OpenAI Unveils ChatGPT Images 2.0 with Enhanced Text and Reasoning
Kemal Sivri
OpenAI has launched ChatGPT Images 2.0, a significant upgrade to its AI image generation model. The new version boasts improved capabilities in rendering non-Latin text and incorporates reasoning abilities for more reliable and accurate outputs.
OpenAI is rolling out ChatGPT Images 2.0, a substantial leap forward in its AI image generation capabilities. This updated system promises a “step change,” particularly in its meticulous instruction following, detailed text rendering, and object placement within scenes. For the first time, OpenAI has integrated reasoning capabilities into its image model, allowing it to perform tasks like web searches and output verification, aiming for enhanced reliability in accuracy, consistency, and visual cohesion.
A major focus for Images 2.0 is its improved understanding and rendering of non-Latin text, with OpenAI reporting significant advancements in handling Japanese, Korean, Chinese, Hindi, and Bengali. The company also claims the new model is better at capturing the nuances of different visual languages, making it more suitable for applications like game prototyping and storyboarding. Beyond text, the model offers greater flexibility in aspect ratios, supporting images from 3:1 wide to 1:3 tall, generating outputs at resolutions up to 2K, and producing up to eight images simultaneously.
During a preview, Images 2.0 demonstrated its prowess by generating a tortoiseshell cat in the pixel art style of Pokémon's third generation, a task that often challenges AI models. It also successfully converted the image into a transparent PNG and created a four-page manga about a cat enjoying a sunny day. While one output showed slight deviation from the prompt, the overall performance, especially in handling complex requests like transparent PNGs, was commendable.
ChatGPT Images 2.0 is now accessible to all ChatGPT users, including those on the Free and Go tiers, with Plus and Pro subscribers gaining access to more advanced features. The model is also available via OpenAI's API and its Codex coding app. This launch comes shortly after Anthropic introduced its own design assistant, intensifying the competition in the AI-driven visual design space.
Original Source: https://www.engadget.com/ai/chatgpt-images-20-is-better-at-rendering-non-latin-text-190000153.html?src=rss
Related News
Comments (0)
✨Leave a Comment
Be the first to comment.