AI startup Stability AI recently unveiled the latest version of its image generator, Stable Diffusion XL (SDXL) v0.9, pushing the boundaries in generating ultra-photorealistic imagery. With significant improvements in visual quality and sophistication, the new model is poised to revolutionize how we interact with and utilize generative AI imagery.
In comparison to its previous beta version, SDXL v0.9 excels in response to text-based prompts, displaying superior composition detail. Enhanced imagery can be observed in examples such as a more realistic rendering of a wolf in Yosemite National Park based on the prompt, which outshines the former version in terms of true-to-life details. This substantial progress can be attributed to an increased number of parameters, providing a deeper learning approach for the new AI model.
A noteworthy shift in the latest release is its ability to generate impressive results from simpler and less structured inputs, effectively reducing the need for complicated prompts. This improvement simplifies communication with the model, making it more user-friendly. Consequently, this could pose a substantial threat to rival applications such as MidJourney, whose primary selling point is ease of use.
Additionally, the cinematic aesthetics and precise object rendering by SDXL v0.9 bears resemblance to MidJourney’s visual style, potentially serving as a robust selling point for Stability AI. Looking ahead, the new version is set to be integrated with Clipdrop, an AI image generating and editing tool, and will become accessible to the company’s API customers soon. However, the model is not yet available for training or refining and cannot run locally, requiring a powerful system to operate once it becomes publicly released.
Stability AI’s consistent advancements in technology, such as SDXL v0.9, have garnered significant recognition, as demonstrated by their inclusion in TIME’s list of the most influential companies of 2023. Despite the impressive progress, the company is not one to rest on its laurels, continuing to work on other projects such as their large language model, StableLM, and DeepFloyd IF, a pioneering text-to-image generator capable of seamlessly embedding legible text into images.
This relentless pursuit of innovative technology is anticipated to culminate in the public release of SDXL v0.9 as open-source software in mid-July, marking yet another vital milestone for the thriving company.
As AI-based image generation evolves, the potential applications and use cases for these advanced models grow exponentially. From realistic and immersive digital art to new layers of interactivity in design and marketing, the future holds exciting possibilities for both the AI industry and the creatives who will harness the power of these cutting-edge tools.
Source: Decrypt