AI artwork is all the over the Internet, with DALL-E being the most prominent or most-spoken-about player. Think of a few words or a simple sentence, like “Tintin and Elon Musk go on an adventure”, and watch the same come alive in a few seconds with varying degrees of success. The output isn’t perfect but every few months, the technology is improving drastically.
And the improvements can be seen while using Stable Diffusion, an open-source image synthesis model, to generate an image that matches the text prompt. The text-to-image programme is free to use for anyone with a decent computer and a little technical knowhow. Released only a few weeks ago, it has been embraced by the AI art community.
We tried “Elvis reading a book” and the result was almost spot on while “Shah Rukh Khan eating French fries” resulted in an artwork with a cool vibe. According to Forbes, much of the company’s funds came directly from founder and CEO Emad Mostaque, a former hedge fund manager. Now, the company behind Stable Diffusion is in discussions to raise $100 million from investors, according to three people with knowledge of the matter.
Mostaque, 39, hails from Bangladesh and grew up in England. He has a master’s degree in mathematics and computer science from Oxford University (2005) and spent 13 years working at UK hedge funds. One of the main differences between Stable Diffusion and other AI art generators is the idea of being open sourced. You can experience public demo, which is slow, or try a software beta that’s fast and easy to use, named DreamStudio. There’s a full-fat version of the model that anyone can download and work around with it.
Other platforms, like OpenAI’s GPT-3, initially had limited access for fears the software would be used to create spam and propaganda. That has not been the case. But the artwork you get can be quite interesting. Stable Diffusion is trained on a vast dataset that it mines for patterns and learns to replicate.