MY KOLKATA EDUGRAPH
ADVERTISEMENT
regular-article-logo Monday, 25 November 2024

All that you need to now about Dall-E

It can revolutionise the usage of AI and creators have taken care that it’s not misused

Mathures Paul Published 21.06.22, 02:39 AM
Dall-E 2’s picture translation for ‘Teddy bears mixing sparkling chemicals as mad scientists, steampunk’.

Dall-E 2’s picture translation for ‘Teddy bears mixing sparkling chemicals as mad scientists, steampunk’. Pictures: OpenAI

The weekend was spent rolling in laughter as minutes became hours using Dall-E Mini, a neural network that can turn words — sentences to be exact — into images. The folks at OpenAI have come up with a powerful software that can produce a range of images in a matter of seconds. The AI engine has been trained by looking at millions of images available on the Internet as well as the text that accompanies them.

Results can be hilarious as well as packed with amazement. What are the chances of keying in “Cat travelling in a car while having coffee” or “Astronaut picking flowers on the moon” or “Cat having hot dog at office” and coming away with a selection of images that are quirky but enough to get work done?

ADVERTISEMENT

Dall-E 2 can revolutionise the usage of AI and creators have taken care that it’s not misused. Notice we have mentioned Dall-E 2 because the original programme was created in 2021 while the title is a mash-up of the names Salvador Dali and Wall-E while the results have their quirkiness.

The seven-year-old company from San Francisco is already known for creating GPT-3, which generates complex text passages from simple prompts, and Copilot, which helps automate writing code for software engineer. The original version of Dall-E restricted images to 256-by-256 pixel squares which now stands at 1,024 by 1,024 pixels.

The technology that is being used goes far beyond pixel count as something called “inpainting” is being used, that is, replacing one or more elements in an image with another. So you can have an astronaut picking flowers or a cat having coffee while riding in a car. Beyond creating an image in a single style, you can implement different art techniques, like styles of drawing, oil painting, 1960s movie poster and so on.

The results are mesmerising enough to have started a trend on social media as well as a subreddit (visit r/weirddalle). The company also has the confidence of investors, which includes Microsoft, Reid Hoffman’s charitable foundation, and Khosla Ventures.

The full-fledged Dall-E 2 is at the moment restricted, so the next best choice is to use Dall-E Mini, which draws on open-source code from a loosely organised team of developers and is often overloaded with demand.

Dall-E 2’s picture interpretation for ‘A photo of an astronaut riding a horse’

Dall-E 2’s picture interpretation for ‘A photo of an astronaut riding a horse’

Using the Mini version takes time as often you will be greeted with the message “Too much traffic, please try again.” The best way to tackle it is keep trying and after hitting the ‘Run’ button three to 10 times, you should get lucky.

The developers have strong content policy in place to stop bullying, harassment or creation of sexual or political content.

Text to picture technology is having its moment. There is also Midjourney, which requires you to fill out a form, there is NightCafe, which pushes a similar agenda, and AI Art Maker, which has a more simplistic approach.

A bigger question is what happens when these AI text-to-drawing tools reach a level of maturity that may challenge the lives of artistes who are being employed to work on original artwork at studios? Plus, AI has got a bad name because of some other players, giving way to fake news and manipulated images. Dall E-2 has a secure approach and hopefully it will remain that way. But what could be a more practical implementation of the technology? Only time will tell.

Follow us on:
ADVERTISEMENT
ADVERTISEMENT