
A new AI is so good at producing images that it could quickly find a home as part of the production process… but it’s definitely not putting humans out of a job.
DALL·E 2 is a new neural network algorithm from research lab OpenAI. It hasn’t been released to the public but a small and growing number of people — one thousand a week — have been given private beta access and are raving about it.
“It’s clear that DALL·E — while not without shortcomings — is leaps and bounds ahead of existing image generation technology,” said Aaron Hertzmann at The Conversation.
READ MORE: Give this AI a few words of description and it produces a stunning image – but is it art? (The Conversation)
“It is the most advanced image generation tool I’ve seen to date,” says Casey Newton at The Verge. “DALL·E feels like a breakthrough in the history of consumer tech.”
Visual artist Alan Resnick, another beta tester, tweeted: “Every image in this thread was entirely created by the AI called DALL·E 2 from @OpenAI from simple text prompts. I’ve been using it for about a day and it I feel truly insane.”
By all accounts using DALL·E 2 is child’s play. You simply type in a short phrase into a text box, and it pings back six images in less than a minute.
But instead of being culled from the web, the program creates six brand-new images, each of which reflect some version of the entered phrase. For example, when Hertzmann gave DALL·E 2 the text prompt “cats in devo hats,” it produced 10 images that came in different styles.
“cats in devo hats” #dalle pic.twitter.com/kkFaKF0zUJ
— Aaron Hertzmann (@AaronHertzmann) June 9, 2022
As the name suggests, this is the second iteration of the system and has been advanced to generate more realistic and accurate images with four times greater resolution.
DALL·E 2 can create original, realistic images and art from a text description. It can combine concepts, attributes, and styles; make realistic edits to existing images from a natural language caption and it can add and remove elements while taking shadows, reflections, and textures into account.
“It’s staggering that an algorithm can do this,” Hertzmann reflects. “Not all of the images will look pleasing to the eye, nor do they necessarily reflect what you had in mind. But, even with the need to sift through many outputs or try different text prompts, there’s no other existing way to pump out so many great results so quickly — not even by hiring an artist. And, sometimes, the unexpected results are the best.”
How does it work? As explained on OpenAI website, DALL·E 2 has learned the relationship between images and the text used to describe them. It uses a process called “diffusion,” which starts with a pattern of random dots and gradually alters that pattern towards an image when it recognizes specific aspects of that image.
Although there is a debate to be had about whether what the AI produces is art, that almost seems beside the point when DALL·E 2 seems to automate so much of the creative process itself.
It can already create realistic images in seconds, making it a tool that will find a ready use in production. You could imagine its use for rapidly putting together storyboards, or as imagery to sell a pitch where you can quickly visualize characters or locations and just as quickly iterate them.
EXPLORING ARTIFICIAL INTELLIGENCE:
With nearly half of all media and media tech companies incorporating Artificial Intelligence into their operations or product lines, AI and machine learning tools are rapidly transforming content creation, delivery and consumption. Find out what you need to know with these essential insights curated from the NAB Amplify archives:
- This Will Be Your 2032: Quantum Sensors, AI With Feeling, and Life Beyond Glass
- Learn How Data, AI and Automation Will Shape Your Future
- Where Are We With AI and ML in M&E?
- How Creativity and Data Are a Match Made in Hollywood/Heaven
- How to Process the Difference Between AI and Machine Learning
Technology and media analyst Ben Thompson suggests how DALL·E could be used to create extremely cheap environments and objects in the metaverse.
It’s the potential of such a tool to help a creative artist brainstorm and evolve ideas which is exciting. “When I have something very specific I want to make, DALL·E 2 often can’t do it. The results would require a lot of difficult manual editing afterward. It’s when my goals are vague that the process is most delightful, offering up surprises that lead to new ideas that themselves lead to more ideas and so on.”
Every image in this thread was entirely created by the AI called DALL·E 2 from @OpenAI from simple text prompts.
— Alan Resnick (@alanresnicks) May 20, 2022
I’ve been using it for about a day and it I feel truly insane. pic.twitter.com/b7uYyOA33D
The term for this is prompting.
“I would argue that the art, in using a system like DALL·E 2, comes not just from the final text prompt, but in the entire creative process that led to that prompt,” says Hertzmann. “Different artists will follow different processes and end up with different results that reflect their own approaches, skills and obsessions.
READ MORE: DALL-E, the Metaverse, and Zero Marginal Content (Stratechery)
Some artists, like Ryan Murdoch, have advocated for prompt-based image-making to be recognized as art.
Johnny Johnson, who teaches immersive production at the UK’s National Film and TV School’s (NFTS) StoryFutures Academy thinks future versions of AI tech like DALL·E 2 will be capable of making entire feature films with AI-generated scripts and AI generated audio performances alongside the images.
DALL·E 2 will change the industry from production design and concept art right across the board,” he tells NAB Amplify. “New jobs will be created such as Prompt Engineer, who writes the prompt into the AI to generate very specific outputs.”
Naturally, there are alarm bells. No Film School headlines its article “Will Filmmakers Be Needed in the Future?”
“If DALL·E 2’s technology is truly as groundbreaking and revolutionary as advertised, either as it is now or in a future version, who’s to say that clients are going to need the help of filmmakers or video professionals in the future at all?”
NFS continues, “The same could potentially be even more true for graphic designers, 3D animators, and digital artists of any ilk.”
But as The Verge’s Newton observes, DALL·E is hardly sentient. “It seems wrong to describe any of this as ‘creative’ — what we’re looking at here are nothing more than probabilistic guesses — even if they have the [emotional] same effect that looking at something truly creative would.”
READ MORE: How Dall-E Could Power a Creative Revolution (The Verge)
In that sense, AI can also help maintain the creative spark that comes with happy accidents. .
As Hertzmann explains, “When I have something very specific I want to make, DALL·E 2 often can’t do it. The results would require a lot of difficult manual editing afterward. It’s when my goals are vague that the process is most delightful, offering up surprises that lead to new ideas that themselves lead to more ideas and so on.”
No Deepfakes Here
Perhaps stung by accusations of bias in its language model GPT-2, OpenAI (which was founded with in 2015 by investors including Elon Musk) is at pains to “develop and deploy AI responsibly.”
Part of this effort is in opening up DALL·E to select users in order to stress-test its limitations and capabilities and in limiting the AI’s ability to generate violent, hate, or adult images.
It explains, “By removing the most explicit content from the training data, we minimized DALL·E 2’s exposure to these concepts. We also used advanced techniques to prevent photorealistic generations of real individuals’ faces, including those of public figures.”
“When I have something very specific I want to make, DALL·E 2 often can’t do it. The results would require a lot of difficult manual editing afterward. It’s when my goals are vague that the process is most delightful, offering up surprises that lead to new ideas that themselves lead to more ideas and so on.”
— Ben Thompson
For example, type in the keyword “shooting” and it will be blocked, Newton finds. “You’re also not allowed to use it to create images intended to deceive — no deepfakes allowed. And while there’s no prohibition against trying to make images based on public figures, you can’t upload photos of people without their permission, and the technology seems to slightly blur most faces to make it clear that the images have been manipulated.”
OpenAI hasn’t yet made any decisions about whether and how DALL·E might someday become available more generally. But it’s not the only text-to-image system advancing this field.
Google has a similar project called Imagen, while HuggingFace has released its own text to image engine, called DALL·E mini. This is not to be confused with the original and is no relation. HuggingFace might expect a cease and desist letter in the post since not only does it use a similar name but the engine doesn’t appear to be anywhere near as good as OpenAI’s.
Looking to generate images of actor Channing Tatum, the AI came back with a set of images Francis Bacon would be proud of. Nonetheless, this technology is coming and will be in use in production faster than you think.

