I took DALL-E 3 for a Test Run. How did it do?

There’s been a lot of excitement recently about DALL-E 3, the text-to-image generator by Open AI. So I decided to try it.

I’ve been revamping my website and finding (free) photos and images is a challenge, especially to illustrate a topic such as writing. I usually spend awhile searching Google images or freepik for the right photos. I don’t always find them.

So this time, I tried DALLE-3 and it was wild.

Words Turned into Images

A brief tutorial I found suggested you go to bing.com, select Chat and start your prompt with “Create this image:”. So I did that, sometimes adding the word please. (I’ve heard using manners and encouragement with generative AI chatbots can produce better results and, also, why not?)

Once I entered my prompt, it usually responded: “I’ll try to create that.” Then maybe 10, 20 or even 30 seconds elapsed as it worked to created my requested image. It offered a “Stop Responding” button if I changed my mind while it was processing.

Instant Images

It generated four similar images from one prompt request. The differences from one image to another were usually subtle: a plant in a different spot, the rug a different shade, the bookcase a different style.

With each prompt, it also generated three automatic prompt buttons to alter the image, such as: Would you like to add a window (yes), make the walls blue (no), add a table (no). I could click on them or also add more detail or change my original prompt.

Trial and Error

Like ChatGPT, it was a lot of trial and error. The more specific I was, the better the results generally were. (You’re permitted up to 4,000 characters for your prompt.)

Here’s what I found DALL-E 3 was good at and not so good at (yet).

Grading DALLE-3

B+: It created decent inanimate objects such as books, rugs, plants and bookshelves. The image it created for me looks fairly realistic, except the books on my bookshelves have no titles or authors on its spines. I didn’t ask for them but you’d think that would be included when requesting a book.

A-: It did a good job creating a specific breed of dog I requested (beagle). It was challenged with my instruction to show the dog lying down with its chin on the rug, though. I think it produced one decent image of this. Most of the time it gave me dogs lying with head up.
B-: It produced a realistic photo of a single woman but there were oddities or perhaps biases: It produced mainly young Asian women. Nothing wrong with that, but if I wanted a black woman or a white woman of middle-age, I had to specifically ask for that. And it didn’t always deliver realistically. The skin of the middle-aged women looked smoother and better than most middle-aged women I know!
D: It had trouble creating word art. When I asked for a colorful image with the words, “Two Words,” it created images in which letters were missing or broken or words were misspelled. It took several tries to get a good image.
C+: Like overloaded wifi at a concert, DALL-E 3 couldn’t deliver service at times. Once, or maybe twice, it told me it couldn’t respond to any more requests at the moment because there were too many for it to handle.
B: After several prompts and responses, I had to start a new chat. And my chats weren’t saved, so I had to download images I wanted to save. Maybe that’s up to me to remedy. I have to figure that out.
F: Now, when I asked for a realistic image of a family sitting around the dinner table, it couldn’t deliver. All the images I received — and I tried multiple times – were cartoonish or looked animated. It just couldn’t deliver a realistic family photo.

More Data

I suspect with most things generative AI, DALLE-3 will get better. But the basic premise is remarkable: Images from words. What’s so interesting about this AI technological phenomenon, is how many products are being released when they’re still not great or even very good. I suspect it’s all about data: more uses will provide more data and information for it to improve in the future.

A Shout-Out to Photographers

I’ve worked with many fantastic photographers in my career, and I don’t advocate or foresee DALLE-3 or similar products replacing them. I certainly respect their right to protect their work. As a writer, I can relate to that.

I don’t know how DALLE-3 generated my images. It would be great to have transparency on that, if that’s even possible?

I do appreciate the simplicity and relative effectiveness of it as a tool, though. I suspect it will only get better.

The image for this blog was generated by AI from the prompt: Create an image that is representative of
DALL-E 3.

Sue ValerianOctober 17, 2023