I recently got access to DALL-E and wanted to experiment with what it could and couldn't understand. (I'm a little behind the times, I know. Us non-famous people have to spend a while longer on the waitlist.) Most of the other investigations into DALL-E seemed to be focusing on highly complicated real-world concepts, so I went the other way and tried very simple, abstract things.

DALL-E is incredibly bad at this. It doesn't seem to be able to generalize at all, and it refuses to draw anything simple without adding a bunch of extra details I didn't ask for.

Full post here.

New to LessWrong?

New Comment
2 comments, sorted by Click to highlight new comments since: Today at 7:00 AM

For some reason this felt incredibly funny, especially the texts generated by DALL-E.

  • "Saquaraham" = square
  • "Heboneckoton" = hexagon
  • "Deyogenition" = pentagon (with four sides)
  • "Do yo d hocs 100" = five dogs

Also, the entire context is silly: an artificial intelligence that makes great art but sucks at elementary math.

I bet this is mostly a training data limitation.