AI image-generation models I’ve tried

Midjourney is best at producing a diverse and aesthetically pleasing range of styles and doesn’t refuse “in the style of…” requests. However, it is worst at text-in-images, avoiding uncanny AI artifacts (like extra fingers or unrealistic postures), and precise instruction-following (it messes up the specifics). Another major downside is that they don’t offer an API.

GPT-5 produces less artistic outputs but is better at following precise instructions on text and composition details.

Gemini “Nano Banana” is somewhere in the middle where it is ok-ish at everything—better at style than GPT-5 but worse than Midjourney, better at instruction-following than Midjourney but worse than GPT-5.

Midjourney v7 messing up instruction-following (basically none of these are images of a robot pushing a computer up a hill)

Generated image — GPT-5 with the same prompt. It followed the instruction, but the style isn’t as nice, and it messed up the colors (GPT is obsessed with yellowish tones and backgrounds, read on to find out how to solve this).

Follow the tips in this article and Gwern won’t hate your AI images :D

Styles that look good

Image-generation models are better at making some styles look good than others. Key characteristics of styles that look good are:

Minimalist / without a huge level of detail
Non-realist enough to avoid accidental uncanny-valley effects

I find asking for aquarelle/watercolor paintings particularly effective, inspired by the LessWrong team.

Me trying the LessWrong team's recommended "Thomas Schaller" prompt. Unfortunately GPT-5 refuses "in the style of" requests. It works pretty well with Midjourney though.

Think about the resolution at which the image will be displayed. Though models sometimes produce good-looking images with detail, they often look worse when you zoom in.

Compositions that look good

Avoid anything where getting the detail exactly right makes or breaks the image (e.g. careful hand positioning).

Hopefully, this will no longer be necessary as models improve, but for now I still find being conservative with the composition necessary to avoid weird alien elements.

Correcting artifacts with programmatic post-processing

Post-processing images in Python is a useful hack to remove annoying LLM color artifacts. It often helps to automatically set all pixels of a certain color to white / your background color of choice (example code). For example, I used this trick to generate this stylistically-consistent set of city illustrations on white backgrounds.

Here is an example Gemini output. Note the off-white background:

“Robot sisyphus pushing an immense computer up a hill, aquarelle, plain white background.”

Here it is after the programmatic correction:

Unless you're doing this in bulk, you don't need to write code to do this, just ask ChatGPT to process your image using Python (thank you Chris for the reminder).

LESSWRONG
LW

LESSWRONG
LW

39

How to make better AI art with current models

39

39

AI image-generation models I’ve tried

Styles that look good

Compositions that look good

Correcting artifacts with programmatic post-processing