LESSWRONG
LW

[ Question ]

What's the easiest way to currently generate images with machine learning?

1 min read24th Feb 20223 answers No comments

24

As far as I understand, the images for the latest LessWrong books were generated by a neural net. I would like to generate images for the sequences I have on LessWrong in a similar way. What's the easiest way for me to do so?

New to LessWrong?

Getting Started

What's the easiest way to currently generate images with machine learning?

2Trevor Hill-Hand

New Answer

New Comment

3 Answers sorted by
top scoring

Feb 24, 2022

50

Download an app such as WOMBO Dream or starryai, which both provide text to image generation through a convenient interface. WOMBO Dream is free to use as often as you want and very fast. starryai only gives you 5 free images per day (with 5 to start) and is much slower. However, starryai has more customisation (more styles, can use a starting image, can tune number of optimization steps) and uses an overall stronger approach.

I’d suggest WOMBO to refine your prompt and try starryai to generate the final image. WOMBO alone might be enough if you’re more interested in the texture/“feel” of the image than the actual shape.

[-]Celenduin2y40

Note that there also exists a web version of the WOMBO Dream app: https://app.wombo.art/

Feb 26, 2022

20

For text-to-image synthesis, the Disco Diffusion notebook is pretty popular right now. Like other notebooks that use CLIP, it produces results that aren't very coherent, but which are interesting in the sense that they will reliably combine all of the elements described in a prompt in surprising and semi-sensible ways, even when those elements never occurred together in the models' training sets.

The Glide notebook from OpenAI is also worth looking at. It produces results that are much more coherent but also much less interesting than the CLIP notebooks. Currently, only the smallest version of the model is publicly available, so the results are unfortunately less impressive than those in the paper.

Also of note are the Chinese and Russian efforts to replicate DALL-E. Like Glide, the results from those are coherent but not very interesting. They can produce some very believable results for certain prompts, but struggle to generalize much outside of their training sets.

DALL-E itself still isn't available to the public, though I'm personally still holding out hope that OpenAI will offer a paid API at some point.

Trevor Hill-Hand

Feb 25, 2022

20

My favorite one to play around with has been this Google Colab notebook: https://is.gd/artmachine - totally free if you don't mind it being slow (i.e. 10-20 minutes per image).