2 Comments

I feel like general purpose image generators like Midjourney are used by the general public for fun, but professionals and corporations will be using custom trained models on datasets of their choosing - rather than trying to elicit a particular style via prompting, the model will inherently embody a style and they just use prompts to describe the scene.

Have you tried running Stable Diffusion locally on your PC? My 6 year old PC with a GTX1080 runs it great. It's extremely flexible. You can train your own model checkpoint with Dreambooth, or create a LORA - which seems to be what you might want - it's basically a small file (20MB) that "adapts" and trains an existing SD checkpoint to produce images of a specific person or animal, or specific style. Further to that, extensions such as ControlNet and OpenPose allow you tailor the scene you are trying to elicit, even down to specifying poses and room layout. In contrast I have found Midjourney extremely limited as all you can do is prompt.

This person trained a LORA in 6 hours on their own PC with about 25 pictures of their dog and can reliably produce images of their dog in different scenarios.

https://www.reddit.com/r/StableDiffusion/comments/15enp0y/trained_a_lora_on_one_of_my_dogs/

If you don't have a decent PC you can buy equivalent compute time on the cloud for a few dollars.

I haven't needed to or tried to train my own LORA but what I do is I browse Civitai which has hundreds of user created custom models and LORAs which I can download and try out on my PC.

Expand full comment

Great summary

Expand full comment