Which feels a bit sketchy(?) to me, seeing as all models are built on imagery scraped from the net without anyone's permission. It's one reason why I've spent the last month training my own models using my own imagery. If these stock photo sites had any brains, they would also start training models on images in their databases, especially since they already have everything sorted into categories based on keywords (which I'll spend the next year doing, until I can get img2text tools working in recursive batch mode).
Thanks! I have a g-doc where I've been documenting settings and progress here[0] although I just now realized I might not have been using my diffusion model correctly for most of my tests. Iteration 509 of my model I seem to have finally nailed it though! :) I partially "blame" Visions of Chaos since the amazing dev (or devs?) drops updates almost every day with new Machine Learning features, model training was only added recently. I must have reset something on accident.
Also I realize there's a lot of image prep work required, not to mention I have a less than ideal amount of VRAM (3060 Ti w/8GB but no monitors attached i.e. 8GB free) so I have to lower some settings. The source images have to be in 1:1 format (which none of my photos are) so I'm using a script to batch call ImageMagick's 'convert' to add white borders to the top/bottom, which results in my renders also having white borders.
They all look great! But the usual customers of stock photos are not looking for dragons or cavemen taking a group selfie. And all the images are not the result of a beginner trying their first prompt, they all took many tries to be generated.
I've used MidJourney and I really don't think it takes that many tries to get what you want. When you enter in a prompt you get 4 results and then you can either upscale one of them or you can variate on one of them and produce 4 variations of that particular one.
Here's a couple "first results" that I personally tried and you can judge for yourself:
Obviously these prompts are a bit more artsy than stock photos are meant to be but the point is just to give you an idea of how it does on the first try. All of these took less than a minute to produce
crystal dragon thing:
https://cdn.discordapp.com/attachments/951197655021797436/10...
https://cdn.discordapp.com/attachments/951197655021797436/10...
https://cdn.discordapp.com/attachments/951197655021797436/10...
davinci-style notebook of flying machines:
https://cdn.discordapp.com/attachments/1008049109338443829/1...
https://cdn.discordapp.com/attachments/1008049109338443829/1...
this person tried to show the life cycle of an alien:
https://cdn.discordapp.com/attachments/1010211132671275058/1...
https://cdn.discordapp.com/attachments/1010211132671275058/1...
https://cdn.discordapp.com/attachments/1010211132671275058/1...
https://cdn.discordapp.com/attachments/1010211132671275058/1...
cavemen taking a group selfie (lots of faces)
https://cdn.discordapp.com/attachments/1011408429170044928/1...