Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

With several thousand images on each, I agree with this -- to a degree.

Dall-E does seem more aware of relationships among things, but using parens and careful word order in some of the SD builds can beat it. By contrast, even most failed images from MidJourney could still be in an outsider art gallery. MJ aesthetic works, while Dall-E seems like a 9 year old was taken hostage and clipped out Rapunzel and the paper shredder from magazines and pasted them onto a ransom note.

That said, I have not been able to get any of Dall-E, MJ, or SD to give me a coherent black Ford Excursion towing a silver camping trailer on the surface of the moon beneath an earthrise.

At cost per image, I could pay to get complex concepts such as this rendered via any number of art-for-hire sites at less expense and guaranteed results.



Do you have any links to share on how to build a proper SD query, with parens for example? I have not seen that done.


Only some builds support it. This is the one I'm familiar with: [0]. () around a word causes the model to pay more attention to it, [] around a word causes the model to pay less attention. There's an example at the link.

[0] https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: