What You May Have Missed #7
DALL·E API & Midjourney v4 / 3 new Gen AI apps /AI ethicists' hardships / When generative AI becomes unethical / WebSummit: Gary Marcus and Noam Chomsky / Meta protein folding model / Google AI@ '22
DALL·E API, Midjourney v4, and the benefits of hiding prompts
OpenAI finally made DALL·E available through an API. This news comes quite late given the popularity of Stable Diffusion (SD), but it’ll still spark the emergence of new gen AI companies. The reason is DALL·E—in contrast to SD—removes the burden of “good prompting” from the user by hiding additions they automatically include to make the images more appealing.
Levelsio (creator of InteriorAI and AvatarAI) tweeted about this recently: “most of us already automated prompt writing away with a front end interface with big buttons and selectors. Regular people don't have the time to figure out prompts.” I agree that, although prompt engineering will be ubiquitous, the ability required to obtain good results will go down over time, as companies hide the complexity of prompts behind the scenes.
Midjourney does something very similar to always generate beautiful images—and they just took it to the next level with the release of v4. The new version is significantly better than anything I’ve seen. Here’s a side-to-side comparison of “a penguin in Venice” between Midjourney v3 and v4:
Midjourney v4 was trained from scratch (it doesn’t use SD or the DALL·E API).
It can handle more complex prompts, it’s better with small details, and, maybe most importantly, it’s better with multi-object scenes:
Although it doesn’t seem to have mastered compositionality: