What You May Have Missed #22
Beyond GPT-4: An eventful week / GPT-4 (security challenges, hallucinations, and misinfo) / Open-source AI is dying / An array of generalized bad sensations about AI (anxiety, exhaustion, FOMO, etc.)
Beyond GPT-4: An eventful week
Some people have called this week the most eventful or significant week ever in AI. It’s a bold claim driven by excitement, but it’s undeniable a lot of things have happened. GPT-4 is the most salient news, but there’s much more.
Stanford Alpaca and Alpaca-LoRA
Stanford Alpaca is an instruction-tuned model based on Meta's LLaMA (7B). It achieves GPT-3.5 performance for a fraction of the cost. From the blog post:
“On our preliminary evaluation of single-turn instruction following, Alpaca behaves qualitatively similarly to OpenAI’s text-davinci-003, while being surprisingly small and easy/cheap to reproduce (<600$).”
Alpaca-LoRA is the Low-Rank Adaptation version of Alpaca, i.e., it contains the code to reproduce the results on consumer hardware. The era of locally-deployed GPT-3.5-level language models is here.
If you want to know more about Alpaca, read this short thread by ML researcher Sebastian Raschka.
Midjourney V5
Midjourney has done it again. MJ V5 is reaching the levels of quality, realism, and detail of human-made work. Here are a few threads with examples:
Nick St. Pierre comparisons of V4 and V5:
Prompts with more natural language (in contrast to mere descriptors or modifiers):
Carlos Santana says it produces images with double resolution, more quality, and a larger style range:
And Julie W. Design shows how MJ V5 can solve “shots through a window”:
Maybe the most relevant improvement for casual users is the easiness with which impressive results are attainable. As many predicted (and St. Pierre proved above), prompt engineering may not require careful crafting much longer to get competitive image quality.
The question now is, how will we value mastery if anyone can trivially get results at a level where most people can no longer distinguish X from X+delta quality, i.e., a slight improvement in quality no longer provides a corresponding perceived improvement?
Anthropic and Google Releases
Anthropic is opening Claude, the startup's version of ChatGPT, through a chat interface and API. Google is opening PaLM (until GPT-4 it held the crown of language model benchmark performance) through an API.
These releases are good to give developers alternatives to OpenAI's offering. Still, the models aren't accessible at the weights/code levels so researchers can't truly study them independently. In this regard, Alpaca is a more important announcement.
Generative AI in office software
Google (Workspace) and Microsoft (Copilot 365) have taken another step to integrate generative AI features into their product suites, which brings the potential of language models and the performance and productivity boost they provide to everyone who uses their products.
As Benedict Evans argues, this is an example of incumbents transforming innovations into features of older products. How long until someone comes along with an idea that redefines those products from the ground up instead? That's where a potential “threat” to Google and Microsoft lies: