What You May Have Missed #22

Beyond GPT-4: An eventful week / GPT-4 (security challenges, hallucinations, and misinfo) / Open-source AI is dying / An array of generalized bad sensations about AI (anxiety, exhaustion, FOMO, etc.)

Mar 20, 2023

∙ Paid

Beyond GPT-4: An eventful week

Some people have called this week the most eventful or significant week ever in AI. It’s a bold claim driven by excitement, but it’s undeniable a lot of things have happened. GPT-4 is the most salient news, but there’s much more.

Stanford Alpaca and Alpaca-LoRA

Stanford Alpaca is an instruction-tuned model based on Meta's LLaMA (7B). It achieves GPT-3.5 performance for a fraction of the cost. From the blog post:

“On our preliminary evaluation of single-turn instruction following, Alpaca behaves qualitatively similarly to OpenAI’s text-davinci-003, while being surprisingly small and easy/cheap to reproduce (<600$).”

Alpaca-LoRA is the Low-Rank Adaptation version of Alpaca, i.e., it contains the code to reproduce the results on consumer hardware. The era of locally-deployed GPT-3.5-level language models is here.

If you want to know more about Alpaca, read this short thread by ML researcher Sebastian Raschka.

Midjourney V5

Midjourney has done it again. MJ V5 is reaching the levels of quality, realism, and detail of human-made work. Here are a few threads with examples:

Nick St. Pierre comparisons of V4 and V5:

Nick St. Pierre @nickfloats

Midjourney v5 is here! (for real this time, lol) Here are some side-by-sides of my prompts, v4 vs v5, as well as some new prompts and crowd shots. I'll add more to this as I experiment. 🧵

Prompts with more natural language (in contrast to mere descriptors or modifiers):

Nick St. Pierre @nickfloats

I've been adapting my prompting for Midjourney v5 to include more natural language and it's working and freaking me out how good the results are. 🧵Examples w/ prompts in ALT tags & notes at the end

medium-full shot of an elderly french woman with deep wrinkles and a warm smile, petting a golden retriever in washington square park, wearing a bright pastel floral blazer made of linen, natural afternoon light reflecting off her eyeglasses, shot on Agfa Vista 200, side-angle view, 4k --ar 16:9 --stylize 1000 --v 5

Carlos Santana says it produces images with double resolution, more quality, and a larger style range:

Carlos Santana @DotCSV

🔴 NUEVO MIDJOURNEY V5 !!! La nueva versión del modelo text-to-image más avanzado, acaba de evolucionar y mejora lo que ya era impresionante! 👉 El doble de resolución! 👉 Más calidad de imagen 👉 Mayor rango de estilos Os muestro algunos resultados y flipas con esto! 🧵

And Julie W. Design shows how MJ V5 can solve “shots through a window”:

Julie W. Design @juliewdesign_

🔥 MJ tip: shots through a window are finally possible with V5! I've been craving the "My Blueberry Nights"-aesthetic since I first tried out Dalle2 (and it did okay-ish), but v5 is mind-boggling! → find the prompt in the ALT text of the images #synthography #midjourneyv5

photography shot trough an outdoor window of a coffee shop with neon sign lighting, window glares and reflections, depth of field, grandpa in a suit with a cup of coffee in his hands sitting at a table, portrait, kodak portra 800, 105 mm f1. 8 --ar 2:1 --v 5

photography shot trough an outdoor window of a coffee shop with neon sign lighting, window glares and reflections, depth of field, little girl with red hair sitting at a table, portrait, kodak portra 800, 105 mm f1. 8 --ar 2:1 --v 5

photography shot trough an outdoor window of a coffee shop with neon sign lighting, window glares and reflections, depth of field, a romantic young asisan couple sitting at a table, portrait, kodak portra 800, 105 mm f1. 8 --ar 2:1 --v 5

photography shot trough an outdoor window of a coffee shop with neon sign lighting, window glares and reflections, depth of field, an african woman sitting at a table, portrait, kodak portra 800, 105 mm f1. 8 --ar 2:1 --v 5

Maybe the most relevant improvement for casual users is the easiness with which impressive results are attainable. As many predicted (and St. Pierre proved above), prompt engineering may not require careful crafting much longer to get competitive image quality.

The question now is, how will we value mastery if anyone can trivially get results at a level where most people can no longer distinguish X from X+delta quality, i.e., a slight improvement in quality no longer provides a corresponding perceived improvement?

Anthropic and Google Releases

Anthropic is opening Claude, the startup's version of ChatGPT, through a chat interface and API. Google is opening PaLM (until GPT-4 it held the crown of language model benchmark performance) through an API.

Sundar Pichai @sundarpichai

Excited about PaLM API: an easy and safe way for developers to build on top of our language models, and MakerSuite, a tool to jumpstart prototyping - both in private preview today. @googlecloud customers can also access these models + more via Vertex AI.

blog.googleThe next generation of AI for developers and Google WorkspaceIntroducing new generative AI capabilities in Google Cloud and Google Workspace, plus PaLM API and MakerSuite for developers.

These releases are good to give developers alternatives to OpenAI's offering. Still, the models aren't accessible at the weights/code levels so researchers can't truly study them independently. In this regard, Alpaca is a more important announcement.

Generative AI in office software

Google (Workspace) and Microsoft (Copilot 365) have taken another step to integrate generative AI features into their product suites, which brings the potential of language models and the performance and productivity boost they provide to everyone who uses their products.

As Benedict Evans argues, this is an example of incumbents transforming innovations into features of older products. How long until someone comes along with an idea that redefines those products from the ground up instead? That's where a potential “threat” to Google and Microsoft lies:

Benedict Evans @benedictevans

Microsoft & Google adding generative AI into office apps is a classic pattern of incumbents making the new thing a feature. But the new thing generally also enables completely new ways to solve the problem. ‘Easier spreadsheets’ is less important than ‘why is that a spreadsheet?’

The Algorithmic Bridge