What You May Have Missed #20
ChatGPT API / Generative AI beyond ChatGPT: Meta's LLaMA, Microsoft's Kosmos-1 and Bing updates, OpenAI rival, and Stable Diffusion / Who´s responsible for AI harm, users or companies?
ChatGPT API
The most important news of the week is OpenAI’s announcement of the ChatGPT (and Whisper) API. We knew it was coming but not the details. Three keys:
10x price reduction
OpenAI set the price-per-token at 90% less for GPT-3.5-turbo—the model that powers (the recently improved) ChatGPT—than the existing GPT-3.5 models (e.g., text-davinci-003), while “matching/better at … pretty much any task (not just chat).” It costs $0.002/1K tokens, which means you could generate about 750K words for $2.
Jim Fan calculated that you could process the entire Harry Potter series—all 7 books!—for less than $5.
Max Woolf, data scientist at BuzzFeed, commented on Hacker News that the 10x price reduction is “a massive, massive deal … A much better model [than text-davinci-003] and a 1/10th cost warps the economics completely to the point that it may be better than in-house finetuned LLMs.”
Then he wonders if OpenAI can ever make money with this, or if it’s just a “loss-leader to lock out competitors before they even get off the ground.”
System-wide optimizations
One explanation is that OpenAI engineers managed to achieve this impressive 10x price reduction thanks to “system-wide optimizations.” Cameron R. Wolfe, director of AI at Rebuy Engine, lists four methods that could’ve allowed this: pruning, smaller models, quantization, multi-query attention (and maybe something else):
Still, one can wonder why OpenAI didn’t price it at, for instance, $0.005/1K tokens, achieving a 4x price reduction and keeping the margins twice as large. Why did they settle at such a low cost—possibly even unnecessarily small given the amount customers would be willing to pay.
I think they’re playing the long game here. They prefer to stay on top with a significant edge over earning twice the money from the beginning. With the strong funding they got from Microsoft and others, they’re safe financially-wise—I get why it’d be a better strategy to maximize the benefits of having such engineering talent—which is only growing stronger.
A more developer-friendly API
Improvements, from the blog post:
“Data submitted through the API is no longer used for service improvements (including model training) unless the organization opts in
Implementing a default 30-day data retention policy for API users, with options for stricter retention depending on user needs.
Removing our pre-launch review (unlocked by improving our automated monitoring)
Improving developer documentation
Simplifying our Terms of Service and Usage Policies, including terms around data ownership: users own the input and output of the models.”
So far, it’s going great for them:
A group of developers organized an “emergency hackathon” to explore the possibilities of the API (the quantitative price reduction translates into a qualitative change in terms of application options, given that many were prohibitively costly before but aren’t anymore). Ben Tossell, who publishes Ben’s Bites (a daily newsletter on generative AI tools) says he sent the longest email on Friday; “chatGPT API aftermath.”
And many companies are already leveraging the API (in particular, the dedicated instances available through Foundry, intended for large workloads) to create their in-house chatbots. Shopify and Snapchat are among the best-known, but we can expect many others to follow suit soon.