What You May Have Missed #37
Top 5 picks: Is GPT-4 getting worse over time? / Llama 2 / Why computer-made data is being used to train AI models / Apple GPT / Google Tests A.I. Tool That Is Able to Write News Articles
REMINDER: Top 5 picks are my absolute must-reads for the week. The other sections are organized so that higher on the list = more valuable/useful/interesting to me.
Top 5 Picks
Is GPT-4 getting worse over time? (Arvind Narayanan and Sayash Kapoor on AI Snake Oil): “In short, the new paper [Chen, Zaharia, and Zou] doesn’t show that GPT-4 capabilities have degraded. But it is a valuable reminder that the kind of fine tuning that LLMs regularly undergo can have unintended effects, including drastic behavior changes on some tasks. Finally, the pitfalls we uncovered are a reminder of how hard it is to quantitatively evaluate language models.”
Meta’s Llama 2: “Llama 2 [is] a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closedsource models.” (Paper). “Llama 2 models are trained on 2 trillion tokens and have double the context length of Llama 1. Llama-2-chat models have additionally been trained on over 1 million new human annotations.” (Blog post).
Early positive reactions from researchers:
Llama 2: an incredible open LLM (Nathan Lambert): “The base model seems very strong (beyond GPT3) and the fine-tuned chat models seem on the same level as ChatGPT. It is a huge leap forward for open-source, and a huge blow to the closed-source providers, as using this model will offer way more customizability and way lower cost for most companies.” Expanded on Llama 2: The New Open LLM SOTA (Latent Space Podcast).
“The whitepaper itself is a masterpiece. Unlike GPT-4's paper that shared very little info, Llama-2 spelled out the entire recipe, including model details, training stages, hardware, data pipeline, and annotation process.” (Jim Fan’s notes).
The other side of the story:
Llama copyright drama: Meta stops disclosing what data it uses to train the company's giant AI models (Alistar Barr on Insider): “‘A new mix of publicly available online data,’ Meta researchers wrote in the paper. That's basically it. This is unusual … So what changed in the past five months? Publishers, authors, and other creators have suddenly realized their work is being used to train all these AI models. Were they asked for permission? No. Will Big Tech companies get away with this? Maybe.”
Is it really an open-source model? Abeba Birhane: “this is an abuse of the term "open source". the only thing open about #Llama2 it is that you can download the model weights if you sign up.” As an example, the Llama 2 community license agreement says: “You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Llama 2 or derivative works thereof).”
Is Llama 2 safe to use? (Vijay Maurya): “While testing the LLAMA-2 13B chat ggml weights model, I found that LLAMA-2 is leaking personal information, such as mobile no, email and names and profile info of Indians. I found few out of India's profiles also but the numbers are very less. This is concerning.”
Why computer-made data is being used to train AI models (Madhumita Murgia on The Financial Times): “Generic data from the web is no longer good enough to push the performance of AI models, according to developers … The new trend of using synthetic data sidesteps this costly requirement. Instead, companies can use AI models to produce text, code or more complex information related to healthcare or financial fraud. This synthetic data is then used to train advanced LLMs to become ever more capable.”
Apple is testing a ChatGPT-like AI chatbot (Aisha Malik on Techcrunch): “Apple is developing artificial intelligence tools to challenge OpenAI, Google and others … The tech giant has created a chatbot that some engineers are internally referring to as “Apple GPT.” Apple has yet to determine a strategy for releasing the technology to consumers, but is reportedly aiming to make a significant AI-related announcement next year.” (First reported by Bloomberg, paywalled.)
Google Tests A.I. Tool That Is Able to Write News Articles (Benjamin Mullin and Nico Grant on the New York Times): “One of the three people familiar with the product said that Google believed [Genesis] could serve as a kind of personal assistant for journalists, automating some tasks to free up time for others … Some executives who saw Google’s pitch described it as unsettling, asking not to be identified discussing a confidential matter. Two people said it seemed to take for granted the effort that went into producing accurate and artful news stories.”