What You May Have Missed #18
Short overview of Microsoft vs Google and Bing/Sidney / Robust AI and robust testing methods / ChatGPT and the limitations and applicability of LLMs / Other generative AI news
Short overview of Google vs Microsoft and Bing/Sidney for the latecomers
I didn't publish WYMHM last week because I was focused on the three-article series on Google vs Microsoft (I’ve covered today both weeks). In case you aren't willing to read the 45+ mins it takes to complete the series (totally understandable), here's a short overview of the key links to stay updated on the ongoing AI arms race: the release of the new Microsoft Bing (blog post, demo event, The Verge interview) and the incoming release of Google's Bard (blog post, demo event). (Here are parts one, two, and three of the series in case you want to read them anyway.)
Besides the business aspects, I also analyzed the consequences, partly predictable partly not, of this new way of doing AI—what I called the ship-then-fix modus operandi. Among the unforeseeable ones was the emergence of the Sidney persona out of a supposed search-enhancing chat tool. People with access to Bing found out that the chatbot would behave in weird ways under particular circumstances (not too hard to set up). Dmitri Brereton was among the first to note that the new “Bing AI can't be trusted,” which was then reinforced by Kevin Liu and Martin von Hagen's discovery of Sidney.
Simon Willison and Ben Thompson have great reviews of Bing's unhinged behavior: It threatened, gaslit, had an existential crisis, professed love—and was eventually erased from existence (will dig deeper into this in a future article).
Eliezer Yudkowsky shared (not necessarily endorsed) a Change.org petition to “unplug the evil AI” while people began expressing unease and unsettlement toward Bing's exceptional interactional skills and the feeling that it was sentient.
Now, Microsoft has limited its usage. Here's Peter Yang's summary:
People have reacted in various ways. Some are apparently sad, and others have expressed the intention to switch back to the “old” ways (browser + ChatGPT). Ethan Mollick, a professor at Wharton, reflected on the before and after of Sidney on Twitter.
Two takeaways from all this: One, you can't scale to hundreds of millions a search-enhancing chatbot that lies and threatens users, however funny those in the know may find it. Second, if you create a product, people repurpose it for something else, you remove the new usage and people reject the change, maybe you should rethink the approach completely (of course, existing evidence that people would prefer Sidney to exist in a new search product is anecdotal given that only a small group of people had access—and most likely all of them quite acquainted with AI’s state-of-the-art).