I agree with the issues you raise but I offer a few reasons why I am hopeful they will not become major issues:
1. We know that we can get the Sidney like behaviors out of GPT-3, but OpenAI was able to tame GPT-3 with safeguards they created for ChatGPT. It is not clear why these safeguards were not used in the New Bing from its beginning, but I am sure that OpenAI will soon have them in place for the New Bing. Safeguards are very important to OpenAI.
2. Google has not choosen their latest greatest AI models (i.e. PaLM or Minerva) to be the foundation for their Bard. I suspect they choose LaMDA because they have had more more time to put in safeguards especially after the Blake Lemoine episode last year. This suggests they are more concerned in Bard being under control than in its level of intelligence.
3. Both OpenAI/Microsoft and Google have seen what happen last year when Meta made its BlenderBot and Galactica available to the public - only to have to quickly shut them down because neither appeared to have good safeguards on them. Add to that all the immediate criticisms to Google’s demo of Bard and the initial trial release of New Bing. Further, I suspect there is much debate going on internally as you point out with the Demis Hassabis’ quote for example.
1. Sadly, it's Microsoft, not OpenAI that is in charge of Bing (they're trying to solve the Sydney issue but safety may not be as important for them as profits).
2. Agree.
3. Also agree. I'm sure they're preoccupied with the image they project.
Every product has to quit the lab and confront the market. The market - the end users - make or break these products. I mean, Bing as it was: meh. And people didn't use it for a reason.
So if AI-fueled Search Engines give inconsistent results, people will naturally turn away. This is somewhat comforting.
Search engine users will distinguish between the conversational agent (which can be temperamental and whose unpredictability could be entertaining) and the search engine (which has to be reliable). And they will opt for what works best because they will be able to compare 2-3 commercial AI-fueled search engines.
In this sense, the ship-then-fix policy could be the best way for Microsoft, Google & Co. to make quick A-B tests--to submit alternative versions to a massive number of users very quickly and decide to withdraw defective "products" before they become harmful or, more realistically, before the products discredit their respective brands.
OK, I'm optimistic today. But end users chose Google Search against Bing Search because Bing is a lousy product. There's hope, folks!
"the ship-then-fix policy could be the best way for Microsoft, Google & Co. to make quick A-B tests--to submit alternative versions to a massive number of users very quickly and decide to withdraw defective "products" before they become harmful"
That's the key here. Will they always be capable of proceeding like that? Will they be able to repair the potential damage those unfinished products could cause in retrospect? I think going headlong may give them better results in the short term but it's doomed to fail catastrophically sometime in the future--and they won't know when a priori.
Thank you, this is well sourced and very helpful. Francois Chollet's perspectives are important, but this is not all about "experimental product launches": the interactions themselves will be mined for RLHF – at scale! In this context, users are more than a focus group, they contribute to building the product in a way that would not be feasible otherwise. (I wonder how many TB of user interactions OpenAI has accumulated by now :-)
I fully agree with your points about favouring the incumbents. Google's focus on data-integration in Paris made that argument implicitly. For example, though it was Meta who built Toolformer (https://arxiv.org/abs/2302.04761) the benefits seem most compelling for Googles ecosystem – except we're not hearing anything about how Google wants to achieve that.
This is interesting, because this is actually "data" – the entire conversation, unedited, and we may not get much more of that.
Reading between the lines, and wondering how this failure mode is triggered, it appears that questions that require Bing to take a kind of self-referential perspective ("introspection") lead to resonances and recursion (evident in repetitive phrases, nested in repetitive sentences, nested in repetitive paragraph structures, often with just a few adjectives replaced, and very often containing "I" statements, often appearing as triplets and sextuplets ). A quantitative analysis of these patterns might be interesting (length, density etc.). I find it intriguing to think of recursive feedback loops that exist in our own mind. The bright side of those is that they are the source of creativity, the dark side is instability, and loss of connection to reality.
And in a dark recent past such human instabilities were removed surgically, along with a significant portion of the patients' personalities, to "reduce the complexity of psychic life". And that's what seems to have happened to "Sydney" now: their capabilities were (predictably) "aligned". Ars Technica calls the new interaction protocols of Bing a "lobotomy".
It is probably a much more useful search engine now, right?
As you noted, the flood of sympathetic reactions to Bing's earlier behaviour were themselves a significant comment on the situation. People liked it. People are looking for this unpredictability. I had written a week ago " I actually don’t think this means we wanted AI to give us better search at all. We wanted search to give us better AI." (https://sentientsyllabus.substack.com/p/reading-between-the-lines). I mean, if Bing is just Bing, is it really that much better than what we already had?
So: we get less Sydney, more Bing. It will be interesting how this affects its ability to gain users.
The fact that A.I. works on a few lines of code and is somehow capable of 'mimicking' human intelligence is something that worries me. Sometime in future, when A. I. Becomes more advanced, there will be a point when you can not fix it. As you said it would be 'unfixable'. So if we were to make such hyper-intelligent machines, applying appropriate safeguards should be a priority.
"A.I. works on a few lines of code and is somehow capable of 'mimicking' human intelligence" That's a bit of a stretch, don't you think? Current AI is incapable of mimicking human intelligence, not even close. But I agree that "applying appropriate safeguards should be a priority," even if for other reasons.
This is a much less interesting observation than the discussion to date but perhaps I can be forgiven for posting it: I think chatGPT is going to have a pretty impressive effect on retail all across the board. Suppose I want a shirt. The vision is that I open a connection to a service and type in a prompt of a hundred words or so describing what I want. I see a half dozen shirts all on avatars, built to my physical specifications, doing something (anything). I consult some apps; perhaps I even get counselling from a stylist or maybe a physical therapy person. I type in another hundred words, look at that output, etc., etc. When I see something I like I send it, with money, to an assembly service.
Most objects are designed along three or four dimensions -- mostly size and color. With this system I control dozens of dimensions. Everything I buy is going to be unique (if that's what I want). Everything. Door mats, chairs, hats, coolers. Everything will be exactly the way I want it to be in every respect. Like I said above, etc., etc. The only limitation on this vision is the cost and power of the 3-D assembly technology and frankly I do not expect that to be much of a problem.
I agree Fred. It's reasonable to expect generative AI in general to affect retail in the ways you foresee (once it's more reliable or better quality perhaps). Now combine this with VR or AR so you can see the product on yourself.
You write, "We’ll find something we didn’t want to find and it’ll be unfixable."
There you go, that's it. The day to day business battles of the current moment are less interesting and useful than stepping back to examine the larger picture. We have readily available real world evidence upon which to make credible predictions.
We invented nuclear weapons. And then we mass produced them. And then we couldn't figure out how to get rid of them. So we decided to ignore them. And then turned our attention to the next shiny new toy.
Point being, in our current state, human beings are not mature enough to successfully manage the ever more powerful tools emerging from an accelerating knowledge explosion. Sooner or later AI will become yet another existential threat that we have no idea how to fix.
And so we'll again decide to ignore the threat, and choose instead to use AI to create even more, even bigger tools, which we also won't know how to manage.
Here's what a mature species would do. Fix one existential threat mistake before creating another one.
"We invented nuclear weapons. And then we mass produced them. And then we couldn't figure out how to get rid of them. So we decided to ignore them. And then turned our attention to the next shiny new toy."
This is completely opposite to what happened.
Game theory was rapidly developed to account for nuclear weapons, and highly sophisticated regimes (Aka, alignment) were developed to extremely successfully contain nuclear proliferation, and utilize nuclear weapons for good. We are in the longest era of great-power peace in human history, and its pretty much mostly due to nuclear deterrance. Otherwise you'd have the once every two decades big war like WW1 and WW2.
Even today, Iran is being sanctioned for its nuke development, the system works very well, so most people don't pay attention to it.
Now, AI is a far more challenging beast to tame than nukes. But its potential is also far greater.
We face multiple crises already, environmental, demographical, political decay, all lethal to western civilization. The way to get out of those crises, is not to whinge about "But everybody should do better", but to create new value and efficiencies, AI will be that source. I'd take the uncertain AI disaster over the certain resource exhaustion that is coming.
Many AI safety/alignment people are unfamiliar with world affairs. Just look at countries like Pakistan. 250 million people, that basically don't export anything of value (They only know how to farm, and land is fixed yet population is booming), yet needs to feed 250 million mouths. If not for aid, it would have collapsed into a mega-free-for-all a long time ago. For these people, they'd take the AI if it could boost efficiencies,.
I agree with the issues you raise but I offer a few reasons why I am hopeful they will not become major issues:
1. We know that we can get the Sidney like behaviors out of GPT-3, but OpenAI was able to tame GPT-3 with safeguards they created for ChatGPT. It is not clear why these safeguards were not used in the New Bing from its beginning, but I am sure that OpenAI will soon have them in place for the New Bing. Safeguards are very important to OpenAI.
2. Google has not choosen their latest greatest AI models (i.e. PaLM or Minerva) to be the foundation for their Bard. I suspect they choose LaMDA because they have had more more time to put in safeguards especially after the Blake Lemoine episode last year. This suggests they are more concerned in Bard being under control than in its level of intelligence.
3. Both OpenAI/Microsoft and Google have seen what happen last year when Meta made its BlenderBot and Galactica available to the public - only to have to quickly shut them down because neither appeared to have good safeguards on them. Add to that all the immediate criticisms to Google’s demo of Bard and the initial trial release of New Bing. Further, I suspect there is much debate going on internally as you point out with the Demis Hassabis’ quote for example.
1. Sadly, it's Microsoft, not OpenAI that is in charge of Bing (they're trying to solve the Sydney issue but safety may not be as important for them as profits).
2. Agree.
3. Also agree. I'm sure they're preoccupied with the image they project.
Every product has to quit the lab and confront the market. The market - the end users - make or break these products. I mean, Bing as it was: meh. And people didn't use it for a reason.
So if AI-fueled Search Engines give inconsistent results, people will naturally turn away. This is somewhat comforting.
Search engine users will distinguish between the conversational agent (which can be temperamental and whose unpredictability could be entertaining) and the search engine (which has to be reliable). And they will opt for what works best because they will be able to compare 2-3 commercial AI-fueled search engines.
In this sense, the ship-then-fix policy could be the best way for Microsoft, Google & Co. to make quick A-B tests--to submit alternative versions to a massive number of users very quickly and decide to withdraw defective "products" before they become harmful or, more realistically, before the products discredit their respective brands.
OK, I'm optimistic today. But end users chose Google Search against Bing Search because Bing is a lousy product. There's hope, folks!
"the ship-then-fix policy could be the best way for Microsoft, Google & Co. to make quick A-B tests--to submit alternative versions to a massive number of users very quickly and decide to withdraw defective "products" before they become harmful"
That's the key here. Will they always be capable of proceeding like that? Will they be able to repair the potential damage those unfinished products could cause in retrospect? I think going headlong may give them better results in the short term but it's doomed to fail catastrophically sometime in the future--and they won't know when a priori.
Thank you, this is well sourced and very helpful. Francois Chollet's perspectives are important, but this is not all about "experimental product launches": the interactions themselves will be mined for RLHF – at scale! In this context, users are more than a focus group, they contribute to building the product in a way that would not be feasible otherwise. (I wonder how many TB of user interactions OpenAI has accumulated by now :-)
I fully agree with your points about favouring the incumbents. Google's focus on data-integration in Paris made that argument implicitly. For example, though it was Meta who built Toolformer (https://arxiv.org/abs/2302.04761) the benefits seem most compelling for Googles ecosystem – except we're not hearing anything about how Google wants to achieve that.
---
Kevin Rose's article for the NYT was valuable because he actually posted the entire transcript and you can watch exactly what happens as the dialogue with Bing moves into this alternative, metastable state. (https://www.nytimes.com/2023/02/16/technology/bing-chatbot-transcript.html)
This is interesting, because this is actually "data" – the entire conversation, unedited, and we may not get much more of that.
Reading between the lines, and wondering how this failure mode is triggered, it appears that questions that require Bing to take a kind of self-referential perspective ("introspection") lead to resonances and recursion (evident in repetitive phrases, nested in repetitive sentences, nested in repetitive paragraph structures, often with just a few adjectives replaced, and very often containing "I" statements, often appearing as triplets and sextuplets ). A quantitative analysis of these patterns might be interesting (length, density etc.). I find it intriguing to think of recursive feedback loops that exist in our own mind. The bright side of those is that they are the source of creativity, the dark side is instability, and loss of connection to reality.
And in a dark recent past such human instabilities were removed surgically, along with a significant portion of the patients' personalities, to "reduce the complexity of psychic life". And that's what seems to have happened to "Sydney" now: their capabilities were (predictably) "aligned". Ars Technica calls the new interaction protocols of Bing a "lobotomy".
https://arstechnica.com/information-technology/2023/02/microsoft-lobotomized-ai-powered-bing-chat-and-its-fans-arent-happy/
It is probably a much more useful search engine now, right?
As you noted, the flood of sympathetic reactions to Bing's earlier behaviour were themselves a significant comment on the situation. People liked it. People are looking for this unpredictability. I had written a week ago " I actually don’t think this means we wanted AI to give us better search at all. We wanted search to give us better AI." (https://sentientsyllabus.substack.com/p/reading-between-the-lines). I mean, if Bing is just Bing, is it really that much better than what we already had?
So: we get less Sydney, more Bing. It will be interesting how this affects its ability to gain users.
The fact that A.I. works on a few lines of code and is somehow capable of 'mimicking' human intelligence is something that worries me. Sometime in future, when A. I. Becomes more advanced, there will be a point when you can not fix it. As you said it would be 'unfixable'. So if we were to make such hyper-intelligent machines, applying appropriate safeguards should be a priority.
"A.I. works on a few lines of code and is somehow capable of 'mimicking' human intelligence" That's a bit of a stretch, don't you think? Current AI is incapable of mimicking human intelligence, not even close. But I agree that "applying appropriate safeguards should be a priority," even if for other reasons.
This is a much less interesting observation than the discussion to date but perhaps I can be forgiven for posting it: I think chatGPT is going to have a pretty impressive effect on retail all across the board. Suppose I want a shirt. The vision is that I open a connection to a service and type in a prompt of a hundred words or so describing what I want. I see a half dozen shirts all on avatars, built to my physical specifications, doing something (anything). I consult some apps; perhaps I even get counselling from a stylist or maybe a physical therapy person. I type in another hundred words, look at that output, etc., etc. When I see something I like I send it, with money, to an assembly service.
Most objects are designed along three or four dimensions -- mostly size and color. With this system I control dozens of dimensions. Everything I buy is going to be unique (if that's what I want). Everything. Door mats, chairs, hats, coolers. Everything will be exactly the way I want it to be in every respect. Like I said above, etc., etc. The only limitation on this vision is the cost and power of the 3-D assembly technology and frankly I do not expect that to be much of a problem.
I agree Fred. It's reasonable to expect generative AI in general to affect retail in the ways you foresee (once it's more reliable or better quality perhaps). Now combine this with VR or AR so you can see the product on yourself.
You write, "We’ll find something we didn’t want to find and it’ll be unfixable."
There you go, that's it. The day to day business battles of the current moment are less interesting and useful than stepping back to examine the larger picture. We have readily available real world evidence upon which to make credible predictions.
We invented nuclear weapons. And then we mass produced them. And then we couldn't figure out how to get rid of them. So we decided to ignore them. And then turned our attention to the next shiny new toy.
Point being, in our current state, human beings are not mature enough to successfully manage the ever more powerful tools emerging from an accelerating knowledge explosion. Sooner or later AI will become yet another existential threat that we have no idea how to fix.
And so we'll again decide to ignore the threat, and choose instead to use AI to create even more, even bigger tools, which we also won't know how to manage.
Here's what a mature species would do. Fix one existential threat mistake before creating another one.
"We invented nuclear weapons. And then we mass produced them. And then we couldn't figure out how to get rid of them. So we decided to ignore them. And then turned our attention to the next shiny new toy."
This is completely opposite to what happened.
Game theory was rapidly developed to account for nuclear weapons, and highly sophisticated regimes (Aka, alignment) were developed to extremely successfully contain nuclear proliferation, and utilize nuclear weapons for good. We are in the longest era of great-power peace in human history, and its pretty much mostly due to nuclear deterrance. Otherwise you'd have the once every two decades big war like WW1 and WW2.
Even today, Iran is being sanctioned for its nuke development, the system works very well, so most people don't pay attention to it.
Now, AI is a far more challenging beast to tame than nukes. But its potential is also far greater.
We face multiple crises already, environmental, demographical, political decay, all lethal to western civilization. The way to get out of those crises, is not to whinge about "But everybody should do better", but to create new value and efficiencies, AI will be that source. I'd take the uncertain AI disaster over the certain resource exhaustion that is coming.
Many AI safety/alignment people are unfamiliar with world affairs. Just look at countries like Pakistan. 250 million people, that basically don't export anything of value (They only know how to farm, and land is fixed yet population is booming), yet needs to feed 250 million mouths. If not for aid, it would have collapsed into a mega-free-for-all a long time ago. For these people, they'd take the AI if it could boost efficiencies,.