We have found that transformer-based LLMs self-organize around Noun-Phrase Routes akin to how CNNs automatically create filters around edges, shapes, etc. We have also found every RAG-based hallucination to be caused by errant Noun-Phrase routes — without exception. While there is much more to learn, we've found this insight extremely useful for creating accurate chatbots. Here's a brief intro: https://youtu.be/ZBWoUVZuGao?si=f83tTbh7q12LjgsJ&t=639
That sounds very interesting, but the video doesn't explain how the problem was really solved, it's just an AI-generated promotion video. How were the Noun-Phrase collisions avoided?
An example of how to apply the noun-phrase route collision avoidance to a real-world problem is given at the end of the video. Kindly see: https://youtu.be/ZBWoUVZuGao?si=ublEF50jxfookXYn&t=902. Also, since writing the above post, I've released an API to fully automate the process, and I also finished the video which explains what types of chatbots can be built using the API: https://youtu.be/K4Wg6QzPfyI. The API is released at https://www.ragfix.ai. Not only are free tokens available, but it is priced just high enough to cover costs to make the automated solution available to everyone.
Provided the API is used properly, it returns 100% accurate responses every time. To document this, I've posted the API response to all the Evident Conflict and Subtle Conflict hallucinations in the RAGTruth corpus for GPT-4 and GPT-3.5-Turbo. The same model that responds with a hallucination returns a 100% accurate response to the same query and same passages. Simply remove the noun-phrase route collisions and the models work perfectly. For documentation see: https://hallucination-analyzer.ragfix.ai/.
By removing noun-phrase route collisions, we don't need bigger models to achieve 100% accuracy. In fact, I am now testing GPT-4o. So far every response has come back 100% accurate — including all the hallucinations in the RAGTruth corpus.
Does anyone like comparison of neural nets to holograms? We can examine the partially transparent medium which the light shines through but no human brain is going to grok how the varying opacities converge on the 3-D image.
And thank you for writing such a fine perspective.
Dear Geoff, The answer is simple, the billions of parameters do not map in anything related to human intelligence in any explicit way. They contribute to interpolate a continuous function in a multidimensional space. That function approximates the most probable sequences of tokens in the given language, based on the training set. There is no understanding or reasoning involved, only correlation between abstract sequences of opaque symbols. If the LLM is trained with sequences of words encrypted with a secret key it does not know, it would give the sane answers in encrypted form, like a dumb automaton.
Moreover, the way natural neurons work in the brains is very different from artificial neurons in machine learning. Making analogies between artificial neural nets and biological brains at the cognitive is very speculative.
Ok, but there’s a lot of complexity hiding in the simple statement “they approximate the most probable sequence.”
And there is a way to make that human intelligible.
For instance, is the LLM using the proximity of tokens to one another throughout the data set to determine the probability one token will come after another? Ok, how many parameters can you get out of that? Or is the grammar of the preceding tokens considered when choosing the next token? Or is there semantic meaning, perhaps etymology, maybe even metaphorical relationships elucidated through repetition in the training data? Maybe there are underlying structures to the ways we speak that we have no idea about whatsoever, but the LLMs picked them up as parameters.
All it does is look for patterns right?
Those are all types of patterns that it can replicate without “understanding.”
The same way most English speakers know pong ping and tok tik sound weird, but few know there’s a rule of grammar that the “i” sound should generally be listed before the “o” sound.
And don’t forget, LLMs aren’t the only type of Neural Net. What about recommendation algorithms? They maximize probability you’ll spend more time on an app.
What are their parameters? How do we get to 1 trillion?
Dear Geoff, yes it is precisely that. It predicts the answer to a question based on the probability of a next word following the prompt ( and some hidden context that the LlM adds based on some pre-programmed heuristics and rules). The free parameters of the model ate like the interpolation coefficients for curve fitting. It is like tabulating a function by listing all of its values, except that you don't include all values, but just some, which allow the others to be approximated.
Geoff, but who in her / his sanity would care about that or even understand what would be the purpose of that?
Look, respectfully, sometimes to fully grasp technical concepts we first need to understand the technical concepts to a certain depth and rigour. Otherwise, we are discussing in a vacuum of naive remarks. The first time in history the Lumiere brothers projected a film of an approaching train on the screen, the audience ran out of the room in panic, fearing that the train would somehow come out of the projector and kill then all. That was because they were creating a completely naive and wrong model of a technology completely new to them.
I am afraid the question you are asking is as relevant as asking what is the name of each byte in the mp4 file of a recording of the fifth symphony of Beethoven. The parameters of a LLM encode in a congressed way the shape of a mathematical function that predicts next tokens given a previous string. The parameters do not stand for domain concepts pretty much like the bytes of the mp4 file do not stand for trumpets or the words of Schiller poem Freude.
This is an analogy I hope may help you. Otherwise if you need something more precise about the role of parameters in a regression model routinely used everyday in predictive statistics.
I get the sense you didn’t read this will researched article. Or that you didn’t understand it. And I think the same may be true of my comments. Maybe you don’t see the difference between what something’s function is and how it works?
I find that unlikely. But I don’t understand why we aren’t connecting and it’s making me feel frustrated.
You’re clearly incredibly smart—slinging words around like depth, rigor and “a vacuum of naive remarks” (🤯!)—and you obviously have some specialization in this area… but I think I’ll go back to talking to ChatGPT. It at least pantomimes a desire to take the perspective of the person whose point of view it is criticizing.
Yes, I see... The problem here is that no pain no gain, it is always easier to confuse everything. The point is that it is inglorious to look for meaning in the parameters of a LLM.
This article points out nicely that ML AI lacks explanatory power, but it misses the point of WHY it is inherently so.
As for reading, instead of ChatGPT you may look to this open source book on deep learning. Sorry for any misunderstanding from my part, it just appalls me so much hype and misunderstanding about the way LlMs work... People keep anthropomorphizing and making misplaced comparisons with the human brain. The brain is a very complex dynamic system, a thinking machine, while an artificial neural net is essentially a compressed huge huge lookup table, fixed at training time.
"We are incurring a huge intellectual debt in the form of unintelligible tech that we use without wisdom or restraint."
Well-said! You articulated what I've personally been feeling about this whole thing. I think what makes it even worse is the gaslighting around AI adoption normalized by tech giants. In spite of everything we don't know, it's sold as "forward-thinking" behavior in a highly competitive urgency culture--and it works. I'm seeing a lot of image-obsessed executives that are buying "the shiny new object on the market" just to be able to slap on AI buzzwords on their resumes/portfolio. Talented folks are at the risk of losing their jobs, thinking *they're* worthless and obsolete (Mira Murati's viral comment about creative jobs says it all). Yet the irony is that the same folks who then take credit for implementing AI, are generally tone-deaf when it comes to ethics, human safety, best practices, and all in all, out of touch with reality.
It's difficult for our minds to comprehend the vast complexity of the universe we live in. We may have a relatively solid grasp of the simple interactions which make up a system - quantum chromodynamics, for example, models the interactions of elementary particles quite accurately. Of course, it takes one of the larger supercomputers to run a model of just a few hundred particles. At this level, there is no hint of the astonishing diversity of structure we see at our scale.Then we consider that there are more particles in our little finger than there are stars in the visible universe, and realise we exist amid an immense fractally detailed layering of emergent phenomena.
It's not just the function of neural networks, artificial or biological, that confounds us. Everywhere we look, from the workings of the cosmos down to the function of the cell, the more we find out, the more we realise we don't know.
I don't mean to belittle conscious understanding or abstract rational processes - these parts of our mind obviously have their uses. We can be justifiably proud of the mountain of knowledge we've accumulated. However, admitting that we still have a planet of undiscovered knowledge ahead of us might instill some balancing humility. Then we might see that, when it comes to intelligence, we tend to put the cart before the horse.
Lately, AI research and neurology have been revealing to us what Mother Nature knew all along - that intelligence grows up from its roots rather than being bestowed from above.
Neurologists witness, with their fMRI scans, the various "subconscious" parts of the human brain making a decision well before the conscious part even becomes aware of it. The conscious mind is not the instigator it likes to believe, rather an explainer after the fact, a storyteller. Our abstract models of reality are not the source of cognition, but the condensate of a much larger mass of mentation, a cooperative network of specialised behaviours trained over millions of years by our sense data.
The abstract models are nonetheless very important to modern humans. Our storyteller function allows us to follow chains of cause and effect, to envision sequences of events over time, to map reality. These maps have become a crucial link in the feedback loops which enable us to navigate the world, to function and survive in it.
Problems arise, though, when we confuse the map with the territory. If, rather than allowing reality to inform our model, we attempt to impose our abstractions in a one-way top-down manner, then we will end up following a faulty map, lost and in denial.
This confusion is writ large in our society these days. We spend more and more time arguing about what is or isn't real, less and less time negotiating in good faith how we will act in response. AI will just be adding to the noise obscuring the signal until we wake up to ourselves.
The answer to our problems comes from the ground up. Our supposed leaders are the symptom of our disease, not the cure. They cannot understand the functioning of massed humanity any more than AI researchers can understand how neural nets work. There is no law that can be enacted, no regulatory institution that could be established that will resolve our impasse. No-one can lead us out of this mess.
I think it's a bit misleading to assert that "no one knows how AI works." In some sense, we know exactly how it works -- everything is open and visible, including all model details and every single one of the trillions of calculations. What we don't know is how to reduce this down to something our brains can grasp, analogous to discovering physical laws that govern the large-scale behavior of trillions (or more) of particles and waves.
Unfortunately, neural networks may well be irreducible. That is, in order to understand the system's behavior, we may simply have to run it and go through the trillions of calculations. There might be no shortcuts.
We've already seen examples like AlphaZero's "magic" moves, for which no short explanation was found -- one needed to enumerate every possibility. The same goes for some computer-assisted proofs (e.g., the Four Color Theorem), where there's similarly no known "short" explanation and only a computer can explore every single one of a vast number of proof paths.
In short, explainability could well be a dead end. Perhaps short explanations could be found in specific cases, as with certain brilliant grandmaster chess moves, but it seems hopeless in general.
Only in a specific sense, as I mentioned. At the highest level, a neural network is nothing more and nothing less than a function, generally mapping vectors of floats to vectors of floats. That's one way to explain how it works.
Another way is to describe exactly what happens as the input is transformed into the output -- i.e., multiply and sum using the weights, pass through the activation function, etc., for every element and layer of the model. In this case, that's exactly how it works, down to the last detail.
If we were smarter, we might even be able to use this to understand why neural networks produce the behaviors they do and what precisely they're capable of. I suppose it's a matter of semantics, but that's bigger than merely knowing how a neural network works. Nonetheless, I won't argue if that's what you meant by "how it works."
12 lines of python. Who knew a language created as and for a joke would become the conduit for the next wave of human insanity. The universe certainly delights in irony.
it seems like the AI community is focused more on capability than process leading to the capability? is that the case?
building intelligence from the ground up like it arose for us does not seem of interest. I see that happening for iRobot's design of the roomba but not for generative AI LLMs.
There is some obfuscation in the typical way the idea of the black box is presented, which is badly motivated by financial incentives to upsell the models' intelligence.
Lay outsiders are manipulated by the black box idea to believe that we don't even know how neural net algorithms are implemented, as if the 'discovered invention' is some sort of encrypted device where we only interact with the data input and output but don't see or understand any code. This is silly of course. We write, design and therefore understand neural nets algorithmically. To the programmer, the black box is what you get when you deliberately overlook the quite simple mathematical models to ask about the 'emergent' behaviour of the model on the pretense that you don't know how it's grounded. But what justification is there for even framing the analysis this way, other than wanting to indulge in a primordial fantasy about creating new life?
The fact that we know how the algorithms work is important, even if there are questions to answer about what they signify cognitively, because while we might not be able to say how they do work, we can say how they don't work. For example, they don't work by mapping embodied sense data into a space of abstracted concepts which are used consistently and compositionally in out-of-domain tests. That much is enough to know that this is not a path to human-like intelligence.
Also yeah, we don't ignore everything. But we don't know how to explain behavior from neural activity. We also don't know in humans (or any relatively complex animal for that matter). Denying that's a worthy scientific question is ridiculous. That's what this essay is about.
how are you able to ensure that advanced AI systems remain aligned with human values and intentions when the anthropic's CEO just said we know around 3% of how these dumb models work?
How do we ensure that any complex system aligns with human values? We fail in general because our values are heterogeneous, hard to define, and usually overruled by economic incentives anyway. Most of our society is already an offense to our ideals. Framing it as a problem particular to AI is again just a rhetorical ploy
I did not intend it to be a ploy. If we are planning to let these advanced systems take hold of the critical assets of humanity (which based on current news, it looks like is the case) then how do we recognize those different values for humanity and which ones and who will decide what to impart on advanced AI systems?
The field named "artificial intelligence" has a lot of thing in addition to neural networks, for which the statement "no one knows how it work" is totally wrong.
I didn't make myself clear. The fact that they used the words "artificial" and "intelligence" is the greatest lie and the most successful marketing trick in the history of the field. The title I used is just fine in comparison. More so when I clarify in the first paragraph.
I agree that AI is constructing and progressing now with his own statistique intelligence writhing his own langage that will help humanity and nature evolve in the perfect direction.
Fundamentally, a neural network is a mathematical function that projects an interleaved multidimensional space of sparse points along hyperplans that allow to more easily - but not perfectly - discriminate those points.
The layering of the networks is used to discriminate the features of those points along various axes of this multidimensional space, at different scales, in order to correctly place the projected point.
Now, the more sophisticated a problem is to solve, the greater the number of dimensions its hyperspace will have, and the greater the number of parameters and the complexity of the function will be.
Such mathematical objects are intractable, that is a fact. But this is nothing new in mathematics, think of the Navier-Stokes equations which describe the motion of viscous fluid substances, it simply means that the explainability of a complex neural network will never be achieved analytically, and that we need to think of the equivalent of a "numerical" approach which result is accurate enough for us to establish the degree of reliability of the network...
Neural nets are a tool to tune ( by training) the parameters that approximately fit a function to a model. What the heck is there to understand more than this? The explanation is mathematical and statistical.
That's how it works. The LLM works the same if fed with the internet text or the temperature reports of all square miles in planet earth.
I think the part that isn’t understood is what the billions or trillions of individual parameters each represent (in human intelligible terms) and how the values are assigned to them, right?
That and how neural activity (changes in the activations when faced with a given input) cause behavior. How does "make a poem" causes ChatGPT to write a specific poem and not do something else. (We also don't know this about humans or well, even about a tiny worm, the C. Elegans, whose 302 neurons are completely mapped.)
Yeah, that paper from Anthropic (and the scaling monosemanticity one) prompted me to write this. I'm researching that topic to publish a deeper dive sometime down the line.
Looking forward to it. Seems like some level of progress although it does not seem to give full certainty of what the ai will produce for a given prompt...assuming that is when we can call ai explainability solved?
"How does "make a poem" causes ChatGPT to write a specific poem and not do something else." ? Because of the probabilistic structure of language extracted from the training set. It's a very powerful version of an auto-complete algorithm. This is already discussed by shannon on his seminal 1948 paper (see section 3).
Dear Alberto, it is not, because in a LLM there is no such activity ( see below, please ).
However, it is an explanation of how machine learning neural nets work. Artificial intelligence "neurons" do not have any kind of "neural activity" as you are imagining, they are just implementing a simple mathematical function and all the rest is linear algebra and statistics.
Perhaps to believe me, you could trust the explanation I just asked chatGPT about it, it looks pretty clear to me.
Neural activity is a placeholder for "activity that's taken place between neurons, biological or otherwise." I know biological neurons are mucho more complex than artificial ones and have written about it several times. This is a semantics discussion that will get us nowhere. So let me rephrase: We can't explain the behavior of an LLM by looking at the the parameters because we lack a robust explanatory framework to do so.
Brilliant writing as always Alberto, something I wish you to be very proud of.
Emergence is absolutely fascinating and, from what it feels to me, inherent to intelligence?
We have found that transformer-based LLMs self-organize around Noun-Phrase Routes akin to how CNNs automatically create filters around edges, shapes, etc. We have also found every RAG-based hallucination to be caused by errant Noun-Phrase routes — without exception. While there is much more to learn, we've found this insight extremely useful for creating accurate chatbots. Here's a brief intro: https://youtu.be/ZBWoUVZuGao?si=f83tTbh7q12LjgsJ&t=639
Thanks for the link Michael!
That sounds very interesting, but the video doesn't explain how the problem was really solved, it's just an AI-generated promotion video. How were the Noun-Phrase collisions avoided?
An example of how to apply the noun-phrase route collision avoidance to a real-world problem is given at the end of the video. Kindly see: https://youtu.be/ZBWoUVZuGao?si=ublEF50jxfookXYn&t=902. Also, since writing the above post, I've released an API to fully automate the process, and I also finished the video which explains what types of chatbots can be built using the API: https://youtu.be/K4Wg6QzPfyI. The API is released at https://www.ragfix.ai. Not only are free tokens available, but it is priced just high enough to cover costs to make the automated solution available to everyone.
Provided the API is used properly, it returns 100% accurate responses every time. To document this, I've posted the API response to all the Evident Conflict and Subtle Conflict hallucinations in the RAGTruth corpus for GPT-4 and GPT-3.5-Turbo. The same model that responds with a hallucination returns a 100% accurate response to the same query and same passages. Simply remove the noun-phrase route collisions and the models work perfectly. For documentation see: https://hallucination-analyzer.ragfix.ai/.
By removing noun-phrase route collisions, we don't need bigger models to achieve 100% accuracy. In fact, I am now testing GPT-4o. So far every response has come back 100% accurate — including all the hallucinations in the RAGTruth corpus.
Kindly let me know if you need any help.
Does anyone like comparison of neural nets to holograms? We can examine the partially transparent medium which the light shines through but no human brain is going to grok how the varying opacities converge on the 3-D image.
And thank you for writing such a fine perspective.
Much love.
This sounds like cybernetics to an old bloke like me. Oh my.
Dear Geoff, The answer is simple, the billions of parameters do not map in anything related to human intelligence in any explicit way. They contribute to interpolate a continuous function in a multidimensional space. That function approximates the most probable sequences of tokens in the given language, based on the training set. There is no understanding or reasoning involved, only correlation between abstract sequences of opaque symbols. If the LLM is trained with sequences of words encrypted with a secret key it does not know, it would give the sane answers in encrypted form, like a dumb automaton.
Moreover, the way natural neurons work in the brains is very different from artificial neurons in machine learning. Making analogies between artificial neural nets and biological brains at the cognitive is very speculative.
Ok, but there’s a lot of complexity hiding in the simple statement “they approximate the most probable sequence.”
And there is a way to make that human intelligible.
For instance, is the LLM using the proximity of tokens to one another throughout the data set to determine the probability one token will come after another? Ok, how many parameters can you get out of that? Or is the grammar of the preceding tokens considered when choosing the next token? Or is there semantic meaning, perhaps etymology, maybe even metaphorical relationships elucidated through repetition in the training data? Maybe there are underlying structures to the ways we speak that we have no idea about whatsoever, but the LLMs picked them up as parameters.
All it does is look for patterns right?
Those are all types of patterns that it can replicate without “understanding.”
The same way most English speakers know pong ping and tok tik sound weird, but few know there’s a rule of grammar that the “i” sound should generally be listed before the “o” sound.
And don’t forget, LLMs aren’t the only type of Neural Net. What about recommendation algorithms? They maximize probability you’ll spend more time on an app.
What are their parameters? How do we get to 1 trillion?
Dear Geoff, yes it is precisely that. It predicts the answer to a question based on the probability of a next word following the prompt ( and some hidden context that the LlM adds based on some pre-programmed heuristics and rules). The free parameters of the model ate like the interpolation coefficients for curve fitting. It is like tabulating a function by listing all of its values, except that you don't include all values, but just some, which allow the others to be approximated.
Awesome! We agree. Thats it precisely.
Now name one trillion (an unfathomably huge number) of such parameters without handwaving. Literally name them.
You can’t.
I think the point is no one can.
Can you see how that pertains to your original question? That’s what more there is to understand.
Geoff, but who in her / his sanity would care about that or even understand what would be the purpose of that?
Look, respectfully, sometimes to fully grasp technical concepts we first need to understand the technical concepts to a certain depth and rigour. Otherwise, we are discussing in a vacuum of naive remarks. The first time in history the Lumiere brothers projected a film of an approaching train on the screen, the audience ran out of the room in panic, fearing that the train would somehow come out of the projector and kill then all. That was because they were creating a completely naive and wrong model of a technology completely new to them.
I am afraid the question you are asking is as relevant as asking what is the name of each byte in the mp4 file of a recording of the fifth symphony of Beethoven. The parameters of a LLM encode in a congressed way the shape of a mathematical function that predicts next tokens given a previous string. The parameters do not stand for domain concepts pretty much like the bytes of the mp4 file do not stand for trumpets or the words of Schiller poem Freude.
This is an analogy I hope may help you. Otherwise if you need something more precise about the role of parameters in a regression model routinely used everyday in predictive statistics.
I get the sense you didn’t read this will researched article. Or that you didn’t understand it. And I think the same may be true of my comments. Maybe you don’t see the difference between what something’s function is and how it works?
I find that unlikely. But I don’t understand why we aren’t connecting and it’s making me feel frustrated.
You’re clearly incredibly smart—slinging words around like depth, rigor and “a vacuum of naive remarks” (🤯!)—and you obviously have some specialization in this area… but I think I’ll go back to talking to ChatGPT. It at least pantomimes a desire to take the perspective of the person whose point of view it is criticizing.
Cheers!
Yes, I see... The problem here is that no pain no gain, it is always easier to confuse everything. The point is that it is inglorious to look for meaning in the parameters of a LLM.
This article points out nicely that ML AI lacks explanatory power, but it misses the point of WHY it is inherently so.
As for reading, instead of ChatGPT you may look to this open source book on deep learning. Sorry for any misunderstanding from my part, it just appalls me so much hype and misunderstanding about the way LlMs work... People keep anthropomorphizing and making misplaced comparisons with the human brain. The brain is a very complex dynamic system, a thinking machine, while an artificial neural net is essentially a compressed huge huge lookup table, fixed at training time.
https://www.deeplearningbook.org/
"All our inventions are but improved means to an unimproved end".
"We are incurring a huge intellectual debt in the form of unintelligible tech that we use without wisdom or restraint."
Well-said! You articulated what I've personally been feeling about this whole thing. I think what makes it even worse is the gaslighting around AI adoption normalized by tech giants. In spite of everything we don't know, it's sold as "forward-thinking" behavior in a highly competitive urgency culture--and it works. I'm seeing a lot of image-obsessed executives that are buying "the shiny new object on the market" just to be able to slap on AI buzzwords on their resumes/portfolio. Talented folks are at the risk of losing their jobs, thinking *they're* worthless and obsolete (Mira Murati's viral comment about creative jobs says it all). Yet the irony is that the same folks who then take credit for implementing AI, are generally tone-deaf when it comes to ethics, human safety, best practices, and all in all, out of touch with reality.
It's difficult for our minds to comprehend the vast complexity of the universe we live in. We may have a relatively solid grasp of the simple interactions which make up a system - quantum chromodynamics, for example, models the interactions of elementary particles quite accurately. Of course, it takes one of the larger supercomputers to run a model of just a few hundred particles. At this level, there is no hint of the astonishing diversity of structure we see at our scale.Then we consider that there are more particles in our little finger than there are stars in the visible universe, and realise we exist amid an immense fractally detailed layering of emergent phenomena.
It's not just the function of neural networks, artificial or biological, that confounds us. Everywhere we look, from the workings of the cosmos down to the function of the cell, the more we find out, the more we realise we don't know.
I don't mean to belittle conscious understanding or abstract rational processes - these parts of our mind obviously have their uses. We can be justifiably proud of the mountain of knowledge we've accumulated. However, admitting that we still have a planet of undiscovered knowledge ahead of us might instill some balancing humility. Then we might see that, when it comes to intelligence, we tend to put the cart before the horse.
Lately, AI research and neurology have been revealing to us what Mother Nature knew all along - that intelligence grows up from its roots rather than being bestowed from above.
Neurologists witness, with their fMRI scans, the various "subconscious" parts of the human brain making a decision well before the conscious part even becomes aware of it. The conscious mind is not the instigator it likes to believe, rather an explainer after the fact, a storyteller. Our abstract models of reality are not the source of cognition, but the condensate of a much larger mass of mentation, a cooperative network of specialised behaviours trained over millions of years by our sense data.
The abstract models are nonetheless very important to modern humans. Our storyteller function allows us to follow chains of cause and effect, to envision sequences of events over time, to map reality. These maps have become a crucial link in the feedback loops which enable us to navigate the world, to function and survive in it.
Problems arise, though, when we confuse the map with the territory. If, rather than allowing reality to inform our model, we attempt to impose our abstractions in a one-way top-down manner, then we will end up following a faulty map, lost and in denial.
This confusion is writ large in our society these days. We spend more and more time arguing about what is or isn't real, less and less time negotiating in good faith how we will act in response. AI will just be adding to the noise obscuring the signal until we wake up to ourselves.
The answer to our problems comes from the ground up. Our supposed leaders are the symptom of our disease, not the cure. They cannot understand the functioning of massed humanity any more than AI researchers can understand how neural nets work. There is no law that can be enacted, no regulatory institution that could be established that will resolve our impasse. No-one can lead us out of this mess.
There is no one in charge.
I think it's a bit misleading to assert that "no one knows how AI works." In some sense, we know exactly how it works -- everything is open and visible, including all model details and every single one of the trillions of calculations. What we don't know is how to reduce this down to something our brains can grasp, analogous to discovering physical laws that govern the large-scale behavior of trillions (or more) of particles and waves.
Unfortunately, neural networks may well be irreducible. That is, in order to understand the system's behavior, we may simply have to run it and go through the trillions of calculations. There might be no shortcuts.
We've already seen examples like AlphaZero's "magic" moves, for which no short explanation was found -- one needed to enumerate every possibility. The same goes for some computer-assisted proofs (e.g., the Four Color Theorem), where there's similarly no known "short" explanation and only a computer can explore every single one of a vast number of proof paths.
In short, explainability could well be a dead end. Perhaps short explanations could be found in specific cases, as with certain brilliant grandmaster chess moves, but it seems hopeless in general.
You mean "computationally irreducible"? Could be.
Also, if we know how a neural network works and my headline is misleading, can you explain how does it work?
Only in a specific sense, as I mentioned. At the highest level, a neural network is nothing more and nothing less than a function, generally mapping vectors of floats to vectors of floats. That's one way to explain how it works.
Another way is to describe exactly what happens as the input is transformed into the output -- i.e., multiply and sum using the weights, pass through the activation function, etc., for every element and layer of the model. In this case, that's exactly how it works, down to the last detail.
If we were smarter, we might even be able to use this to understand why neural networks produce the behaviors they do and what precisely they're capable of. I suppose it's a matter of semantics, but that's bigger than merely knowing how a neural network works. Nonetheless, I won't argue if that's what you meant by "how it works."
12 lines of python. Who knew a language created as and for a joke would become the conduit for the next wave of human insanity. The universe certainly delights in irony.
it seems like the AI community is focused more on capability than process leading to the capability? is that the case?
building intelligence from the ground up like it arose for us does not seem of interest. I see that happening for iRobot's design of the roomba but not for generative AI LLMs.
There is some obfuscation in the typical way the idea of the black box is presented, which is badly motivated by financial incentives to upsell the models' intelligence.
Lay outsiders are manipulated by the black box idea to believe that we don't even know how neural net algorithms are implemented, as if the 'discovered invention' is some sort of encrypted device where we only interact with the data input and output but don't see or understand any code. This is silly of course. We write, design and therefore understand neural nets algorithmically. To the programmer, the black box is what you get when you deliberately overlook the quite simple mathematical models to ask about the 'emergent' behaviour of the model on the pretense that you don't know how it's grounded. But what justification is there for even framing the analysis this way, other than wanting to indulge in a primordial fantasy about creating new life?
The fact that we know how the algorithms work is important, even if there are questions to answer about what they signify cognitively, because while we might not be able to say how they do work, we can say how they don't work. For example, they don't work by mapping embodied sense data into a space of abstracted concepts which are used consistently and compositionally in out-of-domain tests. That much is enough to know that this is not a path to human-like intelligence.
"except we design it."
Also yeah, we don't ignore everything. But we don't know how to explain behavior from neural activity. We also don't know in humans (or any relatively complex animal for that matter). Denying that's a worthy scientific question is ridiculous. That's what this essay is about.
how are you able to ensure that advanced AI systems remain aligned with human values and intentions when the anthropic's CEO just said we know around 3% of how these dumb models work?
How do we ensure that any complex system aligns with human values? We fail in general because our values are heterogeneous, hard to define, and usually overruled by economic incentives anyway. Most of our society is already an offense to our ideals. Framing it as a problem particular to AI is again just a rhetorical ploy
I did not intend it to be a ploy. If we are planning to let these advanced systems take hold of the critical assets of humanity (which based on current news, it looks like is the case) then how do we recognize those different values for humanity and which ones and who will decide what to impart on advanced AI systems?
Right title should be "No one knows how neural networks, which are presented as something with intelligence, work."
Buy you wouldn't read something with that title, would you? Also, the field is named "artificial intelligence" -- everything is permitted after that.
The field named "artificial intelligence" has a lot of thing in addition to neural networks, for which the statement "no one knows how it work" is totally wrong.
I didn't make myself clear. The fact that they used the words "artificial" and "intelligence" is the greatest lie and the most successful marketing trick in the history of the field. The title I used is just fine in comparison. More so when I clarify in the first paragraph.
I agree that AI is constructing and progressing now with his own statistique intelligence writhing his own langage that will help humanity and nature evolve in the perfect direction.
Fundamentally, a neural network is a mathematical function that projects an interleaved multidimensional space of sparse points along hyperplans that allow to more easily - but not perfectly - discriminate those points.
The layering of the networks is used to discriminate the features of those points along various axes of this multidimensional space, at different scales, in order to correctly place the projected point.
Now, the more sophisticated a problem is to solve, the greater the number of dimensions its hyperspace will have, and the greater the number of parameters and the complexity of the function will be.
Such mathematical objects are intractable, that is a fact. But this is nothing new in mathematics, think of the Navier-Stokes equations which describe the motion of viscous fluid substances, it simply means that the explainability of a complex neural network will never be achieved analytically, and that we need to think of the equivalent of a "numerical" approach which result is accurate enough for us to establish the degree of reliability of the network...
Neural nets are a tool to tune ( by training) the parameters that approximately fit a function to a model. What the heck is there to understand more than this? The explanation is mathematical and statistical.
That's how it works. The LLM works the same if fed with the internet text or the temperature reports of all square miles in planet earth.
So you read this and decide to argue with the world experts on this topic? That's fine, choose your arguments well!
I think the part that isn’t understood is what the billions or trillions of individual parameters each represent (in human intelligible terms) and how the values are assigned to them, right?
That and how neural activity (changes in the activations when faced with a given input) cause behavior. How does "make a poem" causes ChatGPT to write a specific poem and not do something else. (We also don't know this about humans or well, even about a tiny worm, the C. Elegans, whose 302 neurons are completely mapped.)
is this relevant? https://www.anthropic.com/news/mapping-mind-language-model
I still don't understand these findings even after asking Claude 3.5 to explain it to me in simple terms....
Yeah, that paper from Anthropic (and the scaling monosemanticity one) prompted me to write this. I'm researching that topic to publish a deeper dive sometime down the line.
Looking forward to it. Seems like some level of progress although it does not seem to give full certainty of what the ai will produce for a given prompt...assuming that is when we can call ai explainability solved?
"How does "make a poem" causes ChatGPT to write a specific poem and not do something else." ? Because of the probabilistic structure of language extracted from the training set. It's a very powerful version of an auto-complete algorithm. This is already discussed by shannon on his seminal 1948 paper (see section 3).
https://people.math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf
How is that an explanation for how neural activity produces behavior?
Dear Alberto, it is not, because in a LLM there is no such activity ( see below, please ).
However, it is an explanation of how machine learning neural nets work. Artificial intelligence "neurons" do not have any kind of "neural activity" as you are imagining, they are just implementing a simple mathematical function and all the rest is linear algebra and statistics.
Perhaps to believe me, you could trust the explanation I just asked chatGPT about it, it looks pretty clear to me.
https://chatgpt.com/share/fb070753-3cfa-4d73-8640-cc32d2f19bbf
Neural activity is a placeholder for "activity that's taken place between neurons, biological or otherwise." I know biological neurons are mucho more complex than artificial ones and have written about it several times. This is a semantics discussion that will get us nowhere. So let me rephrase: We can't explain the behavior of an LLM by looking at the the parameters because we lack a robust explanatory framework to do so.