Gary Marcus just a few weeks ago had written at length about LLMs not being able to encode world models. He’s either having a field day with this paper rn or he’s mad that somehow people found this shocking even though he said the same thing a while ago. Nevertheless, thank you for your nice additions and philosophical discussions specifically in the latter parts of this piece. Very informative and interesting!
The paragraph: “If you see someone pour juice into a cup and then knock the cup over, you understand, even without seeing the spill, that the juice will be on the floor”, recalled me the classic stages of spatial development in children as proposed by the Swiss developmental psychologist, Jean Piaget. Following a parallelism argument, let me do a comparison between the children’s cognitive development against the current LLM models, where it looks like the current LLM models lack a kind of cognitive development such as the stage of spatial development that occurs in all children to develop a “world model”.
In a nutshell the stages are as follows: a) Sensorimotor, 0–2 years, action-based, spatial understanding is action-based; b) Preoperational, 2–7 years, symbolic but egocentric, limited perspective-taking; c) Concrete Operational, 7–11 years, understand conservation of space, accurate spatial reasoning; d) Formal Operational, 11+ years, abstract and hypothetical spatial concepts.
Where in children the development is embodied, evolving, experiential, in contrast in LLM is static, data-driven, non-developmental. Children learn grounded and contextual meaning and understanding whereas LLM just simulates it by associative and statistical next-token prediction. LLM lacks grounded semantics, children can “feel” volume and space, LLM does not know what is to “feel”, its substrate is impeded to build a “real” world model. The LLM´s lack of embodiment (sensorimotor experience) to learn through action in the world is a huge limitation to build the object permanence, spatial reasoning, or proprioception.
A huge breakthrough is needed to have AI Models ready to make scientific discoveries, although LLM reasoning is a step forward, it is not enough. Maybe a neuro-symbolic AI or a brand-new AI cognitive architecture which encompases not only "artificial intelligence" but a full cognitive system. Who knows? ( Sakana.ai is trying). So, the time to have an AI with a "world model" to emulate the Kepler and Newton discoveries is a long way to go.
It always goes back to Judea Pearl's statement that without modelling causality, you get only "glorified regressors". And causal analysis is freaking hard compared to the "throw data at ever bigger models and hope it sticks" approach of the usual suspects.
I recently wrote about how this problem applies to biology, and why AI for bio is still a long way from producing the types of breakthroughs that headlines imagine. This is a great piece breaking down the generalized problem, I look forward to sharing it!
In the final part of the essay, the humorous infinite monkey theorem came to mind and the thought hit me, even if an LLM did make a discovery, wouldn't it automatically discard this precisely because it violates in some way its predictive probability programming of the "what", labelling it an error or deviation from the expected result? Whereas if it also or instead took into consideration the "why", partially at least, then it might be prone to consider a deviation from the expected as a potential "breakthrough" and go back to improve upon the model so it could better represent the"what" (or whatever knowledge it's addressing). By way of analogy, even if a monkey did hammer out a work of Shakespeare, it would be just as meaningless as everything preceding it and consigned to the trash heap just as quickly.
World Labs is trying to solve this and avoid the pure next-token prediction paradigm. Perhaps that is the next unlock beyond Scaling Laws on pure Transformers.
Your article also suggests that there might be some different kind of machine reasoning that might not follow the human mind model but could be useful anyway. That potentially opens up the question of AI eventually reaching a different form of consciousness (especially if consciousness turns out to be emergent from sheer complexity) that we may never understand.
Like you though, I think this is a long way off, if it ever happens.
I realised after I wrote and sent that comment that intelligence and consciousness were not the same thing but it was too late to retract. That’s a penalty of replying really late at night as a I am doing again now! I’m a southern hemisphere person, who reads substack at ludicrously late hours when I’m tired.
Certainly the relationship between machine intelligence and alternative epistemologies is worth exploring. It’s deeply affected by available explanatory frameworks which is itself affected by the state of contemporary understanding at any given time and the physical mechanisms that you have available to perceive the world and build an explanatory framework from that. These will be radically different between a human who is tuned into wildly overdetermining meaning patterns into everything, including the face of Jesus on a piece of toast, or a scratching sound at night; a machine which has great difficulty generalising; and an octopus which must have some kind of epistemological understanding because it is intelligent and capable of problem solving but which would be deeply influenced by its short life which would limit its capacity for experience and by its radically different physical information sensing systems to ours.
This is by far one of the most interesting articles I’ve read on LLM’s. Thank you!
Thank you, Jeanne!
On the topic of AlphaFold, I just wrote this. AlphaFold definitely doesn't understand the physics.
https://clauswilke.substack.com/p/no-alphafold-has-not-completely-solved
Gary Marcus just a few weeks ago had written at length about LLMs not being able to encode world models. He’s either having a field day with this paper rn or he’s mad that somehow people found this shocking even though he said the same thing a while ago. Nevertheless, thank you for your nice additions and philosophical discussions specifically in the latter parts of this piece. Very informative and interesting!
Yep, he's been saying this for a while
The paragraph: “If you see someone pour juice into a cup and then knock the cup over, you understand, even without seeing the spill, that the juice will be on the floor”, recalled me the classic stages of spatial development in children as proposed by the Swiss developmental psychologist, Jean Piaget. Following a parallelism argument, let me do a comparison between the children’s cognitive development against the current LLM models, where it looks like the current LLM models lack a kind of cognitive development such as the stage of spatial development that occurs in all children to develop a “world model”.
In a nutshell the stages are as follows: a) Sensorimotor, 0–2 years, action-based, spatial understanding is action-based; b) Preoperational, 2–7 years, symbolic but egocentric, limited perspective-taking; c) Concrete Operational, 7–11 years, understand conservation of space, accurate spatial reasoning; d) Formal Operational, 11+ years, abstract and hypothetical spatial concepts.
Where in children the development is embodied, evolving, experiential, in contrast in LLM is static, data-driven, non-developmental. Children learn grounded and contextual meaning and understanding whereas LLM just simulates it by associative and statistical next-token prediction. LLM lacks grounded semantics, children can “feel” volume and space, LLM does not know what is to “feel”, its substrate is impeded to build a “real” world model. The LLM´s lack of embodiment (sensorimotor experience) to learn through action in the world is a huge limitation to build the object permanence, spatial reasoning, or proprioception.
A huge breakthrough is needed to have AI Models ready to make scientific discoveries, although LLM reasoning is a step forward, it is not enough. Maybe a neuro-symbolic AI or a brand-new AI cognitive architecture which encompases not only "artificial intelligence" but a full cognitive system. Who knows? ( Sakana.ai is trying). So, the time to have an AI with a "world model" to emulate the Kepler and Newton discoveries is a long way to go.
It always goes back to Judea Pearl's statement that without modelling causality, you get only "glorified regressors". And causal analysis is freaking hard compared to the "throw data at ever bigger models and hope it sticks" approach of the usual suspects.
this is my favourite article on AI! I've never felt so satisfied with an explanation for the limits of LLMs. Thank you!
Good write up on a very nuanced topic. I learned enough to sound smart in my next argument on LLMs.
A very nice article! From which I'm taking that Judea Pearl is still right after all these years...
Brillant. 10/10 article. A shorter version would fit well on the pages of the NYT.
super insightful, thank you!!
Really interesting, and I like all the references + short video. Created a video of it for myself initially, but thought why not also share it here: https://app.symvol.io/videos/harvard-and-mit-study-ai-models-are-not-ready-to-make-scientific-discoveries-1b07
I recently wrote about how this problem applies to biology, and why AI for bio is still a long way from producing the types of breakthroughs that headlines imagine. This is a great piece breaking down the generalized problem, I look forward to sharing it!
Thank you Kennedy!
Great essay and insights!
In the final part of the essay, the humorous infinite monkey theorem came to mind and the thought hit me, even if an LLM did make a discovery, wouldn't it automatically discard this precisely because it violates in some way its predictive probability programming of the "what", labelling it an error or deviation from the expected result? Whereas if it also or instead took into consideration the "why", partially at least, then it might be prone to consider a deviation from the expected as a potential "breakthrough" and go back to improve upon the model so it could better represent the"what" (or whatever knowledge it's addressing). By way of analogy, even if a monkey did hammer out a work of Shakespeare, it would be just as meaningless as everything preceding it and consigned to the trash heap just as quickly.
World Labs is trying to solve this and avoid the pure next-token prediction paradigm. Perhaps that is the next unlock beyond Scaling Laws on pure Transformers.
Amazing. Great article
Thank you Mark! Glad you liked it
Excellent article. Thanks for writing it. You explain very clearly.
The whole topic throws up the question of types of knowledge and reasoning, which as you say is deeply philosophical.
Aristotle had a good go at chopping it up into five different categories, some of which overlap with yours.
See for instance this article.
https://compass.onlinelibrary.wiley.com/doi/abs/10.1111/phc3.12799
Your article also suggests that there might be some different kind of machine reasoning that might not follow the human mind model but could be useful anyway. That potentially opens up the question of AI eventually reaching a different form of consciousness (especially if consciousness turns out to be emergent from sheer complexity) that we may never understand.
Like you though, I think this is a long way off, if it ever happens.
Thank you Julia!! Uuh the topic of consciousness, don't get me started (although I don't think intelligence and consciousness necessarily go together)
I realised after I wrote and sent that comment that intelligence and consciousness were not the same thing but it was too late to retract. That’s a penalty of replying really late at night as a I am doing again now! I’m a southern hemisphere person, who reads substack at ludicrously late hours when I’m tired.
Certainly the relationship between machine intelligence and alternative epistemologies is worth exploring. It’s deeply affected by available explanatory frameworks which is itself affected by the state of contemporary understanding at any given time and the physical mechanisms that you have available to perceive the world and build an explanatory framework from that. These will be radically different between a human who is tuned into wildly overdetermining meaning patterns into everything, including the face of Jesus on a piece of toast, or a scratching sound at night; a machine which has great difficulty generalising; and an octopus which must have some kind of epistemological understanding because it is intelligent and capable of problem solving but which would be deeply influenced by its short life which would limit its capacity for experience and by its radically different physical information sensing systems to ours.