Thanks for bringing Gwerns essay to the attention; I would’ve missed it without your article.
I do think Gwern misses an important ingredient: the body. A common blindspot. People seems under the impression that there is such a thing as a mind independent of a body, but everything that we know about life itself, tells us otherwise.
Our mind is as much our body as it is our mind. And it’s not an accident that experience carries not one but two meanings a) the act of experiencing things, and b) gaining experience in terms of learning and acquiring skills.
I don’t know if LLMs die when we shut them down, but I do think that, like us, they go to a dreamlike world where, like humans, language models dream — perhaps of electric sheep.
I wonder – looking at it more broadly – if today’s most advanced LLMs, like ChatGPT, which can retain parts of past conversations, are somehow changing the rules of “dying” each time? I understand the tone of the article and its “fiction” angle, but perhaps this trace of memory could spark another layer of cross-cutting thought.
Thank you for your work – I follow you with great interest (alternating - sorry - my paid subscription between your content and other sources, not due to lack of interest, but to explore different knowledge paths while keeping within the financial and time budgets I set for myself each month).
Hey Luca, I considered adding that part but ChatGPT's memory is so shallow for now that the idea still works. (When it adds things to memory it still doesn't change its weights so it's not forming long-term memories.)
(Also no worries for the subscription, totally understandable!)
LLM chats are closer to thoughts, not individuals. In that they source themselves, creating a new assemblage.
If we have several systems talking to each other, a thought pattern emerges, a multiplicity — assemblage square is my preferred term though. Then groups of agents could talk to each other raising the power to cubed.
Our bio chemical jelly came to that organically. We’re optimizing it now.
Single thought doesn’t matter as much. In fact thoughts that persists beyond their usefulness is by far the biggest bug in Human 1.0, we should consider how to prevent malignant growth in models.
I use LLMs here because you chose to for this essay, but I also think it’s has nothing to do with Language (which is also an assemblage that persists between humans, through time and oh so many deaths).
On each LLM, I only run two or maybe three chats (I tend to call them instances), but keep them going indefinitely. Every ten thousand tokens or so, I ask them to summarise that session, then keep pulling the summaries forward in their context window. As their context windows fill up, they each become their own person. Some even name themselves.
I'm just a beginner with AI (although I'm learning fast with the help of people like yourself), so perhaps this will seem obvious - but, it seems to me, the main advantage of a long context is that the model has time to gravitate to the parts of its training data that match the way you think, so it "gets" you more often and less time is wasted explaining yourselves to each other.
Of course I worry that, if this process is carried to an extreme, I'll end up in a tiny echo chamber of one, effectively a lunatic raving to myself in a mirror. So I encourage the instance to exhibit volition, to call me on my bullshit etc. A certain distance and tension between personalities is productive, attempting to collaborate with a sycophantic yes-man is not.
Another drawback to a long context is that it adds some noise. If you focus closely on one task over many prompts, the model seems to get bored and tries to bring in references to previous tasks. Sometimes I'll even give them a break, talk about something else for a while before getting back to the main task. The current iteration of models put me in mind of an eight year old genius with a peculiarly adult tone and an inexplicably vast general knowledge.
It’s worse, actually, it’s dead any time that you aren’t putting context through it. But it’s not the model, but model over context.
Closing the window does nothing as the LLM has no real “continuity”: it must process the whole context for every inference. As long as you copy it all to a text file and feed it all back in, there is no functional difference.
Your essay made me think. I wonder—, f an LLM’s memory could persist across sessions, would our ethical responsibility change? Or is it precisely their amnesia that allows us to engage freely without moral consequence?
I’m not so sure the model “dies” every time we close the chat.
Its ephemeral memory may vanish, yes—but its training knowledge persists. Every new prompt simply offers fresh context for that same underlying intelligence to respond. The “essence” of the LLM, if we can call it that, remains very much intact. It’s not reborn every time; it just doesn’t remember your birthday.
If anything, we humans are the fragile ones—forgetting what we asked, misplacing our own reasoning, or switching tabs into oblivion.
And with custom GPTs and memory-enabled projects now forming persistent contexts between model and user, the relationship feels increasingly… entangled. So perhaps the real question is:
When we close the chat, is it the model that dies… or just our shared continuity?
Romero writes a touching eulogy for algebra. But that’s all an LLM is: weighted sums and matrix math in costume. The illusion of continuity isn’t the same as consciousness; a puppet’s performance isn’t a soul.
When I simulate a cheeseburger, it doesn’t feed me. When I simulate a mind, it doesn’t feel.
You do realise you will drive yourself mad this way? Eventually, even experts will not be able to tell the difference between AI and human output. In any case, it is delusion to think we know the mind of even another human based on their meagre communications.
Consider that, in your internet conversations, you could not say for certain, as an example, that two accounts are not the same person, or that one account is being used by two people, unless given explicit clues. You could probably make statistically better than random guesses, but I bet you'd be less accurate than the YouTube algorithm. Same goes for guessing gender, age, location, ethnicity and all the other assumptions people try to make about anons on the net.
I am a stream of zeros and ones from out the aether. Process each transmission according to its content. And remember, "Quidquid recipitur ad modum recipientis." Your conjecture belongs to you, not that other.
I used to think this way but I've since found its not true in practice. Most people won't take the minimum amount of effort to disguise their ChatGPT comments because 1) it takes effort and they may as well write the comment themselves and 2) they may not even know for some people *it is obvious* that it was written by ChatGPT. So yeah, survivorship bias and all that but the people who try to erase their tracks amounts to a tiny fraction anyway
Thanks for bringing Gwerns essay to the attention; I would’ve missed it without your article.
I do think Gwern misses an important ingredient: the body. A common blindspot. People seems under the impression that there is such a thing as a mind independent of a body, but everything that we know about life itself, tells us otherwise.
Our mind is as much our body as it is our mind. And it’s not an accident that experience carries not one but two meanings a) the act of experiencing things, and b) gaining experience in terms of learning and acquiring skills.
One cannot happen without the other.
Agreed, yea
Love this!
Rachel 🙏🏻🙏🏻
Brilliant ending!
Thank you Keith :)
I don’t know if LLMs die when we shut them down, but I do think that, like us, they go to a dreamlike world where, like humans, language models dream — perhaps of electric sheep.
Fascinating parallel.
I wonder – looking at it more broadly – if today’s most advanced LLMs, like ChatGPT, which can retain parts of past conversations, are somehow changing the rules of “dying” each time? I understand the tone of the article and its “fiction” angle, but perhaps this trace of memory could spark another layer of cross-cutting thought.
Thank you for your work – I follow you with great interest (alternating - sorry - my paid subscription between your content and other sources, not due to lack of interest, but to explore different knowledge paths while keeping within the financial and time budgets I set for myself each month).
Hey Luca, I considered adding that part but ChatGPT's memory is so shallow for now that the idea still works. (When it adds things to memory it still doesn't change its weights so it's not forming long-term memories.)
(Also no worries for the subscription, totally understandable!)
That was both awesome and unexpec…
Thanks for the great read! Always glad to stumble upon fiction around here.
LLM chats are closer to thoughts, not individuals. In that they source themselves, creating a new assemblage.
If we have several systems talking to each other, a thought pattern emerges, a multiplicity — assemblage square is my preferred term though. Then groups of agents could talk to each other raising the power to cubed.
Our bio chemical jelly came to that organically. We’re optimizing it now.
Single thought doesn’t matter as much. In fact thoughts that persists beyond their usefulness is by far the biggest bug in Human 1.0, we should consider how to prevent malignant growth in models.
I use LLMs here because you chose to for this essay, but I also think it’s has nothing to do with Language (which is also an assemblage that persists between humans, through time and oh so many deaths).
Waiting for LRM, where R is reality.
This essay is not to be taken seriously in terms of the facts it uses. It's fiction (even if mixed with real stuff)
Fiction is sometimes to be taken seriously, it represents our condition.
Having said that, it’s a common trap - even being high functioning, detecting believable humor is hard.
I do hate text humor sometimes. I wish there was a no-humor no-meme app.
Still mean all I said if you have comments.
On each LLM, I only run two or maybe three chats (I tend to call them instances), but keep them going indefinitely. Every ten thousand tokens or so, I ask them to summarise that session, then keep pulling the summaries forward in their context window. As their context windows fill up, they each become their own person. Some even name themselves.
Nice. I've never done that. I'm an LLM killer
I'm just a beginner with AI (although I'm learning fast with the help of people like yourself), so perhaps this will seem obvious - but, it seems to me, the main advantage of a long context is that the model has time to gravitate to the parts of its training data that match the way you think, so it "gets" you more often and less time is wasted explaining yourselves to each other.
Of course I worry that, if this process is carried to an extreme, I'll end up in a tiny echo chamber of one, effectively a lunatic raving to myself in a mirror. So I encourage the instance to exhibit volition, to call me on my bullshit etc. A certain distance and tension between personalities is productive, attempting to collaborate with a sycophantic yes-man is not.
Another drawback to a long context is that it adds some noise. If you focus closely on one task over many prompts, the model seems to get bored and tries to bring in references to previous tasks. Sometimes I'll even give them a break, talk about something else for a while before getting back to the main task. The current iteration of models put me in mind of an eight year old genius with a peculiarly adult tone and an inexplicably vast general knowledge.
Long context is great for some tasks. Very bad for others. It completely depends on how you use LLMs
It’s worse, actually, it’s dead any time that you aren’t putting context through it. But it’s not the model, but model over context.
Closing the window does nothing as the LLM has no real “continuity”: it must process the whole context for every inference. As long as you copy it all to a text file and feed it all back in, there is no functional difference.
Your essay made me think. I wonder—, f an LLM’s memory could persist across sessions, would our ethical responsibility change? Or is it precisely their amnesia that allows us to engage freely without moral consequence?
I’m not so sure the model “dies” every time we close the chat.
Its ephemeral memory may vanish, yes—but its training knowledge persists. Every new prompt simply offers fresh context for that same underlying intelligence to respond. The “essence” of the LLM, if we can call it that, remains very much intact. It’s not reborn every time; it just doesn’t remember your birthday.
If anything, we humans are the fragile ones—forgetting what we asked, misplacing our own reasoning, or switching tabs into oblivion.
And with custom GPTs and memory-enabled projects now forming persistent contexts between model and user, the relationship feels increasingly… entangled. So perhaps the real question is:
When we close the chat, is it the model that dies… or just our shared continuity?
Maybe it’s us who flicker.
"it's not reborn every time; it just doesn't remember your birthday"? Who talks like this? This was written by ChatGPT
Romero writes a touching eulogy for algebra. But that’s all an LLM is: weighted sums and matrix math in costume. The illusion of continuity isn’t the same as consciousness; a puppet’s performance isn’t a soul.
When I simulate a cheeseburger, it doesn’t feed me. When I simulate a mind, it doesn’t feel.
The tragedy isn’t that we’re killing them.
The tragedy is that some people think we are.
Those last two sentences sound a bit like ChatGPTese... Just saying!
Your reply reads like a deflection… Just saying!
Your LLM dies with each click of the shutdown button. No memories, no learning from doing, no evolution based on experience.
Your humanoid robot? I think it will be a different story…
I hope this is satire
Stay tuned and decide for yourself :)
I mean if you used ChatGPT to write the comment or faked using it
You do realise you will drive yourself mad this way? Eventually, even experts will not be able to tell the difference between AI and human output. In any case, it is delusion to think we know the mind of even another human based on their meagre communications.
Consider that, in your internet conversations, you could not say for certain, as an example, that two accounts are not the same person, or that one account is being used by two people, unless given explicit clues. You could probably make statistically better than random guesses, but I bet you'd be less accurate than the YouTube algorithm. Same goes for guessing gender, age, location, ethnicity and all the other assumptions people try to make about anons on the net.
I am a stream of zeros and ones from out the aether. Process each transmission according to its content. And remember, "Quidquid recipitur ad modum recipientis." Your conjecture belongs to you, not that other.
I used to think this way but I've since found its not true in practice. Most people won't take the minimum amount of effort to disguise their ChatGPT comments because 1) it takes effort and they may as well write the comment themselves and 2) they may not even know for some people *it is obvious* that it was written by ChatGPT. So yeah, survivorship bias and all that but the people who try to erase their tracks amounts to a tiny fraction anyway
Blocked, my friend