AI Can Now Make (Surprisingly Good) Podcasts From Your Writing

Google NotebookLM is cool

Sep 20, 2024

Too much focus on OpenAI o1 and the new reasoning paradigm may cause me to forget about Google. I shouldn’t.

People are freaking out about the new update to NotebookLM. I know I am and these things rarely impress me much. Here’s what Google says about the original product (released in 2023) in the blog post:

We built NotebookLM to help you make sense of complex information. When you upload your sources, it instantly becomes an expert, grounding its responses in your material with citations and relevant quotes.

Ok, so ignoring the hyperbole (e.g. “instantly becomes an expert”) we get that NotebookLM is a supercharged talk-to-your-pdf app. But because it’s integrated into the vast Google ecosystem it’s more helpful than standalone ChatGPT wrappers.

Google later added Gemini’s multimodal capabilities (e.g. to make presentation slides from your documents). So far so good.

The update people are freaking out about is a feature Google calls Audio Overview:

With one click, two AI hosts start up a lively “deep dive” discussion based on your sources. They summarize your material, make connections between topics, and banter back and forth.

In short: make podcasts on your writing.

I tried it with an upcoming essay and, as Josh Woodward says—but without the suspicion that arises from him being a Google VP—you need to try it to believe it. It’s good. Really good. Possibly the most incredible existing-tech product-ready AI tool I’ve seen since ChatGPT.1

But, if it’s this good, why didn’t Google promote it?

Interestingly, the first person I saw praising the release was none other than OpenAI main leaker, Jimmy Apples:

So I went to Google’s post and realized I had already seen it (I keep notifications on for the main AI companies in case they happen to release something as good as this). But I hadn’t read the announcement, just scrolled past it. One comment reminded me why:

Google is indeed so bad at marketing its products that it’s kinda hilarious its entire business model relies on ad placement.

There are other theories out there as to why Google is downplaying the product by putting it inside an existing service no one uses or not showing off on X by dropping nonchalantly the best AI podcast ever, etc.

One theory is that Google is afraid of what users can do with voice. Or rather, what voice models may do wrong if someone manages to jailbreak them (I assume it’d also be trivial to prompt inject something in a text document to make the podcasters say some cancelable stuff). Google’s fear would be reasonable given the bad press and public backlash they’ve faced with previous products like Gemini image generator and Overviews (remember the “glue pizza”?)2

I don’t think this is it because Google could, like OpenAI, roll out voice features slowly and then further red-team them or test for issues iteratively. But no, everyone has access to NotebookLM. Google went all-in.

I believe there’s a simpler explanation: Google has chosen not to play by OpenAI’s rules.

They’re not following the standard generation Z-coded Twitter dunk playbook. They may not grab the attention of AI Twitter anons this way, but who cares—the company enjoys a several-one-billion-user-services edge.3 Google is allowing its giant inertia to exert the required force to make NotebookLM a press-worthy sensation.

Professor Ethan Mollick also helped with this message. He’s an AI evangelist in the side of optimism, but you don’t often see him this enthusiastic (or, well, unnerved):

Google’s NotebookLM is the current best “wow this is amazing & useful” demo of AI. Here I gave it the entire text of my book, it turned it into a podcast, a study guide, FAQ, timeline & quite accurate chat. Listen to the first few minutes of the “podcast.” Seriously, just listen.

So I decided to test it.

I have to say I’m surprised (even considering how the surprise of all these people warned my priors). I tried it once yesterday and wasn’t completely satisfied. But when I tried it again today to give it a second chance using one of my upcoming essays as an example, it gave me goosebumps. (I’ll share the recording at the end.)

At the technical level, it’s incredible. A mix of many things we’ve seen before—natural voice, human emotions, and smooth flow, combined with the language mastery of the best chatbots and the ability to take a long text document as input and just get it. Google nicely put them together into a useful tool.

But it is at the content level that I’m quite speechless.4

There’s a “meh” part: It’s not as good as the text itself—as profound or as insightful. But we shouldn’t expect otherwise.

But then there’s the “wow” part: The main points of my essay are all there.5 The ideas are well summarized, even the harder nuances. The story is told chronologically but with interesting jumps in the conversation that adapt the audio to the typically engaging style of the podcast format. It mentions all the quotes and references. It’s uncanny that you can get the gist of my thesis (despite being, admittedly, quite convoluted) just from listening to two AIs with human voices talk about it in a dance of “hypnotic false coordination” as another user puts it. NotebookLM took three minutes to make it.

At one point, around the 1:30-minute mark, I felt the chills. Here’s the relevant dialogue:

AI Woman: And he doesn’t shy away from the big examples either. He even brings up the atomic bomb.
AI Man: Yeah. That was a really thought-provoking comparison. It really highlights how complex this issue of “easy solutions” can be, especially when you bring in moral considerations. I mean, the atomic bomb, it did end WW2 quickly, but at what cost—was it the right answer, you know, if you consider the long-term impact?
AI Woman: Man. Yeah. That really makes you think. I have to admit when I first read that part of the blog post I was, “wow, okay, this is getting intense.” But then the author throws in this counterargument from the economist Noah Smith and it basically says: “No, an easier life is inherently better.” And he uses the same examples, like dieting and even the atomic bomb to make his point.

That’s what the actual essay argues (up to that point, there’s more substance later). The examples and my counterargument using one of Noah Smith’s posts are there. But the way the AI woman draws the connection… it’s the perfect talk-style transition between the first section of the essay and the second, somehow smoother. Really. Uncanny.

I don’t know what to think yet—what are the implications or how much should I correct my preconceptions—but I surely didn’t expect an undersold AI tool to be this good.

Feel free to listen to the entire thing if you want to truly understand what I mean.6 Thankfully, my first impressions are easier to convey just by sharing the audio (if you care about spoilers about my next blog post, you can just listen to the first minute or so):

1×

0:00

-9:56

Cool, huh?

So cool actually, that when I started listening I got the same somatic reaction I get when something I’ve written gets recognized by the press—the real, human press. The feeling that someone is broadcasting to the whole world their judgment on my work.

But this is just AI.

Anyway, as good as it is, it’s not as good as the essay, so I recommend you stay tuned!

OpenAI o1 doesn’t count either as product-ready or as existing tech. There’s a case to be made that Audio Overview is also new tech but I’d say it’s closer to a nice combo of things we know have existed for a while. Google packed them into a ready-to-use service.

OpenAI’s products make similar mistakes but aren’t held to the same high standards. I wonder if it’s because people have grown tired of Google’s monopolistic or (two-faced) behavior or simply because their products matter more (more users).

It’s funny that it was AI Twitter anons, as they always are—and despite Google’s unintended efforts to go under the radar—that made NotebookLM viral.

Although I’m sure Google hasn’t solved hallucinations, I didn’t catch any, and just a couple of minor mistakes. For instance, around the 1-minute mark, the AI makes it seem that one person I mention in the text is the actual author of the article. I understand the error comes from the fact that I pasted the text without any mention of myself. It could have said “the author”, though. But anyway, I’m being nitpicky.

It’s kinda hypocritical that the essay goes on about how we shouldn’t always make life easier, e.g. using AI and so on, and I used AI to easily create a podcast. Sometimes you gotta navigate your own inconsistencies by making fun of them in the most flagrant way possible.

You will get it—like truly get it—once I publish the final piece. Even though I’ve edited it substantially since I created the podcast with it, the parallelism is so powerful that the synthetic voices truly make it seem like they’re two humans who read it and are commenting on it.

Bianca Dămoc

Sep 20, 2024Edited

I agree with you on why Google didn't (bother to?) market it. I doubt it was out of fear of misuse - had this been a real concern they wouldn't have released all of it to everyone.

I see your goosebumps and I raise my own... not sure if it's from a state of awe, or fear. Probably the latter.

Expand full comment

2 replies

Guy Wilson

It is incredible, but after trying it a few times, I am beginning to get tired of the stylized nature of these podcasts. It is still too stereotyped or templated. I have caught it in minor mistakes, but nothing significant yet. The way the summation of each podcast goes feels like it is trying to lock the listener into a cautious but optimistic mindset.

2 replies by Alberto Romero and others

30 more comments...

The Algorithmic Bridge

Discussion about this post