NotebookLM: neural network for document analysis, self-education and investigations

Analyzing documents using neural networks is a task with an asterisk, although it would seem that artificial intelligence trained on a huge array of data should easily cope with it. However, popular chatbots in such situations hallucinate much more often than average and require fine-tuning to obtain the correct result. NotebookLM from Google effectively solves this problem.

NotebookLM works on the basis of the Gemini Pro model. Unlike traditional chatbots, which rely on general knowledge from the Internet, this model bases its responses solely on documents uploaded by the user. This significantly increases the accuracy of the analysis and makes the tool an excellent assistant when working with large volumes of data. At the time of publication of this text, it works with Google Docs, PDF, Markdown and TXT files, audio recordings, web links and YouTube videos. Like other Google products, NotebookLM has a collaboration mode.

Description and basic functions 

On the main page, a new user is invited to create his first notebook and add documents to it. Each notebook can contain up to 50 sources (documents, links or files) in the free version and up to 300 in the paid version.

This is what a “fresh” NotebookLM looks like

And this is what the NotebookLM of Pavel Bannikov, editor of the “Checked Manual”, looks like

When working with a notepad, on the left side of the screen there is a list of downloaded sources, and in the center there is a chat bot that analyzes the sources, with which the user first interacts. In the chat, you can search and ask meaningful questions about downloaded documents. Each answer can be saved as a note inside a notepad, and notes can be turned into new sources. 

On the right side of the screen is the so-called “Studio”, in which the user is offered five automated methods for studying documents in a notepad: 

  • "Brief overview";
  • "Manual";
  • "Chronology";
  • "Frequently Asked Questions";
  • "Audio retelling".

"Brief Review" And "Manual" - a godsend for students and teachers. The first will help highlight the main thing from one or all files in the notepad. The second will help develop an entire training course based on them: lecture notes, practical assignments, essay topics and questions for tests. We tested this function on Hans Rosling's book "Factuality"and in 7 seconds. received a well-structured lecture notes about this book with a final quiz for students.

This function is also useful for investigations. For example, during an experiment on data about several sites that are suspected of spreading disinformation in Kazakhstan and other countries, NotebookLM generated a completely correct method by which you can conduct an investigation, find connections between sites, and theoretically even contact the owners of this network.

The function works in a similar way "Chronology": She isolates the dates and associated semantic elements found in the notebook. Note that the model may mistakenly match a date to an element that is simply mentioned in nearby text. In such cases, she is programmed to make a reservation, but it is better to double-check yourself.

Function "Frequently Asked Questions" allows you to generate a block of questions that are likely to arise before or while reading the contents of the notebook. The model, accordingly, will find answers to them and show them to the user. 

"Audio retelling" is the most impressive feature. It allows you to generate a podcast based on a notepad, and do this in 50 languages, including Russian (the language can be selected in the settings). The conversation is conducted by two AI characters - male and female voices. In English they sound very natural; in Russian there are some mispronunciations of words, but the quality of the analysis allows you to come to terms with this. 

Example of a generated podcast

Next to the podcast generation button there is a “Customize” button. By clicking on it, you can adjust the details of the podcast and the main topics that the “hosts” will discuss - you just need to set the parameters that are important to you in the prompt.

Let's say you're analyzing two years' worth of financial records for several companies (which could be dozens of reports that could take days to sift through) and you need to look for potential evidence of corruption. You can ask just such a clarification and get an audio retelling focused on this topic. Then you can switch back to text mode and analyze the files that have passed the pre-selection. 

Just a couple of days before the release of this material, the audio retelling function (so far only in beta testing mode and in English) became interactive. The user can now participate in the generated conversation. You need to intervene in the conversation of the “hosts” of the podcast and ask a clarifying question - the conversation will go in the direction you want (you will have to wait a little). 

In the free version, each notebook can store only one audio retelling, but this is not a problem - you can download the generated file, delete the retelling from the notebook, and set a prompt to generate a podcast with new tasks. The downloaded file can be downloaded as one of the sources if desired. 

Restrictions

  • Each source in the notebook can contain no more than 500,000 words and/or weigh no more than 200 MB.
  • The same limitation applies to videos with subtitles: the file with them should not exceed 500,000 words. However, this is not a problem: even Yuri Dud’s guests are not able to utter so many words in a three-hour interview.
  • You can store up to 1000 notes in one notebook. 
  • Deleted notes are not restored.
  • Tables (for example, files in XSLX and CSV formats) are not currently supported; they must first be converted to PDF for analysis.
  • Synchronization with Google Drive for May 2025 is also incomplete - only documents and presentations. 
  • When analyzing audio, the tool does not break down cues by speaker and does not provide time codes.

Some of these shortcomings will likely be eliminated in the future. So, it is logical to assume that Google's tool should one day start working with Google Sheets, as well as transfer time codes from videos to YouTube. But even in its current configuration, NotebookLM can significantly reduce the time spent studying a large amount of data. In the free version, you can analyze up to 25 million words within one notebook - that’s three complete 90-volume collected works Leo Tolstoy.

Share with friends