Andrej Karpathy Reveals How LLMs Build Personal Knowledge Bases That Shift Focus from Code to Knowledge

Avatar photo

The Workflow: From Raw Data to a Living Wiki

Andrej Karpathy, one of the world’s leading AI experts and former AI Director at Tesla and founding member of OpenAI, has shared a system that is transforming his own research productivity. Instead of using large language models (LLMs such as Claude or GPT) merely to generate code or isolated answers, Karpathy employs them to create and maintain personal knowledge bases. The process starts with a folder called “raw/” that stores original documents: scientific articles, papers, code repositories, datasets, and even images. An LLM then incrementally “compiles” everything into an organized wiki of Markdown (.md) files—a simple, universal text format that supports links, lists, and basic formatting without complexity.

The model automatically generates summaries, thematic articles on key concepts, cross-references (backlinks), and categorizes the information. The process is progressive: each new document integrates without disrupting the existing structure. Karpathy avoids complex retrieval-augmented generation (RAG) tools because the LLM itself maintains indexes and brief summaries, allowing seamless navigation even at scale.

Obsidian as the Interface: The “Frontend” Where the LLM Takes Control

The visual tool is Obsidian, a free application that turns Markdown files into an interconnected knowledge network—much like a personal Wikipedia. It automatically creates visual graphs of relationships between notes. Karpathy rarely edits the wiki manually; the LLM writes and updates everything. He uses the Obsidian Web Clipper to capture web articles and downloads related images so the model can reference them directly.

Plugins enable rendering presentations (via Marp) or visualizations. The result is an environment where the researcher only observes and queries: artificial intelligence becomes the true author and maintainer of the knowledge base.

Complex Queries and Outputs That Enrich the Wiki

Once the collection reaches critical mass – around 100 articles and 400,000 words in his case – the real power appears. Karpathy poses complex questions to the LLM, which researches internally across the entire wiki, cross-references data, and delivers deep answers. Instead of plain text, the model outputs new Markdown files, Marp-format presentations, or Matplotlib-generated charts. All outputs are automatically saved back into the wiki, turning every query into permanent knowledge that compounds over time.

Intelligent Maintenance and the Path to Dedicated Products

Karpathy goes further. He runs LLM-powered “health checks” to detect inconsistencies, fill missing data (using web searches when needed), or suggest new articles based on unexpected connections. He has also built auxiliary tools, such as a simple search engine over the wiki that the LLM can invoke directly.

The expert acknowledges that the current setup relies on hacky scripts, but states unequivocally that there is massive room for a professional product that removes the technical friction. In the near future, he plans to generate synthetic data from the wiki and fine-tune models so the knowledge lives directly in the model weights rather than depending solely on context windows.

Why This Approach Challenges Conventional AI Use

For years, LLMs have been used mainly for one-off tasks or code generation. Karpathy demonstrates that their true value lies in becoming long-term knowledge collaborators. This shift is not merely technical; it is structural. It reduces the researcher’s cognitive load, accelerates idea synthesis, and enables explorations that previously required entire teams. However, it demands initial discipline in data ingestion and trust that the model will preserve integrity. Those who adopt this method correctly will leave behind fragmented note-taking and enter an era where AI does not merely answer—it builds and refines the user’s own intellect.

Sources

Total
0
Shares
Previous Article

Alex Karp: The Philosopher in the Valley Who Forged the Modern Surveillance State with Palantir

Next Article

Oracle Fires 30,000 Employees and Files 3,126 H-1B Petitions: The Legal Leash Replacing Workers Who Can Actually Quit

Total
0
Share