RAG System for AI-Powered Book Production

Background

I was writing a non-fiction book drawing on over 40 source works — from archaeology and history to modern research. The problem was simple: without access to the sources, AI models hallucinate. And with 40+ books, manual research was too slow.

I needed a system that could find the right passages across all sources, keep track of what had already been used, and ensure each chapter was based on real knowledge — not fabricated facts. The solution was RAG (Retrieval-Augmented Generation): a technique that combines AI language models with precise search across your own documents.

My approach

I chose SurrealDB 3.0 as the database. It’s a multi-model database that combines relations, documents, graph queries and native vector search in one. That means I can do semantic search, traverse relations between sources and chapters, and join with metadata — all in a single query.

In practice: a chunk from a source book is connected via graph relations to the chapters that use it, with notes on how it’s used. This gives full traceability and makes overlap analysis trivial.

Tech stack

SurrealDB 3.0 with HNSW vector index (1024 dimensions, cosine similarity)
pplx-embed for contextual embeddings with late chunking
Claude Code agent team with researcher, author, quality guard and critical reader
SurrealQL graph queries for source traceability and overlap detection

Why pplx-embed and late chunking

Most embedding models treat each text block in isolation. If a passage starts with “he continued to…” you lose the context of who “he” is. pplx-embed with late chunking solves this: the entire document runs through the model first, and chunking happens afterward. Each embedding retains context from the full document.

In practice, this means significantly better search results — especially in source material with many cross-references and pronouns.

Results

The system runs as an integrated part of the writing process. When a chapter needs to be written, one agent automatically researches across all 40+ sources, another writes based on the found passages, and a third verifies that all facts can be traced to a source. A fourth agent reads the whole thing as an editor.

The overlap detection ensures no source is used the same way in two chapters. The book is in production, and every chapter is based on verifiable sources — not AI guesswork.

RAG System for AI-Powered Book Production

Background

My approach

Tech stack

Why pplx-embed and late chunking

Results

Related cases

AI Agents That Read Your Blood Tests

Website with Claude Code as CMS

Custom WordPress with Claude Code

Contact me