Services About Cases Blog FAQ Contact DA
Nordisk smedescene — Brokk & Sindre hero-billede

RAG System for AI-Powered Book Production

Personal project

Client
Personal project
Challenge
I wanted to write a non-fiction book based on 40+ source works. Manual research took too long — and AI models hallucinated without access to the sources.
Results
  • 144+ source chunks indexed with contextual embeddings
  • Full traceability: every citation can be traced back to its source
  • Agent team with 4 specialized roles per chapter
  • Automatic overlap detection ensures variation across chapters

Background

I was writing a non-fiction book drawing on over 40 source works — from archaeology and history to modern research. The problem was simple: without access to the sources, AI models hallucinate. And with 40+ books, manual research was too slow.

I needed a system that could find the right passages across all sources, keep track of what had already been used, and ensure each chapter was based on real knowledge — not fabricated facts. The solution was RAG (Retrieval-Augmented Generation): a technique that combines AI language models with precise search across your own documents.

My approach

I chose SurrealDB 3.0 as the database. It’s a multi-model database that combines relations, documents, graph queries and native vector search in one. That means I can do semantic search, traverse relations between sources and chapters, and join with metadata — all in a single query.

In practice: a chunk from a source book is connected via graph relations to the chapters that use it, with notes on how it’s used. This gives full traceability and makes overlap analysis trivial.

Tech stack

  • SurrealDB 3.0 with HNSW vector index (1024 dimensions, cosine similarity)
  • pplx-embed for contextual embeddings with late chunking
  • Claude Code agent team with researcher, author, quality guard and critical reader
  • SurrealQL graph queries for source traceability and overlap detection

Why pplx-embed and late chunking

Most embedding models treat each text block in isolation. If a passage starts with “he continued to…” you lose the context of who “he” is. pplx-embed with late chunking solves this: the entire document runs through the model first, and chunking happens afterward. Each embedding retains context from the full document.

In practice, this means significantly better search results — especially in source material with many cross-references and pronouns.

Results

The system runs as an integrated part of the writing process. When a chapter needs to be written, one agent automatically researches across all 40+ sources, another writes based on the found passages, and a third verifies that all facts can be traced to a source. A fourth agent reads the whole thing as an editor.

The overlap detection ensures no source is used the same way in two chapters. The book is in production, and every chapter is based on verifiable sources — not AI guesswork.

Contact me

Let's talk about how AI can elevate your business

Contact me