I Tried Using ChatGPT to Index My Book. Here's What Happened (And What to Use Instead)

The instinct is completely logical. You've been using ChatGPT or Claude for outlines, research summaries, cover copy, and email rewrites. It processes text fast. It can read a chapter and produce something coherent. Why not ask it to build your book index?
I did exactly this. And I'm going to walk you through what came back — the actual output, the specific failure modes, and why general-purpose AI isn't the right tool for this job, even in 2026 when it's genuinely excellent at almost everything else.
What I Expected vs. What I Got
The experiment: I pasted three chapters from a 220-page nonfiction business manuscript into ChatGPT-4 and gave it a detailed prompt. "Act as a professional book indexer. Create a back-of-book index for this content. Include main entries, subentries, cross-references, and accurate page numbers. Follow Chicago Manual of Style conventions."
What came back looked credible on first glance. The entries were alphabetized. There were subentries under major concepts. I could even see a few "see also" cross-references. I understand why an author who hadn't compared this against a professional index recently might accept it and move on.
But here's what emerged when I checked it carefully against my actual typeset PDF.
The page numbers were fabricated. Not approximately right — fabricated. ChatGPT had no access to my typeset document. It was estimating page locations based on the sequence of text in the chat window, which has no reliable relationship to where page breaks fall in a formatted InDesign file. When I cross-referenced the output against my final PDF, roughly 60% of the page number entries were wrong — off by anywhere from 2 to 15 pages in either direction.
Large portions of the text went missing. Because of context window limits, ChatGPT wasn't processing my full chapter content — it was working with whatever fit after the system prompt. Concepts introduced mid-chapter or near the end of a section were either absent from the index or represented by a single page reference when they actually appeared on four or five pages throughout the chapter.
The cross-references were inferred, not reasoned. A "see also: leadership, situational" cross-reference looks authoritative on screen. But when I traced it back to the manuscript, the connection was loose at best — the model had associated two terms that appeared near each other, not terms that a skilled indexer reading for meaning would link as genuinely related navigation points.
The subentries were mechanical. A professional indexer groups references to "leadership" under subentries that reflect how readers navigate the concept: leadership, in crisis situations; leadership, building vs. managing teams; leadership, transformational approaches. What I got was a list of page numbers with vague modifiers — the structural appearance of depth without the editorial judgment that makes subentries useful.
The Research Confirms It
This wasn't an isolated experience. A 2025 peer-reviewed study published by Liverpool University Press tested large language models against professionally created book indexes. The results were specific: ChatGPT failed to index 67 out of 319 pages — 21% of the book's total content went completely unrepresented in the generated index. Claude performed better on raw page coverage but both models produced indexes with significant structural deficiencies that would require substantial human correction before they could be used.
The coverage problem isn't a bug that better prompting will fix. It's structural. A general-purpose LLM processing pasted text has no reliable mechanism to know that your chapter on pricing strategy starts on page 134 rather than page 129, because it never encountered the typeset document where that page break actually exists. The pagination is gone the moment you paste text into a chat window.
Professional indexers understand this completely. Their first and non-negotiable requirement is always the final, typeset PDF — not the manuscript, not the Word file, not a page-numbered draft that might shift. They know that even a few formatting changes can alter page breaks across an entire book, which is why they never start until every other production step is locked.
Why the Workaround Doesn't Work
Many authors discover the token limit problem and try to work around it by processing chapters individually: paste Chapter 1, get an index fragment, paste Chapter 2, combine them manually.
This approach compounds the errors rather than solving them.
Index entries that should appear across the full book get siloed by chapter. A recurring framework mentioned in Chapters 2, 5, 7, and 11 shows up as four separate partial entries unless you manually reconcile them — which requires holding the full book's conceptual structure in your head while doing detailed editorial work. Cross-references can only connect concepts within the chunk you're currently processing. Page numbers remain wrong because pasting text into a chat window doesn't restore the layout information that determines where page breaks fall in your formatted file.
The editing time required to fix a chapter-by-chapter ChatGPT index consistently runs 6–10 hours for a 250-page book, according to authors who've documented the process. That is more total time than most DIY indexing approaches, and it produces less reliable output.
What Professional Indexers Actually Charge
Before looking at the alternative, it helps to understand why so many authors turn to AI in the first place.
Professional book indexers charge $2.50–$6.00 per indexable page, according to current industry rate guides. For a 250-page nonfiction manuscript with 210 indexable pages, you're looking at $500–$1,500 before any complexity adjustments. Academic texts, legal references, and technical manuals with dense terminology consistently land at the high end. A serious 300-page reference book can run $1,500–$2,400 for professional indexing.
The timeline compounds the budget concern. Indexers require your final typeset PDF and typically work at 8–10 pages per hour — meaning a 250-page book represents 25–35 hours of professional work. Most quote a two-to-four week turnaround. If your print submission window is ten days away, no budget on earth closes that gap.
These constraints are real and they're not going away. They're exactly why authors are searching for "ChatGPT book index" in the first place.
What a Purpose-Built Tool Does Differently
The key technical difference between a general-purpose AI and a purpose-built indexing tool is where the work starts.
General-purpose AI tools accept text. Purpose-built indexing tools ingest your actual typeset PDF — the identical file your professional indexer would require. They extract text with page-location data from the document structure itself, not from probabilistic inference about where content might fall. That structural difference is what makes accurate page references technically possible in the first place.
From there, these tools take an approach that mirrors the professional workflow rather than replacing it with keyword extraction:
They identify indexable entities in context — people, organizations, concepts, frameworks, and key terms — and present them to you before generating the final index. You can review what the tool found, promote entries that deserve more weight, merge variant phrasings of the same concept, and remove terms that don't need to be indexed. You're exercising editorial judgment over the content, not just accepting AI output.
The tool then handles the structural work: grouping variant phrasings under canonical main entries, generating meaningful subentries (not just page lists), adding cross-references where genuine relationships exist, and producing CMOS-compliant output in two-column format ready for InDesign insertion.
The complete workflow — PDF upload, entity review, index generation, and a thorough human editing pass — runs under two hours for most nonfiction manuscripts. A professional indexer performing the same work clocks 25–35 hours. The output quality is not identical to the best professional indexers working on complex academic texts. But for trade nonfiction, business books, self-help, and most other commercially published manuscripts, it consistently clears the editorial bar that matters: accepted by publishers, useful to readers, professionally formatted.
The Right Tool for the Right Job
The framing that resolves most of the confusion: general-purpose AI and purpose-built indexing tools aren't competing for the same task. They have different designs, different inputs, and different appropriate uses.
ChatGPT and Claude are exceptional for tasks where deep language generation is the primary challenge — drafts, rewrites, research synthesis, promotional copy, outlines, and summaries. They're not designed to process document structure, maintain page-accuracy across 280 continuous pages, or apply the editorial logic that distinguishes a navigable index from a keyword list.
Purpose-built indexing tools are narrow by design. They do one thing: accept your typeset PDF and produce a professionally structured index. That specialization is what makes them technically capable of getting it right.
Here's how to think about which option fits your situation:
Hire a professional indexer if your publisher formally evaluates index quality (academic presses, legal publishers), your contract specifies professional indexer credentials, or you have timeline flexibility and the production budget to accommodate a two-to-four week turnaround. The quality ceiling for complex reference texts is higher than any current tool can reach.
Use a purpose-built AI tool if you're self-published or hybrid-published, your production timeline is tight, your budget needs to stay realistic, or you want to understand what your index will look like before making any financial commitment.
Skip general AI for indexing. Not because these tools are bad — they're extraordinary — but because they're genuinely the wrong tool for this specific task. A hallucinated page number in a back-of-book index isn't a cosmetic issue. It actively misdirects a reader looking for something specific. And fixing six hours of hallucinated output is worse than starting from scratch with a tool designed for the job.
See What Your Index Actually Looks Like
The fastest way to understand the quality difference is to try it with your own manuscript.
Onomastic processes your actual typeset PDF and returns a CMOS-compliant index with accurate page references, hierarchical entries, meaningful subentries, and cross-references built for reader navigation — not keyword extraction dressed up to look like one.
Upload your final PDF. Review the extracted entities. See what the generated index looks like. You'll know within an hour whether the output clears your bar — and you'll have a concrete, real-output basis for comparing it against a professional quote, against your DIY estimate, or against whatever ChatGPT produced when you tried it first.
The experiment I ran with ChatGPT cost several frustrating hours and produced an index I couldn't use. Getting it right took under two hours with a purpose-built tool and passed editorial review without revision requests. That comparison is the one that matters.




