Introduction
Learn what RAG PDF Highlighter does and how it fits into your pipeline.
Quickstart
Install the package and highlight your first PDF in under five minutes.
API Reference
Explore the POST /highlight endpoint, request schema, and response format.
Guides
Wire RAG PDF Highlighter into a LangChain or LlamaIndex retrieval pipeline.
How It Works
Install the package
Add RAG PDF Highlighter to your project with a single command:Or pull the pre-built Docker image to run it as a standalone microservice — no additional dependencies required.
Retrieve your chunks
Run your RAG pipeline as normal. Collect the
Document objects your retriever returns — each one carries the matched page_content and a metadata.page value that tells the highlighter which page to search.POST to /highlight
Send a single request with the URL of the source PDF and your list of
Document objects. The service fetches the PDF, locates every chunk using a 3-tier matching strategy (exact → sentence → collapsed-whitespace fallback), and applies highlight annotations in one pass.