RAG PDF Highlighter: Annotate PDFs for RAG Pipelines

RAG PDF Highlighter gives you a FastAPI microservice and an importable Python library that take the text chunks your retrieval pipeline surfaces and stamp them directly onto the source PDF — returning a fully annotated, highlighted document your users can read in seconds. Whether you call it over HTTP or import it as a library, you get the same stateless, Docker-ready service with no authentication overhead standing between you and highlighted PDFs.

Introduction

Learn what RAG PDF Highlighter does and how it fits into your pipeline.

Quickstart

Install the package and highlight your first PDF in under five minutes.

API Reference

Explore the POST /highlight endpoint, request schema, and response format.

Guides

Wire RAG PDF Highlighter into a LangChain or LlamaIndex retrieval pipeline.

How It Works

Install the package

Add RAG PDF Highlighter to your project with a single command:

pip install rag-pdf-highlighter

Or pull the pre-built Docker image to run it as a standalone microservice — no additional dependencies required.

Retrieve your chunks

Run your RAG pipeline as normal. Collect the Document objects your retriever returns — each one carries the matched page_content and a metadata.page value that tells the highlighter which page to search.

POST to /highlight

Send a single request with the URL of the source PDF and your list of Document objects. The service fetches the PDF, locates every chunk using a 3-tier matching strategy (exact → sentence → collapsed-whitespace fallback), and applies highlight annotations in one pass.

import requests

response = requests.post("http://localhost:8000/highlight", json={
    "pdf_url": "https://example.com/paper.pdf",
    "documents": [
        {"page_content": "Key finding from the paper.", "metadata": {"page": 3}}
    ]
})

with open("annotated.pdf", "wb") as f:
    f.write(response.content)

Serve the annotated PDF

The endpoint returns an application/pdf binary — the original document with every matched chunk highlighted and ready to display. Drop it straight into your document viewer, object store, or email it to your users.

⌘I

Introduction

Quickstart

API Reference

Guides

​How It Works

How It Works