Skip to main content
RAG PDF Highlighter is a stateless microservice that accepts a PDF URL and a list of text chunks, finds each chunk inside the document, and returns an annotated PDF with yellow highlights applied — all within a single HTTP request. Understanding the request lifecycle helps you debug unexpected behaviour and get the most out of the service.

Request lifecycle

1

PDF Download

When a request arrives, the service asynchronously downloads the PDF from the URL you provide using httpx. The file is written to a temporary location on disk so that PyMuPDF can open it page by page. The download is fully async and does not block other concurrent requests.
2

Chunk Location

For each Document object in your request, the service reads metadata.page to identify the target page, then attempts to locate page_content on that page using three successive matching strategies — exact match, sentence-level match, and collapsed-whitespace match. The service tries each strategy in order and stops as soon as a match is found, returning a set of bounding boxes that cover the matched text.
3

Highlight Application

Once bounding boxes are found for a chunk, PyMuPDF draws a yellow highlight annotation over each rectangle on the correct page. Near-duplicate bounding boxes are removed automatically before annotations are written, preventing double-highlights on the same region.
4

Response

After all chunks have been processed, the fully annotated PDF is serialised to bytes and returned to the caller as an application/pdf binary response. You can save this directly to disk or stream it to a browser.
5

Cleanup

Temporary files created during the download step are deleted automatically once the response has been sent. No data is retained on the server between requests.

Stateless design

Every request to RAG PDF Highlighter is fully self-contained. The service holds no session state, no cached PDFs, and no stored annotations between calls. This means you can scale horizontally behind a load balancer without sticky sessions, and each request will produce a deterministic result given the same inputs. Because there is no shared state, retrying a failed request is always safe — you will never corrupt a partially-annotated document from a previous attempt.
Chunks whose metadata.page value does not match any page in the PDF, or whose text cannot be found by any of the three matching strategies, are silently skipped. The service still returns a valid annotated PDF — it simply contains no highlight for that chunk. No error or warning is raised for unmatched chunks.