Skip to main content
Docker is the fastest way to get RAG PDF Highlighter running in any environment. The included Dockerfile packages the FastAPI service and all native dependencies so you can go from zero to a working highlight endpoint in a few commands.

Prerequisites

  • Docker installed and running on your machine or server.

Deploy the Service

1

Clone or download the project

Grab the source code from the repository and change into the project directory.
git clone https://github.com/MuhammadSalmanAhmad/rag-pdf-highlighter.git
cd rag-pdf-highlighter
2

Build the Docker image

Build the image and tag it rag-pdf-highlighter.
docker build -t rag-pdf-highlighter .
The build step installs Python dependencies and the required native PDF libraries inside the image. This only needs to run once (or whenever you update the project).
3

Run the container

Start the container and map port 8000 on your host to port 8000 inside the container.
docker run -p 8000:8000 rag-pdf-highlighter
The service starts automatically. You should see Uvicorn log output confirming it is listening.
4

Verify the deployment

Send a health-check request to confirm the service is up.
curl http://localhost:8000/
A successful response looks like this:
{"status": "ok the app is running"}

Use a Custom Port

By default the service listens on port 8000. Override this with the PORT environment variable — remember to update both the -e flag and the -p mapping so they match.
docker run -e PORT=9000 -p 9000:9000 rag-pdf-highlighter
The container is built on python:3.12-slim and includes libmupdf-dev pre-installed, which provides the native PDF rendering capabilities that power the highlighting engine. You do not need to install any additional system packages.

Production Tips

  • Scale to zero freely. RAG PDF Highlighter is fully stateless — it holds no in-memory session data between requests. You can safely stop, restart, or scale the container without losing any state.
  • Serve over HTTPS. Put the container behind a reverse proxy such as Nginx or a cloud load balancer that terminates TLS. Avoid exposing port 8000 directly to the public internet.
  • Resource limits. PDF processing is CPU-bound. Set appropriate --cpus and --memory limits in production to prevent a single large PDF from starving other workloads.