Files
docling-serve/docs/deployment.md
Rui Dias Gomes 525a43ff6f docs: update deployment examples (#135)
Signed-off-by: rmdg88 <rmdg88@gmail.com>
Signed-off-by: Rui Dias Gomes <66125272+rmdg88@users.noreply.github.com>
2025-04-17 14:29:34 +02:00

4.5 KiB

Deployment Examples

This document provides deployment examples for running the application in different environments.

Choose the deployment option that best fits your setup.

  • Local GPU: For deploying the application locally on a machine with a NVIDIA GPU (using Docker Compose).
  • OpenShift: For deploying the application on an OpenShift cluster, designed for cloud-native environments.

Local GPU

Docker compose

Manifest example: compose-gpu.yaml

This deployment has the following features:

  • NVIDIA cuda enabled

Install the app with:

docker compose -f docs/deploy-examples/compose-gpu.yaml up -d

For using the API:

# Make a test query
curl -X 'POST' \
  "localhost:5001/v1alpha/convert/source/async" \
  -H "accept: application/json" \
  -H "Content-Type: application/json" \
  -d '{
    "http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
  }'
Requirements
  • debian/ubuntu/rhel/fedora/opensuse
  • docker
  • nvidia drivers >=550.54.14
  • nvidia-container-toolkit

Docs:

Steps
  1. Check driver version and which GPU you want to use (0/1/2/3.. and update compose-gpu.yaml file or use count: all)

    nvidia-smi
    
  2. Check if the NVIDIA Container Toolkit is installed/updated

    # debian
    dpkg -l | grep nvidia-container-toolkit
    
    # rhel
    rpm -q nvidia-container-toolkit
    

    NVIDIA Container Toolkit install steps can be found here:

    https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

  3. Check which runtime is being used by Docker

    # docker
    docker info | grep -i runtime
    
  4. If the default Docker runtime changes back from 'nvidia' to 'default' after restarting the Docker service (optional):

    Backup the daemon.json file:

    sudo cp /etc/docker/daemon.json /etc/docker/daemon.json.bak
    

    Update the daemon.json file:

    echo '{
      "runtimes": {
        "nvidia": {
          "path": "nvidia-container-runtime"
        }
      },
      "default-runtime": "nvidia"
    }' | sudo tee /etc/docker/daemon.json > /dev/null
    

    Restart the Docker service:

    sudo systemctl restart docker
    

    Confirm 'nvidia' is the default runtime used by Docker by repeating step 3.

  5. Run the container:

    docker compose -f docs/deploy-examples/compose-gpu.yaml up -d
    

OpenShift

Simple deployment

Manifest example: docling-serve-simple.yaml

This deployment example has the following features:

  • Deployment configuration
  • Service configuration
  • NVIDIA cuda enabled

Install the app with:

oc apply -f docs/deploy-examples/docling-serve-simple.yaml

For using the API:

# Port-forward the service
oc port-forward svc/docling-serve 5001:5001

# Make a test query
curl -X 'POST' \
  "localhost:5001/v1alpha/convert/source/async" \
  -H "accept: application/json" \
  -H "Content-Type: application/json" \
  -d '{
    "http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
  }'

Secure deployment with oauth-proxy

Manifest example: docling-serve-oauth.yaml

This deployment has the following features:

  • TLS encryption between all components (using the cluster-internal CA authority).
  • Authentication via a secure oauth-proxy sidecar.
  • Expose the service using a secure OpenShift Route

Install the app with:

oc apply -f docs/deploy-examples/docling-serve-oauth.yaml

For using the API:

# Retrieve the endpoint
DOCLING_NAME=docling-serve
DOCLING_ROUTE="https://$(oc get routes ${DOCLING_NAME} --template={{.spec.host}})"

# Retrieve the authentication token
OCP_AUTH_TOKEN=$(oc whoami --show-token)

# Make a test query
curl -X 'POST' \
  "${DOCLING_ROUTE}/v1alpha/convert/source/async" \
  -H "Authorization: Bearer ${OCP_AUTH_TOKEN}" \
  -H "accept: application/json" \
  -H "Content-Type: application/json" \
  -d '{
    "http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
  }'