chore: bump version to 0.4.0 [skip ci]

feat: New container images (#68 )
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-11-29 16:43:24 +00:00 · 2025-02-26 13:05:03 +00:00 · 2025-02-26 12:49:20 +01:00 · 2025-02-25 14:49:50 -08:00 · 2025-02-25 11:27:42 +01:00 · 2025-02-24 08:21:17 +01:00
31 changed files with 5172 additions and 5409 deletions
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -0,0 +1,12 @@
+<!-- Thank you for contributing to Docling! -->
+
+<!-- STEPS TO FOLLOW:
+  1. Add a description of the changes (frequently the same as the commit description)
+  2. Enter the issue number next to "Resolves #" below (if there is no tracking issue resolved, **remove that section**)
+  3. Make sure the PR title follows the **Commit Message Formatting**: https://www.conventionalcommits.org/en/v1.0.0/#summary.
+-->
+
+<!-- Uncomment this section with the issue number if an issue is being resolved
+**Issue resolved by this Pull Request:**
+Resolves #
+--->
--- a/.github/SECURITY.md
+++ b/.github/SECURITY.md
@@ -0,0 +1,23 @@
+# Security and Disclosure Information Policy for the Docling Project
+
+The Docling team and community take security bugs seriously. We appreciate your efforts to responsibly disclose your findings, and will make every effort to acknowledge your contributions.
+
+## Reporting a Vulnerability
+
+If you think you've identified a security issue in an Docling project repository, please DO NOT report the issue publicly via the GitHub issue tracker, etc.
+
+Instead, send an email with as many details as possible to [deepsearch-core@zurich.ibm.com](mailto:deepsearch-core@zurich.ibm.com). This is a private mailing list for the maintainers team.
+
+Please do not create a public issue.
+
+## Security Vulnerability Response
+
+Each report is acknowledged and analyzed by the core maintainers within 3 working days.
+
+Any vulnerability information shared with core maintainers stays within the Docling project and will not be disseminated to other projects unless it is necessary to get the issue fixed.
+
+After the initial reply to your report, the security team will keep you informed of the progress towards a fix and full announcement, and may ask for additional information or guidance.
+
+## Security Alerts
+
+We will send announcements of security vulnerabilities and steps to remediate on the [Docling announcements](https://github.com/DS4SD/docling/discussions/categories/announcements).
--- a/.github/actions/setup-poetry/action.yml
+++ b/.github/actions/setup-poetry/action.yml
@@ -1,19 +0,0 @@
-name: 'Set up Poetry and install'
-description: 'Set up a specific version of Poetry and install dependencies using caching.'
-inputs:
-  python-version:
-    description: "Version range or exact version of Python or PyPy to use, using SemVer's version range syntax."
-    default: '3.11'
-runs:
-  using: 'composite'
-  steps:
-    - name: Install poetry
-      run: pipx install poetry==1.8.3
-      shell: bash
-    - uses: actions/setup-python@v4
-      with:
-        python-version: ${{ inputs.python-version }}
-        cache: 'poetry'
-    - name: Install dependencies
-      run: poetry install --all-extras
-      shell: bash
--- a/.github/mergify.yml
+++ b/.github/mergify.yml
@@ -0,0 +1,9 @@
+merge_protections:
+  - name: Enforce conventional commit
+    description: Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
+    if:
+      - base = main
+    success_conditions:
+      - "title ~=
+        ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\\(.+\
+        \\))?(!)?:"
--- a/.github/scripts/release.sh
+++ b/.github/scripts/release.sh
@@ -0,0 +1,39 @@
+#!/bin/bash
+
+set -e  # trigger failure on error - do not remove!
+set -x  # display command on output
+
+if [ -z "${TARGET_VERSION}" ]; then
+    >&2 echo "No TARGET_VERSION specified"
+    exit 1
+fi
+CHGLOG_FILE="${CHGLOG_FILE:-CHANGELOG.md}"
+
+# update package version
+uvx --from=toml-cli toml set --toml-path=pyproject.toml project.version "${TARGET_VERSION}"
+
+# collect release notes
+REL_NOTES=$(mktemp)
+uv run --no-sync semantic-release changelog --unreleased >> "${REL_NOTES}"
+
+# update changelog
+TMP_CHGLOG=$(mktemp)
+TARGET_TAG_NAME="v${TARGET_VERSION}"
+RELEASE_URL="$(gh repo view --json url -q ".url")/releases/tag/${TARGET_TAG_NAME}"
+printf "## [${TARGET_TAG_NAME}](${RELEASE_URL}) - $(date -Idate)\n\n" >> "${TMP_CHGLOG}"
+cat "${REL_NOTES}" >> "${TMP_CHGLOG}"
+if [ -f "${CHGLOG_FILE}" ]; then
+    printf "\n" | cat - "${CHGLOG_FILE}" >> "${TMP_CHGLOG}"
+fi
+mv "${TMP_CHGLOG}" "${CHGLOG_FILE}"
+
+# push changes
+git config --global user.name 'github-actions[bot]'
+git config --global user.email 'github-actions[bot]@users.noreply.github.com'
+git add pyproject.toml "${CHGLOG_FILE}"
+COMMIT_MSG="chore: bump version to ${TARGET_VERSION} [skip ci]"
+git commit -m "${COMMIT_MSG}"
+git push origin main
+
+# create GitHub release (incl. Git tag)
+gh release create "${TARGET_TAG_NAME}" -F "${REL_NOTES}"
--- a/.github/workflows/cd.yml
+++ b/.github/workflows/cd.yml
@@ -0,0 +1,59 @@
+name: "Run CD"
+
+on:
+  workflow_dispatch:
+
+jobs:
+  code-checks:
+    uses: ./.github/workflows/job-checks.yml
+  pre-release-check:
+    runs-on: ubuntu-latest
+    outputs:
+      TARGET_TAG_V: ${{ steps.version_check.outputs.TRGT_VERSION }}
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0  # for fetching tags, required for semantic-release
+      - name: Install uv and set the python version
+        uses: astral-sh/setup-uv@v5
+        with:
+          enable-cache: true
+      - name: Install dependencies
+        run: uv sync --only-dev
+      - name: Check version of potential release
+        id: version_check
+        run: |
+          TRGT_VERSION=$(uv run --no-sync semantic-release print-version)
+          echo "TRGT_VERSION=${TRGT_VERSION}" >> "$GITHUB_OUTPUT"
+          echo "${TRGT_VERSION}"
+      - name: Check notes of potential release
+        run: uv run --no-sync semantic-release changelog --unreleased
+  release:
+    needs: [code-checks, pre-release-check]
+    if: needs.pre-release-check.outputs.TARGET_TAG_V != ''
+    environment: auto-release
+    runs-on: ubuntu-latest
+    concurrency: release
+    steps:
+      - uses: actions/create-github-app-token@v1
+        id: app-token
+        with:
+          app-id: ${{ vars.CI_APP_ID }}
+          private-key: ${{ secrets.CI_PRIVATE_KEY }}
+      - uses: actions/checkout@v4
+        with:
+          token: ${{ steps.app-token.outputs.token }}
+          fetch-depth: 0  # for fetching tags, required for semantic-release
+      - name: Install uv and set the python version
+        uses: astral-sh/setup-uv@v5
+        with:
+          enable-cache: true
+      - name: Install dependencies
+        run: uv sync --only-dev
+      - name: Run release script
+        env:
+          GH_TOKEN: ${{ steps.app-token.outputs.token }}
+          TARGET_VERSION: ${{ needs.pre-release-check.outputs.TARGET_TAG_V }}
+          CHGLOG_FILE: CHANGELOG.md
+        run: ./.github/scripts/release.sh
+        shell: bash
--- a/.github/workflows/ci-images-dryrun.yml
+++ b/.github/workflows/ci-images-dryrun.yml
@@ -0,0 +1,41 @@
+name: Dry run docling-serve image building
+
+on:
+  workflow_call:
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
+  cancel-in-progress: true
+
+jobs:
+  build_image:
+    name: Build ${{ matrix.spec.name }} container image
+    strategy:
+      matrix:
+        spec:
+          - name: ds4sd/docling-serve
+            build_args: |
+              UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra cpu
+            platforms: linux/amd64, linux/arm64
+          - name: ds4sd/docling-serve-cpu
+            build_args: |
+              UV_SYNC_EXTRA_ARGS=--no-extra cu124
+            platforms: linux/amd64, linux/arm64
+          - name: ds4sd/docling-serve-cu124
+            build_args: |
+              UV_SYNC_EXTRA_ARGS=--no-extra cpu
+            platforms: linux/amd64
+
+    permissions:
+      packages: write
+      contents: read
+      attestations: write
+      id-token: write
+
+    uses: ./.github/workflows/job-image.yml
+    with:
+      publish: false
+      build_args: ${{ matrix.spec.build_args }}
+      ghcr_image_name: ${{ matrix.spec.name }}
+      quay_image_name: ""
+      platforms: ${{ matrix.spec.platforms }}
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -0,0 +1,25 @@
+name: "Run CI"
+
+on:
+  push:
+    branches: ["main"]
+  pull_request:
+    branches: ["main"]
+
+jobs:
+  code-checks:
+    # if: ${{ github.event_name == 'push' || (github.event.pull_request.head.repo.full_name != 'DS4SD/docling-serve' && github.event.pull_request.head.repo.full_name != 'ds4sd/docling-serve') }}
+    uses: ./.github/workflows/job-checks.yml
+    permissions:
+      packages: write
+      contents: read
+      attestations: write
+      id-token: write
+
+  build-images:
+    uses: ./.github/workflows/ci-images-dryrun.yml
+    permissions:
+      packages: write
+      contents: read
+      attestations: write
+      id-token: write
--- a/.github/workflows/images-dryrun.yml
+++ b/.github/workflows/images-dryrun.yml
@@ -1,105 +0,0 @@
-name: Dry run docling-serve image building
-
-on:
-  pull_request:
-    branches: ["main"]
-
-env:
-  GHCR_REGISTRY: ghcr.io
-  GHCR_DOCLING_SERVE_CPU_IMAGE_NAME: ds4sd/docling-serve-cpu
-  GHCR_DOCLING_SERVE_GPU_IMAGE_NAME: ds4sd/docling-serve
-
-jobs:
-  build_cpu_image:
-    name: Build docling-serve "CPU only" container image
-    runs-on: ubuntu-latest
-    permissions:
-      packages: write
-      contents: read
-      attestations: write
-      id-token: write
-
-    steps:
-      - name: Check out the repo
-        uses: actions/checkout@v4
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3
-
-      - name: Cache Docker layers
-        uses: actions/cache@v4
-        with:
-          path: /tmp/.buildx-cache
-          key: ${{ runner.os }}-buildx-${{ github.sha }}
-          restore-keys: |
-            ${{ runner.os }}-buildx-
-
-      - name: Extract metadata (tags, labels) for docling-serve (CPU only) ghcr image
-        id: ghcr_serve_cpu_meta
-        uses: docker/metadata-action@v5
-        with:
-          images: ${{ env.GHCR_REGISTRY }}/${{ env.GHCR_DOCLING_SERVE_CPU_IMAGE_NAME }}
-
-      - name: Build docling-serve-cpu image
-        id: build-serve-cpu-ghcr
-        uses: docker/build-push-action@v5
-        with:
-          context: .
-          push: false
-          tags: ${{ steps.ghcr_serve_cpu_meta.outputs.tags }}
-          labels: ${{ steps.ghcr_serve_cpu_meta.outputs.labels }}
-          platforms: linux/amd64, linux/arm64
-          cache-from: type=gha
-          cache-to: type=gha,mode=max
-          file: Containerfile
-          build-args: |
-            --build-arg CPU_ONLY=true
-
-      - name: Remove Local Docker Images
-        run: |
-          docker image prune -af
-
-  build_gpu_image:
-    name: Build docling-serve (with GPU support) container image
-    runs-on: ubuntu-latest
-    permissions:
-      packages: write
-      contents: read
-      attestations: write
-      id-token: write
-
-    steps:
-      - name: Check out the repo
-        uses: actions/checkout@v4
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3
-
-      - name: Cache Docker layers
-        uses: actions/cache@v4
-        with:
-          path: /tmp/.buildx-cache
-          key: ${{ runner.os }}-buildx-${{ github.sha }}
-          restore-keys: |
-            ${{ runner.os }}-buildx-
-
-      - name: Extract metadata (tags, labels) for docling-serve (GPU) ghcr image
-        id: ghcr_serve_gpu_meta
-        uses: docker/metadata-action@v5
-        with:
-          images: ${{ env.GHCR_REGISTRY }}/${{ env.GHCR_DOCLING_SERVE_GPU_IMAGE_NAME }}
-
-      - name: Build docling-serve (GPU) image
-        id: build-serve-gpu-ghcr
-        uses: docker/build-push-action@v5
-        with:
-          context: .
-          push: false
-          tags: ${{ steps.ghcr_serve_gpu_meta.outputs.tags }}
-          labels: ${{ steps.ghcr_serve_gpu_meta.outputs.labels }}
-          platforms: linux/amd64,linux/arm64
-          cache-from: type=gha
-          cache-to: type=gha,mode=max
-          file: Containerfile
-          build-args: |
-            --build-arg CPU_ONLY=false
--- a/.github/workflows/images.yml
+++ b/.github/workflows/images.yml
@@ -4,193 +4,44 @@ on:
  push:
    branches:
      - main
-    tags:
-      - 'v*'
+  release:
+    types: [published]

-env:
-  GHCR_REGISTRY: ghcr.io
-  GHCR_DOCLING_SERVE_CPU_IMAGE_NAME: ds4sd/docling-serve-cpu
-  GHCR_DOCLING_SERVE_GPU_IMAGE_NAME: ds4sd/docling-serve
-  QUAY_REGISTRY: quay.io
-  QUAY_DOCLING_SERVE_CPU_IMAGE_NAME: ds4sd/docling-serve-cpu
-  QUAY_DOCLING_SERVE_GPU_IMAGE_NAME: ds4sd/docling-serve
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
+  cancel-in-progress: true

 jobs:
-  build_and_publish_cpu_images:
-    name: Push docling-serve "CPU only" container image to GHCR and QUAY
-    runs-on: ubuntu-latest
-    environment: registry-creds
+  build_and_publish_images:
+    name: Build and push ${{ matrix.spec.name }} container image to GHCR and QUAY
+    strategy:
+      matrix:
+        spec:
+          - name: ds4sd/docling-serve
+            build_args: |
+              UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra cpu
+            platforms: linux/amd64, linux/arm64
+          - name: ds4sd/docling-serve-cpu
+            build_args: |
+              UV_SYNC_EXTRA_ARGS=--no-extra cu124
+            platforms: linux/amd64, linux/arm64
+          - name: ds4sd/docling-serve-cu124
+            build_args: |
+              UV_SYNC_EXTRA_ARGS=--no-extra cpu
+            platforms: linux/amd64
+
    permissions:
      packages: write
      contents: read
      attestations: write
      id-token: write
+    secrets: inherit

-    steps:
-      - name: Check out the repo
-        uses: actions/checkout@v4
-
-      - name: Log in to the GHCR container image registry
-        uses: docker/login-action@v3
-        with:
-          registry: ${{ env.GHCR_REGISTRY }}
-          username: ${{ github.actor }}
-          password: ${{ secrets.GITHUB_TOKEN }}
-
-      - name: Log in to the Quay container image registry
-        uses: docker/login-action@v3
-        with:
-          registry: ${{ env.QUAY_REGISTRY }}
-          username: ${{ secrets.QUAY_USERNAME }}
-          password: ${{ secrets.QUAY_TOKEN }}
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3
-
-      - name: Cache Docker layers
-        uses: actions/cache@v4
-        with:
-          path: /tmp/.buildx-cache
-          key: ${{ runner.os }}-buildx-${{ github.sha }}
-          restore-keys: |
-            ${{ runner.os }}-buildx-
-
-      - name: Extract metadata (tags, labels) for docling-serve (CPU only) ghcr image
-        id: ghcr_serve_cpu_meta
-        uses: docker/metadata-action@v5
-        with:
-          images: ${{ env.GHCR_REGISTRY }}/${{ env.GHCR_DOCLING_SERVE_CPU_IMAGE_NAME }}
-
-      - name: Build and push docling-serve-cpu image to ghcr.io
-        id: push-serve-cpu-ghcr
-        uses: docker/build-push-action@v5
-        with:
-          context: .
-          push: true
-          tags: ${{ steps.ghcr_serve_cpu_meta.outputs.tags }}
-          labels: ${{ steps.ghcr_serve_cpu_meta.outputs.labels }}
-          platforms: linux/amd64, linux/arm64
-          cache-from: type=gha
-          cache-to: type=gha,mode=max
-          file: Containerfile
-          build-args: |
-            --build-arg CPU_ONLY=true
-
-      - name: Generate artifact attestation
-        uses: actions/attest-build-provenance@v1
-        with:
-          subject-name: ${{ env.GHCR_REGISTRY }}/${{ env.GHCR_DOCLING_SERVE_CPU_IMAGE_NAME}}
-          subject-digest: ${{ steps.push-serve-cpu-ghcr.outputs.digest }}
-          push-to-registry: true
-
-      - name: Extract metadata (tags, labels) for docling-serve (CPU only) quay image
-        id: quay_serve_cpu_meta
-        uses: docker/metadata-action@v5
-        with:
-          images: ${{ env.QUAY_REGISTRY }}/${{ env.QUAY_DOCLING_SERVE_CPU_IMAGE_NAME }}
-
-      - name: Build and push docling-serve-cpu image to quay.io
-        id: push-serve-cpu-quay
-        uses: docker/build-push-action@v5
-        with:
-          context: .
-          push: true
-          tags: ${{ steps.quay_serve_cpu_meta.outputs.tags }}
-          labels: ${{ steps.quay_serve_cpu_meta.outputs.labels }}
-          platforms: linux/amd64, linux/arm64
-          cache-from: type=gha
-          cache-to: type=gha,mode=max
-          file: Containerfile
-          build-args: |
-            --build-arg CPU_ONLY=true
-      - name: Remove Local Docker Images
-        run: |
-          docker image prune -af
-
-  build_and_publish_gpu_images:
-    name: Push docling-serve (with GPU support) container image to GHCR and QUAY
-    runs-on: ubuntu-latest
-    environment: registry-creds
-    permissions:
-      packages: write
-      contents: read
-      attestations: write
-      id-token: write
-
-    steps:
-      - name: Check out the repo
-        uses: actions/checkout@v4
-
-      - name: Log in to the GHCR container image registry
-        uses: docker/login-action@v3
-        with:
-          registry: ${{ env.GHCR_REGISTRY }}
-          username: ${{ github.actor }}
-          password: ${{ secrets.GITHUB_TOKEN }}
-
-      - name: Log in to the Quay container image registry
-        uses: docker/login-action@v3
-        with:
-          registry: ${{ env.QUAY_REGISTRY }}
-          username: ${{ secrets.QUAY_USERNAME }}
-          password: ${{ secrets.QUAY_TOKEN }}
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3
-
-      - name: Cache Docker layers
-        uses: actions/cache@v4
-        with:
-          path: /tmp/.buildx-cache
-          key: ${{ runner.os }}-buildx-${{ github.sha }}
-          restore-keys: |
-            ${{ runner.os }}-buildx-
-
-      - name: Extract metadata (tags, labels) for docling-serve (GPU) ghcr image
-        id: ghcr_serve_gpu_meta
-        uses: docker/metadata-action@v5
-        with:
-          images: ${{ env.GHCR_REGISTRY }}/${{ env.GHCR_DOCLING_SERVE_GPU_IMAGE_NAME }}
-
-      - name: Build and push docling-serve (GPU) image to ghcr.io
-        id: push-serve-gpu-ghcr
-        uses: docker/build-push-action@v5
-        with:
-          context: .
-          push: true
-          tags: ${{ steps.ghcr_serve_gpu_meta.outputs.tags }}
-          labels: ${{ steps.ghcr_serve_gpu_meta.outputs.labels }}
-          platforms: linux/amd64,linux/arm64
-          cache-from: type=gha
-          cache-to: type=gha,mode=max
-          file: Containerfile
-          build-args: |
-            --build-arg CPU_ONLY=false
-
-      - name: Generate artifact attestation
-        uses: actions/attest-build-provenance@v1
-        with:
-          subject-name: ${{ env.GHCR_REGISTRY }}/${{ env.GHCR_DOCLING_SERVE_GPU_IMAGE_NAME}}
-          subject-digest: ${{ steps.push-serve-gpu-ghcr.outputs.digest }}
-          push-to-registry: true
-
-      - name: Extract metadata (tags, labels) for docling-serve (GPU) quay image
-        id: quay_serve_gpu_meta
-        uses: docker/metadata-action@v5
-        with:
-          images: ${{ env.QUAY_REGISTRY }}/${{ env.QUAY_DOCLING_SERVE_GPU_IMAGE_NAME }}
-
-      - name: Build and push docling-serve (GPU) image to quay.io
-        id: push-serve-gpu-quay
-        uses: docker/build-push-action@v5
-        with:
-          context: .
-          push: true
-          tags: ${{ steps.quay_serve_gpu_meta.outputs.tags }}
-          labels: ${{ steps.quay_serve_gpu_meta.outputs.labels }}
-          platforms: linux/amd64,linux/arm64
-          cache-from: type=gha
-          cache-to: type=gha,mode=max
-          file: Containerfile
-          build-args: |
-            --build-arg CPU_ONLY=false
+    uses: ./.github/workflows/job-image.yml
+    with:
+      publish: true
+      environment: registry-creds
+      build_args: ${{ matrix.spec.build_args }}
+      ghcr_image_name: ${{ matrix.spec.name }}
+      quay_image_name: ${{ matrix.spec.name }}
+      platforms: ${{ matrix.spec.platforms }}
--- a/.github/workflows/job-checks.yml
+++ b/.github/workflows/job-checks.yml
@@ -1,27 +1,25 @@
-name: Run linter checks
-on:
-  push:
-    branches: ["main"]
-  pull_request:
-    branches: ["main"]
+name: Run checks

-concurrency:
-  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
-  cancel-in-progress: true
+on:
+  workflow_call:

 jobs:
  py-lint:
    runs-on: ubuntu-latest
    strategy:
      matrix:
-        python-version: ['3.11']
+        python-version: ['3.12']
    steps:
      - uses: actions/checkout@v4
-      - uses: ./.github/actions/setup-poetry
+      - name: Install uv and set the python version
+        uses: astral-sh/setup-uv@v5
        with:
          python-version: ${{ matrix.python-version }}
+          enable-cache: true
+      - name: Install dependencies
+        run: uv sync --all-extras --no-extra cu124
      - name: Run styling check
-        run: poetry run pre-commit run --all-files
+        run: uv run --no-sync pre-commit run --all-files

  markdown-lint:
    runs-on: ubuntu-latest
--- a/.github/workflows/job-image.yml
+++ b/.github/workflows/job-image.yml
@@ -0,0 +1,141 @@
+name: Build docling-serve container image
+
+on:
+  workflow_call:
+    inputs:
+      build_args:
+        type: string
+        description: "Extra build arguments for the build."
+        default: ""
+      ghcr_image_name:
+        type: string
+        description: "Name of the image for GHCR."
+      quay_image_name:
+        type: string
+        description: "Name of the image Quay."
+      platforms:
+        type: string
+        description: "Platform argument for building images."
+        default: linux/amd64, linux/arm64
+      publish:
+        type: boolean
+        description: "If true, the images will be published."
+        default: false
+      environment:
+        type: string
+        description: "GH Action environment"
+        default: ""
+
+env:
+  GHCR_REGISTRY: ghcr.io
+  QUAY_REGISTRY: quay.io
+
+jobs:
+  image:
+    runs-on: ubuntu-latest
+    permissions:
+      packages: write
+      contents: read
+      attestations: write
+      id-token: write
+    environment: ${{ inputs.environment }}
+
+    steps:
+      - name: Free up space in github runner
+        # Free space as indicated here : https://github.com/actions/runner-images/issues/2840#issuecomment-790492173
+        run: |
+            df -h
+            sudo rm -rf "/usr/local/share/boost"
+            sudo rm -rf "$AGENT_TOOLSDIRECTORY"
+            sudo rm -rf /usr/share/dotnet /opt/ghc /usr/local/lib/android /usr/local/share/powershell /usr/share/swift /usr/local/.ghcup
+            # shellcheck disable=SC2046
+            sudo docker rmi "$(docker image ls -aq)" >/dev/null 2>&1 || true
+            df -h
+
+      - name: Check out the repo
+        uses: actions/checkout@v4
+
+      - name: Log in to the GHCR container image registry
+        if: ${{ inputs.publish }}
+        uses: docker/login-action@v3
+        with:
+          registry: ${{ env.GHCR_REGISTRY }}
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Log in to the Quay container image registry
+        if: ${{ inputs.publish }}
+        uses: docker/login-action@v3
+        with:
+          registry: ${{ env.QUAY_REGISTRY }}
+          username: ${{ secrets.QUAY_USERNAME }}
+          password: ${{ secrets.QUAY_TOKEN }}
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+
+      - name: Cache Docker layers
+        uses: actions/cache@v4
+        with:
+          path: /tmp/.buildx-cache
+          key: ${{ runner.os }}-buildx-${{ github.sha }}
+          restore-keys: |
+            ${{ runner.os }}-buildx-
+
+      - name: Extract metadata (tags, labels) for docling-serve ghcr image
+        id: ghcr_meta
+        uses: docker/metadata-action@v5
+        with:
+          images: ${{ env.GHCR_REGISTRY }}/${{ inputs.ghcr_image_name }}
+
+      - name: Build and push image to ghcr.io
+        id: ghcr_push
+        uses: docker/build-push-action@v5
+        with:
+          context: .
+          push: ${{ inputs.publish }}
+          tags: ${{ steps.ghcr_meta.outputs.tags }}
+          labels: ${{ steps.ghcr_meta.outputs.labels }}
+          platforms: ${{ inputs.platforms}}
+          cache-from: type=gha
+          cache-to: type=gha,mode=max
+          file: Containerfile
+          build-args: ${{ inputs.build_args }}
+
+      - name: Generate artifact attestation
+        if: ${{ inputs.publish }}
+        uses: actions/attest-build-provenance@v1
+        with:
+          subject-name: ${{ env.GHCR_REGISTRY }}/${{ inputs.ghcr_image_name }}
+          subject-digest: ${{ steps.ghcr_push.outputs.digest }}
+          push-to-registry: true
+
+      - name: Extract metadata (tags, labels) for docling-serve quay image
+        if: ${{ inputs.publish }}
+        id: quay_meta
+        uses: docker/metadata-action@v5
+        with:
+          images: ${{ env.QUAY_REGISTRY }}/${{ inputs.quay_image_name }}
+
+      - name: Build and push image to quay.io
+        if: ${{ inputs.publish }}
+        # id: push-serve-cpu-quay
+        uses: docker/build-push-action@v5
+        with:
+          context: .
+          push: ${{ inputs.publish }}
+          tags: ${{ steps.quay_meta.outputs.tags }}
+          labels: ${{ steps.quay_meta.outputs.labels }}
+          platforms: ${{ inputs.platforms}}
+          cache-from: type=gha
+          cache-to: type=gha,mode=max
+          file: Containerfile
+          build-args: ${{ inputs.build_args }}
+      
+      # - name: Inspect the image details
+      #   run: |
+      #     echo "${{ steps.ghcr_push.outputs.metadata }}"
+
+      - name: Remove Local Docker Images
+        run: |
+          docker image prune -af
--- a/.github/workflows/pypi.yml
+++ b/.github/workflows/pypi.yml
@@ -0,0 +1,32 @@
+name: "Build and publish package"
+
+on:
+  release:
+    types: [published]
+
+permissions:
+  contents: read
+
+jobs:
+  build-and-publish:
+    runs-on: ubuntu-latest
+    environment:
+      name: pypi
+      url: https://pypi.org/p/docling-serve  # Replace <package-name> with your PyPI project name
+    permissions:
+      id-token: write  # IMPORTANT: mandatory for trusted publishing
+    steps:
+      - uses: actions/checkout@v4
+      - name: Install uv and set the python version
+        uses: astral-sh/setup-uv@v5
+        with:
+          enable-cache: true
+      - name: Install dependencies
+        run: uv sync --all-extras --no-extra cu124
+      - name: Build
+        run: uv build
+      - name: Publish distribution 📦 to PyPI
+        uses: pypa/gh-action-pypi-publish@release/v1
+        with:
+          # currently not working with reusable workflows
+          attestations: false
--- a/.gitignore
+++ b/.gitignore
@@ -1,5 +1,7 @@
 model_artifacts/
 scratch/
+.md-lint
+actionlint

 # Created by https://www.toptal.com/developers/gitignore/api/python,macos,virtualenv,pycharm,visualstudiocode,emacs,vim,jupyternotebooks
 # Edit at https://www.toptal.com/developers/gitignore?templates=python,macos,virtualenv,pycharm,visualstudiocode,emacs,vim,jupyternotebooks
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -1,49 +1,24 @@
 fail_fast: true
 repos:
-  - repo: local
-    hooks:
-      - id: system
-        name: Black
-        entry: poetry run black docling_serve tests
-        pass_filenames: false
-        language: system
-        files: '\.py$'
-  - repo: local
-    hooks:
-      - id: system
-        name: isort
-        entry: poetry run isort docling_serve tests
-        pass_filenames: false
-        language: system
-        files: '\.py$'
-  - repo: local
-    hooks:
-      - id: autoflake
-        name: autoflake
-        entry: poetry run autoflake docling_serve tests
-        pass_filenames: false
-        language: system
-        files: '\.py$'
-  - repo: local
-    hooks:
-      - id: system
-        name: flake8
-        entry: poetry run flake8 docling_serve
-        pass_filenames: false
-        language: system
-        files: '\.py$'
  - repo: local
    hooks:
      - id: system
        name: MyPy
-        entry: poetry run mypy docling_serve
+        entry: uv run --no-sync mypy docling_serve
        pass_filenames: false
        language: system
        files: '\.py$'
-  - repo: local
+  - repo: https://github.com/astral-sh/uv-pre-commit
+    # uv version.
+    rev: 0.6.1
    hooks:
-      - id: system
-        name: Poetry check
-        entry: poetry check --lock
-        pass_filenames: false
-        language: system
+      - id: uv-lock
+  - repo: https://github.com/astral-sh/ruff-pre-commit
+    rev: v0.9.6
+    hooks:
+      # Run the Ruff linter.
+      - id: ruff
+        args: [--exit-non-zero-on-fix, --config=pyproject.toml]
+      # Run the Ruff formatter.
+      # - id: ruff-format
+      #   args: [--config=pyproject.toml]
--- a/.python-version
+++ b/.python-version
@@ -0,0 +1 @@
+3.12
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -0,0 +1,18 @@
+## [v0.4.0](https://github.com/DS4SD/docling-serve/releases/tag/v0.4.0) - 2025-02-26
+
+### Feature
+
+* New container images ([#68](https://github.com/DS4SD/docling-serve/issues/68)) ([`7e6d9cd`](https://github.com/DS4SD/docling-serve/commit/7e6d9cdef398df70a5b4d626aeb523c428c10d56))
+* Render DoclingDocument with npm docling-components in the example UI ([#65](https://github.com/DS4SD/docling-serve/issues/65)) ([`c430d9b`](https://github.com/DS4SD/docling-serve/commit/c430d9b1a162ab29104d86ebaa1ac5a5488b1f09))
+
+## [v0.3.0](https://github.com/DS4SD/docling-serve/releases/tag/v0.3.0) - 2025-02-19
+
+### Feature
+
+* Add new docling-serve cli ([#50](https://github.com/DS4SD/docling-serve/issues/50)) ([`ec33a61`](https://github.com/DS4SD/docling-serve/commit/ec33a61faa7846b9b7998fbf557ebe39a3b800f6))
+
+### Fix
+
+* Set DOCLING_SERVE_ARTIFACTS_PATH in images ([#53](https://github.com/DS4SD/docling-serve/issues/53)) ([`4877248`](https://github.com/DS4SD/docling-serve/commit/487724836896576ca4f98e84abf15fd1c383bec8))
+* Set root UI path when behind proxy ([#38](https://github.com/DS4SD/docling-serve/issues/38)) ([`c64a450`](https://github.com/DS4SD/docling-serve/commit/c64a450bf9ba9947ab180e92bef2763ff710b210))
+* Support python 3.13 and docling updates and switch to uv ([#48](https://github.com/DS4SD/docling-serve/issues/48)) ([`ae3b490`](https://github.com/DS4SD/docling-serve/commit/ae3b4906f1c0829b1331ea491f3518741cabff71))
--- a/35
+++ b/35
@@ -2,7 +2,8 @@ ARG BASE_IMAGE=quay.io/sclorg/python-312-c9s:c9s

 FROM ${BASE_IMAGE}

-ARG CPU_ONLY=false
+ARG MODELS_LIST="layout tableformer picture_classifier easyocr"
+ARG UV_SYNC_EXTRA_ARGS=""

 USER 0

@@ -21,6 +22,8 @@ RUN --mount=type=bind,source=os-packages.txt,target=/tmp/os-packages.txt \

 ENV TESSDATA_PREFIX=/usr/share/tesseract/tessdata/

+COPY --from=ghcr.io/astral-sh/uv:0.6.1 /uv /uvx /bin/
+
 ###################################################################################################
 # Docling layer                                                                                   #
 ###################################################################################################
@@ -35,27 +38,25 @@ ENV OMP_NUM_THREADS=4
 ENV LANG=en_US.UTF-8
 ENV LC_ALL=en_US.UTF-8
 ENV PYTHONIOENCODING=utf-8
+ENV UV_COMPILE_BYTECODE=1 UV_LINK_MODE=copy
+ENV UV_PROJECT_ENVIRONMENT=/opt/app-root

-ENV WITH_UI=True
+ENV DOCLING_SERVE_ARTIFACTS_PATH=/opt/app-root/src/.cache/docling/models

-COPY --chown=1001:0 pyproject.toml poetry.lock models_download.py README.md ./
+COPY --chown=1001:0 pyproject.toml uv.lock README.md ./

-RUN pip install --no-cache-dir poetry && \
-    # We already are in a virtual environment, so we don't need to create a new one, only activate it.
-    poetry config virtualenvs.create false && \
-    source /opt/app-root/bin/activate && \
-    if [ "$CPU_ONLY" = "true" ]; then \
-        poetry install --no-root --no-cache --no-interaction --all-extras --with cpu --without dev; \
-    else \
-        poetry install --no-root --no-cache --no-interaction --all-extras --without dev; \
-    fi && \
-    echo "Downloading models..." && \
-    python models_download.py && \
-    chown -R 1001:0 /opt/app-root/src && \
-    chmod -R g=u /opt/app-root/src
+RUN --mount=type=cache,target=/opt/app-root/src/.cache/uv,uid=1001 \
+    uv sync --frozen --no-install-project --no-dev --all-extras ${UV_SYNC_EXTRA_ARGS}   # --no-extra ${NO_EXTRA}
+
+RUN echo "Downloading models..." && \
+    docling-tools models download -o "${DOCLING_SERVE_ARTIFACTS_PATH}" ${MODELS_LIST} && \
+    chown -R 1001:0 /opt/app-root/src/.cache && \
+    chmod -R g=u /opt/app-root/src/.cache

 COPY --chown=1001:0 --chmod=664 ./docling_serve ./docling_serve
+RUN --mount=type=cache,target=/opt/app-root/src/.cache/uv,uid=1001 \
+    uv sync --frozen --no-dev --all-extras ${UV_SYNC_EXTRA_ARGS}   # --no-extra ${NO_EXTRA}

 EXPOSE 5001

-CMD ["python", "-m", "docling_serve"]
+CMD ["docling-serve", "run"]
--- a/32
+++ b/32
@@ -24,19 +24,26 @@ action-lint-file:
 md-lint-file:
 	$(CMD_PREFIX) touch .markdown-lint

+.PHONY: docling-serve-image
+docling-serve-image: Containerfile
+	$(ECHO_PREFIX) printf "  %-12s Containerfile\n" "[docling-serve]"
+	$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra cpu" -f Containerfile -t ghcr.io/ds4sd/docling-serve:$(TAG) .
+	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve:$(TAG) ghcr.io/ds4sd/docling-serve:main
+	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve:$(TAG) quay.io/ds4sd/docling-serve:main
+
 .PHONY: docling-serve-cpu-image
 docling-serve-cpu-image: Containerfile ## Build docling-serve "cpu only" container image
-	$(ECHO_PREFIX) printf "  %-12s Containerfile\n" "[docling-serve CPU ONLY]"
-	$(CMD_PREFIX) docker build --build-arg CPU_ONLY=true -f Containerfile --platform linux/amd64 -t ghcr.io/ds4sd/docling-serve-cpu:$(TAG) .
+	$(ECHO_PREFIX) printf "  %-12s Containerfile\n" "[docling-serve CPU]"
+	$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cu124" -f Containerfile -t ghcr.io/ds4sd/docling-serve-cpu:$(TAG) .
 	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve-cpu:$(TAG) ghcr.io/ds4sd/docling-serve-cpu:main
 	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve-cpu:$(TAG) quay.io/ds4sd/docling-serve-cpu:main

-.PHONY: docling-serve-gpu-image
-docling-serve-gpu-image: Containerfile ## Build docling-serve container image with GPU support
-	$(ECHO_PREFIX) printf "  %-12s Containerfile\n" "[docling-serve with GPU]"
-	$(CMD_PREFIX) docker build --build-arg CPU_ONLY=false -f Containerfile --platform linux/amd64 -t ghcr.io/ds4sd/docling-serve:$(TAG) .
-	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve:$(TAG) ghcr.io/ds4sd/docling-serve:main
-	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve:$(TAG) quay.io/ds4sd/docling-serve:main
+.PHONY: docling-serve-cu124-image
+docling-serve-cu124-image: Containerfile ## Build docling-serve container image with GPU support
+	$(ECHO_PREFIX) printf "  %-12s Containerfile\n" "[docling-serve with Cuda 12.4]"
+	$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cpu" -f Containerfile --platform linux/amd64 -t ghcr.io/ds4sd/docling-serve-cu124:$(TAG) .
+	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve-cu124:$(TAG) ghcr.io/ds4sd/docling-serve-cu124:main
+	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve-cu124:$(TAG) quay.io/ds4sd/docling-serve-cu124:main

 .PHONY: action-lint
 action-lint: .action-lint ##      Lint GitHub Action workflows
@@ -65,13 +72,12 @@ md-lint: .md-lint ##      Lint markdown files
 .PHONY: py-Lint
 py-lint: ##      Lint Python files
 	$(ECHO_PREFIX) printf "  %-12s ./...\n" "[PY LINT]"
-	$(CMD_PREFIX) if ! which poetry $(PIPE_DEV_NULL) ; then \
-		echo "Please install poetry." ; \
-		echo "pip install poetry" ; \
+	$(CMD_PREFIX) if ! which uv $(PIPE_DEV_NULL) ; then \
+		echo "Please install uv." ; \
 		exit 1 ; \
 	fi
-	$(CMD_PREFIX) poetry install --all-extras
-	$(CMD_PREFIX) poetry run pre-commit run --all-files
+	$(CMD_PREFIX) uv sync --extra ui
+	$(CMD_PREFIX) uv run pre-commit run --all-files

 .PHONY: run-docling-cpu
 run-docling-cpu: ## Run the docling-serve container with CPU support and assign a container name
--- a/README.md
+++ b/README.md
@@ -276,6 +276,17 @@ The response can be a JSON Document or a File.
 - If you set the parameter `return_as_file` to True, the response will be a zip file.
 - If multiple files are generated (multiple inputs, or one input but multiple outputs with `return_as_file` True), the response will be a zip file.

+## Run docling-serve
+
+Clone the repository and run the following from within the cloned directory root.
+
+```bash
+python -m venv venv
+source venv/bin/activate
+pip install "docling-serve[ui]"
+docling-serve run --enable-ui
+```
+
 ## Helpers

 - A full Swagger UI is available at the `/docs` endpoint.
@@ -293,11 +304,11 @@ The response can be a JSON Document or a File.
 ### CPU only

 ```sh
-# Install poetry if not already available
-curl -sSL https://install.python-poetry.org | python3 -
+# Install uv if not already available
+curl -LsSf https://astral.sh/uv/install.sh | sh

 # Install dependencies
-poetry install --with cpu
+uv sync --extra cpu
 ```

 ### Cuda GPU
@@ -306,29 +317,107 @@ For GPU support use the following command:

 ```sh
 # Install dependencies
-poetry install
+uv sync
 ```

+### Gradio UI and different OCR backends
+
+`/ui` endpoint using `gradio` and different OCR backends can be enabled via package extras:
+
+```sh
+# Enable ui and rapidocr
+uv sync --extra ui --extra rapidocr
+```
+
+```sh
+# Enable tesserocr
+uv sync --extra tesserocr
+```
+
+See `[project.optional-dependencies]` section in `pyproject.toml` for full list of options and runtime options with `uv run docling-serve --help`.
+
 ### Run the server

-The [start_server.sh](./start_server.sh) executable is a convenient script for launching the local webserver.
+The `docling-serve` executable is a convenient script for launching the webserver both in
+development and production mode.

 ```sh
-# Run the server
-bash start_server.sh
+# Run the server in development mode
+# - reload is enabled by default
+# - listening on the 127.0.0.1 address
+# - ui is enabled by default
+docling-serve dev

-# Run the server with live reload
-RELOAD=true bash start_server.sh
+# Run the server in production mode
+# - reload is disabled by default
+# - listening on the 0.0.0.0 address
+# - ui is disabled by default
+docling-serve run
 ```

-### Environment variables
+### Options

-The following variables are available:
+The `docling-serve` executable allows is controlled with both command line
+options and environment variables.

-`TESSDATA_PREFIX`: Tesseract data location, example `/usr/share/tesseract/tessdata/`.
-`UVICORN_WORKERS`: Number of workers to use.
-`RELOAD`: If `True`, this will enable auto-reload when you modify files, useful for development.
-`WITH_UI`: If `True`, The Gradio UI will be available at `/ui`.
+<details>
+<summary>`docling-serve` help message</summary>
+
+```sh
+$ docling-serve dev --help
+                                                                                                              
+ Usage: docling-serve dev [OPTIONS]                                                                           
+                                                                                                              
+ Run a Docling Serve app in development mode. 🧪                                                              
+ This is equivalent to docling-serve run but with reload                                                      
+ enabled and listening on the 127.0.0.1 address.                                                              
+                                                                                                              
+ Options can be set also with the corresponding ENV variable, with the exception                              
+ of --enable-ui, --host and --reload.                                                                         
+                                                                                                              
+╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────╮
+│ --host                                   TEXT     The host to serve on. For local development in localhost │
+│                                                   use 127.0.0.1. To enable public access, e.g. in a        │
+│                                                   container, use all the IP addresses available with       │
+│                                                   0.0.0.0.                                                 │
+│                                                   [default: 127.0.0.1]                                     │
+│ --port                                   INTEGER  The port to serve on. [default: 5001]                    │
+│ --reload           --no-reload                    Enable auto-reload of the server when (code) files       │
+│                                                   change. This is resource intensive, use it only during   │
+│                                                   development.                                             │
+│                                                   [default: reload]                                        │
+│ --root-path                              TEXT     The root path is used to tell your app that it is being  │
+│                                                   served to the outside world with some path prefix set up │
+│                                                   in some termination proxy or similar.                    │
+│ --proxy-headers    --no-proxy-headers             Enable/Disable X-Forwarded-Proto, X-Forwarded-For,       │
+│                                                   X-Forwarded-Port to populate remote address info.        │
+│                                                   [default: proxy-headers]                                 │
+│ --artifacts-path                          PATH     If set to a valid directory, the model weights will be  │
+│                                                    loaded from this path.                                  │
+│                                                    [default: None]                                         │
+│ --enable-ui        --no-enable-ui                 Enable the development UI. [default: enable-ui]          │
+│ --help                                            Show this message and exit.                              │
+╰────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
+```
+
+</details>
+
+#### Environment variables
+
+The environment variables controlling the `uvicorn` execution can be specified with the `UVICORN_` prefix:
+
+- `UVICORN_WORKERS`: Number of workers to use.
+- `UVICORN_RELOAD`: If `True`, this will enable auto-reload when you modify files, useful for development.
+
+The environment variables controlling specifics of the Docling Serve app can be specified with the
+`DOCLING_SERVE_` prefix:
+
+- `DOCLING_SERVE_ARTIFACTS_PATH`: if set Docling will use only the local weights of models, for example `/opt/app-root/src/.cache/docling/models`.
+- `DOCLING_SERVE_ENABLE_UI`: If `True`, The Gradio UI will be available at `/ui`.
+
+Others:
+
+- `TESSDATA_PREFIX`: Tesseract data location, example `/usr/share/tesseract/tessdata/`.

 ## Get help and support

--- a/docling_serve/.env.example
+++ b/docling_serve/.env.example
@@ -1,3 +1,3 @@
 TESSDATA_PREFIX=/usr/share/tesseract/tessdata/
 UVICORN_WORKERS=2
-RELOAD=True
+UVICORN_RELOAD=True
--- a/docling_serve/main.py
+++ b/docling_serve/main.py
@@ -1,20 +1,302 @@
-import os
+import importlib
+import logging
+import platform
+import sys
+import warnings
+from pathlib import Path
+from typing import Annotated, Any, Optional, Union

-from docling_serve.app import app
-from docling_serve.helper_functions import _str_to_bool
+import typer
+import uvicorn
+from rich.console import Console

-# Launch the FastAPI server
-if __name__ == "__main__":
-    from uvicorn import run
+from docling_serve.settings import docling_serve_settings, uvicorn_settings

-    port = int(os.getenv("PORT", "5001"))
-    workers = int(os.getenv("UVICORN_WORKERS", "1"))
-    reload = _str_to_bool(os.getenv("RELOAD", "False"))
-    run(
-        app,
-        host="0.0.0.0",
-        port=port,
-        workers=workers,
-        timeout_keep_alive=600,
-        reload=reload,
+warnings.filterwarnings(action="ignore", category=UserWarning, module="pydantic|torch")
+warnings.filterwarnings(action="ignore", category=FutureWarning, module="easyocr")
+
+
+err_console = Console(stderr=True)
+console = Console()
+
+app = typer.Typer(
+    no_args_is_help=True,
+    rich_markup_mode="rich",
+)
+
+logger = logging.getLogger(__name__)
+
+
+def version_callback(value: bool) -> None:
+    if value:
+        docling_serve_version = importlib.metadata.version("docling_serve")
+        docling_version = importlib.metadata.version("docling")
+        docling_core_version = importlib.metadata.version("docling-core")
+        docling_ibm_models_version = importlib.metadata.version("docling-ibm-models")
+        docling_parse_version = importlib.metadata.version("docling-parse")
+        platform_str = platform.platform()
+        py_impl_version = sys.implementation.cache_tag
+        py_lang_version = platform.python_version()
+        console.print(f"Docling Serve version: {docling_serve_version}")
+        console.print(f"Docling version: {docling_version}")
+        console.print(f"Docling Core version: {docling_core_version}")
+        console.print(f"Docling IBM Models version: {docling_ibm_models_version}")
+        console.print(f"Docling Parse version: {docling_parse_version}")
+        console.print(f"Python: {py_impl_version} ({py_lang_version})")
+        console.print(f"Platform: {platform_str}")
+        raise typer.Exit()
+
+
+@app.callback()
+def callback(
+    version: Annotated[
+        Union[bool, None],
+        typer.Option(
+            "--version", help="Show the version and exit.", callback=version_callback
+        ),
+    ] = None,
+    verbose: Annotated[
+        int,
+        typer.Option(
+            "--verbose",
+            "-v",
+            count=True,
+            help="Set the verbosity level. -v for info logging, -vv for debug logging.",
+        ),
+    ] = 0,
+) -> None:
+    if verbose == 0:
+        logging.basicConfig(level=logging.WARNING)
+    elif verbose == 1:
+        logging.basicConfig(level=logging.INFO)
+    elif verbose == 2:
+        logging.basicConfig(level=logging.DEBUG)
+
+
+def _run(
+    *,
+    command: str,
+) -> None:
+    server_type = "development" if command == "dev" else "production"
+
+    console.print(f"Starting {server_type} server 🚀")
+
+    url = f"http://{uvicorn_settings.host}:{uvicorn_settings.port}"
+    url_docs = f"{url}/docs"
+    url_ui = f"{url}/ui"
+
+    console.print("")
+    console.print(f"Server started at [link={url}]{url}[/]")
+    console.print(f"Documentation at [link={url_docs}]{url_docs}[/]")
+    if docling_serve_settings.enable_ui:
+        console.print(f"UI at [link={url_ui}]{url_ui}[/]")
+
+    if command == "dev":
+        console.print("")
+        console.print(
+            "Running in development mode, for production use: "
+            "[bold]docling-serve run[/]",
+        )
+
+    console.print("")
+    console.print("Logs:")
+
+    uvicorn.run(
+        app="docling_serve.app:create_app",
+        factory=True,
+        host=uvicorn_settings.host,
+        port=uvicorn_settings.port,
+        reload=uvicorn_settings.reload,
+        workers=uvicorn_settings.workers,
+        root_path=uvicorn_settings.root_path,
+        proxy_headers=uvicorn_settings.proxy_headers,
    )
+
+
+@app.command()
+def dev(
+    *,
+    # uvicorn options
+    host: Annotated[
+        str,
+        typer.Option(
+            help=(
+                "The host to serve on. For local development in localhost "
+                "use [blue]127.0.0.1[/blue]. To enable public access, "
+                "e.g. in a container, use all the IP addresses "
+                "available with [blue]0.0.0.0[/blue]."
+            )
+        ),
+    ] = "127.0.0.1",
+    port: Annotated[
+        int,
+        typer.Option(help="The port to serve on."),
+    ] = uvicorn_settings.port,
+    reload: Annotated[
+        bool,
+        typer.Option(
+            help=(
+                "Enable auto-reload of the server when (code) files change. "
+                "This is [bold]resource intensive[/bold], "
+                "use it only during development."
+            )
+        ),
+    ] = True,
+    root_path: Annotated[
+        str,
+        typer.Option(
+            help=(
+                "The root path is used to tell your app that it is being served "
+                "to the outside world with some [bold]path prefix[/bold] "
+                "set up in some termination proxy or similar."
+            )
+        ),
+    ] = uvicorn_settings.root_path,
+    proxy_headers: Annotated[
+        bool,
+        typer.Option(
+            help=(
+                "Enable/Disable X-Forwarded-Proto, X-Forwarded-For, "
+                "X-Forwarded-Port to populate remote address info."
+            )
+        ),
+    ] = uvicorn_settings.proxy_headers,
+    # docling options
+    artifacts_path: Annotated[
+        Optional[Path],
+        typer.Option(
+            help=(
+                "If set to a valid directory, "
+                "the model weights will be loaded from this path."
+            )
+        ),
+    ] = docling_serve_settings.artifacts_path,
+    enable_ui: Annotated[bool, typer.Option(help="Enable the development UI.")] = True,
+) -> Any:
+    """
+    Run a [bold]Docling Serve[/bold] app in [yellow]development[/yellow] mode. 🧪
+
+    This is equivalent to [bold]docling-serve run[/bold] but with [bold]reload[/bold]
+    enabled and listening on the [blue]127.0.0.1[/blue] address.
+
+    Options can be set also with the corresponding ENV variable, with the exception
+    of --enable-ui, --host and --reload.
+    """
+
+    uvicorn_settings.host = host
+    uvicorn_settings.port = port
+    uvicorn_settings.reload = reload
+    uvicorn_settings.root_path = root_path
+    uvicorn_settings.proxy_headers = proxy_headers
+
+    docling_serve_settings.artifacts_path = artifacts_path
+    docling_serve_settings.enable_ui = enable_ui
+
+    _run(
+        command="dev",
+    )
+
+
+@app.command()
+def run(
+    *,
+    host: Annotated[
+        str,
+        typer.Option(
+            help=(
+                "The host to serve on. For local development in localhost "
+                "use [blue]127.0.0.1[/blue]. To enable public access, "
+                "e.g. in a container, use all the IP addresses "
+                "available with [blue]0.0.0.0[/blue]."
+            )
+        ),
+    ] = uvicorn_settings.host,
+    port: Annotated[
+        int,
+        typer.Option(help="The port to serve on."),
+    ] = uvicorn_settings.port,
+    reload: Annotated[
+        bool,
+        typer.Option(
+            help=(
+                "Enable auto-reload of the server when (code) files change. "
+                "This is [bold]resource intensive[/bold], "
+                "use it only during development."
+            )
+        ),
+    ] = uvicorn_settings.reload,
+    workers: Annotated[
+        Union[int, None],
+        typer.Option(
+            help=(
+                "Use multiple worker processes. "
+                "Mutually exclusive with the --reload flag."
+            )
+        ),
+    ] = uvicorn_settings.workers,
+    root_path: Annotated[
+        str,
+        typer.Option(
+            help=(
+                "The root path is used to tell your app that it is being served "
+                "to the outside world with some [bold]path prefix[/bold] "
+                "set up in some termination proxy or similar."
+            )
+        ),
+    ] = uvicorn_settings.root_path,
+    proxy_headers: Annotated[
+        bool,
+        typer.Option(
+            help=(
+                "Enable/Disable X-Forwarded-Proto, X-Forwarded-For, "
+                "X-Forwarded-Port to populate remote address info."
+            )
+        ),
+    ] = uvicorn_settings.proxy_headers,
+    # docling options
+    artifacts_path: Annotated[
+        Optional[Path],
+        typer.Option(
+            help=(
+                "If set to a valid directory, "
+                "the model weights will be loaded from this path."
+            )
+        ),
+    ] = docling_serve_settings.artifacts_path,
+    enable_ui: Annotated[
+        bool, typer.Option(help="Enable the development UI.")
+    ] = docling_serve_settings.enable_ui,
+) -> Any:
+    """
+    Run a [bold]Docling Serve[/bold] app in [green]production[/green] mode. 🚀
+
+    This is equivalent to [bold]docling-serve dev[/bold] but with [bold]reload[/bold]
+    disabled and listening on the [blue]0.0.0.0[/blue] address.
+
+    Options can be set also with the corresponding ENV variable, e.g. UVICORN_PORT
+    or DOCLING_SERVE_ENABLE_UI.
+    """
+
+    uvicorn_settings.host = host
+    uvicorn_settings.port = port
+    uvicorn_settings.reload = reload
+    uvicorn_settings.workers = workers
+    uvicorn_settings.root_path = root_path
+    uvicorn_settings.proxy_headers = proxy_headers
+
+    docling_serve_settings.artifacts_path = artifacts_path
+    docling_serve_settings.enable_ui = enable_ui
+
+    _run(
+        command="run",
+    )
+
+
+def main() -> None:
+    app()
+
+
+# Launch the CLI when calling python -m docling_serve
+if __name__ == "__main__":
+
+    main()
--- a/docling_serve/app.py
+++ b/docling_serve/app.py
@@ -1,5 +1,4 @@
 import logging
-import os
 import tempfile
 from contextlib import asynccontextmanager
 from io import BytesIO
@@ -8,7 +7,6 @@ from typing import Annotated, Any, Dict, List, Optional, Union

 from docling.datamodel.base_models import DocumentStream, InputFormat
 from docling.document_converter import DocumentConverter
-from dotenv import load_dotenv
 from fastapi import BackgroundTasks, FastAPI, UploadFile
 from fastapi.middleware.cors import CORSMiddleware
 from fastapi.responses import RedirectResponse
@@ -22,17 +20,9 @@ from docling_serve.docling_conversion import (
    converters,
    get_pdf_pipeline_opts,
 )
-from docling_serve.helper_functions import FormDepends, _str_to_bool
+from docling_serve.helper_functions import FormDepends
 from docling_serve.response_preparation import ConvertDocumentResponse, process_results
-
-# Load local env vars if present
-load_dotenv()
-
-WITH_UI = _str_to_bool(os.getenv("WITH_UI", "False"))
-if WITH_UI:
-    import gradio as gr
-
-    from docling_serve.gradio_ui import ui as gradio_ui
+from docling_serve.settings import docling_serve_settings


 # Set up custom logging as we'll be intermixes with FastAPI/Uvicorn's logging
@@ -70,7 +60,6 @@ _log = logging.getLogger(__name__)
 # Context manager to initialize and clean up the lifespan of the FastAPI app
@asynccontextmanager
 async def lifespan(app: FastAPI):
-    # settings = Settings()

    # Converter with default options
    pdf_format_option, options_hash = get_pdf_pipeline_opts(ConvertDocumentsOptions())
@@ -86,139 +75,156 @@ async def lifespan(app: FastAPI):
    yield

    converters.clear()
-    if WITH_UI:
-        gradio_ui.close()
+    # if WITH_UI:
+    #     gradio_ui.close()


 ##################################
 # App creation and configuration #
 ##################################

-app = FastAPI(
-    title="Docling Serve",
-    lifespan=lifespan,
-)

-origins = ["*"]
-methods = ["*"]
-headers = ["*"]
-
-app.add_middleware(
-    CORSMiddleware,
-    allow_origins=origins,
-    allow_credentials=True,
-    allow_methods=methods,
-    allow_headers=headers,
-)
-
-# Mount the Gradio app
-if WITH_UI:
-    tmp_output_dir = Path(tempfile.mkdtemp())
-    gradio_ui.gradio_output_dir = tmp_output_dir
-    app = gr.mount_gradio_app(
-        app, gradio_ui, path="/ui", allowed_paths=["./logo.png", tmp_output_dir]
+def create_app():
+    app = FastAPI(
+        title="Docling Serve",
+        lifespan=lifespan,
    )

+    origins = ["*"]
+    methods = ["*"]
+    headers = ["*"]

-#############################
-# API Endpoints definitions #
-#############################
-
-
-# Favicon
-@app.get("/favicon.ico", include_in_schema=False)
-async def favicon():
-    response = RedirectResponse(url="https://ds4sd.github.io/docling/assets/logo.png")
-    return response
-
-
-# Status
-class HealthCheckResponse(BaseModel):
-    status: str = "ok"
-
-
-@app.get("/health")
-def health() -> HealthCheckResponse:
-    return HealthCheckResponse()
-
-
-# API readiness compatibility for OpenShift AI Workbench
-@app.get("/api", include_in_schema=False)
-def api_check() -> HealthCheckResponse:
-    return HealthCheckResponse()
-
-
-# Convert a document from URL(s)
-@app.post(
-    "/v1alpha/convert/source",
-    response_model=ConvertDocumentResponse,
-    responses={
-        200: {
-            "content": {"application/zip": {}},
-            # "description": "Return the JSON item or an image.",
-        }
-    },
-)
-def process_url(
-    background_tasks: BackgroundTasks, conversion_request: ConvertDocumentsRequest
-):
-    sources: List[Union[str, DocumentStream]] = []
-    headers: Optional[Dict[str, Any]] = None
-    if isinstance(conversion_request, ConvertDocumentFileSourcesRequest):
-        for file_source in conversion_request.file_sources:
-            sources.append(file_source.to_document_stream())
-    else:
-        for http_source in conversion_request.http_sources:
-            sources.append(http_source.url)
-            if headers is None and http_source.headers:
-                headers = http_source.headers
-
-    # Note: results are only an iterator->lazy evaluation
-    results = convert_documents(
-        sources=sources, options=conversion_request.options, headers=headers
+    app.add_middleware(
+        CORSMiddleware,
+        allow_origins=origins,
+        allow_credentials=True,
+        allow_methods=methods,
+        allow_headers=headers,
    )

-    # The real processing will happen here
-    response = process_results(
-        background_tasks=background_tasks,
-        conversion_options=conversion_request.options,
-        conv_results=results,
+    # Mount the Gradio app
+    if docling_serve_settings.enable_ui:
+
+        try:
+            import gradio as gr
+
+            from docling_serve.gradio_ui import ui as gradio_ui
+
+            tmp_output_dir = Path(tempfile.mkdtemp())
+            gradio_ui.gradio_output_dir = tmp_output_dir
+            app = gr.mount_gradio_app(
+                app,
+                gradio_ui,
+                path="/ui",
+                allowed_paths=["./logo.png", tmp_output_dir],
+                root_path="/ui",
+            )
+        except ImportError:
+            _log.warning(
+                "Docling Serve enable_ui is activated, but gradio is not installed. "
+                "Install it with `pip install docling-serve[ui]` "
+                "or `pip install gradio`"
+            )
+
+    #############################
+    # API Endpoints definitions #
+    #############################
+
+    # Favicon
+    @app.get("/favicon.ico", include_in_schema=False)
+    async def favicon():
+        response = RedirectResponse(
+            url="https://ds4sd.github.io/docling/assets/logo.png"
+        )
+        return response
+
+    # Status
+    class HealthCheckResponse(BaseModel):
+        status: str = "ok"
+
+    @app.get("/health")
+    def health() -> HealthCheckResponse:
+        return HealthCheckResponse()
+
+    # API readiness compatibility for OpenShift AI Workbench
+    @app.get("/api", include_in_schema=False)
+    def api_check() -> HealthCheckResponse:
+        return HealthCheckResponse()
+
+    # Convert a document from URL(s)
+    @app.post(
+        "/v1alpha/convert/source",
+        response_model=ConvertDocumentResponse,
+        responses={
+            200: {
+                "content": {"application/zip": {}},
+                # "description": "Return the JSON item or an image.",
+            }
+        },
    )
+    def process_url(
+        background_tasks: BackgroundTasks, conversion_request: ConvertDocumentsRequest
+    ):
+        sources: List[Union[str, DocumentStream]] = []
+        headers: Optional[Dict[str, Any]] = None
+        if isinstance(conversion_request, ConvertDocumentFileSourcesRequest):
+            for file_source in conversion_request.file_sources:
+                sources.append(file_source.to_document_stream())
+        else:
+            for http_source in conversion_request.http_sources:
+                sources.append(http_source.url)
+                if headers is None and http_source.headers:
+                    headers = http_source.headers

-    return response
+        # Note: results are only an iterator->lazy evaluation
+        results = convert_documents(
+            sources=sources, options=conversion_request.options, headers=headers
+        )

+        # The real processing will happen here
+        response = process_results(
+            background_tasks=background_tasks,
+            conversion_options=conversion_request.options,
+            conv_results=results,
+        )

-# Convert a document from file(s)
-@app.post(
-    "/v1alpha/convert/file",
-    response_model=ConvertDocumentResponse,
-    responses={
-        200: {
-            "content": {"application/zip": {}},
-        }
-    },
-)
-async def process_file(
-    background_tasks: BackgroundTasks,
-    files: List[UploadFile],
-    options: Annotated[ConvertDocumentsOptions, FormDepends(ConvertDocumentsOptions)],
-):
+        return response

-    _log.info(f"Received {len(files)} files for processing.")
-
-    # Load the uploaded files to Docling DocumentStream
-    file_sources = []
-    for file in files:
-        buf = BytesIO(file.file.read())
-        name = file.filename if file.filename else "file.pdf"
-        file_sources.append(DocumentStream(name=name, stream=buf))
-
-    results = convert_documents(sources=file_sources, options=options)
-
-    response = process_results(
-        background_tasks=background_tasks,
-        conversion_options=options,
-        conv_results=results,
+    # Convert a document from file(s)
+    @app.post(
+        "/v1alpha/convert/file",
+        response_model=ConvertDocumentResponse,
+        responses={
+            200: {
+                "content": {"application/zip": {}},
+            }
+        },
    )
+    async def process_file(
+        background_tasks: BackgroundTasks,
+        files: List[UploadFile],
+        options: Annotated[
+            ConvertDocumentsOptions, FormDepends(ConvertDocumentsOptions)
+        ],
+    ):

-    return response
+        _log.info(f"Received {len(files)} files for processing.")
+
+        # Load the uploaded files to Docling DocumentStream
+        file_sources = []
+        for file in files:
+            buf = BytesIO(file.file.read())
+            name = file.filename if file.filename else "file.pdf"
+            file_sources.append(DocumentStream(name=name, stream=buf))
+
+        results = convert_documents(sources=file_sources, options=options)
+
+        response = process_results(
+            background_tasks=background_tasks,
+            conversion_options=options,
+            conv_results=results,
+        )
+
+        return response
+
+    return app
--- a/docling_serve/docling_conversion.py
+++ b/docling_serve/docling_conversion.py
@@ -39,6 +39,7 @@ from fastapi import HTTPException
 from pydantic import BaseModel, Field

 from docling_serve.helper_functions import _to_list_of_strings
+from docling_serve.settings import docling_serve_settings

 _log = logging.getLogger(__name__)

@@ -55,7 +56,7 @@ class ConvertDocumentsOptions(BaseModel):
            ),
            examples=[[v.value for v in InputFormat]],
        ),
-    ] = [v for v in InputFormat]
+    ] = list(InputFormat)

    to_formats: Annotated[
        List[OutputFormat],
@@ -161,7 +162,7 @@ class ConvertDocumentsOptions(BaseModel):
        bool,
        Field(
            description=(
-                "Abort on error if enabled. " "Boolean. Optional, defaults to false."
+                "Abort on error if enabled. Boolean. Optional, defaults to false."
            ),
            # examples=[False],
        ),
@@ -264,7 +265,7 @@ ConvertDocumentsRequest = Union[


 # Document converters will be preloaded and stored in a dictionary
-converters: Dict[str, DocumentConverter] = {}
+converters: Dict[bytes, DocumentConverter] = {}


 # Custom serializer for PdfFormatOption
@@ -276,6 +277,11 @@ def _serialize_pdf_format_option(pdf_format_option: PdfFormatOption) -> str:
    if pdf_format_option.pipeline_options:
        data["pipeline_options"] = pdf_format_option.pipeline_options.model_dump()

+        # Replace `artifacts_path` with a string representation
+        data["pipeline_options"]["artifacts_path"] = repr(
+            data["pipeline_options"]["artifacts_path"]
+        )
+
    # Replace `pipeline_cls` with a string representation
    data["pipeline_cls"] = repr(data["pipeline_cls"])

@@ -293,10 +299,9 @@ def _serialize_pdf_format_option(pdf_format_option: PdfFormatOption) -> str:


 # Computes the PDF pipeline options and returns the PdfFormatOption and its hash
-def get_pdf_pipeline_opts(
+def get_pdf_pipeline_opts(  # noqa: C901
    request: ConvertDocumentsOptions,
-) -> Tuple[PdfFormatOption, str]:
-
+) -> Tuple[PdfFormatOption, bytes]:
    if request.ocr_engine == OcrEngine.EASYOCR:
        try:
            import easyocr  # noqa: F401
@@ -364,6 +369,31 @@ def get_pdf_pipeline_opts(
    else:
        raise RuntimeError(f"Unexpected PDF backend type {request.pdf_backend}")

+    if docling_serve_settings.artifacts_path is not None:
+        if str(docling_serve_settings.artifacts_path.absolute()) == "":
+            _log.info(
+                "artifacts_path is an empty path, model weights will be dowloaded "
+                "at runtime."
+            )
+            pipeline_options.artifacts_path = None
+        elif docling_serve_settings.artifacts_path.is_dir():
+            _log.info(
+                "artifacts_path is set to a valid directory. "
+                "No model weights will be downloaded at runtime."
+            )
+            pipeline_options.artifacts_path = docling_serve_settings.artifacts_path
+        else:
+            _log.warning(
+                "artifacts_path is set to an invalid directory. "
+                "The system will download the model weights at runtime."
+            )
+            pipeline_options.artifacts_path = None
+    else:
+        _log.info(
+            "artifacts_path is unset. "
+            "The system will download the model weights at runtime."
+        )
+
    pdf_format_option = PdfFormatOption(
        pipeline_options=pipeline_options,
        backend=backend,
@@ -371,7 +401,7 @@ def get_pdf_pipeline_opts(

    serialized_data = _serialize_pdf_format_option(pdf_format_option)

-    options_hash = hashlib.sha1(serialized_data.encode()).hexdigest()
+    options_hash = hashlib.sha1(serialized_data.encode()).digest()

    return pdf_format_option, options_hash

--- a/docling_serve/gradio_ui.py
+++ b/docling_serve/gradio_ui.py
@@ -12,6 +12,13 @@ from docling_serve.helper_functions import _to_list_of_strings

 logger = logging.getLogger(__name__)

+##############################
+# Head JS for web components #
+##############################
+head = """
+    <script src="https://unpkg.com/@docling/docling-components@0.0.3" type="module"></script>
+"""
+
 #################
 # CSS and theme #
 #################
@@ -49,6 +56,14 @@ css = """
 #file_input_zone {
    height: 140px;
 }
+
+docling-img::part(pages) {
+    gap: 1rem;
+}
+
+docling-img::part(page) {
+    box-shadow: 0 0.5rem 1rem 0 rgba(0, 0, 0, 0.2);
+}
 """

 theme = gr.themes.Default(
@@ -110,6 +125,7 @@ def set_download_button_label(label_text: gr.State):
 def clear_outputs():
    markdown_content = ""
    json_content = ""
+    json_rendered_content = ""
    html_content = ""
    text_content = ""
    doctags_content = ""
@@ -118,6 +134,7 @@ def clear_outputs():
        markdown_content,
        markdown_content,
        json_content,
+        json_rendered_content,
        html_content,
        html_content,
        text_content,
@@ -260,6 +277,7 @@ def process_file(
 def response_to_output(response, return_as_file):
    markdown_content = ""
    json_content = ""
+    json_rendered_content = ""
    html_content = ""
    text_content = ""
    doctags_content = ""
@@ -282,6 +300,12 @@ def response_to_output(response, return_as_file):
        json_content = json.dumps(
            full_content.get("document").get("json_content"), indent=2
        )
+        # Embed document JSON and trigger load at client via an image.
+        json_rendered_content = f"""
+            <docling-img id="dclimg" pagenumbers tooltip="parsed"></docling-img>
+            <script id="dcljson" type="application/json" onload="document.getElementById('dclimg').src = JSON.parse(document.getElementById('dcljson').textContent);">{json_content}</script>
+            <img src onerror="document.getElementById('dclimg').src = JSON.parse(document.getElementById('dcljson').textContent);" />
+            """
        html_content = full_content.get("document").get("html_content")
        text_content = full_content.get("document").get("text_content")
        doctags_content = full_content.get("document").get("doctags_content")
@@ -289,6 +313,7 @@ def response_to_output(response, return_as_file):
        markdown_content,
        markdown_content,
        json_content,
+        json_rendered_content,
        html_content,
        html_content,
        text_content,
@@ -302,6 +327,7 @@ def response_to_output(response, return_as_file):
 ############

 with gr.Blocks(
+    head=head,
    css=css,
    theme=theme,
    title="Docling Serve",
@@ -464,6 +490,8 @@ with gr.Blocks(
            output_markdown_rendered = gr.Markdown(label="Response")
        with gr.Tab("Docling (JSON)"):
            output_json = gr.Code(language="json", wrap_lines=True, show_label=False)
+        with gr.Tab("Docling-Rendered"):
+            output_json_rendered = gr.HTML()
        with gr.Tab("HTML"):
            output_html = gr.Code(language="html", wrap_lines=True, show_label=False)
        with gr.Tab("HTML-Rendered"):
@@ -514,6 +542,7 @@ with gr.Blocks(
            output_markdown,
            output_markdown_rendered,
            output_json,
+            output_json_rendered,
            output_html,
            output_html_rendered,
            output_text,
@@ -538,6 +567,7 @@ with gr.Blocks(
            output_markdown,
            output_markdown_rendered,
            output_json,
+            output_json_rendered,
            output_html,
            output_html_rendered,
            output_text,
@@ -553,6 +583,7 @@ with gr.Blocks(
            output_markdown,
            output_markdown_rendered,
            output_json,
+            output_json_rendered,
            output_html,
            output_html_rendered,
            output_text,
@@ -582,6 +613,7 @@ with gr.Blocks(
            output_markdown,
            output_markdown_rendered,
            output_json,
+            output_json_rendered,
            output_html,
            output_html_rendered,
            output_text,
@@ -606,6 +638,7 @@ with gr.Blocks(
            output_markdown,
            output_markdown_rendered,
            output_json,
+            output_json_rendered,
            output_html,
            output_html_rendered,
            output_text,
@@ -621,6 +654,7 @@ with gr.Blocks(
            output_markdown,
            output_markdown_rendered,
            output_json,
+            output_json_rendered,
            output_html,
            output_html_rendered,
            output_text,
--- a/docling_serve/settings.py
+++ b/docling_serve/settings.py
@@ -1,6 +1,33 @@
+from pathlib import Path
+from typing import Optional, Union
+
 from pydantic_settings import BaseSettings, SettingsConfigDict


-class Settings(BaseSettings):
+class UvicornSettings(BaseSettings):
+    model_config = SettingsConfigDict(
+        env_prefix="UVICORN_", env_file=".env", extra="allow"
+    )

-    model_config = SettingsConfigDict(env_prefix="DOCLING_")
+    host: str = "0.0.0.0"
+    port: int = 5001
+    reload: bool = False
+    root_path: str = ""
+    proxy_headers: bool = True
+    workers: Union[int, None] = None
+
+
+class DoclingServeSettings(BaseSettings):
+    model_config = SettingsConfigDict(
+        env_prefix="DOCLING_SERVE_",
+        env_file=".env",
+        env_parse_none_str="",
+        extra="allow",
+    )
+
+    enable_ui: bool = False
+    artifacts_path: Optional[Path] = None
+
+
+uvicorn_settings = UvicornSettings()
+docling_serve_settings = DoclingServeSettings()
--- a/models_download.py
+++ b/models_download.py
@@ -1,36 +0,0 @@
-import os
-import zipfile
-
-import requests
-from deepsearch_glm.utils.load_pretrained_models import load_pretrained_nlp_models
-from docling.pipeline.standard_pdf_pipeline import StandardPdfPipeline
-
-# Download Docling models
-StandardPdfPipeline.download_models_hf(force=True)
-load_pretrained_nlp_models(verbose=True)
-
-# Download EasyOCR models
-urls = [
-    "https://github.com/JaidedAI/EasyOCR/releases/download/v1.3/latin_g2.zip",
-    "https://github.com/JaidedAI/EasyOCR/releases/download/pre-v1.1.6/craft_mlt_25k.zip"
-]
-
-local_zip_paths = [
-    "/opt/app-root/src/latin_g2.zip",
-    "/opt/app-root/src/craft_mlt_25k.zip"
-]
-
-extract_path = "/opt/app-root/src/.EasyOCR/model/"
-
-for url, local_zip_path in zip(urls, local_zip_paths):
-    # Download the file
-    response = requests.get(url)
-    with open(local_zip_path, "wb") as file:
-        file.write(response.content)
-
-    # Unzip the file
-    with zipfile.ZipFile(local_zip_path, "r") as zip_ref:
-        zip_ref.extractall(extract_path)
-
-    # Clean up the zip file
-    os.remove(local_zip_path)
--- a/poetry.lock
+++ b/poetry.lock
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,25 +1,25 @@
-[tool.poetry]
+[project]
 name = "docling-serve"
-version = "0.2.0"
+version = "0.4.0"  # DO NOT EDIT, updated automatically
 description = "Running Docling as a service"
-license = "MIT"
+license = {text = "MIT"}
 authors = [
-    "Michele Dolfi <dol@zurich.ibm.com>",
-    "Christoph Auer <cau@zurich.ibm.com>",
-    "Panos Vagenas <pva@zurich.ibm.com>",
-    "Cesar Berrospi Ramis <ceb@zurich.ibm.com>",
-   "Peter Staar <taa@zurich.ibm.com>",
+    {name="Michele Dolfi", email="dol@zurich.ibm.com"},
+    {name="Guillaume Moutier", email="gmoutier@redhat.com"},
+    {name="Anil Vishnoi", email="avishnoi@redhat.com"},
+    {name="Panos Vagenas", email="pva@zurich.ibm.com"},
+    {name="Panos Vagenas", email="pva@zurich.ibm.com"},
+    {name="Christoph Auer", email="cau@zurich.ibm.com"},
+    {name="Peter Staar", email="taa@zurich.ibm.com"},
 ]
 maintainers = [
-    "Peter Staar <taa@zurich.ibm.com>",
-    "Christoph Auer <cau@zurich.ibm.com>",
-    "Michele Dolfi <dol@zurich.ibm.com>",
-    "Cesar Berrospi Ramis <ceb@zurich.ibm.com>",
-    "Panos Vagenas <pva@zurich.ibm.com>",
+    {name="Michele Dolfi", email="dol@zurich.ibm.com"},
+    {name="Anil Vishnoi", email="avishnoi@redhat.com"},
+    {name="Panos Vagenas", email="pva@zurich.ibm.com"},
+    {name="Christoph Auer", email="cau@zurich.ibm.com"},
+    {name="Peter Staar", email="taa@zurich.ibm.com"},
 ]
 readme = "README.md"
-repository = "https://github.com/DS4SD/docling-serve"
-homepage = "https://github.com/DS4SD/docling-serve"
 classifiers = [
    "License :: OSI Approved :: MIT License",
    "Operating System :: OS Independent",
@@ -28,96 +28,145 @@ classifiers = [
    "Typing :: Typed",
    "Programming Language :: Python :: 3"
 ]
-
-[tool.poetry.dependencies]
-python = ">=3.10,<3.13" # 3.10 needed for Gradio, and no torchvision build for 3.13 yet
-docling = "^2.14.0"
-fastapi = {version = "^0.115.6", extras = ["standard"]}
-gradio = { version = "^5.9.1", optional = true }
-uvicorn = "~0.29.0"
-pydantic = "^2.10.3"
-pydantic-settings = "^2.4.0"
-python-multipart = "^0.0.19"
-httpx = "^0.28.1"
-tesserocr = { version = "^2.7.1", optional = true }
-rapidocr-onnxruntime = { version = "^1.4.0", optional = true, markers = "python_version < '3.13'" }
-onnxruntime = [
-  # 1.19.2 is the last version with python3.9 support,
-  # see https://github.com/microsoft/onnxruntime/releases/tag/v1.20.0
-  { version = ">=1.7.0,<1.20.0", optional = true, markers = "python_version < '3.10'" },
-  { version = "^1.7.0", optional = true, markers = "python_version >= '3.10'" }
+requires-python = ">=3.10"
+dependencies = [
+    "docling~=2.23",
+    "fastapi[standard]~=0.115",
+    "httpx~=0.28",
+    "pydantic~=2.10",
+    "pydantic-settings~=2.4",
+    "python-multipart>=0.0.14,<0.1.0",
+    "typer~=0.12",
+    "uvicorn[standard]>=0.29.0,<1.0.0",
 ]

+[project.optional-dependencies]
+ui = [
+    "gradio~=5.9"
+]
+tesserocr = [
+    "tesserocr~=2.7"
+]
+rapidocr = [
+    "rapidocr-onnxruntime~=1.4; python_version<'3.13'",
+    "onnxruntime~=1.7",
+]
+cpu = [
+  "torch>=2.6.0",
+  "torchvision>=0.21.0",
+]
+cu124 = [
+  "torch>=2.6.0",
+  "torchvision>=0.21.0",
+]

-[tool.poetry.extras]
-ui = ["gradio"]
-tesserocr = ["tesserocr"]
-rapidocr = ["rapidocr-onnxruntime", "onnxruntime"]
+[dependency-groups]
+dev = [
+    "mypy~=1.11",
+    "pre-commit~=3.8",
+    "pytest~=8.3",
+    "pytest-asyncio~=0.24",
+    "pytest-check~=2.4",
+    "python-semantic-release~=7.32",
+    "ruff>=0.9.6",
+]

+[tool.uv]
+package = true
+conflicts = [
+  [
+    { extra = "cpu" },
+    { extra = "cu124" },
+  ],
+]

-[tool.poetry.group.pypi-torch]
-optional = false
-
-[tool.poetry.group.pypi-torch.dependencies]
+[tool.uv.sources]
 torch = [
-  {version = "!=2.4.1+cpu" },
+  { index = "pytorch-cpu", extra = "cpu" },
+  { index = "pytorch-cu124", extra = "cu124" },
 ]
 torchvision = [
-  {version = "!=0.19.1+cpu" },
+  { index = "pytorch-cpu", extra = "cpu" },
+  { index = "pytorch-cu124", extra = "cu124" },
 ]

-[tool.poetry.group.cpu]
-optional = true
+[[tool.uv.index]]
+name = "pytorch-cpu"
+url = "https://download.pytorch.org/whl/cpu"
+explicit = true

-[tool.poetry.group.cpu.dependencies]
-torch = [
-    {markers = 'platform_machine=="x86_64" and sys_platform=="linux" and python_version == "3.10"', url="https://download.pytorch.org/whl/cpu/torch-2.4.1%2Bcpu-cp310-cp310-linux_x86_64.whl"},
-    {markers = 'platform_machine=="x86_64" and sys_platform=="linux" and python_version == "3.11"', url="https://download.pytorch.org/whl/cpu/torch-2.4.1%2Bcpu-cp311-cp311-linux_x86_64.whl"},
-    {markers = 'platform_machine=="x86_64" and sys_platform=="linux" and python_version == "3.12"', url="https://download.pytorch.org/whl/cpu/torch-2.4.1%2Bcpu-cp312-cp312-linux_x86_64.whl"},
-]
-torchvision = [
-    {markers = 'platform_machine=="x86_64" and sys_platform=="linux" and python_version == "3.10"', url="https://download.pytorch.org/whl/cpu/torchvision-0.19.1%2Bcpu-cp310-cp310-linux_x86_64.whl"},
-    {markers = 'platform_machine=="x86_64" and sys_platform=="linux" and python_version == "3.11"', url="https://download.pytorch.org/whl/cpu/torchvision-0.19.1%2Bcpu-cp311-cp311-linux_x86_64.whl"},
-    {markers = 'platform_machine=="x86_64" and sys_platform=="linux" and python_version == "3.12"', url="https://download.pytorch.org/whl/cpu/torchvision-0.19.1%2Bcpu-cp312-cp312-linux_x86_64.whl"},
-]
+[[tool.uv.index]]
+name = "pytorch-cu124"
+url = "https://download.pytorch.org/whl/cu124"
+explicit = true

-[tool.poetry.group.constraints.dependencies]
-numpy = [
-    { version = "^2.1.0", markers = 'python_version >= "3.13"' },
-    { version = "^1.24.4", markers = 'python_version < "3.13"' },
-]
+[tool.setuptools.packages.find]
+include = ["docling_serve"]

-[tool.poetry.group.dev.dependencies]
-black = "^24.8.0"
-isort = "^5.13.2"
-pre-commit = "^3.8.0"
-autoflake = "^2.3.1"
-flake8 = "^7.1.1"
-pytest = "^8.3.4"
-pytest-asyncio = "^0.24.0"
-pytest-check = "^2.4.1"
-mypy = "^1.11.2"
+[project.scripts]
+docling-serve = "docling_serve.__main__:main"

-[build-system]
-requires = ["poetry-core"]
-build-backend = "poetry.core.masonry.api"
+[project.urls]
+Homepage = "https://github.com/DS4SD/docling-serve"
+# Documentation = "https://ds4sd.github.io/docling"
+Repository = "https://github.com/DS4SD/docling-serve"
+Issues = "https://github.com/DS4SD/docling-serve/issues"
+Changelog = "https://github.com/DS4SD/docling-serve/blob/main/CHANGELOG.md"

-[tool.black]
+[tool.ruff]
+target-version = "py310"
 line-length = 88
-target-version = ["py310"]
-include = '\.pyi?$'
+respect-gitignore = true

-[tool.isort]
-profile = "black"
-line_length = 88
-py_version=311
+# extend-exclude = [
+#     "tests",
+# ]

-[tool.autoflake]
-in-place = true
-remove-all-unused-imports = true
-remove-unused-variables = true
-expand-star-imports = true
-recursive = true
+[tool.ruff.format]
+skip-magic-trailing-comma = false
+
+[tool.ruff.lint]
+select = [
+    # "B",  # flake8-bugbear
+    "C",  # flake8-comprehensions
+    "C9",  # mccabe
+    # "D",  # flake8-docstrings
+    "E",  # pycodestyle errors (default)
+    "F",  # pyflakes (default)
+    "I",  # isort
+    "PD", # pandas-vet
+    "PIE", # pie
+    # "PTH", # pathlib
+    "Q",  # flake8-quotes
+    # "RET", # return
+    "RUF", # Enable all ruff-specific checks
+    # "SIM", # simplify
+    "S307", # eval
+    # "T20",  # (disallow print statements) keep debugging statements out of the codebase
+    "W",  # pycodestyle warnings
+    "ASYNC" # async
+]
+
+ignore = [
+    "E501",  # Line too long, handled by ruff formatter
+    "D107", # "Missing docstring in __init__",
+    "F811", # "redefinition of the same function"
+    "PL", # Pylint
+    "RUF012", # Mutable Class Attributes
+]
+
+#extend-select = []
+
+[tool.ruff.lint.per-file-ignores]
+"__init__.py" = ["E402", "F401"]
+"tests/*.py" = ["ASYNC"] # Disable ASYNC check for tests
+
+[tool.ruff.lint.mccabe]
+max-complexity = 15
+
+[tool.ruff.lint.isort]
+combine-as-imports = true
+known-third-party = ["docling", "docling_core"]

 [tool.mypy]
 pretty = true
@@ -150,3 +199,16 @@ addopts = "-rA --color=yes --tb=short --maxfail=5"
 markers = [
 "asyncio",
 ]
+
+[tool.semantic_release]
+# for default values check:
+# https://github.com/python-semantic-release/python-semantic-release/blob/v7.32.2/semantic_release/defaults.cfg
+
+version_source = "tag_only"
+branch = "main"
+
+# configure types which should trigger minor and patch version bumps respectively
+# (note that they must be a subset of the configured allowed types):
+parser_angular_allowed_types = "build,chore,ci,docs,feat,fix,perf,style,refactor,test"
+parser_angular_minor_types = "feat"
+parser_angular_patch_types = "fix,perf"
--- a/start_server.sh
+++ b/start_server.sh
@@ -1,30 +0,0 @@
-#!/bin/bash
-set -Eeuo pipefail
-
-# Network settings
-export PORT="${PORT:-5001}"
-export HOST="${HOST:-"0.0.0.0"}"
-
-# Performance settings
-UVICORN_WORKERS="${UVICORN_WORKERS:-1}"
-
-# Development settings
-export WITH_UI="${WITH_UI:-"true"}"
-export RELOAD=${RELOAD:-"false"}
-
-# --------------------------------------
-# Process env settings
-
-EXTRA_ARGS=""
-if [ "$RELOAD" == "true" ]; then
-  EXTRA_ARGS="$EXTRA_ARGS --reload"
-fi
-
-# Launch
-exec poetry run uvicorn \
-    docling_serve.app:app \
-    --host=${HOST} \
-    --port=${PORT} \
-    --timeout-keep-alive=600 \
-    ${EXTRA_ARGS} \
-    --workers=${UVICORN_WORKERS}
--- a/uv.lock
+++ b/uv.lock
Author	SHA1	Message	Date
github-actions[bot]	cad1053e36	chore: bump version to 0.4.0 [skip ci]	2025-02-26 13:05:03 +00:00
Michele Dolfi	7e6d9cdef3	feat: New container images (#68 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-02-26 12:49:20 +01:00
Brent Salisbury	343b985287	Readme additions for running Readme additions for a quickstart running of docling-serve Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>	2025-02-25 14:49:50 -08:00
Kasper Dinkla	c430d9b1a1	feat: Render DoclingDocument with npm docling-components in the example UI (#65 ) Signed-off-by: DKL <dkl@zurich.ibm.com>	2025-02-25 11:27:42 +01:00
Anil Vishnoi	63141f1cc7	ci: Use release event to trigger the image publishing job for releases (#63 ) Signed-off-by: Anil Vishnoi <vishnoianil@gmail.com>	2025-02-24 08:21:17 +01:00
Eugene	d5557fad9f	refactor: Use bytes as options key (#58 ) Signed-off-by: Eugene <fogaprod@gmail.com>	2025-02-21 18:03:27 +01:00
İlker SIĞIRCI	36967f7f61	chore(config): replace black,isort,flake and autoflake with ruff (#55 ) Signed-off-by: ilker.sigirci <ilker.sigirci@data-boss.com.tr> Signed-off-by: ilkersigirci <sigirci.ilker@mgail.com> Co-authored-by: ilker.sigirci <ilker.sigirci@data-boss.com.tr>	2025-02-20 13:29:41 +01:00
github-actions[bot]	3b54d9b6ef	chore: bump version to 0.3.0 [skip ci]	2025-02-19 21:22:27 +00:00
Michele Dolfi	4877248368	fix: set DOCLING_SERVE_ARTIFACTS_PATH in images (#53 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-02-19 22:03:56 +01:00
Michele Dolfi	ec33a61faa	feat: Add new docling-serve cli (#50 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-02-19 20:54:13 +01:00
Eugene	663e03303a	chore: use uv in start_server.sh and update docs (#49 ) Signed-off-by: Eugene <fogaprod@gmail.com>	2025-02-19 19:25:00 +01:00
Guillaume Moutier	c64a450bf9	fix: Set root UI path when behind proxy (#38 ) Signed-off-by: Guillaume Moutier <3944034+guimou@users.noreply.github.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Guillaume Moutier <3944034+guimou@users.noreply.github.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2025-02-19 10:32:43 +01:00
Michele Dolfi	ae3b4906f1	fix: support python 3.13 and docling updates and switch to uv (#48 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-02-19 09:53:07 +01:00
Michele Dolfi	7a351fcdea	fix missing secrets inherit Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-02-13 17:02:01 +00:00
Michele Dolfi	1615f977a2	ci: add semantic release and build/publish python wheel (#41 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-02-13 16:49:43 +01:00
Guillaume Moutier	1bf487b18e	Fix main when workers > 1 (#35 ) Always load the app by using an import string Signed-off-by: Guillaume Moutier <3944034+guimou@users.noreply.github.com> Co-authored-by: Guillaume Moutier <3944034+guimou@users.noreply.github.com>	2025-02-12 09:54:49 +01:00