chore: bump version to 0.8.0 [skip ci]

feat: Add option for vlm pipeline (#143 )
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-11-29 16:43:24 +00:00 · 2025-04-22 13:04:33 +00:00 · 2025-04-22 14:46:33 +02:00 · 2025-04-22 10:41:47 +02:00 · 2025-04-19 19:59:07 +02:00 · 2025-04-19 18:46:28 +02:00
34 changed files with 4584 additions and 2473 deletions
--- a/.github/SECURITY.md
+++ b/.github/SECURITY.md
@@ -20,4 +20,4 @@ After the initial reply to your report, the security team will keep you informed

 ## Security Alerts

-We will send announcements of security vulnerabilities and steps to remediate on the [Docling announcements](https://github.com/DS4SD/docling/discussions/categories/announcements).
+We will send announcements of security vulnerabilities and steps to remediate on the [Docling announcements](https://github.com/docling-project/docling/discussions/categories/announcements).
--- a/.github/workflows/ci-images-dryrun.yml
+++ b/.github/workflows/ci-images-dryrun.yml
@@ -13,15 +13,15 @@ jobs:
    strategy:
      matrix:
        spec:
-          - name: ds4sd/docling-serve
+          - name: docling-project/docling-serve
            build_args: |
              UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra cpu
            platforms: linux/amd64, linux/arm64
-          - name: ds4sd/docling-serve-cpu
+          - name: docling-project/docling-serve-cpu
            build_args: |
              UV_SYNC_EXTRA_ARGS=--no-extra cu124
            platforms: linux/amd64, linux/arm64
-          - name: ds4sd/docling-serve-cu124
+          - name: docling-project/docling-serve-cu124
            build_args: |
              UV_SYNC_EXTRA_ARGS=--no-extra cpu
            platforms: linux/amd64
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -8,7 +8,7 @@ on:

 jobs:
  code-checks:
-    # if: ${{ github.event_name == 'push' || (github.event.pull_request.head.repo.full_name != 'DS4SD/docling-serve' && github.event.pull_request.head.repo.full_name != 'ds4sd/docling-serve') }}
+    # if: ${{ github.event_name == 'push' || (github.event.pull_request.head.repo.full_name != 'docling-project/docling-serve' && github.event.pull_request.head.repo.full_name != 'docling-project/docling-serve') }}
    uses: ./.github/workflows/job-checks.yml
    permissions:
      packages: write
--- a/.github/workflows/images.yml
+++ b/.github/workflows/images.yml
@@ -17,15 +17,15 @@ jobs:
    strategy:
      matrix:
        spec:
-          - name: ds4sd/docling-serve
+          - name: docling-project/docling-serve
            build_args: |
              UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra cpu
            platforms: linux/amd64, linux/arm64
-          - name: ds4sd/docling-serve-cpu
+          - name: docling-project/docling-serve-cpu
            build_args: |
              UV_SYNC_EXTRA_ARGS=--no-extra cu124
            platforms: linux/amd64, linux/arm64
-          - name: ds4sd/docling-serve-cu124
+          - name: docling-project/docling-serve-cu124
            build_args: |
              UV_SYNC_EXTRA_ARGS=--no-extra cpu
            platforms: linux/amd64
--- a/.markdownlint-cli2.yaml
+++ b/.markdownlint-cli2.yaml
@@ -3,7 +3,7 @@ config:
  no-emphasis-as-header: false
  first-line-heading: false
  MD033:
-    allowed_elements: ["details", "summary"]
+    allowed_elements: ["details", "summary", "br", "a", "b", "p", "img"]
  MD024:
    siblings_only: true
 globs:
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -5,10 +5,14 @@ repos:
    hooks:
      # Run the Ruff formatter.
      - id: ruff-format
+        name: "Ruff formatter"
        args: [--config=pyproject.toml]
+        files: '^(docling_serve|tests).*\.(py|ipynb)$'
      # Run the Ruff linter.
      - id: ruff
+        name: "Ruff linter"
        args: [--exit-non-zero-on-fix, --fix, --config=pyproject.toml]
+        files: '^(docling_serve|tests).*\.(py|ipynb)$'
  - repo: local
    hooks:
      - id: system
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,36 +1,86 @@
-## [v0.5.1](https://github.com/DS4SD/docling-serve/releases/tag/v0.5.1) - 2025-03-10
-
-### Fix
-
-* Submodules in wheels ([#85](https://github.com/DS4SD/docling-serve/issues/85)) ([`a92ad48`](https://github.com/DS4SD/docling-serve/commit/a92ad48b287bfcb134011dc0fc3f91ee04e067ee))
-
-## [v0.5.0](https://github.com/DS4SD/docling-serve/releases/tag/v0.5.0) - 2025-03-07
+## [v0.8.0](https://github.com/docling-project/docling-serve/releases/tag/v0.8.0) - 2025-04-22

 ### Feature

-* Async api ([#60](https://github.com/DS4SD/docling-serve/issues/60)) ([`82f8900`](https://github.com/DS4SD/docling-serve/commit/82f890019745859699c1b01f9ccfb64cb7e37906))
-* Display version in fastapi docs ([#78](https://github.com/DS4SD/docling-serve/issues/78)) ([`ed851c9`](https://github.com/DS4SD/docling-serve/commit/ed851c95fee5f59305ddc3dcd5c09efce618470b))
+* Add option for vlm pipeline ([#143](https://github.com/docling-project/docling-serve/issues/143)) ([`ee89ee4`](https://github.com/docling-project/docling-serve/commit/ee89ee4daee5e916bd6a3bdb452f78934cd03f60))
+* Expose more conversion options ([#142](https://github.com/docling-project/docling-serve/issues/142)) ([`6b3d281`](https://github.com/docling-project/docling-serve/commit/6b3d281f02905c195ab75f25bb39f5c4d4e7b680))
+* **UI:** Change UI to use async endpoints ([#131](https://github.com/docling-project/docling-serve/issues/131)) ([`b598872`](https://github.com/docling-project/docling-serve/commit/b598872e5c48928ac44417a11bb7acc0e5c3f0c6))

 ### Fix

-* Remove uv from image, merge ARG and ENV declarations ([#57](https://github.com/DS4SD/docling-serve/issues/57)) ([`c95db36`](https://github.com/DS4SD/docling-serve/commit/c95db3643807a4dfb96d93c8e10d6eb486c49a30))
-* **docs:** Remove comma in convert/source curl example ([#73](https://github.com/DS4SD/docling-serve/issues/73)) ([`05df073`](https://github.com/DS4SD/docling-serve/commit/05df0735d35a589bdc2a11fcdd764a10f700cb6f))
+* **UI:** Use https when calling the api ([#139](https://github.com/docling-project/docling-serve/issues/139)) ([`57f9073`](https://github.com/docling-project/docling-serve/commit/57f9073bc0daf72428b068ea28e2bec7cd76c37b))
+* Fix permissions in docker image ([#136](https://github.com/docling-project/docling-serve/issues/136)) ([`c1ce471`](https://github.com/docling-project/docling-serve/commit/c1ce4719c933179ba3c59d73d0584853bbd6fa6a))
+* Picture caption visuals ([#129](https://github.com/docling-project/docling-serve/issues/129)) ([`5dfb75d`](https://github.com/docling-project/docling-serve/commit/5dfb75d3b9a7022d1daad12edbb8ec7bbf9aa264))

-## [v0.4.0](https://github.com/DS4SD/docling-serve/releases/tag/v0.4.0) - 2025-02-26
+### Documentation
+
+* Fix required permissions for oauth2-proxy requests ([#141](https://github.com/docling-project/docling-serve/issues/141)) ([`087417e`](https://github.com/docling-project/docling-serve/commit/087417e5c2387d4ed95500222058f34d8a8702aa))
+* Update deployment examples ([#135](https://github.com/docling-project/docling-serve/issues/135)) ([`525a43f`](https://github.com/docling-project/docling-serve/commit/525a43ff6f04b7cc80f9dd6a0e653a8d8c4ab317))
+* Fix image tag ([#124](https://github.com/docling-project/docling-serve/issues/124)) ([`420162e`](https://github.com/docling-project/docling-serve/commit/420162e674cc38b4c3c13673ffbee4c20a1b15f1))
+
+## [v0.7.0](https://github.com/docling-project/docling-serve/releases/tag/v0.7.0) - 2025-03-31

 ### Feature

-* New container images ([#68](https://github.com/DS4SD/docling-serve/issues/68)) ([`7e6d9cd`](https://github.com/DS4SD/docling-serve/commit/7e6d9cdef398df70a5b4d626aeb523c428c10d56))
-* Render DoclingDocument with npm docling-components in the example UI ([#65](https://github.com/DS4SD/docling-serve/issues/65)) ([`c430d9b`](https://github.com/DS4SD/docling-serve/commit/c430d9b1a162ab29104d86ebaa1ac5a5488b1f09))
-
-## [v0.3.0](https://github.com/DS4SD/docling-serve/releases/tag/v0.3.0) - 2025-02-19
-
-### Feature
-
-* Add new docling-serve cli ([#50](https://github.com/DS4SD/docling-serve/issues/50)) ([`ec33a61`](https://github.com/DS4SD/docling-serve/commit/ec33a61faa7846b9b7998fbf557ebe39a3b800f6))
+* Expose TLS settings and example deploy with oauth-proxy ([#112](https://github.com/docling-project/docling-serve/issues/112)) ([`7a0faba`](https://github.com/docling-project/docling-serve/commit/7a0fabae07020c2659dbb22c3b0359909051a74c))
+* Offline static files ([#109](https://github.com/docling-project/docling-serve/issues/109)) ([`68772bb`](https://github.com/docling-project/docling-serve/commit/68772bb6f0a87b71094a08ff851f5754c6ca6163))
+* Update to Docling 2.28 ([#106](https://github.com/docling-project/docling-serve/issues/106)) ([`20ec87a`](https://github.com/docling-project/docling-serve/commit/20ec87a63a99145bc0ad7931549af8a0c30db641))

 ### Fix

-* Set DOCLING_SERVE_ARTIFACTS_PATH in images ([#53](https://github.com/DS4SD/docling-serve/issues/53)) ([`4877248`](https://github.com/DS4SD/docling-serve/commit/487724836896576ca4f98e84abf15fd1c383bec8))
-* Set root UI path when behind proxy ([#38](https://github.com/DS4SD/docling-serve/issues/38)) ([`c64a450`](https://github.com/DS4SD/docling-serve/commit/c64a450bf9ba9947ab180e92bef2763ff710b210))
-* Support python 3.13 and docling updates and switch to uv ([#48](https://github.com/DS4SD/docling-serve/issues/48)) ([`ae3b490`](https://github.com/DS4SD/docling-serve/commit/ae3b4906f1c0829b1331ea491f3518741cabff71))
+* Move ARGs to prevent cache invalidation ([#104](https://github.com/docling-project/docling-serve/issues/104)) ([`e30f458`](https://github.com/docling-project/docling-serve/commit/e30f458923d34c169db7d5a5c296848716e8cac4))
+
+## [v0.6.0](https://github.com/docling-project/docling-serve/releases/tag/v0.6.0) - 2025-03-17
+
+### Feature
+
+* Expose options for new features ([#92](https://github.com/docling-project/docling-serve/issues/92)) ([`ec57b52`](https://github.com/docling-project/docling-serve/commit/ec57b528ed3f8e7b9604ff4cdf06da3d52c714dd))
+
+### Fix
+
+* Allow changes in CORS settings ([#100](https://github.com/docling-project/docling-serve/issues/100)) ([`422c402`](https://github.com/docling-project/docling-serve/commit/422c402bab7f05e46274ede11f234a19a62e093e))
+* Avoid exploding options cache using lru and expose size parameter ([#101](https://github.com/docling-project/docling-serve/issues/101)) ([`ea09028`](https://github.com/docling-project/docling-serve/commit/ea090288d3eec4ea8fbdcd32a6a497a99c89189d))
+* Increase timeout_keep_alive and allow parameter changes ([#98](https://github.com/docling-project/docling-serve/issues/98)) ([`07c48ed`](https://github.com/docling-project/docling-serve/commit/07c48edd5d9437219d9623e3d05bc5166c5bb85a))
+* Add warning when using incompatible parameters ([#99](https://github.com/docling-project/docling-serve/issues/99)) ([`a212547`](https://github.com/docling-project/docling-serve/commit/a212547d28d6588c65e52000dc7bc04f3f77e69e))
+* **ui:** Use --port parameter and avoid failing when image is not found ([#97](https://github.com/docling-project/docling-serve/issues/97)) ([`c76daac`](https://github.com/docling-project/docling-serve/commit/c76daac70c87da412f791666881e48b74688b060))
+
+### Documentation
+
+* Simplify README and move details to docs ([#102](https://github.com/docling-project/docling-serve/issues/102)) ([`fd8e40a`](https://github.com/docling-project/docling-serve/commit/fd8e40a00849771263d9b75b9a56f6caeccb8517))
+
+## [v0.5.1](https://github.com/docling-project/docling-serve/releases/tag/v0.5.1) - 2025-03-10
+
+### Fix
+
+* Submodules in wheels ([#85](https://github.com/docling-project/docling-serve/issues/85)) ([`a92ad48`](https://github.com/docling-project/docling-serve/commit/a92ad48b287bfcb134011dc0fc3f91ee04e067ee))
+
+## [v0.5.0](https://github.com/docling-project/docling-serve/releases/tag/v0.5.0) - 2025-03-07
+
+### Feature
+
+* Async api ([#60](https://github.com/docling-project/docling-serve/issues/60)) ([`82f8900`](https://github.com/docling-project/docling-serve/commit/82f890019745859699c1b01f9ccfb64cb7e37906))
+* Display version in fastapi docs ([#78](https://github.com/docling-project/docling-serve/issues/78)) ([`ed851c9`](https://github.com/docling-project/docling-serve/commit/ed851c95fee5f59305ddc3dcd5c09efce618470b))
+
+### Fix
+
+* Remove uv from image, merge ARG and ENV declarations ([#57](https://github.com/docling-project/docling-serve/issues/57)) ([`c95db36`](https://github.com/docling-project/docling-serve/commit/c95db3643807a4dfb96d93c8e10d6eb486c49a30))
+* **docs:** Remove comma in convert/source curl example ([#73](https://github.com/docling-project/docling-serve/issues/73)) ([`05df073`](https://github.com/docling-project/docling-serve/commit/05df0735d35a589bdc2a11fcdd764a10f700cb6f))
+
+## [v0.4.0](https://github.com/docling-project/docling-serve/releases/tag/v0.4.0) - 2025-02-26
+
+### Feature
+
+* New container images ([#68](https://github.com/docling-project/docling-serve/issues/68)) ([`7e6d9cd`](https://github.com/docling-project/docling-serve/commit/7e6d9cdef398df70a5b4d626aeb523c428c10d56))
+* Render DoclingDocument with npm docling-components in the example UI ([#65](https://github.com/docling-project/docling-serve/issues/65)) ([`c430d9b`](https://github.com/docling-project/docling-serve/commit/c430d9b1a162ab29104d86ebaa1ac5a5488b1f09))
+
+## [v0.3.0](https://github.com/docling-project/docling-serve/releases/tag/v0.3.0) - 2025-02-19
+
+### Feature
+
+* Add new docling-serve cli ([#50](https://github.com/docling-project/docling-serve/issues/50)) ([`ec33a61`](https://github.com/docling-project/docling-serve/commit/ec33a61faa7846b9b7998fbf557ebe39a3b800f6))
+
+### Fix
+
+* Set DOCLING_SERVE_ARTIFACTS_PATH in images ([#53](https://github.com/docling-project/docling-serve/issues/53)) ([`4877248`](https://github.com/docling-project/docling-serve/commit/487724836896576ca4f98e84abf15fd1c383bec8))
+* Set root UI path when behind proxy ([#38](https://github.com/docling-project/docling-serve/issues/38)) ([`c64a450`](https://github.com/docling-project/docling-serve/commit/c64a450bf9ba9947ab180e92bef2763ff710b210))
+* Support python 3.13 and docling updates and switch to uv ([#48](https://github.com/docling-project/docling-serve/issues/48)) ([`ae3b490`](https://github.com/docling-project/docling-serve/commit/ae3b4906f1c0829b1331ea491f3518741cabff71))
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -3,13 +3,13 @@
 Our project welcomes external contributions. If you have an itch, please feel
 free to scratch it.

-To contribute code or documentation, please submit a [pull request](https://github.com/DS4SD/docling-serve/pulls).
+To contribute code or documentation, please submit a [pull request](https://github.com/docling-project/docling-serve/pulls).

 A good way to familiarize yourself with the codebase and contribution process is
-to look for and tackle low-hanging fruit in the [issue tracker](https://github.com/DS4SD/docling-serve/issues).
+to look for and tackle low-hanging fruit in the [issue tracker](https://github.com/docling-project/docling-serve/issues).
 Before embarking on a more ambitious contribution, please quickly [get in touch](#communication) with us.

-For general questions or support requests, please refer to the [discussion section](https://github.com/DS4SD/docling-serve/discussions).
+For general questions or support requests, please refer to the [discussion section](https://github.com/docling-project/docling-serve/discussions).

 **Note: We appreciate your effort, and want to avoid a situation where a contribution
 requires extensive rework (by you or by us), sits in backlog for a long time, or
@@ -17,14 +17,14 @@ cannot be accepted at all!**

 ### Proposing new features

-If you would like to implement a new feature, please [raise an issue](https://github.com/DS4SD/docling-serve/issues)
+If you would like to implement a new feature, please [raise an issue](https://github.com/docling-project/docling-serve/issues)
 before sending a pull request so the feature can be discussed. This is to avoid
 you wasting your valuable time working on a feature that the project developers
 are not interested in accepting into the code base.

 ### Fixing bugs

-If you would like to fix a bug, please [raise an issue](https://github.com/DS4SD/docling-serve/issues) before sending a
+If you would like to fix a bug, please [raise an issue](https://github.com/docling-project/docling-serve/issues) before sending a
 pull request so it can be tracked.

 ### Merge approval
@@ -73,7 +73,7 @@ git commit -s

 ## Communication

-Please feel free to connect with us using the [discussion section](https://github.com/DS4SD/docling-serve/discussions).
+Please feel free to connect with us using the [discussion section](https://github.com/docling-project/docling-serve/discussions).

 ## Developing

--- a/17
+++ b/17
@@ -2,9 +2,6 @@ ARG BASE_IMAGE=quay.io/sclorg/python-312-c9s:c9s

 FROM ${BASE_IMAGE}

-ARG MODELS_LIST="layout tableformer picture_classifier easyocr" \
-    UV_SYNC_EXTRA_ARGS=""
-
 USER 0

 ###################################################################################################
@@ -20,6 +17,8 @@ RUN --mount=type=bind,source=os-packages.txt,target=/tmp/os-packages.txt \
    dnf -y clean all && \
    rm -rf /var/cache/dnf

+RUN /usr/bin/fix-permissions /opt/app-root/src/.cache
+
 ENV TESSDATA_PREFIX=/usr/share/tesseract/tessdata/

 ###################################################################################################
@@ -41,25 +40,29 @@ ENV \
    UV_PROJECT_ENVIRONMENT=/opt/app-root \
    DOCLING_SERVE_ARTIFACTS_PATH=/opt/app-root/src/.cache/docling/models

+ARG UV_SYNC_EXTRA_ARGS=""
+
 RUN --mount=from=ghcr.io/astral-sh/uv:0.6.1,source=/uv,target=/bin/uv \
    --mount=type=cache,target=/opt/app-root/src/.cache/uv,uid=1001 \
    --mount=type=bind,source=uv.lock,target=uv.lock \
    --mount=type=bind,source=pyproject.toml,target=pyproject.toml \
-    uv sync --frozen --no-install-project --no-dev --all-extras ${UV_SYNC_EXTRA_ARGS}
+    umask 002 && uv sync --frozen --no-install-project --no-dev --all-extras ${UV_SYNC_EXTRA_ARGS}
+
+ARG MODELS_LIST="layout tableformer picture_classifier easyocr"

 RUN echo "Downloading models..." && \
    HF_HUB_DOWNLOAD_TIMEOUT="90" \
    HF_HUB_ETAG_TIMEOUT="90" \
    docling-tools models download -o "${DOCLING_SERVE_ARTIFACTS_PATH}" ${MODELS_LIST} && \
-    chown -R 1001:0 /opt/app-root/src/.cache && \
-    chmod -R g=u /opt/app-root/src/.cache
+    chown -R 1001:0 ${DOCLING_SERVE_ARTIFACTS_PATH} && \
+    chmod -R g=u ${DOCLING_SERVE_ARTIFACTS_PATH}

 COPY --chown=1001:0 ./docling_serve ./docling_serve
 RUN --mount=from=ghcr.io/astral-sh/uv:0.6.1,source=/uv,target=/bin/uv \
    --mount=type=cache,target=/opt/app-root/src/.cache/uv,uid=1001 \
    --mount=type=bind,source=uv.lock,target=uv.lock \
    --mount=type=bind,source=pyproject.toml,target=pyproject.toml \
-    uv sync --frozen --no-dev --all-extras ${UV_SYNC_EXTRA_ARGS}
+    umask 002 && uv sync --frozen --no-dev --all-extras ${UV_SYNC_EXTRA_ARGS}

 EXPOSE 5001

--- a/25
+++ b/25
@@ -17,6 +17,7 @@ else
 endif

 TAG=$(shell git rev-parse HEAD)
+BRANCH_TAG=$(shell git rev-parse --abbrev-ref HEAD)

 action-lint-file:
 	$(CMD_PREFIX) touch .action-lint
@@ -27,23 +28,23 @@ md-lint-file:
 .PHONY: docling-serve-image
 docling-serve-image: Containerfile
 	$(ECHO_PREFIX) printf "  %-12s Containerfile\n" "[docling-serve]"
-	$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra cpu" -f Containerfile -t ghcr.io/ds4sd/docling-serve:$(TAG) .
-	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve:$(TAG) ghcr.io/ds4sd/docling-serve:main
-	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve:$(TAG) quay.io/ds4sd/docling-serve:main
+	$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra cpu" -f Containerfile -t ghcr.io/docling-project/docling-serve:$(TAG) .
+	$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve:$(TAG) ghcr.io/docling-project/docling-serve:$(BRANCH_TAG)
+	$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve:$(TAG) quay.io/docling-project/docling-serve:$(BRANCH_TAG)

 .PHONY: docling-serve-cpu-image
 docling-serve-cpu-image: Containerfile ## Build docling-serve "cpu only" container image
 	$(ECHO_PREFIX) printf "  %-12s Containerfile\n" "[docling-serve CPU]"
-	$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cu124" -f Containerfile -t ghcr.io/ds4sd/docling-serve-cpu:$(TAG) .
-	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve-cpu:$(TAG) ghcr.io/ds4sd/docling-serve-cpu:main
-	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve-cpu:$(TAG) quay.io/ds4sd/docling-serve-cpu:main
+	$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cu124" -f Containerfile -t ghcr.io/docling-project/docling-serve-cpu:$(TAG) .
+	$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-cpu:$(TAG) ghcr.io/docling-project/docling-serve-cpu:$(BRANCH_TAG)
+	$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-cpu:$(TAG) quay.io/docling-project/docling-serve-cpu:$(BRANCH_TAG)

 .PHONY: docling-serve-cu124-image
 docling-serve-cu124-image: Containerfile ## Build docling-serve container image with GPU support
 	$(ECHO_PREFIX) printf "  %-12s Containerfile\n" "[docling-serve with Cuda 12.4]"
-	$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cpu" -f Containerfile --platform linux/amd64 -t ghcr.io/ds4sd/docling-serve-cu124:$(TAG) .
-	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve-cu124:$(TAG) ghcr.io/ds4sd/docling-serve-cu124:main
-	$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve-cu124:$(TAG) quay.io/ds4sd/docling-serve-cu124:main
+	$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cpu" -f Containerfile --platform linux/amd64 -t ghcr.io/docling-project/docling-serve-cu124:$(TAG) .
+	$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-cu124:$(TAG) ghcr.io/docling-project/docling-serve-cu124:$(BRANCH_TAG)
+	$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-cu124:$(TAG) quay.io/docling-project/docling-serve-cu124:$(BRANCH_TAG)

 .PHONY: action-lint
 action-lint: .action-lint ##      Lint GitHub Action workflows
@@ -66,7 +67,7 @@ action-lint: .action-lint ##      Lint GitHub Action workflows
 md-lint: .md-lint ##      Lint markdown files
 .md-lint: $(wildcard */**/*.md) | md-lint-file
 	$(ECHO_PREFIX) printf "  %-12s ./...\n" "[MD LINT]"
-	$(CMD_PREFIX) docker run --rm -v $$(pwd):/workdir davidanson/markdownlint-cli2:v0.14.0 "**/*.md"
+	$(CMD_PREFIX) docker run --rm -v $$(pwd):/workdir davidanson/markdownlint-cli2:v0.16.0 "**/*.md" "#.venv"
 	$(CMD_PREFIX) touch $@

 .PHONY: py-Lint
@@ -84,11 +85,11 @@ run-docling-cpu: ## Run the docling-serve container with CPU support and assign
 	$(ECHO_PREFIX) printf "  %-12s Removing existing container if it exists...\n" "[CLEANUP]"
 	$(CMD_PREFIX) docker rm -f docling-serve-cpu 2>/dev/null || true
 	$(ECHO_PREFIX) printf "  %-12s Running docling-serve container with CPU support on port 5001...\n" "[RUN CPU]"
-	$(CMD_PREFIX) docker run -it --name docling-serve-cpu -p 5001:5001 ghcr.io/ds4sd/docling-serve-cpu:main
+	$(CMD_PREFIX) docker run -it --name docling-serve-cpu -p 5001:5001 ghcr.io/docling-project/docling-serve-cpu:main

 .PHONY: run-docling-gpu
 run-docling-gpu: ## Run the docling-serve container with GPU support and assign a container name
 	$(ECHO_PREFIX) printf "  %-12s Removing existing container if it exists...\n" "[CLEANUP]"
 	$(CMD_PREFIX) docker rm -f docling-serve-gpu 2>/dev/null || true
 	$(ECHO_PREFIX) printf "  %-12s Running docling-serve container with GPU support on port 5001...\n" "[RUN GPU]"
-	$(CMD_PREFIX) docker run -it --name docling-serve-gpu -p 5001:5001 ghcr.io/ds4sd/docling-serve:main
+	$(CMD_PREFIX) docker run -it --name docling-serve-gpu -p 5001:5001 ghcr.io/docling-project/docling-serve:main
--- a/README.md
+++ b/README.md
@@ -1,431 +1,84 @@
+<p align="center">
+  <a href="https://github.com/docling-project/docling-serve">
+    <img loading="lazy" alt="Docling" src="https://github.com/docling-project/docling-serve/raw/main/docs/assets/docling-serve-pic.png" width="30%"/>
+  </a>
+</p>
+
 # Docling Serve

- Running [Docling](https://github.com/DS4SD/docling) as an API service.
+Running [Docling](https://github.com/docling-project/docling) as an API service.

-## Usage
+## Getting started

-The API provides two endpoints: one for urls, one for files. This is necessary to send files directly in binary format instead of base64-encoded strings.
+Install the `docling-serve` package and run the server.

-### Common parameters
+```bash
+# Using the python package
+pip install "docling-serve"
+docling-serve run

-On top of the source of file (see below), both endpoints support the same parameters, which are almost the same as the Docling CLI.
-
- `from_format` (List[str]): Input format(s) to convert from. Allowed values: `docx`, `pptx`, `html`, `image`, `pdf`, `asciidoc`, `md`. Defaults to all formats.
- `to_formats` (List[str]): Output format(s) to convert to. Allowed values: `md`, `json`, `html`, `text`, `doctags`. Defaults to `md`.
- `do_ocr` (bool): If enabled, the bitmap content will be processed using OCR. Defaults to `True`.
- `image_export_mode`: Image export mode for the document (only in case of JSON, Markdown or HTML). Allowed values: embedded, placeholder, referenced. Optional, defaults to `embedded`.
- `force_ocr` (bool): If enabled, replace any existing text with OCR-generated text over the full content. Defaults to `False`.
- `ocr_engine` (str): OCR engine to use. Allowed values: `easyocr`, `tesseract_cli`, `tesseract`, `rapidocr`, `ocrmac`. Defaults to `easyocr`.
- `ocr_lang` (List[str]): List of languages used by the OCR engine. Note that each OCR engine has different values for the language names. Defaults to empty.
- `pdf_backend` (str): PDF backend to use. Allowed values: `pypdfium2`, `dlparse_v1`, `dlparse_v2`. Defaults to `dlparse_v2`.
- `table_mode` (str): Table mode to use. Allowed values: `fast`, `accurate`. Defaults to `fast`.
- `abort_on_error` (bool): If enabled, abort on error. Defaults to false.
- `return_as_file` (boo): If enabled, return the output as a file. Defaults to false.
- `do_table_structure` (bool): If enabled, the table structure will be extracted. Defaults to true.
- `include_images` (bool): If enabled, images will be extracted from the document. Defaults to true.
- `images_scale` (float): Scale factor for images. Defaults to 2.0.
-
-### URL endpoint
-
-The endpoint is `/v1alpha/convert/source`, listening for POST requests of JSON payloads.
-
-On top of the above parameters, you must send the URL(s) of the document you want process with either the `http_sources` or `file_sources` fields.
-The first is fetching URL(s) (optionally using with extra headers), the second allows to provide documents as base64-encoded strings.
-No `options` is required, they can be partially or completely omitted.
-
-Simple payload example:
-
-```json
-{
-  "http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
-}
+# Using container images, e.g. with Podman
+podman run -p 5001:5001 quay.io/docling-project/docling-serve
 ```

-<details>
+The server is available at

-<summary>Complete payload example:</summary>
+- API <http://127.0.0.1:5001>
+- API documentation <http://127.0.0.1:5001/docs>
+  ![swagger.png](img/swagger.png)

-```json
-{
-  "options": {
-    "from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
-    "to_formats": ["md", "json", "html", "text", "doctags"],
-    "image_export_mode": "placeholder",
-    "do_ocr": true,
-    "force_ocr": false,
-    "ocr_engine": "easyocr",
-    "ocr_lang": ["en"],
-    "pdf_backend": "dlparse_v2",
-    "table_mode": "fast",
-    "abort_on_error": false,
-    "return_as_file": false,
-  },
-  "http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
-}
-```
+Try it out with a simple conversion:

-</details>
-
-<details>
-
-<summary>CURL example:</summary>
-
-```sh
+```bash
 curl -X 'POST' \
  'http://localhost:5001/v1alpha/convert/source' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
-  "options": {
-    "from_formats": [
-      "docx",
-      "pptx",
-      "html",
-      "image",
-      "pdf",
-      "asciidoc",
-      "md",
-      "xlsx"
-    ],
-    "to_formats": ["md", "json", "html", "text", "doctags"],
-    "image_export_mode": "placeholder",
-    "do_ocr": true,
-    "force_ocr": false,
-    "ocr_engine": "easyocr",
-    "ocr_lang": [
-      "fr",
-      "de",
-      "es",
-      "en"
-    ],
-    "pdf_backend": "dlparse_v2",
-    "table_mode": "fast",
-    "abort_on_error": false,
-    "return_as_file": false,
-    "do_table_structure": true,
-    "include_images": true,
-    "images_scale": 2
-  },
-  "http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
-}'
+    "http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
+  }'
 ```

-</details>
+### Container images

-<details>
-<summary>Python example:</summary>
+Available container images:

-```python
-import httpx
+| Name | Description | Arch | Size |
+| -----|-------------|------|------|
+| [`ghcr.io/docling-project/docling-serve`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve) <br /> [`quay.io/docling-project/docling-serve`](https://quay.io/repository/docling-project/docling-serve) | Simple image for Docling Serve, installing all packages from the official pypi.org index. | `linux/amd64`, `linux/arm64` | 3.6 GB |
+| [`ghcr.io/docling-project/docling-serve-cpu`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve-cpu) <br /> [`quay.io/docling-project/docling-serve-cpu`](https://quay.io/repository/docling-project/docling-serve-cpu) | Cpu-only image which installs `torch` from the pytorch cpu index. | `linux/amd64`, `linux/arm64` | 3.6 GB |
+| [`ghcr.io/docling-project/docling-serve-cu124`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve-cu124) <br /> [`quay.io/docling-project/docling-serve-cu124`](https://quay.io/repository/docling-project/docling-serve-cu124) | Cuda 12.4 image which installs `torch` from the pytorch cu124 index. | `linux/amd64` | 8.7 GB |

-async_client = httpx.AsyncClient(timeout=60.0)
-url = "http://localhost:5001/v1alpha/convert/source"
-payload = {
-  "options": {
-    "from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
-    "to_formats": ["md", "json", "html", "text", "doctags"],
-    "image_export_mode": "placeholder",
-    "do_ocr": True,
-    "force_ocr": False,
-    "ocr_engine": "easyocr",
-    "ocr_lang": "en",
-    "pdf_backend": "dlparse_v2",
-    "table_mode": "fast",
-    "abort_on_error": False,
-    "return_as_file": False,
-  },
-  "http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
-}
+Coming soon: `docling-serve-slim` images will reduce the size by skipping the model weights download.

-response = await async_client_client.post(url, json=payload)
-
-data = response.json()
-```
-
-</details>
-
-#### File as base64
-
-The `file_sources` argument in the endpoint allows to send files as base64-encoded strings.
-When your PDF or other file type is too large, encoding it and passing it inline to curl
-can lead to an “Argument list too long” error on some systems. To avoid this, we write
-the JSON request body to a file and have curl read from that file.
-
-<details>
-<summary>CURL steps:</summary>
-
-```sh
-# 1. Base64-encode the file
-B64_DATA=$(base64 -w 0 /path/to/file/pdf-to-convert.pdf)
-
-# 2. Build the JSON with your options
-cat <<EOF > /tmp/request_body.json
-{
-  "options": {
-  },
-  "file_sources": [{
-    "base64_string": "${B64_DATA}",
-    "filename": "pdf-to-convert.pdf"
-  }]
-}
-EOF
-
-# 3. POST the request to the docling service
-curl -X POST "localhost:5001/v1alpha/convert/source" \
-     -H "Content-Type: application/json" \
-     -d @/tmp/request_body.json
-```
-
-</details>
-
-### File endpoint
-
-The endpoint is: `/v1alpha/convert/file`, listening for POST requests of Form payloads (necessary as the files are sent as multipart/form data). You can send one or multiple files.
-
-<details>
-<summary>CURL example:</summary>
-
-```sh
-curl -X 'POST' \
-  'http://127.0.0.1:5001/v1alpha/convert/file' \
-  -H 'accept: application/json' \
-  -H 'Content-Type: multipart/form-data' \
-  -F 'ocr_engine=easyocr' \
-  -F 'pdf_backend=dlparse_v2' \
-  -F 'from_formats=pdf' \
-  -F 'from_formats=docx' \
-  -F 'force_ocr=false' \
-  -F 'image_export_mode=embedded' \
-  -F 'ocr_lang=en' \
-  -F 'ocr_lang=pl' \
-  -F 'table_mode=fast' \
-  -F 'files=@2206.01062v1.pdf;type=application/pdf' \
-  -F 'abort_on_error=false' \
-  -F 'to_formats=md' \
-  -F 'to_formats=text' \
-  -F 'return_as_file=false' \
-  -F 'do_ocr=true'
-```
-
-</details>
-
-<details>
-<summary>Python example:</summary>
-
-```python
-import httpx
-
-async_client = httpx.AsyncClient(timeout=60.0)
-url = "http://localhost:5001/v1alpha/convert/file"
-parameters = {
-"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
-"to_formats": ["md", "json", "html", "text", "doctags"],
-"image_export_mode": "placeholder",
-"do_ocr": True,
-"force_ocr": False,
-"ocr_engine": "easyocr",
-"ocr_lang": ["en"],
-"pdf_backend": "dlparse_v2",
-"table_mode": "fast",
-"abort_on_error": False,
-"return_as_file": False
-}
-
-current_dir = os.path.dirname(__file__)
-file_path = os.path.join(current_dir, '2206.01062v1.pdf')
-
-files = {
-    'files': ('2206.01062v1.pdf', open(file_path, 'rb'), 'application/pdf'),
-}
-
-response = await async_client.post(url, files=files, data={"parameters": json.dumps(parameters)})
-assert response.status_code == 200, "Response should be 200 OK"
-
-data = response.json()
-```
-
-</details>
-
-### Response format
-
-The response can be a JSON Document or a File.
-
- If you process only one file, the response will be a JSON document with the following format:
-
-  ```jsonc
-  {
-    "document": {
-      "md_content": "",
-      "json_content": {},
-      "html_content": "",
-      "text_content": "",
-      "doctags_content": ""
-      },
-    "status": "<success|partial_success|skipped|failure>",
-    "processing_time": 0.0,
-    "timings": {},
-    "errors": []
-  }
-  ```
-
-  Depending on the value you set in `output_formats`, the different items will be populated with their respective results or empty.
-
-  `processing_time` is the Docling processing time in seconds, and `timings` (when enabled in the backend) provides the detailed
-  timing of all the internal Docling components.
-
- If you set the parameter `return_as_file` to True, the response will be a zip file.
- If multiple files are generated (multiple inputs, or one input but multiple outputs with `return_as_file` True), the response will be a zip file.
-
-## Run docling-serve
-
-Clone the repository and run the following from within the cloned directory root.
+### Demonstration UI

 ```bash
-python -m venv venv
-source venv/bin/activate
+# Install the Python package with the extra dependencies
 pip install "docling-serve[ui]"
 docling-serve run --enable-ui
+
+# Run the container image with the extra env parameters
+podman run -p 5001:5001 -e DOCLING_SERVE_ENABLE_UI=true quay.io/docling-project/docling-serve
 ```

-## Helpers
-
- A full Swagger UI is available at the `/docs` endpoint.
-
-![swagger.png](img/swagger.png)
-
- An easy to use UI is available at the `/ui` endpoint.
+An easy to use UI is available at the `/ui` endpoint.

 ![ui-input.png](img/ui-input.png)

 ![ui-output.png](img/ui-output.png)

-## Development
+## Documentation and advance usages

-### CPU only
-
-```sh
-# Install uv if not already available
-curl -LsSf https://astral.sh/uv/install.sh | sh
-
-# Install dependencies
-uv sync --extra cpu
-```
-
-### Cuda GPU
-
-For GPU support use the following command:
-
-```sh
-# Install dependencies
-uv sync
-```
-
-### Gradio UI and different OCR backends
-
-`/ui` endpoint using `gradio` and different OCR backends can be enabled via package extras:
-
-```sh
-# Enable ui and rapidocr
-uv sync --extra ui --extra rapidocr
-```
-
-```sh
-# Enable tesserocr
-uv sync --extra tesserocr
-```
-
-See `[project.optional-dependencies]` section in `pyproject.toml` for full list of options and runtime options with `uv run docling-serve --help`.
-
-### Run the server
-
-The `docling-serve` executable is a convenient script for launching the webserver both in
-development and production mode.
-
-```sh
-# Run the server in development mode
-# - reload is enabled by default
-# - listening on the 127.0.0.1 address
-# - ui is enabled by default
-docling-serve dev
-
-# Run the server in production mode
-# - reload is disabled by default
-# - listening on the 0.0.0.0 address
-# - ui is disabled by default
-docling-serve run
-```
-
-### Options
-
-The `docling-serve` executable allows is controlled with both command line
-options and environment variables.
-
-<details>
-<summary>`docling-serve` help message</summary>
-
-```sh
-$ docling-serve dev --help
-                                                                                                              
- Usage: docling-serve dev [OPTIONS]                                                                           
-                                                                                                              
- Run a Docling Serve app in development mode. 🧪                                                              
- This is equivalent to docling-serve run but with reload                                                      
- enabled and listening on the 127.0.0.1 address.                                                              
-                                                                                                              
- Options can be set also with the corresponding ENV variable, with the exception                              
- of --enable-ui, --host and --reload.                                                                         
-                                                                                                              
-╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────╮
-│ --host                                   TEXT     The host to serve on. For local development in localhost │
-│                                                   use 127.0.0.1. To enable public access, e.g. in a        │
-│                                                   container, use all the IP addresses available with       │
-│                                                   0.0.0.0.                                                 │
-│                                                   [default: 127.0.0.1]                                     │
-│ --port                                   INTEGER  The port to serve on. [default: 5001]                    │
-│ --reload           --no-reload                    Enable auto-reload of the server when (code) files       │
-│                                                   change. This is resource intensive, use it only during   │
-│                                                   development.                                             │
-│                                                   [default: reload]                                        │
-│ --root-path                              TEXT     The root path is used to tell your app that it is being  │
-│                                                   served to the outside world with some path prefix set up │
-│                                                   in some termination proxy or similar.                    │
-│ --proxy-headers    --no-proxy-headers             Enable/Disable X-Forwarded-Proto, X-Forwarded-For,       │
-│                                                   X-Forwarded-Port to populate remote address info.        │
-│                                                   [default: proxy-headers]                                 │
-│ --artifacts-path                          PATH     If set to a valid directory, the model weights will be  │
-│                                                    loaded from this path.                                  │
-│                                                    [default: None]                                         │
-│ --enable-ui        --no-enable-ui                 Enable the development UI. [default: enable-ui]          │
-│ --help                                            Show this message and exit.                              │
-╰────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
-```
-
-</details>
-
-#### Environment variables
-
-The environment variables controlling the `uvicorn` execution can be specified with the `UVICORN_` prefix:
-
- `UVICORN_WORKERS`: Number of workers to use.
- `UVICORN_RELOAD`: If `True`, this will enable auto-reload when you modify files, useful for development.
-
-The environment variables controlling specifics of the Docling Serve app can be specified with the
-`DOCLING_SERVE_` prefix:
-
- `DOCLING_SERVE_ARTIFACTS_PATH`: if set Docling will use only the local weights of models, for example `/opt/app-root/src/.cache/docling/models`.
- `DOCLING_SERVE_ENABLE_UI`: If `True`, The Gradio UI will be available at `/ui`.
-
-Others:
-
- `TESSDATA_PREFIX`: Tesseract data location, example `/usr/share/tesseract/tessdata/`.
+Visit the [Docling Serve documentation](./docs/README.md) for learning how to [configure the webserver](./docs/configuration.md), use all the [runtime options](./docs/usage.md) of the API and [deployment examples](./docs/deployment.md).

 ## Get help and support

-Please feel free to connect with us using the [discussion section](https://github.com/DS4SD/docling/discussions).
+Please feel free to connect with us using the [discussion section](https://github.com/docling-project/docling/discussions).

 ## Contributing

-Please read [Contributing to Docling Serve](https://github.com/DS4SD/docling-serve/blob/main/CONTRIBUTING.md) for details.
+Please read [Contributing to Docling Serve](https://github.com/docling-project/docling-serve/blob/main/CONTRIBUTING.md) for details.

 ## References

@@ -433,14 +86,14 @@ If you use Docling in your projects, please consider citing the following:

 ```bib
@techreport{Docling,
-  author = {Deep Search Team},
-  month = {8},
-  title = {Docling Technical Report},
-  url = {https://arxiv.org/abs/2408.09869},
-  eprint = {2408.09869},
-  doi = {10.48550/arXiv.2408.09869},
-  version = {1.0.0},
-  year = {2024}
+  author = {Docling Contributors},
+  month = {1},
+  title = {Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion},
+  url = {https://arxiv.org/abs/2501.17887},
+  eprint = {2501.17887},
+  doi = {10.48550/arXiv.2501.17887},
+  version = {2.0.0},
+  year = {2025}
 }
 ```

--- a/docling_serve/main.py
+++ b/docling_serve/main.py
@@ -74,12 +74,44 @@ def callback(
 def _run(
    *,
    command: str,
+    # Docling serve parameters
+    artifacts_path: Path | None,
+    enable_ui: bool,
 ) -> None:
    server_type = "development" if command == "dev" else "production"

    console.print(f"Starting {server_type} server 🚀")

-    url = f"http://{uvicorn_settings.host}:{uvicorn_settings.port}"
+    run_subprocess = (
+        uvicorn_settings.workers is not None and uvicorn_settings.workers > 1
+    ) or uvicorn_settings.reload
+
+    run_ssl = (
+        uvicorn_settings.ssl_certfile is not None
+        and uvicorn_settings.ssl_keyfile is not None
+    )
+
+    if run_subprocess and docling_serve_settings.artifacts_path != artifacts_path:
+        err_console.print(
+            "\n[yellow]:warning: The server will run with reload or multiple workers. \n"
+            "The argument [bold]--artifacts-path[/bold] will be ignored, please set the value \n"
+            "using the environment variable [bold]DOCLING_SERVE_ARTIFACTS_PATH[/bold].[/yellow]"
+        )
+
+    if run_subprocess and docling_serve_settings.enable_ui != enable_ui:
+        err_console.print(
+            "\n[yellow]:warning: The server will run with reload or multiple workers. \n"
+            "The argument [bold]--enable-ui[/bold] will be ignored, please set the value \n"
+            "using the environment variable [bold]DOCLING_SERVE_ENABLE_UI[/bold].[/yellow]"
+        )
+
+    # Propagate the settings to the app settings
+    docling_serve_settings.artifacts_path = artifacts_path
+    docling_serve_settings.enable_ui = enable_ui
+
+    # Print documentation
+    protocol = "https" if run_ssl else "http"
+    url = f"{protocol}://{uvicorn_settings.host}:{uvicorn_settings.port}"
    url_docs = f"{url}/docs"
    url_ui = f"{url}/ui"

@@ -99,6 +131,7 @@ def _run(
    console.print("")
    console.print("Logs:")

+    # Launch the server
    uvicorn.run(
        app="docling_serve.app:create_app",
        factory=True,
@@ -108,6 +141,10 @@ def _run(
        workers=uvicorn_settings.workers,
        root_path=uvicorn_settings.root_path,
        proxy_headers=uvicorn_settings.proxy_headers,
+        timeout_keep_alive=uvicorn_settings.timeout_keep_alive,
+        ssl_certfile=uvicorn_settings.ssl_certfile,
+        ssl_keyfile=uvicorn_settings.ssl_keyfile,
+        ssl_keyfile_password=uvicorn_settings.ssl_keyfile_password,
    )


@@ -159,6 +196,18 @@ def dev(
            )
        ),
    ] = uvicorn_settings.proxy_headers,
+    timeout_keep_alive: Annotated[
+        int, typer.Option(help="Timeout for the server response.")
+    ] = uvicorn_settings.timeout_keep_alive,
+    ssl_certfile: Annotated[
+        Optional[Path], typer.Option(help="SSL certificate file")
+    ] = uvicorn_settings.ssl_certfile,
+    ssl_keyfile: Annotated[
+        Optional[Path], typer.Option(help="SSL key file")
+    ] = uvicorn_settings.ssl_keyfile,
+    ssl_keyfile_password: Annotated[
+        Optional[str], typer.Option(help="SSL keyfile password")
+    ] = uvicorn_settings.ssl_keyfile_password,
    # docling options
    artifacts_path: Annotated[
        Optional[Path],
@@ -186,12 +235,15 @@ def dev(
    uvicorn_settings.reload = reload
    uvicorn_settings.root_path = root_path
    uvicorn_settings.proxy_headers = proxy_headers
-
-    docling_serve_settings.artifacts_path = artifacts_path
-    docling_serve_settings.enable_ui = enable_ui
+    uvicorn_settings.timeout_keep_alive = timeout_keep_alive
+    uvicorn_settings.ssl_certfile = ssl_certfile
+    uvicorn_settings.ssl_keyfile = ssl_keyfile
+    uvicorn_settings.ssl_keyfile_password = ssl_keyfile_password

    _run(
        command="dev",
+        artifacts_path=artifacts_path,
+        enable_ui=enable_ui,
    )


@@ -251,6 +303,18 @@ def run(
            )
        ),
    ] = uvicorn_settings.proxy_headers,
+    timeout_keep_alive: Annotated[
+        int, typer.Option(help="Timeout for the server response.")
+    ] = uvicorn_settings.timeout_keep_alive,
+    ssl_certfile: Annotated[
+        Optional[Path], typer.Option(help="SSL certificate file")
+    ] = uvicorn_settings.ssl_certfile,
+    ssl_keyfile: Annotated[
+        Optional[Path], typer.Option(help="SSL key file")
+    ] = uvicorn_settings.ssl_keyfile,
+    ssl_keyfile_password: Annotated[
+        Optional[str], typer.Option(help="SSL keyfile password")
+    ] = uvicorn_settings.ssl_keyfile_password,
    # docling options
    artifacts_path: Annotated[
        Optional[Path],
@@ -281,12 +345,15 @@ def run(
    uvicorn_settings.workers = workers
    uvicorn_settings.root_path = root_path
    uvicorn_settings.proxy_headers = proxy_headers
-
-    docling_serve_settings.artifacts_path = artifacts_path
-    docling_serve_settings.enable_ui = enable_ui
+    uvicorn_settings.timeout_keep_alive = timeout_keep_alive
+    uvicorn_settings.ssl_certfile = ssl_certfile
+    uvicorn_settings.ssl_keyfile = ssl_keyfile
+    uvicorn_settings.ssl_keyfile_password = ssl_keyfile_password

    _run(
        command="run",
+        artifacts_path=artifacts_path,
+        enable_ui=enable_ui,
    )


--- a/docling_serve/app.py
+++ b/docling_serve/app.py
@@ -18,10 +18,15 @@ from fastapi import (
    WebSocketDisconnect,
 )
 from fastapi.middleware.cors import CORSMiddleware
+from fastapi.openapi.docs import (
+    get_redoc_html,
+    get_swagger_ui_html,
+    get_swagger_ui_oauth2_redirect_html,
+)
 from fastapi.responses import RedirectResponse
+from fastapi.staticfiles import StaticFiles

-from docling.datamodel.base_models import DocumentStream, InputFormat
-from docling.document_converter import DocumentConverter
+from docling.datamodel.base_models import DocumentStream

 from docling_serve.datamodel.convert import ConvertDocumentsOptions
 from docling_serve.datamodel.requests import (
@@ -37,7 +42,7 @@ from docling_serve.datamodel.responses import (
 )
 from docling_serve.docling_conversion import (
    convert_documents,
-    converters,
+    get_converter,
    get_pdf_pipeline_opts,
 )
 from docling_serve.engines import get_orchestrator
@@ -86,15 +91,8 @@ _log = logging.getLogger(__name__)
@asynccontextmanager
 async def lifespan(app: FastAPI):
    # Converter with default options
-    pdf_format_option, options_hash = get_pdf_pipeline_opts(ConvertDocumentsOptions())
-    converters[options_hash] = DocumentConverter(
-        format_options={
-            InputFormat.PDF: pdf_format_option,
-            InputFormat.IMAGE: pdf_format_option,
-        }
-    )
-
-    converters[options_hash].initialize_pipeline(InputFormat.PDF)
+    pdf_format_option = get_pdf_pipeline_opts(ConvertDocumentsOptions())
+    get_converter(pdf_format_option)

    orchestrator = get_orchestrator()

@@ -110,11 +108,6 @@ async def lifespan(app: FastAPI):
    except asyncio.CancelledError:
        _log.info("Queue processor cancelled.")

-    converters.clear()
-
-    # if WITH_UI:
-    #     gradio_ui.close()
-

 ##################################
 # App creation and configuration #
@@ -129,15 +122,25 @@ def create_app():  # noqa: C901

        version = "0.0.0"

+    offline_docs_assets = False
+    if (
+        docling_serve_settings.static_path is not None
+        and (docling_serve_settings.static_path).is_dir()
+    ):
+        offline_docs_assets = True
+        _log.info("Found static assets.")
+
    app = FastAPI(
        title="Docling Serve",
+        docs_url=None if offline_docs_assets else "/docs",
+        redoc_url=None if offline_docs_assets else "/redocs",
        lifespan=lifespan,
        version=version,
    )

-    origins = ["*"]
-    methods = ["*"]
-    headers = ["*"]
+    origins = docling_serve_settings.cors_origins
+    methods = docling_serve_settings.cors_methods
+    headers = docling_serve_settings.cors_headers

    app.add_middleware(
        CORSMiddleware,
@@ -170,6 +173,38 @@ def create_app():  # noqa: C901
                "or `pip install gradio`"
            )

+    #############################
+    # Offline assets definition #
+    #############################
+    if offline_docs_assets:
+        app.mount(
+            "/static",
+            StaticFiles(directory=docling_serve_settings.static_path),
+            name="static",
+        )
+
+        @app.get("/docs", include_in_schema=False)
+        async def custom_swagger_ui_html():
+            return get_swagger_ui_html(
+                openapi_url=app.openapi_url,
+                title=app.title + " - Swagger UI",
+                oauth2_redirect_url=app.swagger_ui_oauth2_redirect_url,
+                swagger_js_url="/static/swagger-ui-bundle.js",
+                swagger_css_url="/static/swagger-ui.css",
+            )
+
+        @app.get(app.swagger_ui_oauth2_redirect_url, include_in_schema=False)
+        async def swagger_ui_redirect():
+            return get_swagger_ui_oauth2_redirect_html()
+
+        @app.get("/redoc", include_in_schema=False)
+        async def redoc_html():
+            return get_redoc_html(
+                openapi_url=app.openapi_url,
+                title=app.title + " - ReDoc",
+                redoc_js_url="/static/redoc.standalone.js",
+            )
+
    #############################
    # API Endpoints definitions #
    #############################
@@ -177,9 +212,10 @@ def create_app():  # noqa: C901
    # Favicon
    @app.get("/favicon.ico", include_in_schema=False)
    async def favicon():
-        response = RedirectResponse(
-            url="https://ds4sd.github.io/docling/assets/logo.png"
-        )
+        logo_url = "https://raw.githubusercontent.com/docling-project/docling/refs/heads/main/docs/assets/logo.svg"
+        if offline_docs_assets:
+            logo_url = "/static/logo.svg"
+        response = RedirectResponse(url=logo_url)
        return response

    @app.get("/health")
--- a/docling_serve/datamodel/convert.py
+++ b/docling_serve/datamodel/convert.py
@@ -4,9 +4,27 @@ from typing import Annotated, Optional
 from pydantic import BaseModel, Field

 from docling.datamodel.base_models import InputFormat, OutputFormat
-from docling.datamodel.pipeline_options import OcrEngine, PdfBackend, TableFormerMode
+from docling.datamodel.pipeline_options import (
+    EasyOcrOptions,
+    PdfBackend,
+    PdfPipeline,
+    TableFormerMode,
+    TableStructureOptions,
+)
+from docling.datamodel.settings import (
+    DEFAULT_PAGE_RANGE,
+    PageRange,
+)
+from docling.models.factories import get_ocr_factory
 from docling_core.types.doc import ImageRefMode

+from docling_serve.settings import docling_serve_settings
+
+ocr_factory = get_ocr_factory(
+    allow_external_plugins=docling_serve_settings.allow_external_plugins
+)
+ocr_engines_enum = ocr_factory.get_enum()
+

 class ConvertDocumentsOptions(BaseModel):
    from_formats: Annotated[
@@ -69,18 +87,17 @@ class ConvertDocumentsOptions(BaseModel):
        ),
    ] = False

-    # TODO: use a restricted list based on what is installed on the system
-    ocr_engine: Annotated[
-        OcrEngine,
+    ocr_engine: Annotated[  # type: ignore
+        ocr_engines_enum,
        Field(
            description=(
                "The OCR engine to use. String. "
-                "Allowed values: easyocr, tesseract, rapidocr. "
+                f"Allowed values: {', '.join([v.value for v in ocr_engines_enum])}. "
                "Optional, defaults to easyocr."
            ),
-            examples=[OcrEngine.EASYOCR],
+            examples=[EasyOcrOptions.kind],
        ),
-    ] = OcrEngine.EASYOCR
+    ] = ocr_engines_enum(EasyOcrOptions.kind)  # type: ignore

    ocr_lang: Annotated[
        Optional[list[str]],
@@ -101,25 +118,46 @@ class ConvertDocumentsOptions(BaseModel):
            description=(
                "The PDF backend to use. String. "
                f"Allowed values: {', '.join([v.value for v in PdfBackend])}. "
-                f"Optional, defaults to {PdfBackend.DLPARSE_V2.value}."
+                f"Optional, defaults to {PdfBackend.DLPARSE_V4.value}."
            ),
-            examples=[PdfBackend.DLPARSE_V2],
+            examples=[PdfBackend.DLPARSE_V4],
        ),
-    ] = PdfBackend.DLPARSE_V2
+    ] = PdfBackend.DLPARSE_V4

    table_mode: Annotated[
        TableFormerMode,
        Field(
-            TableFormerMode.FAST,
            description=(
                "Mode to use for table structure, String. "
                f"Allowed values: {', '.join([v.value for v in TableFormerMode])}. "
                "Optional, defaults to fast."
            ),
-            examples=[TableFormerMode.FAST],
+            examples=[TableStructureOptions().mode],
            # pattern="fast|accurate",
        ),
-    ] = TableFormerMode.FAST
+    ] = TableStructureOptions().mode
+
+    pipeline: Annotated[
+        PdfPipeline,
+        Field(description="Choose the pipeline to process PDF or image files."),
+    ] = PdfPipeline.STANDARD
+
+    page_range: Annotated[
+        PageRange,
+        Field(
+            description="Only convert a range of pages. The page number starts at 1.",
+            examples=[(1, 4)],
+        ),
+    ] = DEFAULT_PAGE_RANGE
+
+    document_timeout: Annotated[
+        float,
+        Field(
+            description="The timeout for processing each document, in seconds.",
+            gt=0,
+            le=docling_serve_settings.max_document_timeout,
+        ),
+    ] = docling_serve_settings.max_document_timeout

    abort_on_error: Annotated[
        bool,
@@ -172,3 +210,47 @@ class ConvertDocumentsOptions(BaseModel):
            examples=[2.0],
        ),
    ] = 2.0
+
+    do_code_enrichment: Annotated[
+        bool,
+        Field(
+            description=(
+                "If enabled, perform OCR code enrichment. "
+                "Boolean. Optional, defaults to false."
+            ),
+            examples=[False],
+        ),
+    ] = False
+
+    do_formula_enrichment: Annotated[
+        bool,
+        Field(
+            description=(
+                "If enabled, perform formula OCR, return Latex code. "
+                "Boolean. Optional, defaults to false."
+            ),
+            examples=[False],
+        ),
+    ] = False
+
+    do_picture_classification: Annotated[
+        bool,
+        Field(
+            description=(
+                "If enabled, classify pictures in documents. "
+                "Boolean. Optional, defaults to false."
+            ),
+            examples=[False],
+        ),
+    ] = False
+
+    do_picture_description: Annotated[
+        bool,
+        Field(
+            description=(
+                "If enabled, describe pictures in documents. "
+                "Boolean. Optional, defaults to false."
+            ),
+            examples=[False],
+        ),
+    ] = False
--- a/docling_serve/datamodel/engines.py
+++ b/docling_serve/datamodel/engines.py
@@ -1,10 +1,4 @@
 import enum
-from typing import Optional
-
-from pydantic import BaseModel
-
-from docling_serve.datamodel.requests import ConvertDocumentsRequest
-from docling_serve.datamodel.responses import ConvertDocumentResponse


 class TaskStatus(str, enum.Enum):
@@ -16,15 +10,3 @@ class TaskStatus(str, enum.Enum):

 class AsyncEngine(str, enum.Enum):
    LOCAL = "local"
-
-
-class Task(BaseModel):
-    task_id: str
-    task_status: TaskStatus = TaskStatus.PENDING
-    request: Optional[ConvertDocumentsRequest]
-    result: Optional[ConvertDocumentResponse] = None
-
-    def is_completed(self) -> bool:
-        if self.task_status in [TaskStatus.SUCCESS, TaskStatus.FAILURE]:
-            return True
-        return False
--- a/docling_serve/datamodel/task.py
+++ b/docling_serve/datamodel/task.py
@@ -0,0 +1,19 @@
+from typing import Optional
+
+from pydantic import BaseModel
+
+from docling_serve.datamodel.engines import TaskStatus
+from docling_serve.datamodel.requests import ConvertDocumentsRequest
+from docling_serve.datamodel.responses import ConvertDocumentResponse
+
+
+class Task(BaseModel):
+    task_id: str
+    task_status: TaskStatus = TaskStatus.PENDING
+    request: Optional[ConvertDocumentsRequest]
+    result: Optional[ConvertDocumentResponse] = None
+
+    def is_completed(self) -> bool:
+        if self.task_status in [TaskStatus.SUCCESS, TaskStatus.FAILURE]:
+            return True
+        return False
--- a/docling_serve/docling_conversion.py
+++ b/docling_serve/docling_conversion.py
@@ -1,7 +1,9 @@
 import hashlib
 import json
 import logging
+import sys
 from collections.abc import Iterable, Iterator
+from functools import lru_cache
 from pathlib import Path
 from typing import Any, Optional, Union

@@ -9,37 +11,35 @@ from fastapi import HTTPException

 from docling.backend.docling_parse_backend import DoclingParseDocumentBackend
 from docling.backend.docling_parse_v2_backend import DoclingParseV2DocumentBackend
+from docling.backend.docling_parse_v4_backend import DoclingParseV4DocumentBackend
 from docling.backend.pdf_backend import PdfDocumentBackend
 from docling.backend.pypdfium2_backend import PyPdfiumDocumentBackend
 from docling.datamodel.base_models import DocumentStream, InputFormat
 from docling.datamodel.document import ConversionResult
 from docling.datamodel.pipeline_options import (
-    EasyOcrOptions,
-    OcrEngine,
    OcrOptions,
    PdfBackend,
+    PdfPipeline,
    PdfPipelineOptions,
-    RapidOcrOptions,
    TableFormerMode,
-    TesseractOcrOptions,
+    VlmPipelineOptions,
+    smoldocling_vlm_conversion_options,
+    smoldocling_vlm_mlx_conversion_options,
 )
 from docling.document_converter import DocumentConverter, FormatOption, PdfFormatOption
+from docling.pipeline.vlm_pipeline import VlmPipeline
 from docling_core.types.doc import ImageRefMode

-from docling_serve.datamodel.convert import ConvertDocumentsOptions
+from docling_serve.datamodel.convert import ConvertDocumentsOptions, ocr_factory
 from docling_serve.helper_functions import _to_list_of_strings
 from docling_serve.settings import docling_serve_settings

 _log = logging.getLogger(__name__)


-# Document converters will be preloaded and stored in a dictionary
-converters: dict[bytes, DocumentConverter] = {}
-
-
 # Custom serializer for PdfFormatOption
 # (model_dump_json does not work with some classes)
-def _serialize_pdf_format_option(pdf_format_option: PdfFormatOption) -> str:
+def _hash_pdf_format_option(pdf_format_option: PdfFormatOption) -> bytes:
    data = pdf_format_option.model_dump()

    # pipeline_options are not fully serialized by model_dump, dedicated pass
@@ -64,51 +64,49 @@ def _serialize_pdf_format_option(pdf_format_option: PdfFormatOption) -> str:
        )

    # Serialize the dictionary to JSON with sorted keys to have consistent hashes
-    return json.dumps(data, sort_keys=True)
+    serialized_data = json.dumps(data, sort_keys=True)
+    options_hash = hashlib.sha1(serialized_data.encode()).digest()
+    return options_hash


-# Computes the PDF pipeline options and returns the PdfFormatOption and its hash
-def get_pdf_pipeline_opts(  # noqa: C901
-    request: ConvertDocumentsOptions,
-) -> tuple[PdfFormatOption, bytes]:
-    if request.ocr_engine == OcrEngine.EASYOCR:
-        try:
-            import easyocr  # noqa: F401
-        except ImportError:
-            raise HTTPException(
-                status_code=400,
-                detail="The requested OCR engine"
-                f" (ocr_engine={request.ocr_engine.value})"
-                " is not available on this system. Please choose another OCR engine "
-                "or contact your system administrator.",
-            )
-        ocr_options: OcrOptions = EasyOcrOptions(force_full_page_ocr=request.force_ocr)
-    elif request.ocr_engine == OcrEngine.TESSERACT:
-        try:
-            import tesserocr  # noqa: F401
-        except ImportError:
-            raise HTTPException(
-                status_code=400,
-                detail="The requested OCR engine"
-                f" (ocr_engine={request.ocr_engine.value})"
-                " is not available on this system. Please choose another OCR engine "
-                "or contact your system administrator.",
-            )
-        ocr_options = TesseractOcrOptions(force_full_page_ocr=request.force_ocr)
-    elif request.ocr_engine == OcrEngine.RAPIDOCR:
-        try:
-            from rapidocr_onnxruntime import RapidOCR  # noqa: F401
-        except ImportError:
-            raise HTTPException(
-                status_code=400,
-                detail="The requested OCR engine"
-                f" (ocr_engine={request.ocr_engine.value})"
-                " is not available on this system. Please choose another OCR engine "
-                "or contact your system administrator.",
-            )
-        ocr_options = RapidOcrOptions(force_full_page_ocr=request.force_ocr)
-    else:
-        raise RuntimeError(f"Unexpected OCR engine type {request.ocr_engine}")
+# Cache of DocumentConverter objects
+_options_map: dict[bytes, PdfFormatOption] = {}
+
+
+@lru_cache(maxsize=docling_serve_settings.options_cache_size)
+def _get_converter_from_hash(options_hash: bytes) -> DocumentConverter:
+    pdf_format_option = _options_map[options_hash]
+    format_options: dict[InputFormat, FormatOption] = {
+        InputFormat.PDF: pdf_format_option,
+        InputFormat.IMAGE: pdf_format_option,
+    }
+
+    return DocumentConverter(format_options=format_options)
+
+
+def get_converter(pdf_format_option: PdfFormatOption) -> DocumentConverter:
+    options_hash = _hash_pdf_format_option(pdf_format_option)
+    _options_map[options_hash] = pdf_format_option
+    return _get_converter_from_hash(options_hash)
+
+
+def _parse_standard_pdf_opts(
+    request: ConvertDocumentsOptions, artifacts_path: Optional[Path]
+) -> PdfPipelineOptions:
+    try:
+        ocr_options: OcrOptions = ocr_factory.create_options(
+            kind=request.ocr_engine.value,  # type: ignore
+            force_full_page_ocr=request.force_ocr,
+        )
+    except ImportError as err:
+        raise HTTPException(
+            status_code=400,
+            detail="The requested OCR engine"
+            f" (ocr_engine={request.ocr_engine.value})"  # type: ignore
+            " is not available on this system. Please choose another OCR engine "
+            "or contact your system administrator.\n"
+            f"{err}",
+        )

    if request.ocr_lang is not None:
        if isinstance(request.ocr_lang, str):
@@ -117,11 +115,16 @@ def get_pdf_pipeline_opts(  # noqa: C901
            ocr_options.lang = request.ocr_lang

    pipeline_options = PdfPipelineOptions(
+        artifacts_path=artifacts_path,
+        document_timeout=request.document_timeout,
        do_ocr=request.do_ocr,
        ocr_options=ocr_options,
        do_table_structure=request.do_table_structure,
+        do_code_enrichment=request.do_code_enrichment,
+        do_formula_enrichment=request.do_formula_enrichment,
+        do_picture_classification=request.do_picture_classification,
+        do_picture_description=request.do_picture_description,
    )
-    pipeline_options.table_structure_options.do_cell_matching = True  # do_cell_matching
    pipeline_options.table_structure_options.mode = TableFormerMode(request.table_mode)

    if request.image_export_mode != ImageRefMode.PLACEHOLDER:
@@ -129,50 +132,95 @@ def get_pdf_pipeline_opts(  # noqa: C901
        if request.images_scale:
            pipeline_options.images_scale = request.images_scale

+    return pipeline_options
+
+
+def _parse_backend(request: ConvertDocumentsOptions) -> type[PdfDocumentBackend]:
    if request.pdf_backend == PdfBackend.DLPARSE_V1:
        backend: type[PdfDocumentBackend] = DoclingParseDocumentBackend
    elif request.pdf_backend == PdfBackend.DLPARSE_V2:
        backend = DoclingParseV2DocumentBackend
+    elif request.pdf_backend == PdfBackend.DLPARSE_V4:
+        backend = DoclingParseV4DocumentBackend
    elif request.pdf_backend == PdfBackend.PYPDFIUM2:
        backend = PyPdfiumDocumentBackend
    else:
        raise RuntimeError(f"Unexpected PDF backend type {request.pdf_backend}")

+    return backend
+
+
+def _parse_vlm_pdf_opts(
+    request: ConvertDocumentsOptions, artifacts_path: Optional[Path]
+) -> VlmPipelineOptions:
+    pipeline_options = VlmPipelineOptions(
+        artifacts_path=artifacts_path,
+        document_timeout=request.document_timeout,
+    )
+    pipeline_options.vlm_options = smoldocling_vlm_conversion_options
+    if sys.platform == "darwin":
+        try:
+            import mlx_vlm  # noqa: F401
+
+            pipeline_options.vlm_options = smoldocling_vlm_mlx_conversion_options
+        except ImportError:
+            _log.warning(
+                "To run SmolDocling faster, please install mlx-vlm:\n"
+                "pip install mlx-vlm"
+            )
+    return pipeline_options
+
+
+# Computes the PDF pipeline options and returns the PdfFormatOption and its hash
+def get_pdf_pipeline_opts(
+    request: ConvertDocumentsOptions,
+) -> PdfFormatOption:
+    artifacts_path: Optional[Path] = None
    if docling_serve_settings.artifacts_path is not None:
        if str(docling_serve_settings.artifacts_path.absolute()) == "":
            _log.info(
                "artifacts_path is an empty path, model weights will be dowloaded "
                "at runtime."
            )
-            pipeline_options.artifacts_path = None
+            artifacts_path = None
        elif docling_serve_settings.artifacts_path.is_dir():
            _log.info(
                "artifacts_path is set to a valid directory. "
                "No model weights will be downloaded at runtime."
            )
-            pipeline_options.artifacts_path = docling_serve_settings.artifacts_path
+            artifacts_path = docling_serve_settings.artifacts_path
        else:
            _log.warning(
                "artifacts_path is set to an invalid directory. "
                "The system will download the model weights at runtime."
            )
-            pipeline_options.artifacts_path = None
+            artifacts_path = None
    else:
        _log.info(
            "artifacts_path is unset. "
            "The system will download the model weights at runtime."
        )

-    pdf_format_option = PdfFormatOption(
-        pipeline_options=pipeline_options,
-        backend=backend,
-    )
+    pipeline_options: Union[PdfPipelineOptions, VlmPipelineOptions]
+    if request.pipeline == PdfPipeline.STANDARD:
+        pipeline_options = _parse_standard_pdf_opts(request, artifacts_path)
+        backend = _parse_backend(request)
+        pdf_format_option = PdfFormatOption(
+            pipeline_options=pipeline_options,
+            backend=backend,
+        )

-    serialized_data = _serialize_pdf_format_option(pdf_format_option)
+    elif request.pipeline == PdfPipeline.VLM:
+        pipeline_options = _parse_vlm_pdf_opts(request, artifacts_path)
+        pdf_format_option = PdfFormatOption(
+            pipeline_cls=VlmPipeline, pipeline_options=pipeline_options
+        )
+    else:
+        raise NotImplementedError(
+            f"The pipeline {request.pipeline} is not implemented."
+        )

-    options_hash = hashlib.sha1(serialized_data.encode()).digest()
-
-    return pdf_format_option, options_hash
+    return pdf_format_option


 def convert_documents(
@@ -180,20 +228,14 @@ def convert_documents(
    options: ConvertDocumentsOptions,
    headers: Optional[dict[str, Any]] = None,
 ):
-    pdf_format_option, options_hash = get_pdf_pipeline_opts(options)
-
-    if options_hash not in converters:
-        format_options: dict[InputFormat, FormatOption] = {
-            InputFormat.PDF: pdf_format_option,
-            InputFormat.IMAGE: pdf_format_option,
-        }
-
-        converters[options_hash] = DocumentConverter(format_options=format_options)
-        _log.info(f"We now have {len(converters)} converters in memory.")
-
-    results: Iterator[ConversionResult] = converters[options_hash].convert_all(
+    pdf_format_option = get_pdf_pipeline_opts(options)
+    converter = get_converter(pdf_format_option)
+    results: Iterator[ConversionResult] = converter.convert_all(
        sources,
        headers=headers,
+        page_range=options.page_range,
+        max_file_size=docling_serve_settings.max_file_size,
+        max_num_pages=docling_serve_settings.max_num_pages,
    )

    return results
--- a/docling_serve/engines/async_local/orchestrator.py
+++ b/docling_serve/engines/async_local/orchestrator.py
@@ -5,13 +5,14 @@ from typing import Optional

 from fastapi import WebSocket

-from docling_serve.datamodel.engines import Task, TaskStatus
+from docling_serve.datamodel.engines import TaskStatus
 from docling_serve.datamodel.requests import ConvertDocumentsRequest
 from docling_serve.datamodel.responses import (
    MessageKind,
    TaskStatusResponse,
    WebsocketMessage,
 )
+from docling_serve.datamodel.task import Task
 from docling_serve.engines.async_local.worker import AsyncLocalWorker
 from docling_serve.engines.base_orchestrator import BaseOrchestrator
 from docling_serve.settings import docling_serve_settings
--- a/docling_serve/engines/base_orchestrator.py
+++ b/docling_serve/engines/base_orchestrator.py
@@ -1,6 +1,6 @@
 from abc import ABC, abstractmethod

-from docling_serve.datamodel.engines import Task
+from docling_serve.datamodel.task import Task


 class BaseOrchestrator(ABC):
--- a/docling_serve/gradio_ui.py
+++ b/docling_serve/gradio_ui.py
@@ -1,22 +1,47 @@
+import base64
 import importlib
 import json
 import logging
-import os
+import ssl
 import tempfile
+import time
 from pathlib import Path

+import certifi
 import gradio as gr
-import requests
+import httpx
+
+from docling.datamodel.pipeline_options import (
+    PdfBackend,
+    PdfPipeline,
+    TableFormerMode,
+    TableStructureOptions,
+)

 from docling_serve.helper_functions import _to_list_of_strings
+from docling_serve.settings import docling_serve_settings, uvicorn_settings

 logger = logging.getLogger(__name__)

+############################
+# Path of static artifacts #
+############################
+
+logo_path = "https://raw.githubusercontent.com/docling-project/docling/refs/heads/main/docs/assets/logo.svg"
+js_components_url = "https://unpkg.com/@docling/docling-components@0.0.6"
+if (
+    docling_serve_settings.static_path is not None
+    and docling_serve_settings.static_path.is_dir()
+):
+    logo_path = str(docling_serve_settings.static_path / "logo.svg")
+    js_components_url = "/static/docling-components.js"
+
+
 ##############################
 # Head JS for web components #
 ##############################
-head = """
-    <script src="https://unpkg.com/@docling/docling-components@0.0.3" type="module"></script>
+head = f"""
+    <script src="{js_components_url}" type="module"></script>
 """

 #################
@@ -95,8 +120,29 @@ file_output_path = None  # Will be set when a new file is generated
 #############


+def get_api_endpoint() -> str:
+    protocol = "http"
+    if uvicorn_settings.ssl_keyfile is not None:
+        protocol = "https"
+    return f"{protocol}://{docling_serve_settings.api_host}:{uvicorn_settings.port}"
+
+
+def get_ssl_context() -> ssl.SSLContext:
+    ctx = ssl.create_default_context(cafile=certifi.where())
+    kube_sa_ca_cert_path = Path(
+        "/run/secrets/kubernetes.io/serviceaccount/service-ca.crt"
+    )
+    if (
+        uvicorn_settings.ssl_keyfile is not None
+        and ".svc." in docling_serve_settings.api_host
+        and kube_sa_ca_cert_path.exists()
+    ):
+        ctx.load_verify_locations(cafile=kube_sa_ca_cert_path)
+    return ctx
+
+
 def health_check():
-    response = requests.get(f"http://localhost:{int(os.getenv('PORT', '5001'))}/health")
+    response = httpx.get(f"{get_api_endpoint()}/health")
    if response.status_code == 200:
        return "Healthy"
    return "Unhealthy"
@@ -112,6 +158,11 @@ def set_outputs_visibility_direct(x, y):
    return content, file


+def set_task_id_visibility(x):
+    task_id_row = gr.Row(visible=x)
+    return task_id_row
+
+
 def set_outputs_visibility_process(x):
    content = gr.Row(visible=not x)
    file = gr.Row(visible=x)
@@ -123,6 +174,7 @@ def set_download_button_label(label_text: gr.State):


 def clear_outputs():
+    task_id_rendered = ""
    markdown_content = ""
    json_content = ""
    json_rendered_content = ""
@@ -131,6 +183,7 @@ def clear_outputs():
    doctags_content = ""

    return (
+        task_id_rendered,
        markdown_content,
        markdown_content,
        json_content,
@@ -173,10 +226,56 @@ def change_ocr_lang(ocr_engine):
        return "english,chinese"


+def wait_task_finish(task_id: str, return_as_file: bool):
+    conversion_sucess = False
+    task_finished = False
+    task_status = ""
+    ssl_ctx = get_ssl_context()
+    while not task_finished:
+        try:
+            response = httpx.get(
+                f"{get_api_endpoint()}/v1alpha/status/poll/{task_id}?wait=5",
+                verify=ssl_ctx,
+                timeout=15,
+            )
+            task_status = response.json()["task_status"]
+            if task_status == "success":
+                conversion_sucess = True
+                task_finished = True
+
+            if task_status in ("failure", "revoked"):
+                conversion_sucess = False
+                task_finished = True
+                raise RuntimeError(f"Task failed with status {task_status!r}")
+            time.sleep(5)
+        except Exception as e:
+            logger.error(f"Error processing file(s): {e}")
+            conversion_sucess = False
+            task_finished = True
+            raise gr.Error(f"Error processing file(s): {e}", print_exception=False)
+
+    if conversion_sucess:
+        try:
+            response = httpx.get(
+                f"{get_api_endpoint()}/v1alpha/result/{task_id}",
+                timeout=15,
+                verify=ssl_ctx,
+            )
+            output = response_to_output(response, return_as_file)
+            return output
+        except Exception as e:
+            logger.error(f"Error getting task result: {e}")
+
+    raise gr.Error(
+        f"Error getting task result, conversion finished with status: {task_status}"
+    )
+
+
 def process_url(
    input_sources,
    to_formats,
    image_export_mode,
+    pipeline,
    ocr,
    force_ocr,
    ocr_engine,
@@ -185,12 +284,17 @@ def process_url(
    table_mode,
    abort_on_error,
    return_as_file,
+    do_code_enrichment,
+    do_formula_enrichment,
+    do_picture_classification,
+    do_picture_description,
 ):
    parameters = {
        "http_sources": [{"url": source} for source in input_sources.split(",")],
        "options": {
            "to_formats": to_formats,
            "image_export_mode": image_export_mode,
+            "pipeline": pipeline,
            "ocr": ocr,
            "force_ocr": force_ocr,
            "ocr_engine": ocr_engine,
@@ -199,6 +303,10 @@ def process_url(
            "table_mode": table_mode,
            "abort_on_error": abort_on_error,
            "return_as_file": return_as_file,
+            "do_code_enrichment": do_code_enrichment,
+            "do_formula_enrichment": do_formula_enrichment,
+            "do_picture_classification": do_picture_classification,
+            "do_picture_description": do_picture_description,
        },
    }
    if (
@@ -209,9 +317,12 @@ def process_url(
        logger.error("No input sources provided.")
        raise gr.Error("No input sources provided.", print_exception=False)
    try:
-        response = requests.post(
-            f"http://localhost:{int(os.getenv('PORT', '5001'))}/v1alpha/convert/source",
+        ssl_ctx = get_ssl_context()
+        response = httpx.post(
+            f"{get_api_endpoint()}/v1alpha/convert/source/async",
            json=parameters,
+            verify=ssl_ctx,
+            timeout=60,
        )
    except Exception as e:
        logger.error(f"Error processing URL: {e}")
@@ -221,14 +332,22 @@ def process_url(
        error_message = data.get("detail", "An unknown error occurred.")
        logger.error(f"Error processing file: {error_message}")
        raise gr.Error(f"Error processing file: {error_message}", print_exception=False)
-    output = response_to_output(response, return_as_file)
-    return output
+
+    task_id_rendered = response.json()["task_id"]
+    return task_id_rendered
+
+
+def file_to_base64(file):
+    with open(file.name, "rb") as f:
+        encoded_string = base64.b64encode(f.read()).decode("utf-8")
+    return encoded_string


 def process_file(
-    files,
+    file,
    to_formats,
    image_export_mode,
+    pipeline,
    ocr,
    force_ocr,
    ocr_engine,
@@ -237,30 +356,44 @@ def process_file(
    table_mode,
    abort_on_error,
    return_as_file,
+    do_code_enrichment,
+    do_formula_enrichment,
+    do_picture_classification,
+    do_picture_description,
 ):
-    if not files or len(files) == 0 or files[0] == "":
+    if not file or file == "":
        logger.error("No files provided.")
        raise gr.Error("No files provided.", print_exception=False)
-    files_data = [("files", (file.name, open(file.name, "rb"))) for file in files]
+    files_data = [{"base64_string": file_to_base64(file), "filename": file.name}]

    parameters = {
-        "to_formats": to_formats,
-        "image_export_mode": image_export_mode,
-        "ocr": str(ocr).lower(),
-        "force_ocr": str(force_ocr).lower(),
-        "ocr_engine": ocr_engine,
-        "ocr_lang": _to_list_of_strings(ocr_lang),
-        "pdf_backend": pdf_backend,
-        "table_mode": table_mode,
-        "abort_on_error": str(abort_on_error).lower(),
-        "return_as_file": str(return_as_file).lower(),
+        "file_sources": files_data,
+        "options": {
+            "to_formats": to_formats,
+            "image_export_mode": image_export_mode,
+            "pipeline": pipeline,
+            "ocr": ocr,
+            "force_ocr": force_ocr,
+            "ocr_engine": ocr_engine,
+            "ocr_lang": _to_list_of_strings(ocr_lang),
+            "pdf_backend": pdf_backend,
+            "table_mode": table_mode,
+            "abort_on_error": abort_on_error,
+            "return_as_file": return_as_file,
+            "do_code_enrichment": do_code_enrichment,
+            "do_formula_enrichment": do_formula_enrichment,
+            "do_picture_classification": do_picture_classification,
+            "do_picture_description": do_picture_description,
+        },
    }

    try:
-        response = requests.post(
-            f"http://localhost:{int(os.getenv('PORT', '5001'))}/v1alpha/convert/file",
-            files=files_data,
-            data=parameters,
+        ssl_ctx = get_ssl_context()
+        response = httpx.post(
+            f"{get_api_endpoint()}/v1alpha/convert/source/async",
+            json=parameters,
+            verify=ssl_ctx,
+            timeout=60,
        )
    except Exception as e:
        logger.error(f"Error processing file(s): {e}")
@@ -270,8 +403,9 @@ def process_file(
        error_message = data.get("detail", "An unknown error occurred.")
        logger.error(f"Error processing file: {error_message}")
        raise gr.Error(f"Error processing file: {error_message}", print_exception=False)
-    output = response_to_output(response, return_as_file)
-    return output
+
+    task_id_rendered = response.json()["task_id"]
+    return task_id_rendered


 def response_to_output(response, return_as_file):
@@ -342,17 +476,21 @@ with gr.Blocks(
    with gr.Row(elem_id="check_health"):
        # Logo
        with gr.Column(scale=1, min_width=90):
-            gr.Image(
-                "https://ds4sd.github.io/docling/assets/logo.png",
-                height=80,
-                width=80,
-                show_download_button=False,
-                show_label=False,
-                show_fullscreen_button=False,
-                container=False,
-                elem_id="logo",
-                scale=0,
-            )
+            try:
+                gr.Image(
+                    logo_path,
+                    height=80,
+                    width=80,
+                    show_download_button=False,
+                    show_label=False,
+                    show_fullscreen_button=False,
+                    container=False,
+                    elem_id="logo",
+                    scale=0,
+                )
+            except Exception:
+                logger.warning("Logo not found.")
+
        # Title
        with gr.Column(scale=1, min_width=200):
            gr.Markdown(
@@ -381,30 +519,31 @@ with gr.Blocks(
            )

    # URL Processing Tab
-    with gr.Tab("Convert URL(s)"):
+    with gr.Tab("Convert URL"):
        with gr.Row():
            with gr.Column(scale=4):
                url_input = gr.Textbox(
-                    label="Input Sources (comma-separated URLs)",
-                    placeholder="https://arxiv.org/pdf/2206.01062",
+                    label="URL Input Source",
+                    placeholder="https://arxiv.org/pdf/2501.17887",
                )
            with gr.Column(scale=1):
-                url_process_btn = gr.Button("Process URL(s)", scale=1)
+                url_process_btn = gr.Button("Process URL", scale=1)
                url_reset_btn = gr.Button("Reset", scale=1)

    # File Processing Tab
-    with gr.Tab("Convert File(s)"):
+    with gr.Tab("Convert File"):
        with gr.Row():
            with gr.Column(scale=4):
                file_input = gr.File(
                    elem_id="file_input_zone",
-                    label="Upload Files",
+                    label="Upload File",
                    file_types=[
                        ".pdf",
                        ".docx",
                        ".pptx",
                        ".html",
                        ".xlsx",
+                        ".json",
                        ".asciidoc",
                        ".txt",
                        ".md",
@@ -413,11 +552,11 @@ with gr.Blocks(
                        ".png",
                        ".gif",
                    ],
-                    file_count="multiple",
+                    file_count="single",
                    scale=4,
                )
            with gr.Column(scale=1):
-                file_process_btn = gr.Button("Process File(s)", scale=1)
+                file_process_btn = gr.Button("Process File", scale=1)
                file_reset_btn = gr.Button("Reset", scale=1)

    # Options
@@ -426,14 +565,14 @@ with gr.Blocks(
            with gr.Column(scale=1):
                to_formats = gr.CheckboxGroup(
                    [
-                        ("Markdown", "md"),
                        ("Docling (JSON)", "json"),
+                        ("Markdown", "md"),
                        ("HTML", "html"),
                        ("Plain Text", "text"),
                        ("Doc Tags", "doctags"),
                    ],
                    label="To Formats",
-                    value=["md"],
+                    value=["json", "md"],
                )
            with gr.Column(scale=1):
                image_export_mode = gr.Radio(
@@ -445,6 +584,13 @@ with gr.Blocks(
                    label="Image Export Mode",
                    value="embedded",
                )
+        with gr.Row():
+            with gr.Column(scale=1, min_width=200):
+                pipeline = gr.Radio(
+                    [(v.value.capitalize(), v.value) for v in PdfPipeline],
+                    label="Pipeline type",
+                    value=PdfPipeline.STANDARD.value,
+                )
        with gr.Row():
            with gr.Column(scale=1, min_width=200):
                ocr = gr.Checkbox(label="Enable OCR", value=True)
@@ -465,32 +611,55 @@ with gr.Blocks(
                )
            ocr_engine.change(change_ocr_lang, inputs=[ocr_engine], outputs=[ocr_lang])
        with gr.Row():
-            with gr.Column(scale=2):
+            with gr.Column(scale=4):
                pdf_backend = gr.Radio(
-                    ["pypdfium2", "dlparse_v1", "dlparse_v2"],
+                    [v.value for v in PdfBackend],
                    label="PDF Backend",
-                    value="dlparse_v2",
+                    value=PdfBackend.DLPARSE_V4.value,
                )
            with gr.Column(scale=2):
                table_mode = gr.Radio(
-                    ["fast", "accurate"], label="Table Mode", value="fast"
+                    [(v.value.capitalize(), v.value) for v in TableFormerMode],
+                    label="Table Mode",
+                    value=TableStructureOptions().mode.value,
                )
            with gr.Column(scale=1):
                abort_on_error = gr.Checkbox(label="Abort on Error", value=False)
-                return_as_file = gr.Checkbox(label="Return as File", value=False)
+                return_as_file = gr.Checkbox(
+                    label="Return as File", visible=False, value=False
+                )  # Disable until async handle output as file
+        with gr.Row():
+            with gr.Column():
+                do_code_enrichment = gr.Checkbox(
+                    label="Enable code enrichment", value=False
+                )
+                do_formula_enrichment = gr.Checkbox(
+                    label="Enable formula enrichment", value=False
+                )
+            with gr.Column():
+                do_picture_classification = gr.Checkbox(
+                    label="Enable picture classification", value=False
+                )
+                do_picture_description = gr.Checkbox(
+                    label="Enable picture description", value=False
+                )
+
+    # Task id output
+    with gr.Row(visible=False) as task_id_output:
+        task_id_rendered = gr.Textbox(label="Task id", interactive=False)

    # Document output
    with gr.Row(visible=False) as content_output:
+        with gr.Tab("Docling (JSON)"):
+            output_json = gr.Code(language="json", wrap_lines=True, show_label=False)
+        with gr.Tab("Docling-Rendered"):
+            output_json_rendered = gr.HTML(label="Response")
        with gr.Tab("Markdown"):
            output_markdown = gr.Code(
                language="markdown", wrap_lines=True, show_label=False
            )
        with gr.Tab("Markdown-Rendered"):
            output_markdown_rendered = gr.Markdown(label="Response")
-        with gr.Tab("Docling (JSON)"):
-            output_json = gr.Code(language="json", wrap_lines=True, show_label=False)
-        with gr.Tab("Docling-Rendered"):
-            output_json_rendered = gr.HTML()
        with gr.Tab("HTML"):
            output_html = gr.Code(language="html", wrap_lines=True, show_label=False)
        with gr.Tab("HTML-Rendered"):
@@ -508,36 +677,34 @@ with gr.Blocks(
    # UI Actions #
    ##############

+    # Disable until async handle output as file
    # Handle Return as File
-    url_input.change(
-        auto_set_return_as_file,
-        inputs=[url_input, file_input, image_export_mode],
-        outputs=[return_as_file],
-    )
-    file_input.change(
-        auto_set_return_as_file,
-        inputs=[url_input, file_input, image_export_mode],
-        outputs=[return_as_file],
-    )
-    image_export_mode.change(
-        auto_set_return_as_file,
-        inputs=[url_input, file_input, image_export_mode],
-        outputs=[return_as_file],
-    )
+    # url_input.change(
+    #     auto_set_return_as_file,
+    #     inputs=[url_input, file_input, image_export_mode],
+    #     outputs=[return_as_file],
+    # )
+    # file_input.change(
+    #     auto_set_return_as_file,
+    #     inputs=[url_input, file_input, image_export_mode],
+    #     outputs=[return_as_file],
+    # )
+    # image_export_mode.change(
+    #     auto_set_return_as_file,
+    #     inputs=[url_input, file_input, image_export_mode],
+    #     outputs=[return_as_file],
+    # )

    # URL processing
    url_process_btn.click(
        set_options_visibility, inputs=[false_bool], outputs=[options]
    ).then(
        set_download_button_label, inputs=[processing_text], outputs=[download_file_btn]
-    ).then(
-        set_outputs_visibility_process,
-        inputs=[return_as_file],
-        outputs=[content_output, file_output],
    ).then(
        clear_outputs,
        inputs=None,
        outputs=[
+            task_id_rendered,
            output_markdown,
            output_markdown_rendered,
            output_json,
@@ -547,12 +714,17 @@ with gr.Blocks(
            output_text,
            output_doctags,
        ],
+    ).then(
+        set_task_id_visibility,
+        inputs=[true_bool],
+        outputs=[task_id_output],
    ).then(
        process_url,
        inputs=[
            url_input,
            to_formats,
            image_export_mode,
+            pipeline,
            ocr,
            force_ocr,
            ocr_engine,
@@ -561,7 +733,21 @@ with gr.Blocks(
            table_mode,
            abort_on_error,
            return_as_file,
+            do_code_enrichment,
+            do_formula_enrichment,
+            do_picture_classification,
+            do_picture_description,
        ],
+        outputs=[
+            task_id_rendered,
+        ],
+    ).then(
+        set_outputs_visibility_process,
+        inputs=[return_as_file],
+        outputs=[content_output, file_output],
+    ).then(
+        wait_task_finish,
+        inputs=[task_id_rendered, return_as_file],
        outputs=[
            output_markdown,
            output_markdown_rendered,
@@ -592,21 +778,20 @@ with gr.Blocks(
        set_outputs_visibility_direct,
        inputs=[false_bool, false_bool],
        outputs=[content_output, file_output],
-    ).then(clear_url_input, inputs=None, outputs=[url_input])
+    ).then(set_task_id_visibility, inputs=[false_bool], outputs=[task_id_output]).then(
+        clear_url_input, inputs=None, outputs=[url_input]
+    )

    # File processing
    file_process_btn.click(
        set_options_visibility, inputs=[false_bool], outputs=[options]
    ).then(
        set_download_button_label, inputs=[processing_text], outputs=[download_file_btn]
-    ).then(
-        set_outputs_visibility_process,
-        inputs=[return_as_file],
-        outputs=[content_output, file_output],
    ).then(
        clear_outputs,
        inputs=None,
        outputs=[
+            task_id_rendered,
            output_markdown,
            output_markdown_rendered,
            output_json,
@@ -616,12 +801,17 @@ with gr.Blocks(
            output_text,
            output_doctags,
        ],
+    ).then(
+        set_task_id_visibility,
+        inputs=[true_bool],
+        outputs=[task_id_output],
    ).then(
        process_file,
        inputs=[
            file_input,
            to_formats,
            image_export_mode,
+            pipeline,
            ocr,
            force_ocr,
            ocr_engine,
@@ -630,7 +820,21 @@ with gr.Blocks(
            table_mode,
            abort_on_error,
            return_as_file,
+            do_code_enrichment,
+            do_formula_enrichment,
+            do_picture_classification,
+            do_picture_description,
        ],
+        outputs=[
+            task_id_rendered,
+        ],
+    ).then(
+        set_outputs_visibility_process,
+        inputs=[return_as_file],
+        outputs=[content_output, file_output],
+    ).then(
+        wait_task_finish,
+        inputs=[task_id_rendered, return_as_file],
        outputs=[
            output_markdown,
            output_markdown_rendered,
@@ -661,4 +865,6 @@ with gr.Blocks(
        set_outputs_visibility_direct,
        inputs=[false_bool, false_bool],
        outputs=[content_output, file_output],
-    ).then(clear_file_input, inputs=None, outputs=[file_input])
+    ).then(set_task_id_visibility, inputs=[false_bool], outputs=[task_id_output]).then(
+        clear_file_input, inputs=None, outputs=[file_input]
+    )
--- a/docling_serve/settings.py
+++ b/docling_serve/settings.py
@@ -1,3 +1,4 @@
+import sys
 from pathlib import Path
 from typing import Optional, Union

@@ -16,6 +17,10 @@ class UvicornSettings(BaseSettings):
    reload: bool = False
    root_path: str = ""
    proxy_headers: bool = True
+    timeout_keep_alive: int = 60
+    ssl_certfile: Optional[Path] = None
+    ssl_keyfile: Optional[Path] = None
+    ssl_keyfile_password: Optional[str] = None
    workers: Union[int, None] = None


@@ -28,7 +33,19 @@ class DoclingServeSettings(BaseSettings):
    )

    enable_ui: bool = False
+    api_host: str = "localhost"
    artifacts_path: Optional[Path] = None
+    static_path: Optional[Path] = None
+    options_cache_size: int = 2
+    allow_external_plugins: bool = False
+
+    max_document_timeout: float = 3_600 * 24 * 7  # 7 days
+    max_num_pages: int = sys.maxsize
+    max_file_size: int = sys.maxsize
+
+    cors_origins: list[str] = ["*"]
+    cors_methods: list[str] = ["*"]
+    cors_headers: list[str] = ["*"]

    eng_kind: AsyncEngine = AsyncEngine.LOCAL
    eng_loc_num_workers: int = 2
--- a/docs/README.md
+++ b/docs/README.md
@@ -0,0 +1,8 @@
+# Dolcing Serve documentation
+
+This documentation pages explore the webserver configurations, runtime options, deployment examples as well as development best practices.
+
+- [Configuration](./configuration.md)
+- [Advance usage](./usage.md)
+- [Deployment](./deployment.md)
+- [Development](./development.md)
--- a/docs/assets/docling-serve-pic.png
+++ b/docs/assets/docling-serve-pic.png
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -0,0 +1,44 @@
+# Configuration
+
+The `docling-serve` executable allows to configure the server via command line
+options as well as environment variables.
+Configurations are divided between the settings used for the `uvicorn` asgi
+server and the actual app-specific configurations.
+
+ > [!WARNING]
+> When the server is running with `reload` or with multiple `workers`, uvicorn
+> will spawn multiple subprocessed. This invalides all the values configured
+> via the CLI command line options. Please use environment variables in this
+> type of deployments.
+
+## Webserver configuration
+
+The following table shows the options which are propagated directly to the
+`uvicorn` webserver runtime.
+
+| CLI option | ENV | Default | Description |
+| -----------|-----|---------|-------------|
+| `--host` | `UVICORN_HOST` | `0.0.0.0` for `run`, `localhost` for `dev` | THe host to serve on. |
+| `--port` | `UVICORN_PORT` | `5001` | The port to serve on. |
+| `--reload` | `UVICORN_RELOAD` | `false` for `run`, `true` for `dev` | Enable auto-reload of the server when (code) files change. |
+| `--workers` | `UVICORN_WORKERS` | `1` | Use multiple worker processes. |
+| `--root-path` | `UVICORN_ROOT_PATH` | `""` | The root path is used to tell your app that it is being served to the outside world with some |
+| `--proxy-headers` | `UVICORN_PROXY_HEADERS` | `true` | Enable/Disable X-Forwarded-Proto, X-Forwarded-For, X-Forwarded-Port to populate remote address info. |
+| `--timeout-keep-alive` | `UVICORN_TIMEOUT_KEEP_ALIVE` | `60` | Timeout for the server response. |
+| `--ssl-certfile` | `UVICORN_SSL_CERTFILE` |  | SSL certificate file. |
+| `--ssl-keyfile` | `UVICORN_SSL_KEYFILE` |  | SSL key file. |
+| `--ssl-keyfile-password` | `UVICORN_SSL_KEYFILE_PASSWORD` |  | SSL keyfile password. |
+
+## Docling Serve configuration
+
+THe following table describes the options to configure the Docling Serve app.
+
+| CLI option | ENV | Default | Description |
+| -----------|-----|---------|-------------|
+| `--artifacts-path` | `DOCLING_SERVE_ARTIFACTS_PATH` | unset | If set to a valid directory, the model weights will be loaded from this path |
+|  | `DOCLING_SERVE_STATIC_PATH` | unset | If set to a valid directory, the static assets for the docs and ui will be loaded from this path |
+| `--enable-ui` | `DOCLING_SERVE_ENABLE_UI` | `false` | Enable the demonstrator UI. |
+|  | `DOCLING_SERVE_OPTIONS_CACHE_SIZE` | `2` | How many DocumentConveter objects (including their loaded models) to keep in the cache. |
+|  | `DOCLING_SERVE_CORS_ORIGINS` | `["*"]` | A list of origins that should be permitted to make cross-origin requests. |
+|  | `DOCLING_SERVE_CORS_METHODS` | `["*"]` | A list of HTTP methods that should be allowed for cross-origin requests. |
+|  | `DOCLING_SERVE_CORS_HEADERS` | `["*"]` | A list of HTTP request headers that should be supported for cross-origin requests. |
--- a/docs/deploy-examples/compose-gpu.yaml
+++ b/docs/deploy-examples/compose-gpu.yaml
@@ -0,0 +1,15 @@
+services:
+  docling:
+    image: ghcr.io/docling-project/docling-serve-cu124
+    container_name: docling-serve
+    ports:
+      - 5001:5001
+    environment:
+      - DOCLING_SERVE_ENABLE_UI=true
+    deploy:
+      resources:
+        reservations:
+          devices:
+          - driver: nvidia
+            count: all # nvidia-smi 
+            capabilities: [gpu]
--- a/docs/deploy-examples/docling-serve-oauth.yaml
+++ b/docs/deploy-examples/docling-serve-oauth.yaml
@@ -0,0 +1,192 @@
+# This example deployment configures Docling Serve with a OAuth-Proxy sidecar and TLS termination
+---
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: docling-serve
+  labels:
+    app: docling-serve
+  annotations:
+    serviceaccounts.openshift.io/oauth-redirectreference.primary: '{"kind":"OAuthRedirectReference","apiVersion":"v1","reference":{"kind":"Route","name":"docling-serve"}}'
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+  name: docling-serve-oauth
+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: ClusterRole
+  name: system:auth-delegator
+subjects:
+- kind: ServiceAccount
+  name: docling-serve
+  namespace: docling
+---
+apiVersion: route.openshift.io/v1
+kind: Route
+metadata:
+  name: docling-serve
+  labels:
+    app: docling-serve
+    component: docling-serve-api
+spec:
+  to:
+    kind: Service
+    name: docling-serve
+  port:
+    targetPort: oauth
+  tls:
+    termination: Reencrypt
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: docling-serve
+  labels:
+    app: docling-serve
+    component: docling-serve-api
+  annotations:
+    service.alpha.openshift.io/serving-cert-secret-name: docling-serve-tls
+spec:
+  ports:
+  - name: oauth
+    port: 8443
+    targetPort: oauth
+  - name: http
+    port: 5001
+    targetPort: http
+  selector:
+    app: docling-serve
+    component: docling-serve-api
+---
+kind: Deployment
+apiVersion: apps/v1
+metadata:
+  name: docling-serve
+  labels:
+    app: docling-serve
+    component: docling-serve-api
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: docling-serve
+      component: docling-serve-api
+  template:
+    metadata:
+      labels:
+        app: docling-serve
+        component: docling-serve-api
+    spec:
+      restartPolicy: Always
+      serviceAccountName: docling-serve
+      containers:
+        - name: api
+          resources:
+            limits:
+              cpu: 2000m
+              memory: 2Gi
+            requests:
+              cpu: 800m
+              memory: 1Gi
+          readinessProbe:
+            httpGet:
+              path: /health
+              port: http
+              scheme: HTTPS
+            initialDelaySeconds: 10
+            timeoutSeconds: 2
+            periodSeconds: 5
+            successThreshold: 1
+            failureThreshold: 3
+          livenessProbe:
+            httpGet:
+              path: /health
+              port: http
+              scheme: HTTPS
+            initialDelaySeconds: 3
+            timeoutSeconds: 4
+            periodSeconds: 10
+            successThreshold: 1
+            failureThreshold: 5
+          env:
+            - name: NAMESPACE
+              valueFrom:
+                fieldRef:
+                  fieldPath: metadata.namespace
+            - name: DOCLING_SERVE_ENABLE_UI
+              value: 'true'
+            - name: DOCLING_SERVE_API_HOST
+              value: 'docling-serve.$(NAMESPACE).svc.cluster.local'
+            - name: UVICORN_SSL_CERTFILE
+              value: '/etc/tls/private/tls.crt'
+            - name: UVICORN_SSL_KEYFILE
+              value: '/etc/tls/private/tls.key'
+          ports:
+            - name: http
+              containerPort: 5001
+              protocol: TCP
+          volumeMounts:
+            - name: proxy-tls
+              mountPath: /etc/tls/private
+          imagePullPolicy: Always
+          image: 'ghcr.io/docling-project/docling-serve-cpu:fix-ui-with-https'
+        - name: oauth-proxy
+          resources:
+            limits:
+              cpu: 100m
+              memory: 256Mi
+            requests:
+              cpu: 100m
+              memory: 256Mi
+          readinessProbe:
+            httpGet:
+              path: /oauth/healthz
+              port: oauth
+              scheme: HTTPS
+            initialDelaySeconds: 5
+            timeoutSeconds: 1
+            periodSeconds: 5
+            successThreshold: 1
+            failureThreshold: 3
+          livenessProbe:
+            httpGet:
+              path: /oauth/healthz
+              port: oauth
+              scheme: HTTPS
+            initialDelaySeconds: 30
+            timeoutSeconds: 1
+            periodSeconds: 5
+            successThreshold: 1
+            failureThreshold: 3
+          ports:
+            - name: oauth
+              containerPort: 8443
+              protocol: TCP
+          imagePullPolicy: IfNotPresent
+          volumeMounts:
+            - name: proxy-tls
+              mountPath: /etc/tls/private
+          env:
+            - name: NAMESPACE
+              valueFrom:
+                fieldRef:
+                  fieldPath: metadata.namespace
+          image: 'registry.redhat.io/openshift4/ose-oauth-proxy:v4.13'
+          args:
+            - '--https-address=:8443'
+            - '--provider=openshift'
+            - '--openshift-service-account=docling-serve'
+            - '--upstream=https://docling-serve.$(NAMESPACE).svc.cluster.local:5001'
+            - '--upstream-ca=/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt'
+            - '--tls-cert=/etc/tls/private/tls.crt'
+            - '--tls-key=/etc/tls/private/tls.key'
+            - '--cookie-secret=SECRET'
+            - '--openshift-delegate-urls={"/": {"group":"route.openshift.io","resource":"routes","verb":"get","name":"docling-serve","namespace":"$(NAMESPACE)"}}'
+            - '--openshift-sar={"namespace":"$(NAMESPACE)","resource":"routes","resourceName":"docling-serve","verb":"get","resourceAPIGroup":"route.openshift.io"}'
+            - '--skip-auth-regex=''(^/health|^/docs)'''
+      volumes:
+        - name: proxy-tls
+          secret:
+            secretName: docling-serve-tls
+            defaultMode: 420
--- a/docs/deploy-examples/docling-serve-simple.yaml
+++ b/docs/deploy-examples/docling-serve-simple.yaml
@@ -0,0 +1,58 @@
+# This example deployment configures Docling Serve with a Service and cuda image
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: docling-serve
+  labels:
+    app: docling-serve
+    component: docling-serve-api
+spec:
+  ports:
+  - name: http
+    port: 5001
+    targetPort: http
+  selector:
+    app: docling-serve
+    component: docling-serve-api
+---
+kind: Deployment
+apiVersion: apps/v1
+metadata:
+  name: docling-serve
+  labels:
+    app: docling-serve
+    component: docling-serve-api
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: docling-serve
+      component: docling-serve-api
+  template:
+    metadata:
+      labels:
+        app: docling-serve
+        component: docling-serve-api
+    spec:
+      restartPolicy: Always
+      containers:
+        - name: api
+          resources:
+            limits:
+              cpu: 500m
+              memory: 2Gi
+              nvidia.com/gpu: 1  # Limit to one GPU
+            requests:
+              cpu: 250m
+              memory: 1Gi
+              nvidia.com/gpu: 1  # Limit to one GPU
+          env:
+            - name: DOCLING_SERVE_ENABLE_UI
+              value: 'true'
+          ports:
+            - name: http
+              containerPort: 5001
+              protocol: TCP
+          imagePullPolicy: Always
+          image: 'ghcr.io/docling-project/docling-serve-cu124'
--- a/docs/deployment.md
+++ b/docs/deployment.md
@@ -0,0 +1,194 @@
+# Deployment Examples
+
+This document provides deployment examples for running the application in different environments.
+
+Choose the deployment option that best fits your setup.
+
+- **[Local GPU](#local-gpu)**: For deploying the application locally on a machine with a NVIDIA GPU (using Docker Compose).
+- **[OpenShift](#openshift)**: For deploying the application on an OpenShift cluster, designed for cloud-native environments.
+
+---
+
+## Local GPU
+
+### Docker compose
+
+Manifest example: [compose-gpu.yaml](./deploy-examples/compose-gpu.yaml)
+
+This deployment has the following features:
+
+- NVIDIA cuda enabled
+
+Install the app with:
+
+```sh
+docker compose -f docs/deploy-examples/compose-gpu.yaml up -d
+```
+
+For using the API:
+
+```sh
+# Make a test query
+curl -X 'POST' \
+  "localhost:5001/v1alpha/convert/source/async" \
+  -H "accept: application/json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
+  }'
+```
+
+<details>
+<summary><b>Requirements</b></summary>
+
+- debian/ubuntu/rhel/fedora/opensuse
+- docker
+- nvidia drivers >=550.54.14
+- nvidia-container-toolkit
+
+Docs:
+
+- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/supported-platforms.html)
+- [CUDA Toolkit Release Notes](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#id6)
+
+</details>
+
+<details>
+<summary><b>Steps</b></summary>
+
+1. Check driver version and which GPU you want to use (0/1/2/3.. and update [compose-gpu.yaml](./deploy-examples/compose-gpu.yaml) file or use `count: all`)
+
+    ```sh
+    nvidia-smi
+    ```
+
+2. Check if the NVIDIA Container Toolkit is installed/updated
+
+    ```sh
+    # debian
+    dpkg -l | grep nvidia-container-toolkit
+    ```
+
+    ```sh
+    # rhel
+    rpm -q nvidia-container-toolkit
+    ```
+
+    NVIDIA Container Toolkit install steps can be found here:
+
+    <https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html>
+
+3. Check which runtime is being used by Docker
+
+    ```sh
+    # docker
+    docker info | grep -i runtime
+    ```
+
+4. If the default Docker runtime changes back from 'nvidia' to 'default' after restarting the Docker service (optional):
+
+    Backup the daemon.json file:
+
+    ```sh
+    sudo cp /etc/docker/daemon.json /etc/docker/daemon.json.bak
+    ```
+
+    Update the daemon.json file:
+
+    ```sh
+    echo '{
+      "runtimes": {
+        "nvidia": {
+          "path": "nvidia-container-runtime"
+        }
+      },
+      "default-runtime": "nvidia"
+    }' | sudo tee /etc/docker/daemon.json > /dev/null
+    ```
+
+    Restart the Docker service:
+
+    ```sh
+    sudo systemctl restart docker
+    ```
+
+    Confirm 'nvidia' is the default runtime used by Docker by repeating step 3.
+
+5. Run the container:
+
+    ```sh
+    docker compose -f docs/deploy-examples/compose-gpu.yaml up -d
+    ```
+
+</details>
+
+## OpenShift
+
+### Simple deployment
+
+Manifest example: [docling-serve-simple.yaml](./deploy-examples/docling-serve-simple.yaml)
+
+This deployment example has the following features:
+
+- Deployment configuration
+- Service configuration
+- NVIDIA cuda enabled
+
+Install the app with:
+
+```sh
+oc apply -f docs/deploy-examples/docling-serve-simple.yaml
+```
+
+For using the API:
+
+```sh
+# Port-forward the service
+oc port-forward svc/docling-serve 5001:5001
+
+# Make a test query
+curl -X 'POST' \
+  "localhost:5001/v1alpha/convert/source/async" \
+  -H "accept: application/json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
+  }'
+```
+
+### Secure deployment with `oauth-proxy`
+
+Manifest example: [docling-serve-oauth.yaml](./deploy-examples/docling-serve-oauth.yaml)
+
+This deployment has the following features:
+
+- TLS encryption between all components (using the cluster-internal CA authority).
+- Authentication via a secure `oauth-proxy` sidecar.
+- Expose the service using a secure OpenShift `Route`
+
+Install the app with:
+
+```sh
+oc apply -f docs/deploy-examples/docling-serve-oauth.yaml
+```
+
+For using the API:
+
+```sh
+# Retrieve the endpoint
+DOCLING_NAME=docling-serve
+DOCLING_ROUTE="https://$(oc get routes ${DOCLING_NAME} --template={{.spec.host}})"
+
+# Retrieve the authentication token
+OCP_AUTH_TOKEN=$(oc whoami --show-token)
+
+# Make a test query
+curl -X 'POST' \
+  "${DOCLING_ROUTE}/v1alpha/convert/source/async" \
+  -H "Authorization: Bearer ${OCP_AUTH_TOKEN}" \
+  -H "accept: application/json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
+  }'
+```
--- a/docs/development.md
+++ b/docs/development.md
@@ -0,0 +1,57 @@
+# Development
+
+## Install dependencies
+
+### CPU only
+
+```sh
+# Install uv if not already available
+curl -LsSf https://astral.sh/uv/install.sh | sh
+
+# Install dependencies
+uv sync --extra cpu
+```
+
+### Cuda GPU
+
+For GPU support use the following command:
+
+```sh
+# Install dependencies
+uv sync
+```
+
+### Gradio UI and different OCR backends
+
+`/ui` endpoint using `gradio` and different OCR backends can be enabled via package extras:
+
+```sh
+# Enable ui and rapidocr
+uv sync --extra ui --extra rapidocr
+```
+
+```sh
+# Enable tesserocr
+uv sync --extra tesserocr
+```
+
+See `[project.optional-dependencies]` section in `pyproject.toml` for full list of options and runtime options with `uv run docling-serve --help`.
+
+### Run the server
+
+The `docling-serve` executable is a convenient script for launching the webserver both in
+development and production mode.
+
+```sh
+# Run the server in development mode
+# - reload is enabled by default
+# - listening on the 127.0.0.1 address
+# - ui is enabled by default
+docling-serve dev
+
+# Run the server in production mode
+# - reload is disabled by default
+# - listening on the 0.0.0.0 address
+# - ui is disabled by default
+docling-serve run
+```
--- a/docs/usage.md
+++ b/docs/usage.md
@@ -0,0 +1,279 @@
+# Usage
+
+The API provides two endpoints: one for urls, one for files. This is necessary to send files directly in binary format instead of base64-encoded strings.
+
+## Common parameters
+
+On top of the source of file (see below), both endpoints support the same parameters, which are almost the same as the Docling CLI.
+
+- `from_format` (List[str]): Input format(s) to convert from. Allowed values: `docx`, `pptx`, `html`, `image`, `pdf`, `asciidoc`, `md`. Defaults to all formats.
+- `to_formats` (List[str]): Output format(s) to convert to. Allowed values: `md`, `json`, `html`, `text`, `doctags`. Defaults to `md`.
+- `do_ocr` (bool): If enabled, the bitmap content will be processed using OCR. Defaults to `True`.
+- `image_export_mode`: Image export mode for the document (only in case of JSON, Markdown or HTML). Allowed values: embedded, placeholder, referenced. Optional, defaults to `embedded`.
+- `force_ocr` (bool): If enabled, replace any existing text with OCR-generated text over the full content. Defaults to `False`.
+- `ocr_engine` (str): OCR engine to use. Allowed values: `easyocr`, `tesseract_cli`, `tesseract`, `rapidocr`, `ocrmac`. Defaults to `easyocr`.
+- `ocr_lang` (List[str]): List of languages used by the OCR engine. Note that each OCR engine has different values for the language names. Defaults to empty.
+- `pdf_backend` (str): PDF backend to use. Allowed values: `pypdfium2`, `dlparse_v1`, `dlparse_v2`. Defaults to `dlparse_v2`.
+- `table_mode` (str): Table mode to use. Allowed values: `fast`, `accurate`. Defaults to `fast`.
+- `abort_on_error` (bool): If enabled, abort on error. Defaults to false.
+- `return_as_file` (boo): If enabled, return the output as a file. Defaults to false.
+- `do_table_structure` (bool): If enabled, the table structure will be extracted. Defaults to true.
+- `include_images` (bool): If enabled, images will be extracted from the document. Defaults to true.
+- `images_scale` (float): Scale factor for images. Defaults to 2.0.
+
+## Convert endpoints
+
+### Source endpoint
+
+The endpoint is `/v1alpha/convert/source`, listening for POST requests of JSON payloads.
+
+On top of the above parameters, you must send the URL(s) of the document you want process with either the `http_sources` or `file_sources` fields.
+The first is fetching URL(s) (optionally using with extra headers), the second allows to provide documents as base64-encoded strings.
+No `options` is required, they can be partially or completely omitted.
+
+Simple payload example:
+
+```json
+{
+  "http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
+}
+```
+
+<details>
+
+<summary>Complete payload example:</summary>
+
+```json
+{
+  "options": {
+    "from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
+    "to_formats": ["md", "json", "html", "text", "doctags"],
+    "image_export_mode": "placeholder",
+    "do_ocr": true,
+    "force_ocr": false,
+    "ocr_engine": "easyocr",
+    "ocr_lang": ["en"],
+    "pdf_backend": "dlparse_v2",
+    "table_mode": "fast",
+    "abort_on_error": false,
+    "return_as_file": false,
+  },
+  "http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
+}
+```
+
+</details>
+
+<details>
+
+<summary>CURL example:</summary>
+
+```sh
+curl -X 'POST' \
+  'http://localhost:5001/v1alpha/convert/source' \
+  -H 'accept: application/json' \
+  -H 'Content-Type: application/json' \
+  -d '{
+  "options": {
+    "from_formats": [
+      "docx",
+      "pptx",
+      "html",
+      "image",
+      "pdf",
+      "asciidoc",
+      "md",
+      "xlsx"
+    ],
+    "to_formats": ["md", "json", "html", "text", "doctags"],
+    "image_export_mode": "placeholder",
+    "do_ocr": true,
+    "force_ocr": false,
+    "ocr_engine": "easyocr",
+    "ocr_lang": [
+      "fr",
+      "de",
+      "es",
+      "en"
+    ],
+    "pdf_backend": "dlparse_v2",
+    "table_mode": "fast",
+    "abort_on_error": false,
+    "return_as_file": false,
+    "do_table_structure": true,
+    "include_images": true,
+    "images_scale": 2
+  },
+  "http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
+}'
+```
+
+</details>
+
+<details>
+<summary>Python example:</summary>
+
+```python
+import httpx
+
+async_client = httpx.AsyncClient(timeout=60.0)
+url = "http://localhost:5001/v1alpha/convert/source"
+payload = {
+  "options": {
+    "from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
+    "to_formats": ["md", "json", "html", "text", "doctags"],
+    "image_export_mode": "placeholder",
+    "do_ocr": True,
+    "force_ocr": False,
+    "ocr_engine": "easyocr",
+    "ocr_lang": "en",
+    "pdf_backend": "dlparse_v2",
+    "table_mode": "fast",
+    "abort_on_error": False,
+    "return_as_file": False,
+  },
+  "http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
+}
+
+response = await async_client_client.post(url, json=payload)
+
+data = response.json()
+```
+
+</details>
+
+#### File as base64
+
+The `file_sources` argument in the endpoint allows to send files as base64-encoded strings.
+When your PDF or other file type is too large, encoding it and passing it inline to curl
+can lead to an “Argument list too long” error on some systems. To avoid this, we write
+the JSON request body to a file and have curl read from that file.
+
+<details>
+<summary>CURL steps:</summary>
+
+```sh
+# 1. Base64-encode the file
+B64_DATA=$(base64 -w 0 /path/to/file/pdf-to-convert.pdf)
+
+# 2. Build the JSON with your options
+cat <<EOF > /tmp/request_body.json
+{
+  "options": {
+  },
+  "file_sources": [{
+    "base64_string": "${B64_DATA}",
+    "filename": "pdf-to-convert.pdf"
+  }]
+}
+EOF
+
+# 3. POST the request to the docling service
+curl -X POST "localhost:5001/v1alpha/convert/source" \
+     -H "Content-Type: application/json" \
+     -d @/tmp/request_body.json
+```
+
+</details>
+
+### File endpoint
+
+The endpoint is: `/v1alpha/convert/file`, listening for POST requests of Form payloads (necessary as the files are sent as multipart/form data). You can send one or multiple files.
+
+<details>
+<summary>CURL example:</summary>
+
+```sh
+curl -X 'POST' \
+  'http://127.0.0.1:5001/v1alpha/convert/file' \
+  -H 'accept: application/json' \
+  -H 'Content-Type: multipart/form-data' \
+  -F 'ocr_engine=easyocr' \
+  -F 'pdf_backend=dlparse_v2' \
+  -F 'from_formats=pdf' \
+  -F 'from_formats=docx' \
+  -F 'force_ocr=false' \
+  -F 'image_export_mode=embedded' \
+  -F 'ocr_lang=en' \
+  -F 'ocr_lang=pl' \
+  -F 'table_mode=fast' \
+  -F 'files=@2206.01062v1.pdf;type=application/pdf' \
+  -F 'abort_on_error=false' \
+  -F 'to_formats=md' \
+  -F 'to_formats=text' \
+  -F 'return_as_file=false' \
+  -F 'do_ocr=true'
+```
+
+</details>
+
+<details>
+<summary>Python example:</summary>
+
+```python
+import httpx
+
+async_client = httpx.AsyncClient(timeout=60.0)
+url = "http://localhost:5001/v1alpha/convert/file"
+parameters = {
+"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
+"to_formats": ["md", "json", "html", "text", "doctags"],
+"image_export_mode": "placeholder",
+"do_ocr": True,
+"force_ocr": False,
+"ocr_engine": "easyocr",
+"ocr_lang": ["en"],
+"pdf_backend": "dlparse_v2",
+"table_mode": "fast",
+"abort_on_error": False,
+"return_as_file": False
+}
+
+current_dir = os.path.dirname(__file__)
+file_path = os.path.join(current_dir, '2206.01062v1.pdf')
+
+files = {
+    'files': ('2206.01062v1.pdf', open(file_path, 'rb'), 'application/pdf'),
+}
+
+response = await async_client.post(url, files=files, data={"parameters": json.dumps(parameters)})
+assert response.status_code == 200, "Response should be 200 OK"
+
+data = response.json()
+```
+
+</details>
+
+## Response format
+
+The response can be a JSON Document or a File.
+
+- If you process only one file, the response will be a JSON document with the following format:
+
+  ```jsonc
+  {
+    "document": {
+      "md_content": "",
+      "json_content": {},
+      "html_content": "",
+      "text_content": "",
+      "doctags_content": ""
+      },
+    "status": "<success|partial_success|skipped|failure>",
+    "processing_time": 0.0,
+    "timings": {},
+    "errors": []
+  }
+  ```
+
+  Depending on the value you set in `output_formats`, the different items will be populated with their respective results or empty.
+
+  `processing_time` is the Docling processing time in seconds, and `timings` (when enabled in the backend) provides the detailed
+  timing of all the internal Docling components.
+
+- If you set the parameter `return_as_file` to True, the response will be a zip file.
+- If multiple files are generated (multiple inputs, or one input but multiple outputs with `return_as_file` True), the response will be a zip file.
+
+## Asynchronous API
+
+TBA
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "docling-serve"
-version = "0.5.1"  # DO NOT EDIT, updated automatically
+version = "0.8.0"  # DO NOT EDIT, updated automatically
 description = "Running Docling as a service"
 license = {text = "MIT"}
 authors = [
@@ -30,7 +30,8 @@ classifiers = [
 ]
 requires-python = ">=3.10"
 dependencies = [
-    "docling~=2.25.1",
+    "docling[vlm]~=2.28",
+    "mlx-vlm~=0.1.12; sys_platform == 'darwin' and platform_machine == 'arm64'",
    "fastapi[standard]~=0.115",
    "httpx~=0.28",
    "pydantic~=2.10",
@@ -43,7 +44,8 @@ dependencies = [

 [project.optional-dependencies]
 ui = [
-    "gradio~=5.9"
+    "gradio~=5.9",
+    "pydantic<2.11.0",  # fix compatibility between gradio and new pydantic 2.11
 ]
 tesserocr = [
    "tesserocr~=2.7"
@@ -109,11 +111,11 @@ namespaces = true
 docling-serve = "docling_serve.__main__:main"

 [project.urls]
-Homepage = "https://github.com/DS4SD/docling-serve"
+Homepage = "https://github.com/docling-project/docling-serve"
 # Documentation = "https://ds4sd.github.io/docling"
-Repository = "https://github.com/DS4SD/docling-serve"
-Issues = "https://github.com/DS4SD/docling-serve/issues"
-Changelog = "https://github.com/DS4SD/docling-serve/blob/main/CHANGELOG.md"
+Repository = "https://github.com/docling-project/docling-serve"
+Issues = "https://github.com/docling-project/docling-serve/issues"
+Changelog = "https://github.com/docling-project/docling-serve/blob/main/CHANGELOG.md"

 [tool.ruff]
 target-version = "py310"
@@ -195,6 +197,7 @@ module = [
    "tesserocr.*",
    "rapidocr_onnxruntime.*",
    "requests.*",
+    "mlx_vlm.*",
 ]
 ignore_missing_imports = true

--- a/tests/test_1-file-all-outputs.py
+++ b/tests/test_1-file-all-outputs.py
@@ -92,16 +92,11 @@ async def test_convert_file(async_client):
            msg=f'JSON document should contain \'{{\\n  "schema_name": "DoclingDocument\'". Received: {safe_slice(data["document"]["json_content"])}',
        )
    # HTML check
-    check.is_in(
-        "html_content",
-        data.get("document", {}),
-        msg=f"Response should contain 'html_content' key. Received keys: {list(data.get('document', {}).keys())}",
-    )
    if data.get("document", {}).get("html_content") is not None:
        check.is_in(
-            '<!DOCTYPE html>\n<html lang="en">\n<head>',
+            "<!DOCTYPE html>\n<html>\n<head>",
            data["document"]["html_content"],
-            msg=f"HTML document should contain '<!DOCTYPE html>\\n<html lang=\"en'>. Received: {safe_slice(data['document']['html_content'])}",
+            msg=f"HTML document should contain '<!DOCTYPE html>\\n<html>'. Received: {safe_slice(data['document']['html_content'])}",
        )
    # Text check
    check.is_in(
@@ -123,7 +118,7 @@ async def test_convert_file(async_client):
    )
    if data.get("document", {}).get("doctags_content") is not None:
        check.is_in(
-            "<document>\n<section_header_level_1><location>",
+            "<doctag><page_header><loc",
            data["document"]["doctags_content"],
-            msg=f"DocTags document should contain '<document>\\n<section_header_level_1><location>'. Received: {safe_slice(data['document']['doctags_content'])}",
+            msg=f"DocTags document should contain '<doctag><page_header><loc'. Received: {safe_slice(data['document']['doctags_content'])}",
        )
--- a/tests/test_1-url-all-outputs.py
+++ b/tests/test_1-url-all-outputs.py
@@ -93,9 +93,9 @@ async def test_convert_url(async_client):
    )
    if data.get("document", {}).get("html_content") is not None:
        check.is_in(
-            '<!DOCTYPE html>\n<html lang="en">\n<head>',
+            "<!DOCTYPE html>\n<html>\n<head>",
            data["document"]["html_content"],
-            msg=f"HTML document should contain '<!DOCTYPE html>\\n<html lang=\"en'>. Received: {safe_slice(data['document']['html_content'])}",
+            msg=f"HTML document should contain '<!DOCTYPE html>\\n<html>'. Received: {safe_slice(data['document']['html_content'])}",
        )
    # Text check
    check.is_in(
@@ -117,7 +117,7 @@ async def test_convert_url(async_client):
    )
    if data.get("document", {}).get("doctags_content") is not None:
        check.is_in(
-            "<document>\n<section_header_level_1><location>",
+            "<doctag><page_header><loc",
            data["document"]["doctags_content"],
-            msg=f"DocTags document should contain '<document>\\n<section_header_level_1><location>'. Received: {safe_slice(data['document']['doctags_content'])}",
+            msg=f"DocTags document should contain '<doctag><page_header><loc'. Received: {safe_slice(data['document']['doctags_content'])}",
        )
--- a/uv.lock
+++ b/uv.lock
Author	SHA1	Message	Date
github-actions[bot]	40bb21d347	chore: bump version to 0.8.0 [skip ci]	2025-04-22 13:04:33 +00:00
Michele Dolfi	ee89ee4dae	feat: Add option for vlm pipeline (#143 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-04-22 14:46:33 +02:00
Michele Dolfi	6b3d281f02	feat: Expose more conversion options (#142 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-04-22 10:41:47 +02:00
Tiago Santana	b598872e5c	feat(UI): change UI to use async endpoints (#131 ) Signed-off-by: Tiago Santana <54704492+SantanaTiago@users.noreply.github.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2025-04-19 19:59:07 +02:00
Michele Dolfi	087417e5c2	docs: fix required permissions for oauth2-proxy requests (#141 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-04-19 18:46:28 +02:00
Michele Dolfi	57f9073bc0	fix(UI): use https when calling the api (#139 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-04-19 17:35:54 +02:00
Rui Dias Gomes	525a43ff6f	docs: update deployment examples (#135 ) Signed-off-by: rmdg88 <rmdg88@gmail.com> Signed-off-by: Rui Dias Gomes <66125272+rmdg88@users.noreply.github.com>	2025-04-17 14:29:34 +02:00
Michele Dolfi	c1ce4719c9	fix: fix permissions in docker image (#136 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-04-17 14:27:43 +02:00
Kasper Dinkla	5dfb75d3b9	fix: picture caption visuals (#129 ) Signed-off-by: DKL <dkl@zurich.ibm.com>	2025-04-15 13:17:00 +02:00
Michele Dolfi	420162e674	docs: fix image tag (#124 ) Signed-off-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>	2025-04-11 16:19:39 +02:00
github-actions[bot]	ff75bab21b	chore: bump version to 0.7.0 [skip ci]	2025-03-31 13:44:01 +00:00
Michele Dolfi	7a0fabae07	feat: Expose TLS settings and example deploy with oauth-proxy (#112 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-03-31 14:51:30 +02:00
Maxim Lysak	9ffe49a359	chore: Readme picture (#108 ) Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>	2025-03-31 08:29:09 -04:00
Michele Dolfi	68772bb6f0	feat: Offline static files (#109 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-03-26 18:54:54 -04:00
Michele Dolfi	20ec87a63a	feat: Update to Docling 2.28 (#106 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-03-24 20:00:25 -04:00
Eugene	e30f458923	fix: Move ARGs to prevent cache invalidation (#104 ) Signed-off-by: Eugene <fogaprod@gmail.com>	2025-03-22 12:31:42 +01:00
github-actions[bot]	03e405638f	chore: bump version to 0.6.0 [skip ci]	2025-03-17 12:43:23 +00:00
Michele Dolfi	fd8e40a008	docs: simplify README and move details to docs (#102 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-03-17 13:40:12 +01:00
Michele Dolfi	422c402bab	fix: allow changes in CORS settings (#100 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-03-17 09:49:17 +01:00
Michele Dolfi	ea090288d3	fix: avoid exploding options cache using lru and expose size parameter (#101 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-03-17 08:52:29 +01:00
Michele Dolfi	07c48edd5d	fix: increase timeout_keep_alive and allow parameter changes (#98 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-03-16 09:03:40 +01:00
Michele Dolfi	a212547d28	fix: add warning when using incompatible parameters (#99 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-03-16 09:03:22 +01:00
Michele Dolfi	c76daac70c	fix(ui): use --port parameter and avoid failing when image is not found (#97 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-03-16 09:02:53 +01:00
Michele Dolfi	7994b19b9f	chore: move to docling-project gh org (#95 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-03-14 14:04:31 +01:00
Tiago Santana	ec57b528ed	feat: expose options for new features (#92 ) Signed-off-by: Tiago Santana <54704492+SantanaTiago@users.noreply.github.com>	2025-03-13 17:09:59 +01:00