25 Commits

Author SHA1 Message Date
github-actions[bot]
40bb21d347 chore: bump version to 0.8.0 [skip ci] 2025-04-22 13:04:33 +00:00
Michele Dolfi
ee89ee4dae feat: Add option for vlm pipeline (#143)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-22 14:46:33 +02:00
Michele Dolfi
6b3d281f02 feat: Expose more conversion options (#142)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-22 10:41:47 +02:00
Tiago Santana
b598872e5c feat(UI): change UI to use async endpoints (#131)
Signed-off-by: Tiago Santana <54704492+SantanaTiago@users.noreply.github.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-19 19:59:07 +02:00
Michele Dolfi
087417e5c2 docs: fix required permissions for oauth2-proxy requests (#141)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-19 18:46:28 +02:00
Michele Dolfi
57f9073bc0 fix(UI): use https when calling the api (#139)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-19 17:35:54 +02:00
Rui Dias Gomes
525a43ff6f docs: update deployment examples (#135)
Signed-off-by: rmdg88 <rmdg88@gmail.com>
Signed-off-by: Rui Dias Gomes <66125272+rmdg88@users.noreply.github.com>
2025-04-17 14:29:34 +02:00
Michele Dolfi
c1ce4719c9 fix: fix permissions in docker image (#136)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-17 14:27:43 +02:00
Kasper Dinkla
5dfb75d3b9 fix: picture caption visuals (#129)
Signed-off-by: DKL <dkl@zurich.ibm.com>
2025-04-15 13:17:00 +02:00
Michele Dolfi
420162e674 docs: fix image tag (#124)
Signed-off-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
2025-04-11 16:19:39 +02:00
github-actions[bot]
ff75bab21b chore: bump version to 0.7.0 [skip ci] 2025-03-31 13:44:01 +00:00
Michele Dolfi
7a0fabae07 feat: Expose TLS settings and example deploy with oauth-proxy (#112)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-03-31 14:51:30 +02:00
Maxim Lysak
9ffe49a359 chore: Readme picture (#108)
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
2025-03-31 08:29:09 -04:00
Michele Dolfi
68772bb6f0 feat: Offline static files (#109)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-03-26 18:54:54 -04:00
Michele Dolfi
20ec87a63a feat: Update to Docling 2.28 (#106)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-03-24 20:00:25 -04:00
Eugene
e30f458923 fix: Move ARGs to prevent cache invalidation (#104)
Signed-off-by: Eugene <fogaprod@gmail.com>
2025-03-22 12:31:42 +01:00
github-actions[bot]
03e405638f chore: bump version to 0.6.0 [skip ci] 2025-03-17 12:43:23 +00:00
Michele Dolfi
fd8e40a008 docs: simplify README and move details to docs (#102)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-03-17 13:40:12 +01:00
Michele Dolfi
422c402bab fix: allow changes in CORS settings (#100)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-03-17 09:49:17 +01:00
Michele Dolfi
ea090288d3 fix: avoid exploding options cache using lru and expose size parameter (#101)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-03-17 08:52:29 +01:00
Michele Dolfi
07c48edd5d fix: increase timeout_keep_alive and allow parameter changes (#98)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-03-16 09:03:40 +01:00
Michele Dolfi
a212547d28 fix: add warning when using incompatible parameters (#99)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-03-16 09:03:22 +01:00
Michele Dolfi
c76daac70c fix(ui): use --port parameter and avoid failing when image is not found (#97)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-03-16 09:02:53 +01:00
Michele Dolfi
7994b19b9f chore: move to docling-project gh org (#95)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-03-14 14:04:31 +01:00
Tiago Santana
ec57b528ed feat: expose options for new features (#92)
Signed-off-by: Tiago Santana <54704492+SantanaTiago@users.noreply.github.com>
2025-03-13 17:09:59 +01:00
34 changed files with 4584 additions and 2473 deletions

2
.github/SECURITY.md vendored
View File

@@ -20,4 +20,4 @@ After the initial reply to your report, the security team will keep you informed
## Security Alerts
We will send announcements of security vulnerabilities and steps to remediate on the [Docling announcements](https://github.com/DS4SD/docling/discussions/categories/announcements).
We will send announcements of security vulnerabilities and steps to remediate on the [Docling announcements](https://github.com/docling-project/docling/discussions/categories/announcements).

View File

@@ -13,15 +13,15 @@ jobs:
strategy:
matrix:
spec:
- name: ds4sd/docling-serve
- name: docling-project/docling-serve
build_args: |
UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra cpu
platforms: linux/amd64, linux/arm64
- name: ds4sd/docling-serve-cpu
- name: docling-project/docling-serve-cpu
build_args: |
UV_SYNC_EXTRA_ARGS=--no-extra cu124
platforms: linux/amd64, linux/arm64
- name: ds4sd/docling-serve-cu124
- name: docling-project/docling-serve-cu124
build_args: |
UV_SYNC_EXTRA_ARGS=--no-extra cpu
platforms: linux/amd64

View File

@@ -8,7 +8,7 @@ on:
jobs:
code-checks:
# if: ${{ github.event_name == 'push' || (github.event.pull_request.head.repo.full_name != 'DS4SD/docling-serve' && github.event.pull_request.head.repo.full_name != 'ds4sd/docling-serve') }}
# if: ${{ github.event_name == 'push' || (github.event.pull_request.head.repo.full_name != 'docling-project/docling-serve' && github.event.pull_request.head.repo.full_name != 'docling-project/docling-serve') }}
uses: ./.github/workflows/job-checks.yml
permissions:
packages: write

View File

@@ -17,15 +17,15 @@ jobs:
strategy:
matrix:
spec:
- name: ds4sd/docling-serve
- name: docling-project/docling-serve
build_args: |
UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra cpu
platforms: linux/amd64, linux/arm64
- name: ds4sd/docling-serve-cpu
- name: docling-project/docling-serve-cpu
build_args: |
UV_SYNC_EXTRA_ARGS=--no-extra cu124
platforms: linux/amd64, linux/arm64
- name: ds4sd/docling-serve-cu124
- name: docling-project/docling-serve-cu124
build_args: |
UV_SYNC_EXTRA_ARGS=--no-extra cpu
platforms: linux/amd64

View File

@@ -3,7 +3,7 @@ config:
no-emphasis-as-header: false
first-line-heading: false
MD033:
allowed_elements: ["details", "summary"]
allowed_elements: ["details", "summary", "br", "a", "b", "p", "img"]
MD024:
siblings_only: true
globs:

View File

@@ -5,10 +5,14 @@ repos:
hooks:
# Run the Ruff formatter.
- id: ruff-format
name: "Ruff formatter"
args: [--config=pyproject.toml]
files: '^(docling_serve|tests).*\.(py|ipynb)$'
# Run the Ruff linter.
- id: ruff
name: "Ruff linter"
args: [--exit-non-zero-on-fix, --fix, --config=pyproject.toml]
files: '^(docling_serve|tests).*\.(py|ipynb)$'
- repo: local
hooks:
- id: system

View File

@@ -1,36 +1,86 @@
## [v0.5.1](https://github.com/DS4SD/docling-serve/releases/tag/v0.5.1) - 2025-03-10
### Fix
* Submodules in wheels ([#85](https://github.com/DS4SD/docling-serve/issues/85)) ([`a92ad48`](https://github.com/DS4SD/docling-serve/commit/a92ad48b287bfcb134011dc0fc3f91ee04e067ee))
## [v0.5.0](https://github.com/DS4SD/docling-serve/releases/tag/v0.5.0) - 2025-03-07
## [v0.8.0](https://github.com/docling-project/docling-serve/releases/tag/v0.8.0) - 2025-04-22
### Feature
* Async api ([#60](https://github.com/DS4SD/docling-serve/issues/60)) ([`82f8900`](https://github.com/DS4SD/docling-serve/commit/82f890019745859699c1b01f9ccfb64cb7e37906))
* Display version in fastapi docs ([#78](https://github.com/DS4SD/docling-serve/issues/78)) ([`ed851c9`](https://github.com/DS4SD/docling-serve/commit/ed851c95fee5f59305ddc3dcd5c09efce618470b))
* Add option for vlm pipeline ([#143](https://github.com/docling-project/docling-serve/issues/143)) ([`ee89ee4`](https://github.com/docling-project/docling-serve/commit/ee89ee4daee5e916bd6a3bdb452f78934cd03f60))
* Expose more conversion options ([#142](https://github.com/docling-project/docling-serve/issues/142)) ([`6b3d281`](https://github.com/docling-project/docling-serve/commit/6b3d281f02905c195ab75f25bb39f5c4d4e7b680))
* **UI:** Change UI to use async endpoints ([#131](https://github.com/docling-project/docling-serve/issues/131)) ([`b598872`](https://github.com/docling-project/docling-serve/commit/b598872e5c48928ac44417a11bb7acc0e5c3f0c6))
### Fix
* Remove uv from image, merge ARG and ENV declarations ([#57](https://github.com/DS4SD/docling-serve/issues/57)) ([`c95db36`](https://github.com/DS4SD/docling-serve/commit/c95db3643807a4dfb96d93c8e10d6eb486c49a30))
* **docs:** Remove comma in convert/source curl example ([#73](https://github.com/DS4SD/docling-serve/issues/73)) ([`05df073`](https://github.com/DS4SD/docling-serve/commit/05df0735d35a589bdc2a11fcdd764a10f700cb6f))
* **UI:** Use https when calling the api ([#139](https://github.com/docling-project/docling-serve/issues/139)) ([`57f9073`](https://github.com/docling-project/docling-serve/commit/57f9073bc0daf72428b068ea28e2bec7cd76c37b))
* Fix permissions in docker image ([#136](https://github.com/docling-project/docling-serve/issues/136)) ([`c1ce471`](https://github.com/docling-project/docling-serve/commit/c1ce4719c933179ba3c59d73d0584853bbd6fa6a))
* Picture caption visuals ([#129](https://github.com/docling-project/docling-serve/issues/129)) ([`5dfb75d`](https://github.com/docling-project/docling-serve/commit/5dfb75d3b9a7022d1daad12edbb8ec7bbf9aa264))
## [v0.4.0](https://github.com/DS4SD/docling-serve/releases/tag/v0.4.0) - 2025-02-26
### Documentation
* Fix required permissions for oauth2-proxy requests ([#141](https://github.com/docling-project/docling-serve/issues/141)) ([`087417e`](https://github.com/docling-project/docling-serve/commit/087417e5c2387d4ed95500222058f34d8a8702aa))
* Update deployment examples ([#135](https://github.com/docling-project/docling-serve/issues/135)) ([`525a43f`](https://github.com/docling-project/docling-serve/commit/525a43ff6f04b7cc80f9dd6a0e653a8d8c4ab317))
* Fix image tag ([#124](https://github.com/docling-project/docling-serve/issues/124)) ([`420162e`](https://github.com/docling-project/docling-serve/commit/420162e674cc38b4c3c13673ffbee4c20a1b15f1))
## [v0.7.0](https://github.com/docling-project/docling-serve/releases/tag/v0.7.0) - 2025-03-31
### Feature
* New container images ([#68](https://github.com/DS4SD/docling-serve/issues/68)) ([`7e6d9cd`](https://github.com/DS4SD/docling-serve/commit/7e6d9cdef398df70a5b4d626aeb523c428c10d56))
* Render DoclingDocument with npm docling-components in the example UI ([#65](https://github.com/DS4SD/docling-serve/issues/65)) ([`c430d9b`](https://github.com/DS4SD/docling-serve/commit/c430d9b1a162ab29104d86ebaa1ac5a5488b1f09))
## [v0.3.0](https://github.com/DS4SD/docling-serve/releases/tag/v0.3.0) - 2025-02-19
### Feature
* Add new docling-serve cli ([#50](https://github.com/DS4SD/docling-serve/issues/50)) ([`ec33a61`](https://github.com/DS4SD/docling-serve/commit/ec33a61faa7846b9b7998fbf557ebe39a3b800f6))
* Expose TLS settings and example deploy with oauth-proxy ([#112](https://github.com/docling-project/docling-serve/issues/112)) ([`7a0faba`](https://github.com/docling-project/docling-serve/commit/7a0fabae07020c2659dbb22c3b0359909051a74c))
* Offline static files ([#109](https://github.com/docling-project/docling-serve/issues/109)) ([`68772bb`](https://github.com/docling-project/docling-serve/commit/68772bb6f0a87b71094a08ff851f5754c6ca6163))
* Update to Docling 2.28 ([#106](https://github.com/docling-project/docling-serve/issues/106)) ([`20ec87a`](https://github.com/docling-project/docling-serve/commit/20ec87a63a99145bc0ad7931549af8a0c30db641))
### Fix
* Set DOCLING_SERVE_ARTIFACTS_PATH in images ([#53](https://github.com/DS4SD/docling-serve/issues/53)) ([`4877248`](https://github.com/DS4SD/docling-serve/commit/487724836896576ca4f98e84abf15fd1c383bec8))
* Set root UI path when behind proxy ([#38](https://github.com/DS4SD/docling-serve/issues/38)) ([`c64a450`](https://github.com/DS4SD/docling-serve/commit/c64a450bf9ba9947ab180e92bef2763ff710b210))
* Support python 3.13 and docling updates and switch to uv ([#48](https://github.com/DS4SD/docling-serve/issues/48)) ([`ae3b490`](https://github.com/DS4SD/docling-serve/commit/ae3b4906f1c0829b1331ea491f3518741cabff71))
* Move ARGs to prevent cache invalidation ([#104](https://github.com/docling-project/docling-serve/issues/104)) ([`e30f458`](https://github.com/docling-project/docling-serve/commit/e30f458923d34c169db7d5a5c296848716e8cac4))
## [v0.6.0](https://github.com/docling-project/docling-serve/releases/tag/v0.6.0) - 2025-03-17
### Feature
* Expose options for new features ([#92](https://github.com/docling-project/docling-serve/issues/92)) ([`ec57b52`](https://github.com/docling-project/docling-serve/commit/ec57b528ed3f8e7b9604ff4cdf06da3d52c714dd))
### Fix
* Allow changes in CORS settings ([#100](https://github.com/docling-project/docling-serve/issues/100)) ([`422c402`](https://github.com/docling-project/docling-serve/commit/422c402bab7f05e46274ede11f234a19a62e093e))
* Avoid exploding options cache using lru and expose size parameter ([#101](https://github.com/docling-project/docling-serve/issues/101)) ([`ea09028`](https://github.com/docling-project/docling-serve/commit/ea090288d3eec4ea8fbdcd32a6a497a99c89189d))
* Increase timeout_keep_alive and allow parameter changes ([#98](https://github.com/docling-project/docling-serve/issues/98)) ([`07c48ed`](https://github.com/docling-project/docling-serve/commit/07c48edd5d9437219d9623e3d05bc5166c5bb85a))
* Add warning when using incompatible parameters ([#99](https://github.com/docling-project/docling-serve/issues/99)) ([`a212547`](https://github.com/docling-project/docling-serve/commit/a212547d28d6588c65e52000dc7bc04f3f77e69e))
* **ui:** Use --port parameter and avoid failing when image is not found ([#97](https://github.com/docling-project/docling-serve/issues/97)) ([`c76daac`](https://github.com/docling-project/docling-serve/commit/c76daac70c87da412f791666881e48b74688b060))
### Documentation
* Simplify README and move details to docs ([#102](https://github.com/docling-project/docling-serve/issues/102)) ([`fd8e40a`](https://github.com/docling-project/docling-serve/commit/fd8e40a00849771263d9b75b9a56f6caeccb8517))
## [v0.5.1](https://github.com/docling-project/docling-serve/releases/tag/v0.5.1) - 2025-03-10
### Fix
* Submodules in wheels ([#85](https://github.com/docling-project/docling-serve/issues/85)) ([`a92ad48`](https://github.com/docling-project/docling-serve/commit/a92ad48b287bfcb134011dc0fc3f91ee04e067ee))
## [v0.5.0](https://github.com/docling-project/docling-serve/releases/tag/v0.5.0) - 2025-03-07
### Feature
* Async api ([#60](https://github.com/docling-project/docling-serve/issues/60)) ([`82f8900`](https://github.com/docling-project/docling-serve/commit/82f890019745859699c1b01f9ccfb64cb7e37906))
* Display version in fastapi docs ([#78](https://github.com/docling-project/docling-serve/issues/78)) ([`ed851c9`](https://github.com/docling-project/docling-serve/commit/ed851c95fee5f59305ddc3dcd5c09efce618470b))
### Fix
* Remove uv from image, merge ARG and ENV declarations ([#57](https://github.com/docling-project/docling-serve/issues/57)) ([`c95db36`](https://github.com/docling-project/docling-serve/commit/c95db3643807a4dfb96d93c8e10d6eb486c49a30))
* **docs:** Remove comma in convert/source curl example ([#73](https://github.com/docling-project/docling-serve/issues/73)) ([`05df073`](https://github.com/docling-project/docling-serve/commit/05df0735d35a589bdc2a11fcdd764a10f700cb6f))
## [v0.4.0](https://github.com/docling-project/docling-serve/releases/tag/v0.4.0) - 2025-02-26
### Feature
* New container images ([#68](https://github.com/docling-project/docling-serve/issues/68)) ([`7e6d9cd`](https://github.com/docling-project/docling-serve/commit/7e6d9cdef398df70a5b4d626aeb523c428c10d56))
* Render DoclingDocument with npm docling-components in the example UI ([#65](https://github.com/docling-project/docling-serve/issues/65)) ([`c430d9b`](https://github.com/docling-project/docling-serve/commit/c430d9b1a162ab29104d86ebaa1ac5a5488b1f09))
## [v0.3.0](https://github.com/docling-project/docling-serve/releases/tag/v0.3.0) - 2025-02-19
### Feature
* Add new docling-serve cli ([#50](https://github.com/docling-project/docling-serve/issues/50)) ([`ec33a61`](https://github.com/docling-project/docling-serve/commit/ec33a61faa7846b9b7998fbf557ebe39a3b800f6))
### Fix
* Set DOCLING_SERVE_ARTIFACTS_PATH in images ([#53](https://github.com/docling-project/docling-serve/issues/53)) ([`4877248`](https://github.com/docling-project/docling-serve/commit/487724836896576ca4f98e84abf15fd1c383bec8))
* Set root UI path when behind proxy ([#38](https://github.com/docling-project/docling-serve/issues/38)) ([`c64a450`](https://github.com/docling-project/docling-serve/commit/c64a450bf9ba9947ab180e92bef2763ff710b210))
* Support python 3.13 and docling updates and switch to uv ([#48](https://github.com/docling-project/docling-serve/issues/48)) ([`ae3b490`](https://github.com/docling-project/docling-serve/commit/ae3b4906f1c0829b1331ea491f3518741cabff71))

View File

@@ -3,13 +3,13 @@
Our project welcomes external contributions. If you have an itch, please feel
free to scratch it.
To contribute code or documentation, please submit a [pull request](https://github.com/DS4SD/docling-serve/pulls).
To contribute code or documentation, please submit a [pull request](https://github.com/docling-project/docling-serve/pulls).
A good way to familiarize yourself with the codebase and contribution process is
to look for and tackle low-hanging fruit in the [issue tracker](https://github.com/DS4SD/docling-serve/issues).
to look for and tackle low-hanging fruit in the [issue tracker](https://github.com/docling-project/docling-serve/issues).
Before embarking on a more ambitious contribution, please quickly [get in touch](#communication) with us.
For general questions or support requests, please refer to the [discussion section](https://github.com/DS4SD/docling-serve/discussions).
For general questions or support requests, please refer to the [discussion section](https://github.com/docling-project/docling-serve/discussions).
**Note: We appreciate your effort, and want to avoid a situation where a contribution
requires extensive rework (by you or by us), sits in backlog for a long time, or
@@ -17,14 +17,14 @@ cannot be accepted at all!**
### Proposing new features
If you would like to implement a new feature, please [raise an issue](https://github.com/DS4SD/docling-serve/issues)
If you would like to implement a new feature, please [raise an issue](https://github.com/docling-project/docling-serve/issues)
before sending a pull request so the feature can be discussed. This is to avoid
you wasting your valuable time working on a feature that the project developers
are not interested in accepting into the code base.
### Fixing bugs
If you would like to fix a bug, please [raise an issue](https://github.com/DS4SD/docling-serve/issues) before sending a
If you would like to fix a bug, please [raise an issue](https://github.com/docling-project/docling-serve/issues) before sending a
pull request so it can be tracked.
### Merge approval
@@ -73,7 +73,7 @@ git commit -s
## Communication
Please feel free to connect with us using the [discussion section](https://github.com/DS4SD/docling-serve/discussions).
Please feel free to connect with us using the [discussion section](https://github.com/docling-project/docling-serve/discussions).
## Developing

View File

@@ -2,9 +2,6 @@ ARG BASE_IMAGE=quay.io/sclorg/python-312-c9s:c9s
FROM ${BASE_IMAGE}
ARG MODELS_LIST="layout tableformer picture_classifier easyocr" \
UV_SYNC_EXTRA_ARGS=""
USER 0
###################################################################################################
@@ -20,6 +17,8 @@ RUN --mount=type=bind,source=os-packages.txt,target=/tmp/os-packages.txt \
dnf -y clean all && \
rm -rf /var/cache/dnf
RUN /usr/bin/fix-permissions /opt/app-root/src/.cache
ENV TESSDATA_PREFIX=/usr/share/tesseract/tessdata/
###################################################################################################
@@ -41,25 +40,29 @@ ENV \
UV_PROJECT_ENVIRONMENT=/opt/app-root \
DOCLING_SERVE_ARTIFACTS_PATH=/opt/app-root/src/.cache/docling/models
ARG UV_SYNC_EXTRA_ARGS=""
RUN --mount=from=ghcr.io/astral-sh/uv:0.6.1,source=/uv,target=/bin/uv \
--mount=type=cache,target=/opt/app-root/src/.cache/uv,uid=1001 \
--mount=type=bind,source=uv.lock,target=uv.lock \
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
uv sync --frozen --no-install-project --no-dev --all-extras ${UV_SYNC_EXTRA_ARGS}
umask 002 && uv sync --frozen --no-install-project --no-dev --all-extras ${UV_SYNC_EXTRA_ARGS}
ARG MODELS_LIST="layout tableformer picture_classifier easyocr"
RUN echo "Downloading models..." && \
HF_HUB_DOWNLOAD_TIMEOUT="90" \
HF_HUB_ETAG_TIMEOUT="90" \
docling-tools models download -o "${DOCLING_SERVE_ARTIFACTS_PATH}" ${MODELS_LIST} && \
chown -R 1001:0 /opt/app-root/src/.cache && \
chmod -R g=u /opt/app-root/src/.cache
chown -R 1001:0 ${DOCLING_SERVE_ARTIFACTS_PATH} && \
chmod -R g=u ${DOCLING_SERVE_ARTIFACTS_PATH}
COPY --chown=1001:0 ./docling_serve ./docling_serve
RUN --mount=from=ghcr.io/astral-sh/uv:0.6.1,source=/uv,target=/bin/uv \
--mount=type=cache,target=/opt/app-root/src/.cache/uv,uid=1001 \
--mount=type=bind,source=uv.lock,target=uv.lock \
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
uv sync --frozen --no-dev --all-extras ${UV_SYNC_EXTRA_ARGS}
umask 002 && uv sync --frozen --no-dev --all-extras ${UV_SYNC_EXTRA_ARGS}
EXPOSE 5001

View File

@@ -17,6 +17,7 @@ else
endif
TAG=$(shell git rev-parse HEAD)
BRANCH_TAG=$(shell git rev-parse --abbrev-ref HEAD)
action-lint-file:
$(CMD_PREFIX) touch .action-lint
@@ -27,23 +28,23 @@ md-lint-file:
.PHONY: docling-serve-image
docling-serve-image: Containerfile
$(ECHO_PREFIX) printf " %-12s Containerfile\n" "[docling-serve]"
$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra cpu" -f Containerfile -t ghcr.io/ds4sd/docling-serve:$(TAG) .
$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve:$(TAG) ghcr.io/ds4sd/docling-serve:main
$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve:$(TAG) quay.io/ds4sd/docling-serve:main
$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra cpu" -f Containerfile -t ghcr.io/docling-project/docling-serve:$(TAG) .
$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve:$(TAG) ghcr.io/docling-project/docling-serve:$(BRANCH_TAG)
$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve:$(TAG) quay.io/docling-project/docling-serve:$(BRANCH_TAG)
.PHONY: docling-serve-cpu-image
docling-serve-cpu-image: Containerfile ## Build docling-serve "cpu only" container image
$(ECHO_PREFIX) printf " %-12s Containerfile\n" "[docling-serve CPU]"
$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cu124" -f Containerfile -t ghcr.io/ds4sd/docling-serve-cpu:$(TAG) .
$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve-cpu:$(TAG) ghcr.io/ds4sd/docling-serve-cpu:main
$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve-cpu:$(TAG) quay.io/ds4sd/docling-serve-cpu:main
$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cu124" -f Containerfile -t ghcr.io/docling-project/docling-serve-cpu:$(TAG) .
$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-cpu:$(TAG) ghcr.io/docling-project/docling-serve-cpu:$(BRANCH_TAG)
$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-cpu:$(TAG) quay.io/docling-project/docling-serve-cpu:$(BRANCH_TAG)
.PHONY: docling-serve-cu124-image
docling-serve-cu124-image: Containerfile ## Build docling-serve container image with GPU support
$(ECHO_PREFIX) printf " %-12s Containerfile\n" "[docling-serve with Cuda 12.4]"
$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cpu" -f Containerfile --platform linux/amd64 -t ghcr.io/ds4sd/docling-serve-cu124:$(TAG) .
$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve-cu124:$(TAG) ghcr.io/ds4sd/docling-serve-cu124:main
$(CMD_PREFIX) docker tag ghcr.io/ds4sd/docling-serve-cu124:$(TAG) quay.io/ds4sd/docling-serve-cu124:main
$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cpu" -f Containerfile --platform linux/amd64 -t ghcr.io/docling-project/docling-serve-cu124:$(TAG) .
$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-cu124:$(TAG) ghcr.io/docling-project/docling-serve-cu124:$(BRANCH_TAG)
$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-cu124:$(TAG) quay.io/docling-project/docling-serve-cu124:$(BRANCH_TAG)
.PHONY: action-lint
action-lint: .action-lint ## Lint GitHub Action workflows
@@ -66,7 +67,7 @@ action-lint: .action-lint ## Lint GitHub Action workflows
md-lint: .md-lint ## Lint markdown files
.md-lint: $(wildcard */**/*.md) | md-lint-file
$(ECHO_PREFIX) printf " %-12s ./...\n" "[MD LINT]"
$(CMD_PREFIX) docker run --rm -v $$(pwd):/workdir davidanson/markdownlint-cli2:v0.14.0 "**/*.md"
$(CMD_PREFIX) docker run --rm -v $$(pwd):/workdir davidanson/markdownlint-cli2:v0.16.0 "**/*.md" "#.venv"
$(CMD_PREFIX) touch $@
.PHONY: py-Lint
@@ -84,11 +85,11 @@ run-docling-cpu: ## Run the docling-serve container with CPU support and assign
$(ECHO_PREFIX) printf " %-12s Removing existing container if it exists...\n" "[CLEANUP]"
$(CMD_PREFIX) docker rm -f docling-serve-cpu 2>/dev/null || true
$(ECHO_PREFIX) printf " %-12s Running docling-serve container with CPU support on port 5001...\n" "[RUN CPU]"
$(CMD_PREFIX) docker run -it --name docling-serve-cpu -p 5001:5001 ghcr.io/ds4sd/docling-serve-cpu:main
$(CMD_PREFIX) docker run -it --name docling-serve-cpu -p 5001:5001 ghcr.io/docling-project/docling-serve-cpu:main
.PHONY: run-docling-gpu
run-docling-gpu: ## Run the docling-serve container with GPU support and assign a container name
$(ECHO_PREFIX) printf " %-12s Removing existing container if it exists...\n" "[CLEANUP]"
$(CMD_PREFIX) docker rm -f docling-serve-gpu 2>/dev/null || true
$(ECHO_PREFIX) printf " %-12s Running docling-serve container with GPU support on port 5001...\n" "[RUN GPU]"
$(CMD_PREFIX) docker run -it --name docling-serve-gpu -p 5001:5001 ghcr.io/ds4sd/docling-serve:main
$(CMD_PREFIX) docker run -it --name docling-serve-gpu -p 5001:5001 ghcr.io/docling-project/docling-serve:main

445
README.md
View File

@@ -1,431 +1,84 @@
<p align="center">
<a href="https://github.com/docling-project/docling-serve">
<img loading="lazy" alt="Docling" src="https://github.com/docling-project/docling-serve/raw/main/docs/assets/docling-serve-pic.png" width="30%"/>
</a>
</p>
# Docling Serve
Running [Docling](https://github.com/DS4SD/docling) as an API service.
Running [Docling](https://github.com/docling-project/docling) as an API service.
## Usage
## Getting started
The API provides two endpoints: one for urls, one for files. This is necessary to send files directly in binary format instead of base64-encoded strings.
Install the `docling-serve` package and run the server.
### Common parameters
```bash
# Using the python package
pip install "docling-serve"
docling-serve run
On top of the source of file (see below), both endpoints support the same parameters, which are almost the same as the Docling CLI.
- `from_format` (List[str]): Input format(s) to convert from. Allowed values: `docx`, `pptx`, `html`, `image`, `pdf`, `asciidoc`, `md`. Defaults to all formats.
- `to_formats` (List[str]): Output format(s) to convert to. Allowed values: `md`, `json`, `html`, `text`, `doctags`. Defaults to `md`.
- `do_ocr` (bool): If enabled, the bitmap content will be processed using OCR. Defaults to `True`.
- `image_export_mode`: Image export mode for the document (only in case of JSON, Markdown or HTML). Allowed values: embedded, placeholder, referenced. Optional, defaults to `embedded`.
- `force_ocr` (bool): If enabled, replace any existing text with OCR-generated text over the full content. Defaults to `False`.
- `ocr_engine` (str): OCR engine to use. Allowed values: `easyocr`, `tesseract_cli`, `tesseract`, `rapidocr`, `ocrmac`. Defaults to `easyocr`.
- `ocr_lang` (List[str]): List of languages used by the OCR engine. Note that each OCR engine has different values for the language names. Defaults to empty.
- `pdf_backend` (str): PDF backend to use. Allowed values: `pypdfium2`, `dlparse_v1`, `dlparse_v2`. Defaults to `dlparse_v2`.
- `table_mode` (str): Table mode to use. Allowed values: `fast`, `accurate`. Defaults to `fast`.
- `abort_on_error` (bool): If enabled, abort on error. Defaults to false.
- `return_as_file` (boo): If enabled, return the output as a file. Defaults to false.
- `do_table_structure` (bool): If enabled, the table structure will be extracted. Defaults to true.
- `include_images` (bool): If enabled, images will be extracted from the document. Defaults to true.
- `images_scale` (float): Scale factor for images. Defaults to 2.0.
### URL endpoint
The endpoint is `/v1alpha/convert/source`, listening for POST requests of JSON payloads.
On top of the above parameters, you must send the URL(s) of the document you want process with either the `http_sources` or `file_sources` fields.
The first is fetching URL(s) (optionally using with extra headers), the second allows to provide documents as base64-encoded strings.
No `options` is required, they can be partially or completely omitted.
Simple payload example:
```json
{
"http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
}
# Using container images, e.g. with Podman
podman run -p 5001:5001 quay.io/docling-project/docling-serve
```
<details>
The server is available at
<summary>Complete payload example:</summary>
- API <http://127.0.0.1:5001>
- API documentation <http://127.0.0.1:5001/docs>
![swagger.png](img/swagger.png)
```json
{
"options": {
"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
"to_formats": ["md", "json", "html", "text", "doctags"],
"image_export_mode": "placeholder",
"do_ocr": true,
"force_ocr": false,
"ocr_engine": "easyocr",
"ocr_lang": ["en"],
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": false,
"return_as_file": false,
},
"http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
}
```
Try it out with a simple conversion:
</details>
<details>
<summary>CURL example:</summary>
```sh
```bash
curl -X 'POST' \
'http://localhost:5001/v1alpha/convert/source' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"options": {
"from_formats": [
"docx",
"pptx",
"html",
"image",
"pdf",
"asciidoc",
"md",
"xlsx"
],
"to_formats": ["md", "json", "html", "text", "doctags"],
"image_export_mode": "placeholder",
"do_ocr": true,
"force_ocr": false,
"ocr_engine": "easyocr",
"ocr_lang": [
"fr",
"de",
"es",
"en"
],
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": false,
"return_as_file": false,
"do_table_structure": true,
"include_images": true,
"images_scale": 2
},
"http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
}'
"http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
}'
```
</details>
### Container images
<details>
<summary>Python example:</summary>
Available container images:
```python
import httpx
| Name | Description | Arch | Size |
| -----|-------------|------|------|
| [`ghcr.io/docling-project/docling-serve`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve) <br /> [`quay.io/docling-project/docling-serve`](https://quay.io/repository/docling-project/docling-serve) | Simple image for Docling Serve, installing all packages from the official pypi.org index. | `linux/amd64`, `linux/arm64` | 3.6 GB |
| [`ghcr.io/docling-project/docling-serve-cpu`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve-cpu) <br /> [`quay.io/docling-project/docling-serve-cpu`](https://quay.io/repository/docling-project/docling-serve-cpu) | Cpu-only image which installs `torch` from the pytorch cpu index. | `linux/amd64`, `linux/arm64` | 3.6 GB |
| [`ghcr.io/docling-project/docling-serve-cu124`](https://github.com/docling-project/docling-serve/pkgs/container/docling-serve-cu124) <br /> [`quay.io/docling-project/docling-serve-cu124`](https://quay.io/repository/docling-project/docling-serve-cu124) | Cuda 12.4 image which installs `torch` from the pytorch cu124 index. | `linux/amd64` | 8.7 GB |
async_client = httpx.AsyncClient(timeout=60.0)
url = "http://localhost:5001/v1alpha/convert/source"
payload = {
"options": {
"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
"to_formats": ["md", "json", "html", "text", "doctags"],
"image_export_mode": "placeholder",
"do_ocr": True,
"force_ocr": False,
"ocr_engine": "easyocr",
"ocr_lang": "en",
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": False,
"return_as_file": False,
},
"http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
}
Coming soon: `docling-serve-slim` images will reduce the size by skipping the model weights download.
response = await async_client_client.post(url, json=payload)
data = response.json()
```
</details>
#### File as base64
The `file_sources` argument in the endpoint allows to send files as base64-encoded strings.
When your PDF or other file type is too large, encoding it and passing it inline to curl
can lead to an “Argument list too long” error on some systems. To avoid this, we write
the JSON request body to a file and have curl read from that file.
<details>
<summary>CURL steps:</summary>
```sh
# 1. Base64-encode the file
B64_DATA=$(base64 -w 0 /path/to/file/pdf-to-convert.pdf)
# 2. Build the JSON with your options
cat <<EOF > /tmp/request_body.json
{
"options": {
},
"file_sources": [{
"base64_string": "${B64_DATA}",
"filename": "pdf-to-convert.pdf"
}]
}
EOF
# 3. POST the request to the docling service
curl -X POST "localhost:5001/v1alpha/convert/source" \
-H "Content-Type: application/json" \
-d @/tmp/request_body.json
```
</details>
### File endpoint
The endpoint is: `/v1alpha/convert/file`, listening for POST requests of Form payloads (necessary as the files are sent as multipart/form data). You can send one or multiple files.
<details>
<summary>CURL example:</summary>
```sh
curl -X 'POST' \
'http://127.0.0.1:5001/v1alpha/convert/file' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'ocr_engine=easyocr' \
-F 'pdf_backend=dlparse_v2' \
-F 'from_formats=pdf' \
-F 'from_formats=docx' \
-F 'force_ocr=false' \
-F 'image_export_mode=embedded' \
-F 'ocr_lang=en' \
-F 'ocr_lang=pl' \
-F 'table_mode=fast' \
-F 'files=@2206.01062v1.pdf;type=application/pdf' \
-F 'abort_on_error=false' \
-F 'to_formats=md' \
-F 'to_formats=text' \
-F 'return_as_file=false' \
-F 'do_ocr=true'
```
</details>
<details>
<summary>Python example:</summary>
```python
import httpx
async_client = httpx.AsyncClient(timeout=60.0)
url = "http://localhost:5001/v1alpha/convert/file"
parameters = {
"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
"to_formats": ["md", "json", "html", "text", "doctags"],
"image_export_mode": "placeholder",
"do_ocr": True,
"force_ocr": False,
"ocr_engine": "easyocr",
"ocr_lang": ["en"],
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": False,
"return_as_file": False
}
current_dir = os.path.dirname(__file__)
file_path = os.path.join(current_dir, '2206.01062v1.pdf')
files = {
'files': ('2206.01062v1.pdf', open(file_path, 'rb'), 'application/pdf'),
}
response = await async_client.post(url, files=files, data={"parameters": json.dumps(parameters)})
assert response.status_code == 200, "Response should be 200 OK"
data = response.json()
```
</details>
### Response format
The response can be a JSON Document or a File.
- If you process only one file, the response will be a JSON document with the following format:
```jsonc
{
"document": {
"md_content": "",
"json_content": {},
"html_content": "",
"text_content": "",
"doctags_content": ""
},
"status": "<success|partial_success|skipped|failure>",
"processing_time": 0.0,
"timings": {},
"errors": []
}
```
Depending on the value you set in `output_formats`, the different items will be populated with their respective results or empty.
`processing_time` is the Docling processing time in seconds, and `timings` (when enabled in the backend) provides the detailed
timing of all the internal Docling components.
- If you set the parameter `return_as_file` to True, the response will be a zip file.
- If multiple files are generated (multiple inputs, or one input but multiple outputs with `return_as_file` True), the response will be a zip file.
## Run docling-serve
Clone the repository and run the following from within the cloned directory root.
### Demonstration UI
```bash
python -m venv venv
source venv/bin/activate
# Install the Python package with the extra dependencies
pip install "docling-serve[ui]"
docling-serve run --enable-ui
# Run the container image with the extra env parameters
podman run -p 5001:5001 -e DOCLING_SERVE_ENABLE_UI=true quay.io/docling-project/docling-serve
```
## Helpers
- A full Swagger UI is available at the `/docs` endpoint.
![swagger.png](img/swagger.png)
- An easy to use UI is available at the `/ui` endpoint.
An easy to use UI is available at the `/ui` endpoint.
![ui-input.png](img/ui-input.png)
![ui-output.png](img/ui-output.png)
## Development
## Documentation and advance usages
### CPU only
```sh
# Install uv if not already available
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies
uv sync --extra cpu
```
### Cuda GPU
For GPU support use the following command:
```sh
# Install dependencies
uv sync
```
### Gradio UI and different OCR backends
`/ui` endpoint using `gradio` and different OCR backends can be enabled via package extras:
```sh
# Enable ui and rapidocr
uv sync --extra ui --extra rapidocr
```
```sh
# Enable tesserocr
uv sync --extra tesserocr
```
See `[project.optional-dependencies]` section in `pyproject.toml` for full list of options and runtime options with `uv run docling-serve --help`.
### Run the server
The `docling-serve` executable is a convenient script for launching the webserver both in
development and production mode.
```sh
# Run the server in development mode
# - reload is enabled by default
# - listening on the 127.0.0.1 address
# - ui is enabled by default
docling-serve dev
# Run the server in production mode
# - reload is disabled by default
# - listening on the 0.0.0.0 address
# - ui is disabled by default
docling-serve run
```
### Options
The `docling-serve` executable allows is controlled with both command line
options and environment variables.
<details>
<summary>`docling-serve` help message</summary>
```sh
$ docling-serve dev --help
Usage: docling-serve dev [OPTIONS]
Run a Docling Serve app in development mode. 🧪
This is equivalent to docling-serve run but with reload
enabled and listening on the 127.0.0.1 address.
Options can be set also with the corresponding ENV variable, with the exception
of --enable-ui, --host and --reload.
╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --host TEXT The host to serve on. For local development in localhost │
│ use 127.0.0.1. To enable public access, e.g. in a │
│ container, use all the IP addresses available with │
│ 0.0.0.0. │
│ [default: 127.0.0.1] │
│ --port INTEGER The port to serve on. [default: 5001] │
│ --reload --no-reload Enable auto-reload of the server when (code) files │
│ change. This is resource intensive, use it only during │
│ development. │
│ [default: reload] │
│ --root-path TEXT The root path is used to tell your app that it is being │
│ served to the outside world with some path prefix set up │
│ in some termination proxy or similar. │
│ --proxy-headers --no-proxy-headers Enable/Disable X-Forwarded-Proto, X-Forwarded-For, │
│ X-Forwarded-Port to populate remote address info. │
│ [default: proxy-headers] │
│ --artifacts-path PATH If set to a valid directory, the model weights will be │
│ loaded from this path. │
│ [default: None] │
│ --enable-ui --no-enable-ui Enable the development UI. [default: enable-ui] │
│ --help Show this message and exit. │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```
</details>
#### Environment variables
The environment variables controlling the `uvicorn` execution can be specified with the `UVICORN_` prefix:
- `UVICORN_WORKERS`: Number of workers to use.
- `UVICORN_RELOAD`: If `True`, this will enable auto-reload when you modify files, useful for development.
The environment variables controlling specifics of the Docling Serve app can be specified with the
`DOCLING_SERVE_` prefix:
- `DOCLING_SERVE_ARTIFACTS_PATH`: if set Docling will use only the local weights of models, for example `/opt/app-root/src/.cache/docling/models`.
- `DOCLING_SERVE_ENABLE_UI`: If `True`, The Gradio UI will be available at `/ui`.
Others:
- `TESSDATA_PREFIX`: Tesseract data location, example `/usr/share/tesseract/tessdata/`.
Visit the [Docling Serve documentation](./docs/README.md) for learning how to [configure the webserver](./docs/configuration.md), use all the [runtime options](./docs/usage.md) of the API and [deployment examples](./docs/deployment.md).
## Get help and support
Please feel free to connect with us using the [discussion section](https://github.com/DS4SD/docling/discussions).
Please feel free to connect with us using the [discussion section](https://github.com/docling-project/docling/discussions).
## Contributing
Please read [Contributing to Docling Serve](https://github.com/DS4SD/docling-serve/blob/main/CONTRIBUTING.md) for details.
Please read [Contributing to Docling Serve](https://github.com/docling-project/docling-serve/blob/main/CONTRIBUTING.md) for details.
## References
@@ -433,14 +86,14 @@ If you use Docling in your projects, please consider citing the following:
```bib
@techreport{Docling,
author = {Deep Search Team},
month = {8},
title = {Docling Technical Report},
url = {https://arxiv.org/abs/2408.09869},
eprint = {2408.09869},
doi = {10.48550/arXiv.2408.09869},
version = {1.0.0},
year = {2024}
author = {Docling Contributors},
month = {1},
title = {Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion},
url = {https://arxiv.org/abs/2501.17887},
eprint = {2501.17887},
doi = {10.48550/arXiv.2501.17887},
version = {2.0.0},
year = {2025}
}
```

View File

@@ -74,12 +74,44 @@ def callback(
def _run(
*,
command: str,
# Docling serve parameters
artifacts_path: Path | None,
enable_ui: bool,
) -> None:
server_type = "development" if command == "dev" else "production"
console.print(f"Starting {server_type} server 🚀")
url = f"http://{uvicorn_settings.host}:{uvicorn_settings.port}"
run_subprocess = (
uvicorn_settings.workers is not None and uvicorn_settings.workers > 1
) or uvicorn_settings.reload
run_ssl = (
uvicorn_settings.ssl_certfile is not None
and uvicorn_settings.ssl_keyfile is not None
)
if run_subprocess and docling_serve_settings.artifacts_path != artifacts_path:
err_console.print(
"\n[yellow]:warning: The server will run with reload or multiple workers. \n"
"The argument [bold]--artifacts-path[/bold] will be ignored, please set the value \n"
"using the environment variable [bold]DOCLING_SERVE_ARTIFACTS_PATH[/bold].[/yellow]"
)
if run_subprocess and docling_serve_settings.enable_ui != enable_ui:
err_console.print(
"\n[yellow]:warning: The server will run with reload or multiple workers. \n"
"The argument [bold]--enable-ui[/bold] will be ignored, please set the value \n"
"using the environment variable [bold]DOCLING_SERVE_ENABLE_UI[/bold].[/yellow]"
)
# Propagate the settings to the app settings
docling_serve_settings.artifacts_path = artifacts_path
docling_serve_settings.enable_ui = enable_ui
# Print documentation
protocol = "https" if run_ssl else "http"
url = f"{protocol}://{uvicorn_settings.host}:{uvicorn_settings.port}"
url_docs = f"{url}/docs"
url_ui = f"{url}/ui"
@@ -99,6 +131,7 @@ def _run(
console.print("")
console.print("Logs:")
# Launch the server
uvicorn.run(
app="docling_serve.app:create_app",
factory=True,
@@ -108,6 +141,10 @@ def _run(
workers=uvicorn_settings.workers,
root_path=uvicorn_settings.root_path,
proxy_headers=uvicorn_settings.proxy_headers,
timeout_keep_alive=uvicorn_settings.timeout_keep_alive,
ssl_certfile=uvicorn_settings.ssl_certfile,
ssl_keyfile=uvicorn_settings.ssl_keyfile,
ssl_keyfile_password=uvicorn_settings.ssl_keyfile_password,
)
@@ -159,6 +196,18 @@ def dev(
)
),
] = uvicorn_settings.proxy_headers,
timeout_keep_alive: Annotated[
int, typer.Option(help="Timeout for the server response.")
] = uvicorn_settings.timeout_keep_alive,
ssl_certfile: Annotated[
Optional[Path], typer.Option(help="SSL certificate file")
] = uvicorn_settings.ssl_certfile,
ssl_keyfile: Annotated[
Optional[Path], typer.Option(help="SSL key file")
] = uvicorn_settings.ssl_keyfile,
ssl_keyfile_password: Annotated[
Optional[str], typer.Option(help="SSL keyfile password")
] = uvicorn_settings.ssl_keyfile_password,
# docling options
artifacts_path: Annotated[
Optional[Path],
@@ -186,12 +235,15 @@ def dev(
uvicorn_settings.reload = reload
uvicorn_settings.root_path = root_path
uvicorn_settings.proxy_headers = proxy_headers
docling_serve_settings.artifacts_path = artifacts_path
docling_serve_settings.enable_ui = enable_ui
uvicorn_settings.timeout_keep_alive = timeout_keep_alive
uvicorn_settings.ssl_certfile = ssl_certfile
uvicorn_settings.ssl_keyfile = ssl_keyfile
uvicorn_settings.ssl_keyfile_password = ssl_keyfile_password
_run(
command="dev",
artifacts_path=artifacts_path,
enable_ui=enable_ui,
)
@@ -251,6 +303,18 @@ def run(
)
),
] = uvicorn_settings.proxy_headers,
timeout_keep_alive: Annotated[
int, typer.Option(help="Timeout for the server response.")
] = uvicorn_settings.timeout_keep_alive,
ssl_certfile: Annotated[
Optional[Path], typer.Option(help="SSL certificate file")
] = uvicorn_settings.ssl_certfile,
ssl_keyfile: Annotated[
Optional[Path], typer.Option(help="SSL key file")
] = uvicorn_settings.ssl_keyfile,
ssl_keyfile_password: Annotated[
Optional[str], typer.Option(help="SSL keyfile password")
] = uvicorn_settings.ssl_keyfile_password,
# docling options
artifacts_path: Annotated[
Optional[Path],
@@ -281,12 +345,15 @@ def run(
uvicorn_settings.workers = workers
uvicorn_settings.root_path = root_path
uvicorn_settings.proxy_headers = proxy_headers
docling_serve_settings.artifacts_path = artifacts_path
docling_serve_settings.enable_ui = enable_ui
uvicorn_settings.timeout_keep_alive = timeout_keep_alive
uvicorn_settings.ssl_certfile = ssl_certfile
uvicorn_settings.ssl_keyfile = ssl_keyfile
uvicorn_settings.ssl_keyfile_password = ssl_keyfile_password
_run(
command="run",
artifacts_path=artifacts_path,
enable_ui=enable_ui,
)

View File

@@ -18,10 +18,15 @@ from fastapi import (
WebSocketDisconnect,
)
from fastapi.middleware.cors import CORSMiddleware
from fastapi.openapi.docs import (
get_redoc_html,
get_swagger_ui_html,
get_swagger_ui_oauth2_redirect_html,
)
from fastapi.responses import RedirectResponse
from fastapi.staticfiles import StaticFiles
from docling.datamodel.base_models import DocumentStream, InputFormat
from docling.document_converter import DocumentConverter
from docling.datamodel.base_models import DocumentStream
from docling_serve.datamodel.convert import ConvertDocumentsOptions
from docling_serve.datamodel.requests import (
@@ -37,7 +42,7 @@ from docling_serve.datamodel.responses import (
)
from docling_serve.docling_conversion import (
convert_documents,
converters,
get_converter,
get_pdf_pipeline_opts,
)
from docling_serve.engines import get_orchestrator
@@ -86,15 +91,8 @@ _log = logging.getLogger(__name__)
@asynccontextmanager
async def lifespan(app: FastAPI):
# Converter with default options
pdf_format_option, options_hash = get_pdf_pipeline_opts(ConvertDocumentsOptions())
converters[options_hash] = DocumentConverter(
format_options={
InputFormat.PDF: pdf_format_option,
InputFormat.IMAGE: pdf_format_option,
}
)
converters[options_hash].initialize_pipeline(InputFormat.PDF)
pdf_format_option = get_pdf_pipeline_opts(ConvertDocumentsOptions())
get_converter(pdf_format_option)
orchestrator = get_orchestrator()
@@ -110,11 +108,6 @@ async def lifespan(app: FastAPI):
except asyncio.CancelledError:
_log.info("Queue processor cancelled.")
converters.clear()
# if WITH_UI:
# gradio_ui.close()
##################################
# App creation and configuration #
@@ -129,15 +122,25 @@ def create_app(): # noqa: C901
version = "0.0.0"
offline_docs_assets = False
if (
docling_serve_settings.static_path is not None
and (docling_serve_settings.static_path).is_dir()
):
offline_docs_assets = True
_log.info("Found static assets.")
app = FastAPI(
title="Docling Serve",
docs_url=None if offline_docs_assets else "/docs",
redoc_url=None if offline_docs_assets else "/redocs",
lifespan=lifespan,
version=version,
)
origins = ["*"]
methods = ["*"]
headers = ["*"]
origins = docling_serve_settings.cors_origins
methods = docling_serve_settings.cors_methods
headers = docling_serve_settings.cors_headers
app.add_middleware(
CORSMiddleware,
@@ -170,6 +173,38 @@ def create_app(): # noqa: C901
"or `pip install gradio`"
)
#############################
# Offline assets definition #
#############################
if offline_docs_assets:
app.mount(
"/static",
StaticFiles(directory=docling_serve_settings.static_path),
name="static",
)
@app.get("/docs", include_in_schema=False)
async def custom_swagger_ui_html():
return get_swagger_ui_html(
openapi_url=app.openapi_url,
title=app.title + " - Swagger UI",
oauth2_redirect_url=app.swagger_ui_oauth2_redirect_url,
swagger_js_url="/static/swagger-ui-bundle.js",
swagger_css_url="/static/swagger-ui.css",
)
@app.get(app.swagger_ui_oauth2_redirect_url, include_in_schema=False)
async def swagger_ui_redirect():
return get_swagger_ui_oauth2_redirect_html()
@app.get("/redoc", include_in_schema=False)
async def redoc_html():
return get_redoc_html(
openapi_url=app.openapi_url,
title=app.title + " - ReDoc",
redoc_js_url="/static/redoc.standalone.js",
)
#############################
# API Endpoints definitions #
#############################
@@ -177,9 +212,10 @@ def create_app(): # noqa: C901
# Favicon
@app.get("/favicon.ico", include_in_schema=False)
async def favicon():
response = RedirectResponse(
url="https://ds4sd.github.io/docling/assets/logo.png"
)
logo_url = "https://raw.githubusercontent.com/docling-project/docling/refs/heads/main/docs/assets/logo.svg"
if offline_docs_assets:
logo_url = "/static/logo.svg"
response = RedirectResponse(url=logo_url)
return response
@app.get("/health")

View File

@@ -4,9 +4,27 @@ from typing import Annotated, Optional
from pydantic import BaseModel, Field
from docling.datamodel.base_models import InputFormat, OutputFormat
from docling.datamodel.pipeline_options import OcrEngine, PdfBackend, TableFormerMode
from docling.datamodel.pipeline_options import (
EasyOcrOptions,
PdfBackend,
PdfPipeline,
TableFormerMode,
TableStructureOptions,
)
from docling.datamodel.settings import (
DEFAULT_PAGE_RANGE,
PageRange,
)
from docling.models.factories import get_ocr_factory
from docling_core.types.doc import ImageRefMode
from docling_serve.settings import docling_serve_settings
ocr_factory = get_ocr_factory(
allow_external_plugins=docling_serve_settings.allow_external_plugins
)
ocr_engines_enum = ocr_factory.get_enum()
class ConvertDocumentsOptions(BaseModel):
from_formats: Annotated[
@@ -69,18 +87,17 @@ class ConvertDocumentsOptions(BaseModel):
),
] = False
# TODO: use a restricted list based on what is installed on the system
ocr_engine: Annotated[
OcrEngine,
ocr_engine: Annotated[ # type: ignore
ocr_engines_enum,
Field(
description=(
"The OCR engine to use. String. "
"Allowed values: easyocr, tesseract, rapidocr. "
f"Allowed values: {', '.join([v.value for v in ocr_engines_enum])}. "
"Optional, defaults to easyocr."
),
examples=[OcrEngine.EASYOCR],
examples=[EasyOcrOptions.kind],
),
] = OcrEngine.EASYOCR
] = ocr_engines_enum(EasyOcrOptions.kind) # type: ignore
ocr_lang: Annotated[
Optional[list[str]],
@@ -101,25 +118,46 @@ class ConvertDocumentsOptions(BaseModel):
description=(
"The PDF backend to use. String. "
f"Allowed values: {', '.join([v.value for v in PdfBackend])}. "
f"Optional, defaults to {PdfBackend.DLPARSE_V2.value}."
f"Optional, defaults to {PdfBackend.DLPARSE_V4.value}."
),
examples=[PdfBackend.DLPARSE_V2],
examples=[PdfBackend.DLPARSE_V4],
),
] = PdfBackend.DLPARSE_V2
] = PdfBackend.DLPARSE_V4
table_mode: Annotated[
TableFormerMode,
Field(
TableFormerMode.FAST,
description=(
"Mode to use for table structure, String. "
f"Allowed values: {', '.join([v.value for v in TableFormerMode])}. "
"Optional, defaults to fast."
),
examples=[TableFormerMode.FAST],
examples=[TableStructureOptions().mode],
# pattern="fast|accurate",
),
] = TableFormerMode.FAST
] = TableStructureOptions().mode
pipeline: Annotated[
PdfPipeline,
Field(description="Choose the pipeline to process PDF or image files."),
] = PdfPipeline.STANDARD
page_range: Annotated[
PageRange,
Field(
description="Only convert a range of pages. The page number starts at 1.",
examples=[(1, 4)],
),
] = DEFAULT_PAGE_RANGE
document_timeout: Annotated[
float,
Field(
description="The timeout for processing each document, in seconds.",
gt=0,
le=docling_serve_settings.max_document_timeout,
),
] = docling_serve_settings.max_document_timeout
abort_on_error: Annotated[
bool,
@@ -172,3 +210,47 @@ class ConvertDocumentsOptions(BaseModel):
examples=[2.0],
),
] = 2.0
do_code_enrichment: Annotated[
bool,
Field(
description=(
"If enabled, perform OCR code enrichment. "
"Boolean. Optional, defaults to false."
),
examples=[False],
),
] = False
do_formula_enrichment: Annotated[
bool,
Field(
description=(
"If enabled, perform formula OCR, return Latex code. "
"Boolean. Optional, defaults to false."
),
examples=[False],
),
] = False
do_picture_classification: Annotated[
bool,
Field(
description=(
"If enabled, classify pictures in documents. "
"Boolean. Optional, defaults to false."
),
examples=[False],
),
] = False
do_picture_description: Annotated[
bool,
Field(
description=(
"If enabled, describe pictures in documents. "
"Boolean. Optional, defaults to false."
),
examples=[False],
),
] = False

View File

@@ -1,10 +1,4 @@
import enum
from typing import Optional
from pydantic import BaseModel
from docling_serve.datamodel.requests import ConvertDocumentsRequest
from docling_serve.datamodel.responses import ConvertDocumentResponse
class TaskStatus(str, enum.Enum):
@@ -16,15 +10,3 @@ class TaskStatus(str, enum.Enum):
class AsyncEngine(str, enum.Enum):
LOCAL = "local"
class Task(BaseModel):
task_id: str
task_status: TaskStatus = TaskStatus.PENDING
request: Optional[ConvertDocumentsRequest]
result: Optional[ConvertDocumentResponse] = None
def is_completed(self) -> bool:
if self.task_status in [TaskStatus.SUCCESS, TaskStatus.FAILURE]:
return True
return False

View File

@@ -0,0 +1,19 @@
from typing import Optional
from pydantic import BaseModel
from docling_serve.datamodel.engines import TaskStatus
from docling_serve.datamodel.requests import ConvertDocumentsRequest
from docling_serve.datamodel.responses import ConvertDocumentResponse
class Task(BaseModel):
task_id: str
task_status: TaskStatus = TaskStatus.PENDING
request: Optional[ConvertDocumentsRequest]
result: Optional[ConvertDocumentResponse] = None
def is_completed(self) -> bool:
if self.task_status in [TaskStatus.SUCCESS, TaskStatus.FAILURE]:
return True
return False

View File

@@ -1,7 +1,9 @@
import hashlib
import json
import logging
import sys
from collections.abc import Iterable, Iterator
from functools import lru_cache
from pathlib import Path
from typing import Any, Optional, Union
@@ -9,37 +11,35 @@ from fastapi import HTTPException
from docling.backend.docling_parse_backend import DoclingParseDocumentBackend
from docling.backend.docling_parse_v2_backend import DoclingParseV2DocumentBackend
from docling.backend.docling_parse_v4_backend import DoclingParseV4DocumentBackend
from docling.backend.pdf_backend import PdfDocumentBackend
from docling.backend.pypdfium2_backend import PyPdfiumDocumentBackend
from docling.datamodel.base_models import DocumentStream, InputFormat
from docling.datamodel.document import ConversionResult
from docling.datamodel.pipeline_options import (
EasyOcrOptions,
OcrEngine,
OcrOptions,
PdfBackend,
PdfPipeline,
PdfPipelineOptions,
RapidOcrOptions,
TableFormerMode,
TesseractOcrOptions,
VlmPipelineOptions,
smoldocling_vlm_conversion_options,
smoldocling_vlm_mlx_conversion_options,
)
from docling.document_converter import DocumentConverter, FormatOption, PdfFormatOption
from docling.pipeline.vlm_pipeline import VlmPipeline
from docling_core.types.doc import ImageRefMode
from docling_serve.datamodel.convert import ConvertDocumentsOptions
from docling_serve.datamodel.convert import ConvertDocumentsOptions, ocr_factory
from docling_serve.helper_functions import _to_list_of_strings
from docling_serve.settings import docling_serve_settings
_log = logging.getLogger(__name__)
# Document converters will be preloaded and stored in a dictionary
converters: dict[bytes, DocumentConverter] = {}
# Custom serializer for PdfFormatOption
# (model_dump_json does not work with some classes)
def _serialize_pdf_format_option(pdf_format_option: PdfFormatOption) -> str:
def _hash_pdf_format_option(pdf_format_option: PdfFormatOption) -> bytes:
data = pdf_format_option.model_dump()
# pipeline_options are not fully serialized by model_dump, dedicated pass
@@ -64,51 +64,49 @@ def _serialize_pdf_format_option(pdf_format_option: PdfFormatOption) -> str:
)
# Serialize the dictionary to JSON with sorted keys to have consistent hashes
return json.dumps(data, sort_keys=True)
serialized_data = json.dumps(data, sort_keys=True)
options_hash = hashlib.sha1(serialized_data.encode()).digest()
return options_hash
# Computes the PDF pipeline options and returns the PdfFormatOption and its hash
def get_pdf_pipeline_opts( # noqa: C901
request: ConvertDocumentsOptions,
) -> tuple[PdfFormatOption, bytes]:
if request.ocr_engine == OcrEngine.EASYOCR:
try:
import easyocr # noqa: F401
except ImportError:
raise HTTPException(
status_code=400,
detail="The requested OCR engine"
f" (ocr_engine={request.ocr_engine.value})"
" is not available on this system. Please choose another OCR engine "
"or contact your system administrator.",
)
ocr_options: OcrOptions = EasyOcrOptions(force_full_page_ocr=request.force_ocr)
elif request.ocr_engine == OcrEngine.TESSERACT:
try:
import tesserocr # noqa: F401
except ImportError:
raise HTTPException(
status_code=400,
detail="The requested OCR engine"
f" (ocr_engine={request.ocr_engine.value})"
" is not available on this system. Please choose another OCR engine "
"or contact your system administrator.",
)
ocr_options = TesseractOcrOptions(force_full_page_ocr=request.force_ocr)
elif request.ocr_engine == OcrEngine.RAPIDOCR:
try:
from rapidocr_onnxruntime import RapidOCR # noqa: F401
except ImportError:
raise HTTPException(
status_code=400,
detail="The requested OCR engine"
f" (ocr_engine={request.ocr_engine.value})"
" is not available on this system. Please choose another OCR engine "
"or contact your system administrator.",
)
ocr_options = RapidOcrOptions(force_full_page_ocr=request.force_ocr)
else:
raise RuntimeError(f"Unexpected OCR engine type {request.ocr_engine}")
# Cache of DocumentConverter objects
_options_map: dict[bytes, PdfFormatOption] = {}
@lru_cache(maxsize=docling_serve_settings.options_cache_size)
def _get_converter_from_hash(options_hash: bytes) -> DocumentConverter:
pdf_format_option = _options_map[options_hash]
format_options: dict[InputFormat, FormatOption] = {
InputFormat.PDF: pdf_format_option,
InputFormat.IMAGE: pdf_format_option,
}
return DocumentConverter(format_options=format_options)
def get_converter(pdf_format_option: PdfFormatOption) -> DocumentConverter:
options_hash = _hash_pdf_format_option(pdf_format_option)
_options_map[options_hash] = pdf_format_option
return _get_converter_from_hash(options_hash)
def _parse_standard_pdf_opts(
request: ConvertDocumentsOptions, artifacts_path: Optional[Path]
) -> PdfPipelineOptions:
try:
ocr_options: OcrOptions = ocr_factory.create_options(
kind=request.ocr_engine.value, # type: ignore
force_full_page_ocr=request.force_ocr,
)
except ImportError as err:
raise HTTPException(
status_code=400,
detail="The requested OCR engine"
f" (ocr_engine={request.ocr_engine.value})" # type: ignore
" is not available on this system. Please choose another OCR engine "
"or contact your system administrator.\n"
f"{err}",
)
if request.ocr_lang is not None:
if isinstance(request.ocr_lang, str):
@@ -117,11 +115,16 @@ def get_pdf_pipeline_opts( # noqa: C901
ocr_options.lang = request.ocr_lang
pipeline_options = PdfPipelineOptions(
artifacts_path=artifacts_path,
document_timeout=request.document_timeout,
do_ocr=request.do_ocr,
ocr_options=ocr_options,
do_table_structure=request.do_table_structure,
do_code_enrichment=request.do_code_enrichment,
do_formula_enrichment=request.do_formula_enrichment,
do_picture_classification=request.do_picture_classification,
do_picture_description=request.do_picture_description,
)
pipeline_options.table_structure_options.do_cell_matching = True # do_cell_matching
pipeline_options.table_structure_options.mode = TableFormerMode(request.table_mode)
if request.image_export_mode != ImageRefMode.PLACEHOLDER:
@@ -129,50 +132,95 @@ def get_pdf_pipeline_opts( # noqa: C901
if request.images_scale:
pipeline_options.images_scale = request.images_scale
return pipeline_options
def _parse_backend(request: ConvertDocumentsOptions) -> type[PdfDocumentBackend]:
if request.pdf_backend == PdfBackend.DLPARSE_V1:
backend: type[PdfDocumentBackend] = DoclingParseDocumentBackend
elif request.pdf_backend == PdfBackend.DLPARSE_V2:
backend = DoclingParseV2DocumentBackend
elif request.pdf_backend == PdfBackend.DLPARSE_V4:
backend = DoclingParseV4DocumentBackend
elif request.pdf_backend == PdfBackend.PYPDFIUM2:
backend = PyPdfiumDocumentBackend
else:
raise RuntimeError(f"Unexpected PDF backend type {request.pdf_backend}")
return backend
def _parse_vlm_pdf_opts(
request: ConvertDocumentsOptions, artifacts_path: Optional[Path]
) -> VlmPipelineOptions:
pipeline_options = VlmPipelineOptions(
artifacts_path=artifacts_path,
document_timeout=request.document_timeout,
)
pipeline_options.vlm_options = smoldocling_vlm_conversion_options
if sys.platform == "darwin":
try:
import mlx_vlm # noqa: F401
pipeline_options.vlm_options = smoldocling_vlm_mlx_conversion_options
except ImportError:
_log.warning(
"To run SmolDocling faster, please install mlx-vlm:\n"
"pip install mlx-vlm"
)
return pipeline_options
# Computes the PDF pipeline options and returns the PdfFormatOption and its hash
def get_pdf_pipeline_opts(
request: ConvertDocumentsOptions,
) -> PdfFormatOption:
artifacts_path: Optional[Path] = None
if docling_serve_settings.artifacts_path is not None:
if str(docling_serve_settings.artifacts_path.absolute()) == "":
_log.info(
"artifacts_path is an empty path, model weights will be dowloaded "
"at runtime."
)
pipeline_options.artifacts_path = None
artifacts_path = None
elif docling_serve_settings.artifacts_path.is_dir():
_log.info(
"artifacts_path is set to a valid directory. "
"No model weights will be downloaded at runtime."
)
pipeline_options.artifacts_path = docling_serve_settings.artifacts_path
artifacts_path = docling_serve_settings.artifacts_path
else:
_log.warning(
"artifacts_path is set to an invalid directory. "
"The system will download the model weights at runtime."
)
pipeline_options.artifacts_path = None
artifacts_path = None
else:
_log.info(
"artifacts_path is unset. "
"The system will download the model weights at runtime."
)
pdf_format_option = PdfFormatOption(
pipeline_options=pipeline_options,
backend=backend,
)
pipeline_options: Union[PdfPipelineOptions, VlmPipelineOptions]
if request.pipeline == PdfPipeline.STANDARD:
pipeline_options = _parse_standard_pdf_opts(request, artifacts_path)
backend = _parse_backend(request)
pdf_format_option = PdfFormatOption(
pipeline_options=pipeline_options,
backend=backend,
)
serialized_data = _serialize_pdf_format_option(pdf_format_option)
elif request.pipeline == PdfPipeline.VLM:
pipeline_options = _parse_vlm_pdf_opts(request, artifacts_path)
pdf_format_option = PdfFormatOption(
pipeline_cls=VlmPipeline, pipeline_options=pipeline_options
)
else:
raise NotImplementedError(
f"The pipeline {request.pipeline} is not implemented."
)
options_hash = hashlib.sha1(serialized_data.encode()).digest()
return pdf_format_option, options_hash
return pdf_format_option
def convert_documents(
@@ -180,20 +228,14 @@ def convert_documents(
options: ConvertDocumentsOptions,
headers: Optional[dict[str, Any]] = None,
):
pdf_format_option, options_hash = get_pdf_pipeline_opts(options)
if options_hash not in converters:
format_options: dict[InputFormat, FormatOption] = {
InputFormat.PDF: pdf_format_option,
InputFormat.IMAGE: pdf_format_option,
}
converters[options_hash] = DocumentConverter(format_options=format_options)
_log.info(f"We now have {len(converters)} converters in memory.")
results: Iterator[ConversionResult] = converters[options_hash].convert_all(
pdf_format_option = get_pdf_pipeline_opts(options)
converter = get_converter(pdf_format_option)
results: Iterator[ConversionResult] = converter.convert_all(
sources,
headers=headers,
page_range=options.page_range,
max_file_size=docling_serve_settings.max_file_size,
max_num_pages=docling_serve_settings.max_num_pages,
)
return results

View File

@@ -5,13 +5,14 @@ from typing import Optional
from fastapi import WebSocket
from docling_serve.datamodel.engines import Task, TaskStatus
from docling_serve.datamodel.engines import TaskStatus
from docling_serve.datamodel.requests import ConvertDocumentsRequest
from docling_serve.datamodel.responses import (
MessageKind,
TaskStatusResponse,
WebsocketMessage,
)
from docling_serve.datamodel.task import Task
from docling_serve.engines.async_local.worker import AsyncLocalWorker
from docling_serve.engines.base_orchestrator import BaseOrchestrator
from docling_serve.settings import docling_serve_settings

View File

@@ -1,6 +1,6 @@
from abc import ABC, abstractmethod
from docling_serve.datamodel.engines import Task
from docling_serve.datamodel.task import Task
class BaseOrchestrator(ABC):

View File

@@ -1,22 +1,47 @@
import base64
import importlib
import json
import logging
import os
import ssl
import tempfile
import time
from pathlib import Path
import certifi
import gradio as gr
import requests
import httpx
from docling.datamodel.pipeline_options import (
PdfBackend,
PdfPipeline,
TableFormerMode,
TableStructureOptions,
)
from docling_serve.helper_functions import _to_list_of_strings
from docling_serve.settings import docling_serve_settings, uvicorn_settings
logger = logging.getLogger(__name__)
############################
# Path of static artifacts #
############################
logo_path = "https://raw.githubusercontent.com/docling-project/docling/refs/heads/main/docs/assets/logo.svg"
js_components_url = "https://unpkg.com/@docling/docling-components@0.0.6"
if (
docling_serve_settings.static_path is not None
and docling_serve_settings.static_path.is_dir()
):
logo_path = str(docling_serve_settings.static_path / "logo.svg")
js_components_url = "/static/docling-components.js"
##############################
# Head JS for web components #
##############################
head = """
<script src="https://unpkg.com/@docling/docling-components@0.0.3" type="module"></script>
head = f"""
<script src="{js_components_url}" type="module"></script>
"""
#################
@@ -95,8 +120,29 @@ file_output_path = None # Will be set when a new file is generated
#############
def get_api_endpoint() -> str:
protocol = "http"
if uvicorn_settings.ssl_keyfile is not None:
protocol = "https"
return f"{protocol}://{docling_serve_settings.api_host}:{uvicorn_settings.port}"
def get_ssl_context() -> ssl.SSLContext:
ctx = ssl.create_default_context(cafile=certifi.where())
kube_sa_ca_cert_path = Path(
"/run/secrets/kubernetes.io/serviceaccount/service-ca.crt"
)
if (
uvicorn_settings.ssl_keyfile is not None
and ".svc." in docling_serve_settings.api_host
and kube_sa_ca_cert_path.exists()
):
ctx.load_verify_locations(cafile=kube_sa_ca_cert_path)
return ctx
def health_check():
response = requests.get(f"http://localhost:{int(os.getenv('PORT', '5001'))}/health")
response = httpx.get(f"{get_api_endpoint()}/health")
if response.status_code == 200:
return "Healthy"
return "Unhealthy"
@@ -112,6 +158,11 @@ def set_outputs_visibility_direct(x, y):
return content, file
def set_task_id_visibility(x):
task_id_row = gr.Row(visible=x)
return task_id_row
def set_outputs_visibility_process(x):
content = gr.Row(visible=not x)
file = gr.Row(visible=x)
@@ -123,6 +174,7 @@ def set_download_button_label(label_text: gr.State):
def clear_outputs():
task_id_rendered = ""
markdown_content = ""
json_content = ""
json_rendered_content = ""
@@ -131,6 +183,7 @@ def clear_outputs():
doctags_content = ""
return (
task_id_rendered,
markdown_content,
markdown_content,
json_content,
@@ -173,10 +226,56 @@ def change_ocr_lang(ocr_engine):
return "english,chinese"
def wait_task_finish(task_id: str, return_as_file: bool):
conversion_sucess = False
task_finished = False
task_status = ""
ssl_ctx = get_ssl_context()
while not task_finished:
try:
response = httpx.get(
f"{get_api_endpoint()}/v1alpha/status/poll/{task_id}?wait=5",
verify=ssl_ctx,
timeout=15,
)
task_status = response.json()["task_status"]
if task_status == "success":
conversion_sucess = True
task_finished = True
if task_status in ("failure", "revoked"):
conversion_sucess = False
task_finished = True
raise RuntimeError(f"Task failed with status {task_status!r}")
time.sleep(5)
except Exception as e:
logger.error(f"Error processing file(s): {e}")
conversion_sucess = False
task_finished = True
raise gr.Error(f"Error processing file(s): {e}", print_exception=False)
if conversion_sucess:
try:
response = httpx.get(
f"{get_api_endpoint()}/v1alpha/result/{task_id}",
timeout=15,
verify=ssl_ctx,
)
output = response_to_output(response, return_as_file)
return output
except Exception as e:
logger.error(f"Error getting task result: {e}")
raise gr.Error(
f"Error getting task result, conversion finished with status: {task_status}"
)
def process_url(
input_sources,
to_formats,
image_export_mode,
pipeline,
ocr,
force_ocr,
ocr_engine,
@@ -185,12 +284,17 @@ def process_url(
table_mode,
abort_on_error,
return_as_file,
do_code_enrichment,
do_formula_enrichment,
do_picture_classification,
do_picture_description,
):
parameters = {
"http_sources": [{"url": source} for source in input_sources.split(",")],
"options": {
"to_formats": to_formats,
"image_export_mode": image_export_mode,
"pipeline": pipeline,
"ocr": ocr,
"force_ocr": force_ocr,
"ocr_engine": ocr_engine,
@@ -199,6 +303,10 @@ def process_url(
"table_mode": table_mode,
"abort_on_error": abort_on_error,
"return_as_file": return_as_file,
"do_code_enrichment": do_code_enrichment,
"do_formula_enrichment": do_formula_enrichment,
"do_picture_classification": do_picture_classification,
"do_picture_description": do_picture_description,
},
}
if (
@@ -209,9 +317,12 @@ def process_url(
logger.error("No input sources provided.")
raise gr.Error("No input sources provided.", print_exception=False)
try:
response = requests.post(
f"http://localhost:{int(os.getenv('PORT', '5001'))}/v1alpha/convert/source",
ssl_ctx = get_ssl_context()
response = httpx.post(
f"{get_api_endpoint()}/v1alpha/convert/source/async",
json=parameters,
verify=ssl_ctx,
timeout=60,
)
except Exception as e:
logger.error(f"Error processing URL: {e}")
@@ -221,14 +332,22 @@ def process_url(
error_message = data.get("detail", "An unknown error occurred.")
logger.error(f"Error processing file: {error_message}")
raise gr.Error(f"Error processing file: {error_message}", print_exception=False)
output = response_to_output(response, return_as_file)
return output
task_id_rendered = response.json()["task_id"]
return task_id_rendered
def file_to_base64(file):
with open(file.name, "rb") as f:
encoded_string = base64.b64encode(f.read()).decode("utf-8")
return encoded_string
def process_file(
files,
file,
to_formats,
image_export_mode,
pipeline,
ocr,
force_ocr,
ocr_engine,
@@ -237,30 +356,44 @@ def process_file(
table_mode,
abort_on_error,
return_as_file,
do_code_enrichment,
do_formula_enrichment,
do_picture_classification,
do_picture_description,
):
if not files or len(files) == 0 or files[0] == "":
if not file or file == "":
logger.error("No files provided.")
raise gr.Error("No files provided.", print_exception=False)
files_data = [("files", (file.name, open(file.name, "rb"))) for file in files]
files_data = [{"base64_string": file_to_base64(file), "filename": file.name}]
parameters = {
"to_formats": to_formats,
"image_export_mode": image_export_mode,
"ocr": str(ocr).lower(),
"force_ocr": str(force_ocr).lower(),
"ocr_engine": ocr_engine,
"ocr_lang": _to_list_of_strings(ocr_lang),
"pdf_backend": pdf_backend,
"table_mode": table_mode,
"abort_on_error": str(abort_on_error).lower(),
"return_as_file": str(return_as_file).lower(),
"file_sources": files_data,
"options": {
"to_formats": to_formats,
"image_export_mode": image_export_mode,
"pipeline": pipeline,
"ocr": ocr,
"force_ocr": force_ocr,
"ocr_engine": ocr_engine,
"ocr_lang": _to_list_of_strings(ocr_lang),
"pdf_backend": pdf_backend,
"table_mode": table_mode,
"abort_on_error": abort_on_error,
"return_as_file": return_as_file,
"do_code_enrichment": do_code_enrichment,
"do_formula_enrichment": do_formula_enrichment,
"do_picture_classification": do_picture_classification,
"do_picture_description": do_picture_description,
},
}
try:
response = requests.post(
f"http://localhost:{int(os.getenv('PORT', '5001'))}/v1alpha/convert/file",
files=files_data,
data=parameters,
ssl_ctx = get_ssl_context()
response = httpx.post(
f"{get_api_endpoint()}/v1alpha/convert/source/async",
json=parameters,
verify=ssl_ctx,
timeout=60,
)
except Exception as e:
logger.error(f"Error processing file(s): {e}")
@@ -270,8 +403,9 @@ def process_file(
error_message = data.get("detail", "An unknown error occurred.")
logger.error(f"Error processing file: {error_message}")
raise gr.Error(f"Error processing file: {error_message}", print_exception=False)
output = response_to_output(response, return_as_file)
return output
task_id_rendered = response.json()["task_id"]
return task_id_rendered
def response_to_output(response, return_as_file):
@@ -342,17 +476,21 @@ with gr.Blocks(
with gr.Row(elem_id="check_health"):
# Logo
with gr.Column(scale=1, min_width=90):
gr.Image(
"https://ds4sd.github.io/docling/assets/logo.png",
height=80,
width=80,
show_download_button=False,
show_label=False,
show_fullscreen_button=False,
container=False,
elem_id="logo",
scale=0,
)
try:
gr.Image(
logo_path,
height=80,
width=80,
show_download_button=False,
show_label=False,
show_fullscreen_button=False,
container=False,
elem_id="logo",
scale=0,
)
except Exception:
logger.warning("Logo not found.")
# Title
with gr.Column(scale=1, min_width=200):
gr.Markdown(
@@ -381,30 +519,31 @@ with gr.Blocks(
)
# URL Processing Tab
with gr.Tab("Convert URL(s)"):
with gr.Tab("Convert URL"):
with gr.Row():
with gr.Column(scale=4):
url_input = gr.Textbox(
label="Input Sources (comma-separated URLs)",
placeholder="https://arxiv.org/pdf/2206.01062",
label="URL Input Source",
placeholder="https://arxiv.org/pdf/2501.17887",
)
with gr.Column(scale=1):
url_process_btn = gr.Button("Process URL(s)", scale=1)
url_process_btn = gr.Button("Process URL", scale=1)
url_reset_btn = gr.Button("Reset", scale=1)
# File Processing Tab
with gr.Tab("Convert File(s)"):
with gr.Tab("Convert File"):
with gr.Row():
with gr.Column(scale=4):
file_input = gr.File(
elem_id="file_input_zone",
label="Upload Files",
label="Upload File",
file_types=[
".pdf",
".docx",
".pptx",
".html",
".xlsx",
".json",
".asciidoc",
".txt",
".md",
@@ -413,11 +552,11 @@ with gr.Blocks(
".png",
".gif",
],
file_count="multiple",
file_count="single",
scale=4,
)
with gr.Column(scale=1):
file_process_btn = gr.Button("Process File(s)", scale=1)
file_process_btn = gr.Button("Process File", scale=1)
file_reset_btn = gr.Button("Reset", scale=1)
# Options
@@ -426,14 +565,14 @@ with gr.Blocks(
with gr.Column(scale=1):
to_formats = gr.CheckboxGroup(
[
("Markdown", "md"),
("Docling (JSON)", "json"),
("Markdown", "md"),
("HTML", "html"),
("Plain Text", "text"),
("Doc Tags", "doctags"),
],
label="To Formats",
value=["md"],
value=["json", "md"],
)
with gr.Column(scale=1):
image_export_mode = gr.Radio(
@@ -445,6 +584,13 @@ with gr.Blocks(
label="Image Export Mode",
value="embedded",
)
with gr.Row():
with gr.Column(scale=1, min_width=200):
pipeline = gr.Radio(
[(v.value.capitalize(), v.value) for v in PdfPipeline],
label="Pipeline type",
value=PdfPipeline.STANDARD.value,
)
with gr.Row():
with gr.Column(scale=1, min_width=200):
ocr = gr.Checkbox(label="Enable OCR", value=True)
@@ -465,32 +611,55 @@ with gr.Blocks(
)
ocr_engine.change(change_ocr_lang, inputs=[ocr_engine], outputs=[ocr_lang])
with gr.Row():
with gr.Column(scale=2):
with gr.Column(scale=4):
pdf_backend = gr.Radio(
["pypdfium2", "dlparse_v1", "dlparse_v2"],
[v.value for v in PdfBackend],
label="PDF Backend",
value="dlparse_v2",
value=PdfBackend.DLPARSE_V4.value,
)
with gr.Column(scale=2):
table_mode = gr.Radio(
["fast", "accurate"], label="Table Mode", value="fast"
[(v.value.capitalize(), v.value) for v in TableFormerMode],
label="Table Mode",
value=TableStructureOptions().mode.value,
)
with gr.Column(scale=1):
abort_on_error = gr.Checkbox(label="Abort on Error", value=False)
return_as_file = gr.Checkbox(label="Return as File", value=False)
return_as_file = gr.Checkbox(
label="Return as File", visible=False, value=False
) # Disable until async handle output as file
with gr.Row():
with gr.Column():
do_code_enrichment = gr.Checkbox(
label="Enable code enrichment", value=False
)
do_formula_enrichment = gr.Checkbox(
label="Enable formula enrichment", value=False
)
with gr.Column():
do_picture_classification = gr.Checkbox(
label="Enable picture classification", value=False
)
do_picture_description = gr.Checkbox(
label="Enable picture description", value=False
)
# Task id output
with gr.Row(visible=False) as task_id_output:
task_id_rendered = gr.Textbox(label="Task id", interactive=False)
# Document output
with gr.Row(visible=False) as content_output:
with gr.Tab("Docling (JSON)"):
output_json = gr.Code(language="json", wrap_lines=True, show_label=False)
with gr.Tab("Docling-Rendered"):
output_json_rendered = gr.HTML(label="Response")
with gr.Tab("Markdown"):
output_markdown = gr.Code(
language="markdown", wrap_lines=True, show_label=False
)
with gr.Tab("Markdown-Rendered"):
output_markdown_rendered = gr.Markdown(label="Response")
with gr.Tab("Docling (JSON)"):
output_json = gr.Code(language="json", wrap_lines=True, show_label=False)
with gr.Tab("Docling-Rendered"):
output_json_rendered = gr.HTML()
with gr.Tab("HTML"):
output_html = gr.Code(language="html", wrap_lines=True, show_label=False)
with gr.Tab("HTML-Rendered"):
@@ -508,36 +677,34 @@ with gr.Blocks(
# UI Actions #
##############
# Disable until async handle output as file
# Handle Return as File
url_input.change(
auto_set_return_as_file,
inputs=[url_input, file_input, image_export_mode],
outputs=[return_as_file],
)
file_input.change(
auto_set_return_as_file,
inputs=[url_input, file_input, image_export_mode],
outputs=[return_as_file],
)
image_export_mode.change(
auto_set_return_as_file,
inputs=[url_input, file_input, image_export_mode],
outputs=[return_as_file],
)
# url_input.change(
# auto_set_return_as_file,
# inputs=[url_input, file_input, image_export_mode],
# outputs=[return_as_file],
# )
# file_input.change(
# auto_set_return_as_file,
# inputs=[url_input, file_input, image_export_mode],
# outputs=[return_as_file],
# )
# image_export_mode.change(
# auto_set_return_as_file,
# inputs=[url_input, file_input, image_export_mode],
# outputs=[return_as_file],
# )
# URL processing
url_process_btn.click(
set_options_visibility, inputs=[false_bool], outputs=[options]
).then(
set_download_button_label, inputs=[processing_text], outputs=[download_file_btn]
).then(
set_outputs_visibility_process,
inputs=[return_as_file],
outputs=[content_output, file_output],
).then(
clear_outputs,
inputs=None,
outputs=[
task_id_rendered,
output_markdown,
output_markdown_rendered,
output_json,
@@ -547,12 +714,17 @@ with gr.Blocks(
output_text,
output_doctags,
],
).then(
set_task_id_visibility,
inputs=[true_bool],
outputs=[task_id_output],
).then(
process_url,
inputs=[
url_input,
to_formats,
image_export_mode,
pipeline,
ocr,
force_ocr,
ocr_engine,
@@ -561,7 +733,21 @@ with gr.Blocks(
table_mode,
abort_on_error,
return_as_file,
do_code_enrichment,
do_formula_enrichment,
do_picture_classification,
do_picture_description,
],
outputs=[
task_id_rendered,
],
).then(
set_outputs_visibility_process,
inputs=[return_as_file],
outputs=[content_output, file_output],
).then(
wait_task_finish,
inputs=[task_id_rendered, return_as_file],
outputs=[
output_markdown,
output_markdown_rendered,
@@ -592,21 +778,20 @@ with gr.Blocks(
set_outputs_visibility_direct,
inputs=[false_bool, false_bool],
outputs=[content_output, file_output],
).then(clear_url_input, inputs=None, outputs=[url_input])
).then(set_task_id_visibility, inputs=[false_bool], outputs=[task_id_output]).then(
clear_url_input, inputs=None, outputs=[url_input]
)
# File processing
file_process_btn.click(
set_options_visibility, inputs=[false_bool], outputs=[options]
).then(
set_download_button_label, inputs=[processing_text], outputs=[download_file_btn]
).then(
set_outputs_visibility_process,
inputs=[return_as_file],
outputs=[content_output, file_output],
).then(
clear_outputs,
inputs=None,
outputs=[
task_id_rendered,
output_markdown,
output_markdown_rendered,
output_json,
@@ -616,12 +801,17 @@ with gr.Blocks(
output_text,
output_doctags,
],
).then(
set_task_id_visibility,
inputs=[true_bool],
outputs=[task_id_output],
).then(
process_file,
inputs=[
file_input,
to_formats,
image_export_mode,
pipeline,
ocr,
force_ocr,
ocr_engine,
@@ -630,7 +820,21 @@ with gr.Blocks(
table_mode,
abort_on_error,
return_as_file,
do_code_enrichment,
do_formula_enrichment,
do_picture_classification,
do_picture_description,
],
outputs=[
task_id_rendered,
],
).then(
set_outputs_visibility_process,
inputs=[return_as_file],
outputs=[content_output, file_output],
).then(
wait_task_finish,
inputs=[task_id_rendered, return_as_file],
outputs=[
output_markdown,
output_markdown_rendered,
@@ -661,4 +865,6 @@ with gr.Blocks(
set_outputs_visibility_direct,
inputs=[false_bool, false_bool],
outputs=[content_output, file_output],
).then(clear_file_input, inputs=None, outputs=[file_input])
).then(set_task_id_visibility, inputs=[false_bool], outputs=[task_id_output]).then(
clear_file_input, inputs=None, outputs=[file_input]
)

View File

@@ -1,3 +1,4 @@
import sys
from pathlib import Path
from typing import Optional, Union
@@ -16,6 +17,10 @@ class UvicornSettings(BaseSettings):
reload: bool = False
root_path: str = ""
proxy_headers: bool = True
timeout_keep_alive: int = 60
ssl_certfile: Optional[Path] = None
ssl_keyfile: Optional[Path] = None
ssl_keyfile_password: Optional[str] = None
workers: Union[int, None] = None
@@ -28,7 +33,19 @@ class DoclingServeSettings(BaseSettings):
)
enable_ui: bool = False
api_host: str = "localhost"
artifacts_path: Optional[Path] = None
static_path: Optional[Path] = None
options_cache_size: int = 2
allow_external_plugins: bool = False
max_document_timeout: float = 3_600 * 24 * 7 # 7 days
max_num_pages: int = sys.maxsize
max_file_size: int = sys.maxsize
cors_origins: list[str] = ["*"]
cors_methods: list[str] = ["*"]
cors_headers: list[str] = ["*"]
eng_kind: AsyncEngine = AsyncEngine.LOCAL
eng_loc_num_workers: int = 2

8
docs/README.md Normal file
View File

@@ -0,0 +1,8 @@
# Dolcing Serve documentation
This documentation pages explore the webserver configurations, runtime options, deployment examples as well as development best practices.
- [Configuration](./configuration.md)
- [Advance usage](./usage.md)
- [Deployment](./deployment.md)
- [Development](./development.md)

Binary file not shown.

After

Width:  |  Height:  |  Size: 504 KiB

44
docs/configuration.md Normal file
View File

@@ -0,0 +1,44 @@
# Configuration
The `docling-serve` executable allows to configure the server via command line
options as well as environment variables.
Configurations are divided between the settings used for the `uvicorn` asgi
server and the actual app-specific configurations.
> [!WARNING]
> When the server is running with `reload` or with multiple `workers`, uvicorn
> will spawn multiple subprocessed. This invalides all the values configured
> via the CLI command line options. Please use environment variables in this
> type of deployments.
## Webserver configuration
The following table shows the options which are propagated directly to the
`uvicorn` webserver runtime.
| CLI option | ENV | Default | Description |
| -----------|-----|---------|-------------|
| `--host` | `UVICORN_HOST` | `0.0.0.0` for `run`, `localhost` for `dev` | THe host to serve on. |
| `--port` | `UVICORN_PORT` | `5001` | The port to serve on. |
| `--reload` | `UVICORN_RELOAD` | `false` for `run`, `true` for `dev` | Enable auto-reload of the server when (code) files change. |
| `--workers` | `UVICORN_WORKERS` | `1` | Use multiple worker processes. |
| `--root-path` | `UVICORN_ROOT_PATH` | `""` | The root path is used to tell your app that it is being served to the outside world with some |
| `--proxy-headers` | `UVICORN_PROXY_HEADERS` | `true` | Enable/Disable X-Forwarded-Proto, X-Forwarded-For, X-Forwarded-Port to populate remote address info. |
| `--timeout-keep-alive` | `UVICORN_TIMEOUT_KEEP_ALIVE` | `60` | Timeout for the server response. |
| `--ssl-certfile` | `UVICORN_SSL_CERTFILE` | | SSL certificate file. |
| `--ssl-keyfile` | `UVICORN_SSL_KEYFILE` | | SSL key file. |
| `--ssl-keyfile-password` | `UVICORN_SSL_KEYFILE_PASSWORD` | | SSL keyfile password. |
## Docling Serve configuration
THe following table describes the options to configure the Docling Serve app.
| CLI option | ENV | Default | Description |
| -----------|-----|---------|-------------|
| `--artifacts-path` | `DOCLING_SERVE_ARTIFACTS_PATH` | unset | If set to a valid directory, the model weights will be loaded from this path |
| | `DOCLING_SERVE_STATIC_PATH` | unset | If set to a valid directory, the static assets for the docs and ui will be loaded from this path |
| `--enable-ui` | `DOCLING_SERVE_ENABLE_UI` | `false` | Enable the demonstrator UI. |
| | `DOCLING_SERVE_OPTIONS_CACHE_SIZE` | `2` | How many DocumentConveter objects (including their loaded models) to keep in the cache. |
| | `DOCLING_SERVE_CORS_ORIGINS` | `["*"]` | A list of origins that should be permitted to make cross-origin requests. |
| | `DOCLING_SERVE_CORS_METHODS` | `["*"]` | A list of HTTP methods that should be allowed for cross-origin requests. |
| | `DOCLING_SERVE_CORS_HEADERS` | `["*"]` | A list of HTTP request headers that should be supported for cross-origin requests. |

View File

@@ -0,0 +1,15 @@
services:
docling:
image: ghcr.io/docling-project/docling-serve-cu124
container_name: docling-serve
ports:
- 5001:5001
environment:
- DOCLING_SERVE_ENABLE_UI=true
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all # nvidia-smi
capabilities: [gpu]

View File

@@ -0,0 +1,192 @@
# This example deployment configures Docling Serve with a OAuth-Proxy sidecar and TLS termination
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: docling-serve
labels:
app: docling-serve
annotations:
serviceaccounts.openshift.io/oauth-redirectreference.primary: '{"kind":"OAuthRedirectReference","apiVersion":"v1","reference":{"kind":"Route","name":"docling-serve"}}'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: docling-serve-oauth
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: docling-serve
namespace: docling
---
apiVersion: route.openshift.io/v1
kind: Route
metadata:
name: docling-serve
labels:
app: docling-serve
component: docling-serve-api
spec:
to:
kind: Service
name: docling-serve
port:
targetPort: oauth
tls:
termination: Reencrypt
---
apiVersion: v1
kind: Service
metadata:
name: docling-serve
labels:
app: docling-serve
component: docling-serve-api
annotations:
service.alpha.openshift.io/serving-cert-secret-name: docling-serve-tls
spec:
ports:
- name: oauth
port: 8443
targetPort: oauth
- name: http
port: 5001
targetPort: http
selector:
app: docling-serve
component: docling-serve-api
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: docling-serve
labels:
app: docling-serve
component: docling-serve-api
spec:
replicas: 1
selector:
matchLabels:
app: docling-serve
component: docling-serve-api
template:
metadata:
labels:
app: docling-serve
component: docling-serve-api
spec:
restartPolicy: Always
serviceAccountName: docling-serve
containers:
- name: api
resources:
limits:
cpu: 2000m
memory: 2Gi
requests:
cpu: 800m
memory: 1Gi
readinessProbe:
httpGet:
path: /health
port: http
scheme: HTTPS
initialDelaySeconds: 10
timeoutSeconds: 2
periodSeconds: 5
successThreshold: 1
failureThreshold: 3
livenessProbe:
httpGet:
path: /health
port: http
scheme: HTTPS
initialDelaySeconds: 3
timeoutSeconds: 4
periodSeconds: 10
successThreshold: 1
failureThreshold: 5
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: DOCLING_SERVE_ENABLE_UI
value: 'true'
- name: DOCLING_SERVE_API_HOST
value: 'docling-serve.$(NAMESPACE).svc.cluster.local'
- name: UVICORN_SSL_CERTFILE
value: '/etc/tls/private/tls.crt'
- name: UVICORN_SSL_KEYFILE
value: '/etc/tls/private/tls.key'
ports:
- name: http
containerPort: 5001
protocol: TCP
volumeMounts:
- name: proxy-tls
mountPath: /etc/tls/private
imagePullPolicy: Always
image: 'ghcr.io/docling-project/docling-serve-cpu:fix-ui-with-https'
- name: oauth-proxy
resources:
limits:
cpu: 100m
memory: 256Mi
requests:
cpu: 100m
memory: 256Mi
readinessProbe:
httpGet:
path: /oauth/healthz
port: oauth
scheme: HTTPS
initialDelaySeconds: 5
timeoutSeconds: 1
periodSeconds: 5
successThreshold: 1
failureThreshold: 3
livenessProbe:
httpGet:
path: /oauth/healthz
port: oauth
scheme: HTTPS
initialDelaySeconds: 30
timeoutSeconds: 1
periodSeconds: 5
successThreshold: 1
failureThreshold: 3
ports:
- name: oauth
containerPort: 8443
protocol: TCP
imagePullPolicy: IfNotPresent
volumeMounts:
- name: proxy-tls
mountPath: /etc/tls/private
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
image: 'registry.redhat.io/openshift4/ose-oauth-proxy:v4.13'
args:
- '--https-address=:8443'
- '--provider=openshift'
- '--openshift-service-account=docling-serve'
- '--upstream=https://docling-serve.$(NAMESPACE).svc.cluster.local:5001'
- '--upstream-ca=/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt'
- '--tls-cert=/etc/tls/private/tls.crt'
- '--tls-key=/etc/tls/private/tls.key'
- '--cookie-secret=SECRET'
- '--openshift-delegate-urls={"/": {"group":"route.openshift.io","resource":"routes","verb":"get","name":"docling-serve","namespace":"$(NAMESPACE)"}}'
- '--openshift-sar={"namespace":"$(NAMESPACE)","resource":"routes","resourceName":"docling-serve","verb":"get","resourceAPIGroup":"route.openshift.io"}'
- '--skip-auth-regex=''(^/health|^/docs)'''
volumes:
- name: proxy-tls
secret:
secretName: docling-serve-tls
defaultMode: 420

View File

@@ -0,0 +1,58 @@
# This example deployment configures Docling Serve with a Service and cuda image
---
apiVersion: v1
kind: Service
metadata:
name: docling-serve
labels:
app: docling-serve
component: docling-serve-api
spec:
ports:
- name: http
port: 5001
targetPort: http
selector:
app: docling-serve
component: docling-serve-api
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: docling-serve
labels:
app: docling-serve
component: docling-serve-api
spec:
replicas: 1
selector:
matchLabels:
app: docling-serve
component: docling-serve-api
template:
metadata:
labels:
app: docling-serve
component: docling-serve-api
spec:
restartPolicy: Always
containers:
- name: api
resources:
limits:
cpu: 500m
memory: 2Gi
nvidia.com/gpu: 1 # Limit to one GPU
requests:
cpu: 250m
memory: 1Gi
nvidia.com/gpu: 1 # Limit to one GPU
env:
- name: DOCLING_SERVE_ENABLE_UI
value: 'true'
ports:
- name: http
containerPort: 5001
protocol: TCP
imagePullPolicy: Always
image: 'ghcr.io/docling-project/docling-serve-cu124'

194
docs/deployment.md Normal file
View File

@@ -0,0 +1,194 @@
# Deployment Examples
This document provides deployment examples for running the application in different environments.
Choose the deployment option that best fits your setup.
- **[Local GPU](#local-gpu)**: For deploying the application locally on a machine with a NVIDIA GPU (using Docker Compose).
- **[OpenShift](#openshift)**: For deploying the application on an OpenShift cluster, designed for cloud-native environments.
---
## Local GPU
### Docker compose
Manifest example: [compose-gpu.yaml](./deploy-examples/compose-gpu.yaml)
This deployment has the following features:
- NVIDIA cuda enabled
Install the app with:
```sh
docker compose -f docs/deploy-examples/compose-gpu.yaml up -d
```
For using the API:
```sh
# Make a test query
curl -X 'POST' \
"localhost:5001/v1alpha/convert/source/async" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d '{
"http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
}'
```
<details>
<summary><b>Requirements</b></summary>
- debian/ubuntu/rhel/fedora/opensuse
- docker
- nvidia drivers >=550.54.14
- nvidia-container-toolkit
Docs:
- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/supported-platforms.html)
- [CUDA Toolkit Release Notes](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#id6)
</details>
<details>
<summary><b>Steps</b></summary>
1. Check driver version and which GPU you want to use (0/1/2/3.. and update [compose-gpu.yaml](./deploy-examples/compose-gpu.yaml) file or use `count: all`)
```sh
nvidia-smi
```
2. Check if the NVIDIA Container Toolkit is installed/updated
```sh
# debian
dpkg -l | grep nvidia-container-toolkit
```
```sh
# rhel
rpm -q nvidia-container-toolkit
```
NVIDIA Container Toolkit install steps can be found here:
<https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html>
3. Check which runtime is being used by Docker
```sh
# docker
docker info | grep -i runtime
```
4. If the default Docker runtime changes back from 'nvidia' to 'default' after restarting the Docker service (optional):
Backup the daemon.json file:
```sh
sudo cp /etc/docker/daemon.json /etc/docker/daemon.json.bak
```
Update the daemon.json file:
```sh
echo '{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime"
}
},
"default-runtime": "nvidia"
}' | sudo tee /etc/docker/daemon.json > /dev/null
```
Restart the Docker service:
```sh
sudo systemctl restart docker
```
Confirm 'nvidia' is the default runtime used by Docker by repeating step 3.
5. Run the container:
```sh
docker compose -f docs/deploy-examples/compose-gpu.yaml up -d
```
</details>
## OpenShift
### Simple deployment
Manifest example: [docling-serve-simple.yaml](./deploy-examples/docling-serve-simple.yaml)
This deployment example has the following features:
- Deployment configuration
- Service configuration
- NVIDIA cuda enabled
Install the app with:
```sh
oc apply -f docs/deploy-examples/docling-serve-simple.yaml
```
For using the API:
```sh
# Port-forward the service
oc port-forward svc/docling-serve 5001:5001
# Make a test query
curl -X 'POST' \
"localhost:5001/v1alpha/convert/source/async" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d '{
"http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
}'
```
### Secure deployment with `oauth-proxy`
Manifest example: [docling-serve-oauth.yaml](./deploy-examples/docling-serve-oauth.yaml)
This deployment has the following features:
- TLS encryption between all components (using the cluster-internal CA authority).
- Authentication via a secure `oauth-proxy` sidecar.
- Expose the service using a secure OpenShift `Route`
Install the app with:
```sh
oc apply -f docs/deploy-examples/docling-serve-oauth.yaml
```
For using the API:
```sh
# Retrieve the endpoint
DOCLING_NAME=docling-serve
DOCLING_ROUTE="https://$(oc get routes ${DOCLING_NAME} --template={{.spec.host}})"
# Retrieve the authentication token
OCP_AUTH_TOKEN=$(oc whoami --show-token)
# Make a test query
curl -X 'POST' \
"${DOCLING_ROUTE}/v1alpha/convert/source/async" \
-H "Authorization: Bearer ${OCP_AUTH_TOKEN}" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d '{
"http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
}'
```

57
docs/development.md Normal file
View File

@@ -0,0 +1,57 @@
# Development
## Install dependencies
### CPU only
```sh
# Install uv if not already available
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies
uv sync --extra cpu
```
### Cuda GPU
For GPU support use the following command:
```sh
# Install dependencies
uv sync
```
### Gradio UI and different OCR backends
`/ui` endpoint using `gradio` and different OCR backends can be enabled via package extras:
```sh
# Enable ui and rapidocr
uv sync --extra ui --extra rapidocr
```
```sh
# Enable tesserocr
uv sync --extra tesserocr
```
See `[project.optional-dependencies]` section in `pyproject.toml` for full list of options and runtime options with `uv run docling-serve --help`.
### Run the server
The `docling-serve` executable is a convenient script for launching the webserver both in
development and production mode.
```sh
# Run the server in development mode
# - reload is enabled by default
# - listening on the 127.0.0.1 address
# - ui is enabled by default
docling-serve dev
# Run the server in production mode
# - reload is disabled by default
# - listening on the 0.0.0.0 address
# - ui is disabled by default
docling-serve run
```

279
docs/usage.md Normal file
View File

@@ -0,0 +1,279 @@
# Usage
The API provides two endpoints: one for urls, one for files. This is necessary to send files directly in binary format instead of base64-encoded strings.
## Common parameters
On top of the source of file (see below), both endpoints support the same parameters, which are almost the same as the Docling CLI.
- `from_format` (List[str]): Input format(s) to convert from. Allowed values: `docx`, `pptx`, `html`, `image`, `pdf`, `asciidoc`, `md`. Defaults to all formats.
- `to_formats` (List[str]): Output format(s) to convert to. Allowed values: `md`, `json`, `html`, `text`, `doctags`. Defaults to `md`.
- `do_ocr` (bool): If enabled, the bitmap content will be processed using OCR. Defaults to `True`.
- `image_export_mode`: Image export mode for the document (only in case of JSON, Markdown or HTML). Allowed values: embedded, placeholder, referenced. Optional, defaults to `embedded`.
- `force_ocr` (bool): If enabled, replace any existing text with OCR-generated text over the full content. Defaults to `False`.
- `ocr_engine` (str): OCR engine to use. Allowed values: `easyocr`, `tesseract_cli`, `tesseract`, `rapidocr`, `ocrmac`. Defaults to `easyocr`.
- `ocr_lang` (List[str]): List of languages used by the OCR engine. Note that each OCR engine has different values for the language names. Defaults to empty.
- `pdf_backend` (str): PDF backend to use. Allowed values: `pypdfium2`, `dlparse_v1`, `dlparse_v2`. Defaults to `dlparse_v2`.
- `table_mode` (str): Table mode to use. Allowed values: `fast`, `accurate`. Defaults to `fast`.
- `abort_on_error` (bool): If enabled, abort on error. Defaults to false.
- `return_as_file` (boo): If enabled, return the output as a file. Defaults to false.
- `do_table_structure` (bool): If enabled, the table structure will be extracted. Defaults to true.
- `include_images` (bool): If enabled, images will be extracted from the document. Defaults to true.
- `images_scale` (float): Scale factor for images. Defaults to 2.0.
## Convert endpoints
### Source endpoint
The endpoint is `/v1alpha/convert/source`, listening for POST requests of JSON payloads.
On top of the above parameters, you must send the URL(s) of the document you want process with either the `http_sources` or `file_sources` fields.
The first is fetching URL(s) (optionally using with extra headers), the second allows to provide documents as base64-encoded strings.
No `options` is required, they can be partially or completely omitted.
Simple payload example:
```json
{
"http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
}
```
<details>
<summary>Complete payload example:</summary>
```json
{
"options": {
"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
"to_formats": ["md", "json", "html", "text", "doctags"],
"image_export_mode": "placeholder",
"do_ocr": true,
"force_ocr": false,
"ocr_engine": "easyocr",
"ocr_lang": ["en"],
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": false,
"return_as_file": false,
},
"http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
}
```
</details>
<details>
<summary>CURL example:</summary>
```sh
curl -X 'POST' \
'http://localhost:5001/v1alpha/convert/source' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"options": {
"from_formats": [
"docx",
"pptx",
"html",
"image",
"pdf",
"asciidoc",
"md",
"xlsx"
],
"to_formats": ["md", "json", "html", "text", "doctags"],
"image_export_mode": "placeholder",
"do_ocr": true,
"force_ocr": false,
"ocr_engine": "easyocr",
"ocr_lang": [
"fr",
"de",
"es",
"en"
],
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": false,
"return_as_file": false,
"do_table_structure": true,
"include_images": true,
"images_scale": 2
},
"http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
}'
```
</details>
<details>
<summary>Python example:</summary>
```python
import httpx
async_client = httpx.AsyncClient(timeout=60.0)
url = "http://localhost:5001/v1alpha/convert/source"
payload = {
"options": {
"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
"to_formats": ["md", "json", "html", "text", "doctags"],
"image_export_mode": "placeholder",
"do_ocr": True,
"force_ocr": False,
"ocr_engine": "easyocr",
"ocr_lang": "en",
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": False,
"return_as_file": False,
},
"http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
}
response = await async_client_client.post(url, json=payload)
data = response.json()
```
</details>
#### File as base64
The `file_sources` argument in the endpoint allows to send files as base64-encoded strings.
When your PDF or other file type is too large, encoding it and passing it inline to curl
can lead to an “Argument list too long” error on some systems. To avoid this, we write
the JSON request body to a file and have curl read from that file.
<details>
<summary>CURL steps:</summary>
```sh
# 1. Base64-encode the file
B64_DATA=$(base64 -w 0 /path/to/file/pdf-to-convert.pdf)
# 2. Build the JSON with your options
cat <<EOF > /tmp/request_body.json
{
"options": {
},
"file_sources": [{
"base64_string": "${B64_DATA}",
"filename": "pdf-to-convert.pdf"
}]
}
EOF
# 3. POST the request to the docling service
curl -X POST "localhost:5001/v1alpha/convert/source" \
-H "Content-Type: application/json" \
-d @/tmp/request_body.json
```
</details>
### File endpoint
The endpoint is: `/v1alpha/convert/file`, listening for POST requests of Form payloads (necessary as the files are sent as multipart/form data). You can send one or multiple files.
<details>
<summary>CURL example:</summary>
```sh
curl -X 'POST' \
'http://127.0.0.1:5001/v1alpha/convert/file' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'ocr_engine=easyocr' \
-F 'pdf_backend=dlparse_v2' \
-F 'from_formats=pdf' \
-F 'from_formats=docx' \
-F 'force_ocr=false' \
-F 'image_export_mode=embedded' \
-F 'ocr_lang=en' \
-F 'ocr_lang=pl' \
-F 'table_mode=fast' \
-F 'files=@2206.01062v1.pdf;type=application/pdf' \
-F 'abort_on_error=false' \
-F 'to_formats=md' \
-F 'to_formats=text' \
-F 'return_as_file=false' \
-F 'do_ocr=true'
```
</details>
<details>
<summary>Python example:</summary>
```python
import httpx
async_client = httpx.AsyncClient(timeout=60.0)
url = "http://localhost:5001/v1alpha/convert/file"
parameters = {
"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
"to_formats": ["md", "json", "html", "text", "doctags"],
"image_export_mode": "placeholder",
"do_ocr": True,
"force_ocr": False,
"ocr_engine": "easyocr",
"ocr_lang": ["en"],
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": False,
"return_as_file": False
}
current_dir = os.path.dirname(__file__)
file_path = os.path.join(current_dir, '2206.01062v1.pdf')
files = {
'files': ('2206.01062v1.pdf', open(file_path, 'rb'), 'application/pdf'),
}
response = await async_client.post(url, files=files, data={"parameters": json.dumps(parameters)})
assert response.status_code == 200, "Response should be 200 OK"
data = response.json()
```
</details>
## Response format
The response can be a JSON Document or a File.
- If you process only one file, the response will be a JSON document with the following format:
```jsonc
{
"document": {
"md_content": "",
"json_content": {},
"html_content": "",
"text_content": "",
"doctags_content": ""
},
"status": "<success|partial_success|skipped|failure>",
"processing_time": 0.0,
"timings": {},
"errors": []
}
```
Depending on the value you set in `output_formats`, the different items will be populated with their respective results or empty.
`processing_time` is the Docling processing time in seconds, and `timings` (when enabled in the backend) provides the detailed
timing of all the internal Docling components.
- If you set the parameter `return_as_file` to True, the response will be a zip file.
- If multiple files are generated (multiple inputs, or one input but multiple outputs with `return_as_file` True), the response will be a zip file.
## Asynchronous API
TBA

View File

@@ -1,6 +1,6 @@
[project]
name = "docling-serve"
version = "0.5.1" # DO NOT EDIT, updated automatically
version = "0.8.0" # DO NOT EDIT, updated automatically
description = "Running Docling as a service"
license = {text = "MIT"}
authors = [
@@ -30,7 +30,8 @@ classifiers = [
]
requires-python = ">=3.10"
dependencies = [
"docling~=2.25.1",
"docling[vlm]~=2.28",
"mlx-vlm~=0.1.12; sys_platform == 'darwin' and platform_machine == 'arm64'",
"fastapi[standard]~=0.115",
"httpx~=0.28",
"pydantic~=2.10",
@@ -43,7 +44,8 @@ dependencies = [
[project.optional-dependencies]
ui = [
"gradio~=5.9"
"gradio~=5.9",
"pydantic<2.11.0", # fix compatibility between gradio and new pydantic 2.11
]
tesserocr = [
"tesserocr~=2.7"
@@ -109,11 +111,11 @@ namespaces = true
docling-serve = "docling_serve.__main__:main"
[project.urls]
Homepage = "https://github.com/DS4SD/docling-serve"
Homepage = "https://github.com/docling-project/docling-serve"
# Documentation = "https://ds4sd.github.io/docling"
Repository = "https://github.com/DS4SD/docling-serve"
Issues = "https://github.com/DS4SD/docling-serve/issues"
Changelog = "https://github.com/DS4SD/docling-serve/blob/main/CHANGELOG.md"
Repository = "https://github.com/docling-project/docling-serve"
Issues = "https://github.com/docling-project/docling-serve/issues"
Changelog = "https://github.com/docling-project/docling-serve/blob/main/CHANGELOG.md"
[tool.ruff]
target-version = "py310"
@@ -195,6 +197,7 @@ module = [
"tesserocr.*",
"rapidocr_onnxruntime.*",
"requests.*",
"mlx_vlm.*",
]
ignore_missing_imports = true

View File

@@ -92,16 +92,11 @@ async def test_convert_file(async_client):
msg=f'JSON document should contain \'{{\\n "schema_name": "DoclingDocument\'". Received: {safe_slice(data["document"]["json_content"])}',
)
# HTML check
check.is_in(
"html_content",
data.get("document", {}),
msg=f"Response should contain 'html_content' key. Received keys: {list(data.get('document', {}).keys())}",
)
if data.get("document", {}).get("html_content") is not None:
check.is_in(
'<!DOCTYPE html>\n<html lang="en">\n<head>',
"<!DOCTYPE html>\n<html>\n<head>",
data["document"]["html_content"],
msg=f"HTML document should contain '<!DOCTYPE html>\\n<html lang=\"en'>. Received: {safe_slice(data['document']['html_content'])}",
msg=f"HTML document should contain '<!DOCTYPE html>\\n<html>'. Received: {safe_slice(data['document']['html_content'])}",
)
# Text check
check.is_in(
@@ -123,7 +118,7 @@ async def test_convert_file(async_client):
)
if data.get("document", {}).get("doctags_content") is not None:
check.is_in(
"<document>\n<section_header_level_1><location>",
"<doctag><page_header><loc",
data["document"]["doctags_content"],
msg=f"DocTags document should contain '<document>\\n<section_header_level_1><location>'. Received: {safe_slice(data['document']['doctags_content'])}",
msg=f"DocTags document should contain '<doctag><page_header><loc'. Received: {safe_slice(data['document']['doctags_content'])}",
)

View File

@@ -93,9 +93,9 @@ async def test_convert_url(async_client):
)
if data.get("document", {}).get("html_content") is not None:
check.is_in(
'<!DOCTYPE html>\n<html lang="en">\n<head>',
"<!DOCTYPE html>\n<html>\n<head>",
data["document"]["html_content"],
msg=f"HTML document should contain '<!DOCTYPE html>\\n<html lang=\"en'>. Received: {safe_slice(data['document']['html_content'])}",
msg=f"HTML document should contain '<!DOCTYPE html>\\n<html>'. Received: {safe_slice(data['document']['html_content'])}",
)
# Text check
check.is_in(
@@ -117,7 +117,7 @@ async def test_convert_url(async_client):
)
if data.get("document", {}).get("doctags_content") is not None:
check.is_in(
"<document>\n<section_header_level_1><location>",
"<doctag><page_header><loc",
data["document"]["doctags_content"],
msg=f"DocTags document should contain '<document>\\n<section_header_level_1><location>'. Received: {safe_slice(data['document']['doctags_content'])}",
msg=f"DocTags document should contain '<doctag><page_header><loc'. Received: {safe_slice(data['document']['doctags_content'])}",
)

4657
uv.lock generated

File diff suppressed because it is too large Load Diff