35 Commits

Author SHA1 Message Date
github-actions[bot]
b5c5f47892 chore: bump version to 0.14.0 [skip ci] 2025-06-17 13:10:27 +00:00
23Ro
d5455b7f66 fix: Typo in Headline (#220)
Signed-off-by: 23Ro <m.n@23ro.de>
2025-06-17 14:55:27 +02:00
Michele Dolfi
7a682494d6 chore: dco advisor (#224)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-17 09:38:56 +02:00
Eugene
524f6a8997 feat: Read supported file extensions from docling (#214)
Signed-off-by: Eugene <fogaprod@gmail.com>
2025-06-05 09:38:28 +02:00
github-actions[bot]
9ccf8e3b5e chore: bump version to 0.13.0 [skip ci] 2025-06-04 12:24:40 +00:00
Michele Dolfi
ffea34732b feat: upgrade docling to 2.36 (#212)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-04 14:20:34 +02:00
github-actions[bot]
b299af002b chore: bump version to 0.12.0 [skip ci] 2025-06-03 16:30:28 +00:00
Michele Dolfi
c4c41f16df feat: Export annotations in markdown and html (Docling upgrade) (#202)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-03 18:24:27 +02:00
Michele Dolfi
7066f3520a fix: processing complex params in multipart-form (#210)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-03 18:24:05 +02:00
Rui Dias Gomes
6a8190c315 docs: add openshift replicasets examples (#209)
Signed-off-by: Rui-Dias-Gomes <rui.dias.gomes@ibm.com>
Co-authored-by: Rui-Dias-Gomes <rui.dias.gomes@ibm.com>
2025-06-03 17:43:41 +02:00
github-actions[bot]
060ecd8b0e chore: bump version to 0.11.0 [skip ci] 2025-05-23 13:45:54 +00:00
Michele Dolfi
32b8a809f3 feat: page break placeholder in markdown exports options (#194)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-05-23 15:26:27 +02:00
Michele Dolfi
de002dfcdc feat: clear results registry (#192)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-05-23 14:30:57 +02:00
Michele Dolfi
abe5aa03f5 feat: Upgrade to Docling 2.33.0 (#198)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-05-22 17:00:29 +02:00
VIktor Kuropiantnyk
3f090b7d15 docs: Example and instructions on how to load model weights to persistent volume (#197)
Signed-off-by: Viktor Kuropiatnyk <vku@zurich.ibm.com>
2025-05-21 13:04:46 +02:00
Michele Dolfi
21c1791e42 docs: async api usage and fixes (#195)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-05-19 13:57:35 +02:00
Michele Dolfi
00be428490 feat: api to trigger offloading the models (#188)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-05-14 15:02:18 +02:00
Kasper Dinkla
3ff1b2f983 feat: Figure annotations @ docling components 0.0.7 (#181)
Signed-off-by: DKL <dkl@zurich.ibm.com>
2025-05-08 16:31:10 +02:00
Michele Dolfi
8406fb9b59 fix: usage of hashlib for FIPS (#171)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-05-02 15:00:10 +02:00
github-actions[bot]
a2dcb0a20f chore: bump version to 0.10.1 [skip ci] 2025-04-30 16:04:30 +00:00
Michele Dolfi
36787bc061 fix: avoid missing specialized keys in the options hash (#166)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-30 13:14:34 +02:00
Michele Dolfi
509f4889f8 fix: allow users to set the area threshold for picture descriptions (#165)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
2025-04-30 12:37:24 +02:00
Michele Dolfi
919cf5c041 fix: expose max wait time in sync endpoints (#164)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-30 12:30:11 +02:00
Michele Dolfi
35c2630c61 fix: add flash-attn for cuda images (#161)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-29 16:58:33 +02:00
github-actions[bot]
382d675631 chore: bump version to 0.10.0 [skip ci] 2025-04-28 10:06:42 +00:00
Michele Dolfi
c65f3c654c feat: add support for file upload and return as file in async endpoints (#152)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-28 11:18:19 +02:00
nkh0472
829effec1a docs: fix new default pdf_backend (#158)
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
2025-04-28 09:46:13 +02:00
nkh0472
494d66f992 chore: typo fix (#156)
Signed-off-by: nkh0472 <67589323+nkh0472@users.noreply.github.com>
2025-04-28 08:41:26 +02:00
Quang Nam Ta
14bafb2628 docs: fixing small typo in docs (#155)
Signed-off-by: Quang Nam Ta <work.quangnamta@gmail.com>
2025-04-28 08:35:40 +02:00
github-actions[bot]
37e2e1ad09 chore: bump version to 0.9.0 [skip ci] 2025-04-25 07:56:40 +00:00
Michele Dolfi
71c5fae505 fix: produce image artifacts in referenced mode (#151)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-24 17:33:36 +02:00
Michele Dolfi
91956cbf4e docs: vlm and picture description options (#149)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-24 14:42:06 +02:00
Michele Dolfi
4c9571a052 feat: expose picture description options (#148)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
2025-04-24 13:49:44 +02:00
Tiago Santana
41624af09f test: add tests with fastapi client (#147)
Signed-off-by: Tiago Santana <54704492+SantanaTiago@users.noreply.github.com>
2025-04-24 10:25:29 +02:00
Michele Dolfi
26bef5bec0 feat: Add parameters for Kubeflow pipeline engine (WIP) (#107)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-23 14:59:53 +02:00
57 changed files with 6646 additions and 3519 deletions

2
.github/dco.yml vendored Normal file
View File

@@ -0,0 +1,2 @@
allowRemediationCommits:
individual: true

View File

@@ -19,7 +19,7 @@ jobs:
platforms: linux/amd64, linux/arm64
- name: docling-project/docling-serve-cpu
build_args: |
UV_SYNC_EXTRA_ARGS=--no-extra cu124
UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra flash-attn
platforms: linux/amd64, linux/arm64
- name: docling-project/docling-serve-cu124
build_args: |

192
.github/workflows/dco-advisor.yml vendored Normal file
View File

@@ -0,0 +1,192 @@
name: DCO Advisor Bot
on:
pull_request_target:
types: [opened, reopened, synchronize]
permissions:
pull-requests: write
issues: write
jobs:
dco_advisor:
runs-on: ubuntu-latest
steps:
- name: Handle DCO check result
uses: actions/github-script@v7
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
const pr = context.payload.pull_request || context.payload.check_run?.pull_requests?.[0];
if (!pr) return;
const prNumber = pr.number;
const baseRef = pr.base.ref;
const headSha =
context.payload.check_run?.head_sha ||
pr.head?.sha;
const username = pr.user.login;
console.log("HEAD SHA:", headSha);
const sleep = ms => new Promise(resolve => setTimeout(resolve, ms));
// Poll until DCO check has a conclusion (max 6 attempts, 30s)
let dcoCheck = null;
for (let attempt = 0; attempt < 6; attempt++) {
const { data: checks } = await github.rest.checks.listForRef({
owner: context.repo.owner,
repo: context.repo.repo,
ref: headSha
});
console.log("All check runs:");
checks.check_runs.forEach(run => {
console.log(`- ${run.name} (${run.status}/${run.conclusion}) @ ${run.head_sha}`);
});
dcoCheck = checks.check_runs.find(run =>
run.name.toLowerCase().includes("dco") &&
!run.name.toLowerCase().includes("dco_advisor") &&
run.head_sha === headSha
);
if (dcoCheck?.conclusion) break;
console.log(`Waiting for DCO check... (${attempt + 1})`);
await sleep(5000); // wait 5 seconds
}
if (!dcoCheck || !dcoCheck.conclusion) {
console.log("DCO check did not complete in time.");
return;
}
const isFailure = ["failure", "action_required"].includes(dcoCheck.conclusion);
console.log(`DCO check conclusion for ${headSha}: ${dcoCheck.conclusion} (treated as ${isFailure ? "failure" : "success"})`);
// Parse DCO output for commit SHAs and author
let badCommits = [];
let authorName = "";
let authorEmail = "";
let moreInfo = `More info: [DCO check report](${dcoCheck?.html_url})`;
if (isFailure) {
const { data: commits } = await github.rest.pulls.listCommits({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: prNumber,
});
for (const commit of commits) {
const commitMessage = commit.commit.message;
const signoffMatch = commitMessage.match(/^Signed-off-by:\s+.+<.+>$/m);
if (!signoffMatch) {
console.log(`Bad commit found ${commit.sha}`)
badCommits.push({
sha: commit.sha,
authorName: commit.commit.author.name,
authorEmail: commit.commit.author.email,
});
}
}
}
// If multiple authors are present, you could adapt the message accordingly
// For now, we'll just use the first one
if (badCommits.length > 0) {
authorName = badCommits[0].authorName;
authorEmail = badCommits[0].authorEmail;
}
// Generate remediation commit message if needed
let remediationSnippet = "";
if (badCommits.length && authorEmail) {
remediationSnippet = `git commit --allow-empty -s -m "DCO Remediation Commit for ${authorName} <${authorEmail}>\n\n` +
badCommits.map(c => `I, ${c.authorName} <${c.authorEmail}>, hereby add my Signed-off-by to this commit: ${c.sha}`).join('\n') +
`"`;
} else {
remediationSnippet = "# Unable to auto-generate remediation message. Please check the DCO check details.";
}
// Build comment
const commentHeader = '<!-- dco-advice-bot -->';
let body = "";
if (isFailure) {
body = [
commentHeader,
'❌ **DCO Check Failed**',
'',
`Hi @${username}, your pull request has failed the Developer Certificate of Origin (DCO) check.`,
'',
'This repository supports **remediation commits**, so you can fix this without rewriting history — but you must follow the required message format.',
'',
'---',
'',
'### 🛠 Quick Fix: Add a remediation commit',
'Run this command:',
'',
'```bash',
remediationSnippet,
'git push',
'```',
'',
'---',
'',
'<details>',
'<summary>🔧 Advanced: Sign off each commit directly</summary>',
'',
'**For the latest commit:**',
'```bash',
'git commit --amend --signoff',
'git push --force-with-lease',
'```',
'',
'**For multiple commits:**',
'```bash',
`git rebase --signoff origin/${baseRef}`,
'git push --force-with-lease',
'```',
'',
'</details>',
'',
moreInfo
].join('\n');
} else {
body = [
commentHeader,
'✅ **DCO Check Passed**',
'',
`Thanks @${username}, all your commits are properly signed off. 🎉`
].join('\n');
}
// Get existing comments on the PR
const { data: comments } = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber
});
// Look for a previous bot comment
const existingComment = comments.find(c =>
c.body.includes("<!-- dco-advice-bot -->")
);
if (existingComment) {
await github.rest.issues.updateComment({
owner: context.repo.owner,
repo: context.repo.repo,
comment_id: existingComment.id,
body: body
});
} else {
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
body: body
});
}

View File

@@ -23,7 +23,7 @@ jobs:
platforms: linux/amd64, linux/arm64
- name: docling-project/docling-serve-cpu
build_args: |
UV_SYNC_EXTRA_ARGS=--no-extra cu124
UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra flash-attn
platforms: linux/amd64, linux/arm64
- name: docling-project/docling-serve-cu124
build_args: |

View File

@@ -17,7 +17,7 @@ jobs:
python-version: ${{ matrix.python-version }}
enable-cache: true
- name: Install dependencies
run: uv sync --all-extras --no-extra cu124
run: uv sync --all-extras --no-extra cu124 --no-extra flash-attn
- name: Build package
run: uv build
- name: Check content of wheel

View File

@@ -25,7 +25,7 @@ jobs:
key: pre-commit|${{ env.PY }}|${{ hashFiles('.pre-commit-config.yaml') }}
- name: Install dependencies
run: uv sync --frozen --all-extras --no-extra cu124
run: uv sync --frozen --all-extras --no-extra cu124 --no-extra flash-attn
- name: Run styling check
run: pre-commit run --all-files

2
.gitignore vendored
View File

@@ -444,3 +444,5 @@ pip-selfcheck.json
# Makefile
.action-lint
.markdown-lint
cookies.txt

View File

@@ -1,3 +1,87 @@
## [v0.14.0](https://github.com/docling-project/docling-serve/releases/tag/v0.14.0) - 2025-06-17
### Feature
* Read supported file extensions from docling ([#214](https://github.com/docling-project/docling-serve/issues/214)) ([`524f6a8`](https://github.com/docling-project/docling-serve/commit/524f6a8997b86d2f869ca491ec8fb40585b42ca4))
### Fix
* Typo in Headline ([#220](https://github.com/docling-project/docling-serve/issues/220)) ([`d5455b7`](https://github.com/docling-project/docling-serve/commit/d5455b7f66de39ea1f8b8927b5968d2baa23ca88))
## [v0.13.0](https://github.com/docling-project/docling-serve/releases/tag/v0.13.0) - 2025-06-04
### Feature
* Upgrade docling to 2.36 ([#212](https://github.com/docling-project/docling-serve/issues/212)) ([`ffea347`](https://github.com/docling-project/docling-serve/commit/ffea34732b24fdd438fabd6df02d3d9ce66b4534))
## [v0.12.0](https://github.com/docling-project/docling-serve/releases/tag/v0.12.0) - 2025-06-03
### Feature
* Export annotations in markdown and html (Docling upgrade) ([#202](https://github.com/docling-project/docling-serve/issues/202)) ([`c4c41f1`](https://github.com/docling-project/docling-serve/commit/c4c41f16dff83c5d2a0b8a4c625b5de19b36b7c5))
### Fix
* Processing complex params in multipart-form ([#210](https://github.com/docling-project/docling-serve/issues/210)) ([`7066f35`](https://github.com/docling-project/docling-serve/commit/7066f3520a88c07df1c80a0cc6c4339eaac4d6a7))
### Documentation
* Add openshift replicasets examples ([#209](https://github.com/docling-project/docling-serve/issues/209)) ([`6a8190c`](https://github.com/docling-project/docling-serve/commit/6a8190c315792bd1e0e2b0af310656baaa5551e5))
## [v0.11.0](https://github.com/docling-project/docling-serve/releases/tag/v0.11.0) - 2025-05-23
### Feature
* Page break placeholder in markdown exports options ([#194](https://github.com/docling-project/docling-serve/issues/194)) ([`32b8a80`](https://github.com/docling-project/docling-serve/commit/32b8a809f348bf9fbde657f93589a56935d3749d))
* Clear results registry ([#192](https://github.com/docling-project/docling-serve/issues/192)) ([`de002df`](https://github.com/docling-project/docling-serve/commit/de002dfcdc111c942a08b156c84b7fa22b3fbaf3))
* Upgrade to Docling 2.33.0 ([#198](https://github.com/docling-project/docling-serve/issues/198)) ([`abe5aa0`](https://github.com/docling-project/docling-serve/commit/abe5aa03f54d44ecf5c6d76e3258028997a53e68))
* Api to trigger offloading the models ([#188](https://github.com/docling-project/docling-serve/issues/188)) ([`00be428`](https://github.com/docling-project/docling-serve/commit/00be4284904d55b78c75c5475578ef11c2ade94c))
* Figure annotations @ docling components 0.0.7 ([#181](https://github.com/docling-project/docling-serve/issues/181)) ([`3ff1b2f`](https://github.com/docling-project/docling-serve/commit/3ff1b2f9834aca37472a895a0e3da47560457d77))
### Fix
* Usage of hashlib for FIPS ([#171](https://github.com/docling-project/docling-serve/issues/171)) ([`8406fb9`](https://github.com/docling-project/docling-serve/commit/8406fb9b59d83247b8379974cabed497703dfc4d))
### Documentation
* Example and instructions on how to load model weights to persistent volume ([#197](https://github.com/docling-project/docling-serve/issues/197)) ([`3f090b7`](https://github.com/docling-project/docling-serve/commit/3f090b7d15eaf696611d89bbbba5b98569610828))
* Async api usage and fixes ([#195](https://github.com/docling-project/docling-serve/issues/195)) ([`21c1791`](https://github.com/docling-project/docling-serve/commit/21c1791e427f5b1946ed46c68dfda03c957dca8f))
## [v0.10.1](https://github.com/docling-project/docling-serve/releases/tag/v0.10.1) - 2025-04-30
### Fix
* Avoid missing specialized keys in the options hash ([#166](https://github.com/docling-project/docling-serve/issues/166)) ([`36787bc`](https://github.com/docling-project/docling-serve/commit/36787bc0616356a6199da618d8646de51636b34e))
* Allow users to set the area threshold for picture descriptions ([#165](https://github.com/docling-project/docling-serve/issues/165)) ([`509f488`](https://github.com/docling-project/docling-serve/commit/509f4889f8ed4c0f0ce25bec4126ef1f1199797c))
* Expose max wait time in sync endpoints ([#164](https://github.com/docling-project/docling-serve/issues/164)) ([`919cf5c`](https://github.com/docling-project/docling-serve/commit/919cf5c0414f2f11eb8012f451fed7a8f582b7ad))
* Add flash-attn for cuda images ([#161](https://github.com/docling-project/docling-serve/issues/161)) ([`35c2630`](https://github.com/docling-project/docling-serve/commit/35c2630c613cf229393fc67b6938152b063ff498))
## [v0.10.0](https://github.com/docling-project/docling-serve/releases/tag/v0.10.0) - 2025-04-28
### Feature
* Add support for file upload and return as file in async endpoints ([#152](https://github.com/docling-project/docling-serve/issues/152)) ([`c65f3c6`](https://github.com/docling-project/docling-serve/commit/c65f3c654c76c6b64b6aada1f0a153d74789d629))
### Documentation
* Fix new default pdf_backend ([#158](https://github.com/docling-project/docling-serve/issues/158)) ([`829effe`](https://github.com/docling-project/docling-serve/commit/829effec1a1b80320ccaf2c501be8015169b6fa3))
* Fixing small typo in docs ([#155](https://github.com/docling-project/docling-serve/issues/155)) ([`14bafb2`](https://github.com/docling-project/docling-serve/commit/14bafb26286b94f80b56846c50d6e9a6d99a9763))
## [v0.9.0](https://github.com/docling-project/docling-serve/releases/tag/v0.9.0) - 2025-04-25
### Feature
* Expose picture description options ([#148](https://github.com/docling-project/docling-serve/issues/148)) ([`4c9571a`](https://github.com/docling-project/docling-serve/commit/4c9571a052d5ec0044e49225bc5615e13cdb0a56))
* Add parameters for Kubeflow pipeline engine (WIP) ([#107](https://github.com/docling-project/docling-serve/issues/107)) ([`26bef5b`](https://github.com/docling-project/docling-serve/commit/26bef5bec060f0afd8d358816b68c3f2c0dd4bc2))
### Fix
* Produce image artifacts in referenced mode ([#151](https://github.com/docling-project/docling-serve/issues/151)) ([`71c5fae`](https://github.com/docling-project/docling-serve/commit/71c5fae505366459fd481d2ecdabc5ebed94d49c))
### Documentation
* Vlm and picture description options ([#149](https://github.com/docling-project/docling-serve/issues/149)) ([`91956cb`](https://github.com/docling-project/docling-serve/commit/91956cbf4e91cf82bb4d54ace397cdbbfaf594ba))
## [v0.8.0](https://github.com/docling-project/docling-serve/releases/tag/v0.8.0) - 2025-04-22
### Feature

View File

@@ -46,7 +46,10 @@ RUN --mount=from=ghcr.io/astral-sh/uv:0.6.1,source=/uv,target=/bin/uv \
--mount=type=cache,target=/opt/app-root/src/.cache/uv,uid=1001 \
--mount=type=bind,source=uv.lock,target=uv.lock \
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
umask 002 && uv sync --frozen --no-install-project --no-dev --all-extras ${UV_SYNC_EXTRA_ARGS}
umask 002 && \
UV_SYNC_ARGS="--frozen --no-install-project --no-dev --all-extras" && \
uv sync ${UV_SYNC_ARGS} ${UV_SYNC_EXTRA_ARGS} --no-extra flash-attn && \
FLASH_ATTENTION_SKIP_CUDA_BUILD=TRUE uv sync ${UV_SYNC_ARGS} ${UV_SYNC_EXTRA_ARGS} --no-build-isolation-package=flash-attn
ARG MODELS_LIST="layout tableformer picture_classifier easyocr"

View File

@@ -35,7 +35,7 @@ docling-serve-image: Containerfile
.PHONY: docling-serve-cpu-image
docling-serve-cpu-image: Containerfile ## Build docling-serve "cpu only" container image
$(ECHO_PREFIX) printf " %-12s Containerfile\n" "[docling-serve CPU]"
$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cu124" -f Containerfile -t ghcr.io/docling-project/docling-serve-cpu:$(TAG) .
$(CMD_PREFIX) docker build --load --build-arg "UV_SYNC_EXTRA_ARGS=--no-extra cu124 --no-extra flash-attn" -f Containerfile -t ghcr.io/docling-project/docling-serve-cpu:$(TAG) .
$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-cpu:$(TAG) ghcr.io/docling-project/docling-serve-cpu:$(BRANCH_TAG)
$(CMD_PREFIX) docker tag ghcr.io/docling-project/docling-serve-cpu:$(TAG) quay.io/docling-project/docling-serve-cpu:$(BRANCH_TAG)

View File

@@ -70,7 +70,7 @@ An easy to use UI is available at the `/ui` endpoint.
## Documentation and advance usages
Visit the [Docling Serve documentation](./docs/README.md) for learning how to [configure the webserver](./docs/configuration.md), use all the [runtime options](./docs/usage.md) of the API and [deployment examples](./docs/deployment.md).
Visit the [Docling Serve documentation](./docs/README.md) for learning how to [configure the webserver](./docs/configuration.md), use all the [runtime options](./docs/usage.md) of the API and [deployment examples](./docs/deployment.md), pre-load model weights into a persistent volume [model weights on persistent volume](./docs/pre-loading-models.md)
## Get help and support

View File

@@ -1,11 +1,11 @@
import asyncio
import importlib.metadata
import logging
import tempfile
import shutil
import time
from contextlib import asynccontextmanager
from io import BytesIO
from pathlib import Path
from typing import Annotated, Any, Optional, Union
from typing import Annotated
from fastapi import (
BackgroundTasks,
@@ -28,31 +28,35 @@ from fastapi.staticfiles import StaticFiles
from docling.datamodel.base_models import DocumentStream
from docling_serve.datamodel.callback import (
ProgressCallbackRequest,
ProgressCallbackResponse,
)
from docling_serve.datamodel.convert import ConvertDocumentsOptions
from docling_serve.datamodel.requests import (
ConvertDocumentFileSourcesRequest,
ConvertDocumentHttpSourcesRequest,
ConvertDocumentsRequest,
)
from docling_serve.datamodel.responses import (
ClearResponse,
ConvertDocumentResponse,
HealthCheckResponse,
MessageKind,
TaskStatusResponse,
WebsocketMessage,
)
from docling_serve.docling_conversion import (
convert_documents,
get_converter,
get_pdf_pipeline_opts,
)
from docling_serve.engines import get_orchestrator
from docling_serve.engines.async_local.orchestrator import (
AsyncLocalOrchestrator,
TaskNotFoundError,
from docling_serve.datamodel.task import Task, TaskSource
from docling_serve.docling_conversion import _get_converter_from_hash
from docling_serve.engines.async_orchestrator import (
BaseAsyncOrchestrator,
ProgressInvalid,
)
from docling_serve.engines.async_orchestrator_factory import get_async_orchestrator
from docling_serve.engines.base_orchestrator import TaskNotFoundError
from docling_serve.helper_functions import FormDepends
from docling_serve.response_preparation import process_results
from docling_serve.settings import docling_serve_settings
from docling_serve.storage import get_scratch
# Set up custom logging as we'll be intermixes with FastAPI/Uvicorn's logging
@@ -90,11 +94,11 @@ _log = logging.getLogger(__name__)
# Context manager to initialize and clean up the lifespan of the FastAPI app
@asynccontextmanager
async def lifespan(app: FastAPI):
# Converter with default options
pdf_format_option = get_pdf_pipeline_opts(ConvertDocumentsOptions())
get_converter(pdf_format_option)
orchestrator = get_async_orchestrator()
scratch_dir = get_scratch()
orchestrator = get_orchestrator()
# Warm up processing cache
await orchestrator.warm_up_caches()
# Start the background queue processor
queue_task = asyncio.create_task(orchestrator.process_queue())
@@ -108,6 +112,10 @@ async def lifespan(app: FastAPI):
except asyncio.CancelledError:
_log.info("Queue processor cancelled.")
# Remove scratch directory in case it was a tempfile
if docling_serve_settings.scratch_path is not None:
shutil.rmtree(scratch_dir, ignore_errors=True)
##################################
# App creation and configuration #
@@ -157,7 +165,8 @@ def create_app(): # noqa: C901
from docling_serve.gradio_ui import ui as gradio_ui
tmp_output_dir = Path(tempfile.mkdtemp())
tmp_output_dir = get_scratch() / "gradio"
tmp_output_dir.mkdir(exist_ok=True, parents=True)
gradio_ui.gradio_output_dir = tmp_output_dir
app = gr.mount_gradio_app(
app,
@@ -205,6 +214,55 @@ def create_app(): # noqa: C901
redoc_js_url="/static/redoc.standalone.js",
)
########################
# Async / Sync helpers #
########################
async def _enque_source(
orchestrator: BaseAsyncOrchestrator, conversion_request: ConvertDocumentsRequest
) -> Task:
sources: list[TaskSource] = []
if isinstance(conversion_request, ConvertDocumentFileSourcesRequest):
sources.extend(conversion_request.file_sources)
if isinstance(conversion_request, ConvertDocumentHttpSourcesRequest):
sources.extend(conversion_request.http_sources)
task = await orchestrator.enqueue(
sources=sources, options=conversion_request.options
)
return task
async def _enque_file(
orchestrator: BaseAsyncOrchestrator,
files: list[UploadFile],
options: ConvertDocumentsOptions,
) -> Task:
_log.info(f"Received {len(files)} files for processing.")
# Load the uploaded files to Docling DocumentStream
file_sources: list[TaskSource] = []
for i, file in enumerate(files):
buf = BytesIO(file.file.read())
suffix = "" if len(file_sources) == 1 else f"_{i}"
name = file.filename if file.filename else f"file{suffix}.pdf"
file_sources.append(DocumentStream(name=name, stream=buf))
task = await orchestrator.enqueue(sources=file_sources, options=options)
return task
async def _wait_task_complete(
orchestrator: BaseAsyncOrchestrator, task_id: str
) -> bool:
start_time = time.monotonic()
while True:
task = await orchestrator.task_status(task_id=task_id)
if task.is_completed():
return True
await asyncio.sleep(5)
elapsed_time = time.monotonic() - start_time
if elapsed_time > docling_serve_settings.max_sync_wait:
return False
#############################
# API Endpoints definitions #
#############################
@@ -238,33 +296,34 @@ def create_app(): # noqa: C901
}
},
)
def process_url(
background_tasks: BackgroundTasks, conversion_request: ConvertDocumentsRequest
async def process_url(
background_tasks: BackgroundTasks,
orchestrator: Annotated[BaseAsyncOrchestrator, Depends(get_async_orchestrator)],
conversion_request: ConvertDocumentsRequest,
):
sources: list[Union[str, DocumentStream]] = []
headers: Optional[dict[str, Any]] = None
if isinstance(conversion_request, ConvertDocumentFileSourcesRequest):
for file_source in conversion_request.file_sources:
sources.append(file_source.to_document_stream())
else:
for http_source in conversion_request.http_sources:
sources.append(http_source.url)
if headers is None and http_source.headers:
headers = http_source.headers
# Note: results are only an iterator->lazy evaluation
results = convert_documents(
sources=sources, options=conversion_request.options, headers=headers
task = await _enque_source(
orchestrator=orchestrator, conversion_request=conversion_request
)
success = await _wait_task_complete(
orchestrator=orchestrator, task_id=task.task_id
)
# The real processing will happen here
response = process_results(
background_tasks=background_tasks,
conversion_options=conversion_request.options,
conv_results=results,
)
if not success:
# TODO: abort task!
return HTTPException(
status_code=504,
detail=f"Conversion is taking too long. The maximum wait time is configure as DOCLING_SERVE_MAX_SYNC_WAIT={docling_serve_settings.max_sync_wait}.",
)
return response
result = await orchestrator.task_result(
task_id=task.task_id, background_tasks=background_tasks
)
if result is None:
raise HTTPException(
status_code=404,
detail="Task result not found. Please wait for a completion status.",
)
return result
# Convert a document from file(s)
@app.post(
@@ -278,29 +337,35 @@ def create_app(): # noqa: C901
)
async def process_file(
background_tasks: BackgroundTasks,
orchestrator: Annotated[BaseAsyncOrchestrator, Depends(get_async_orchestrator)],
files: list[UploadFile],
options: Annotated[
ConvertDocumentsOptions, FormDepends(ConvertDocumentsOptions)
],
):
_log.info(f"Received {len(files)} files for processing.")
# Load the uploaded files to Docling DocumentStream
file_sources = []
for file in files:
buf = BytesIO(file.file.read())
name = file.filename if file.filename else "file.pdf"
file_sources.append(DocumentStream(name=name, stream=buf))
results = convert_documents(sources=file_sources, options=options)
response = process_results(
background_tasks=background_tasks,
conversion_options=options,
conv_results=results,
task = await _enque_file(
orchestrator=orchestrator, files=files, options=options
)
success = await _wait_task_complete(
orchestrator=orchestrator, task_id=task.task_id
)
return response
if not success:
# TODO: abort task!
return HTTPException(
status_code=504,
detail=f"Conversion is taking too long. The maximum wait time is configure as DOCLING_SERVE_MAX_SYNC_WAIT={docling_serve_settings.max_sync_wait}.",
)
result = await orchestrator.task_result(
task_id=task.task_id, background_tasks=background_tasks
)
if result is None:
raise HTTPException(
status_code=404,
detail="Task result not found. Please wait for a completion status.",
)
return result
# Convert a document from URL(s) using the async api
@app.post(
@@ -308,10 +373,12 @@ def create_app(): # noqa: C901
response_model=TaskStatusResponse,
)
async def process_url_async(
orchestrator: Annotated[AsyncLocalOrchestrator, Depends(get_orchestrator)],
orchestrator: Annotated[BaseAsyncOrchestrator, Depends(get_async_orchestrator)],
conversion_request: ConvertDocumentsRequest,
):
task = await orchestrator.enqueue(request=conversion_request)
task = await _enque_source(
orchestrator=orchestrator, conversion_request=conversion_request
)
task_queue_position = await orchestrator.get_queue_position(
task_id=task.task_id
)
@@ -319,6 +386,33 @@ def create_app(): # noqa: C901
task_id=task.task_id,
task_status=task.task_status,
task_position=task_queue_position,
task_meta=task.processing_meta,
)
# Convert a document from file(s) using the async api
@app.post(
"/v1alpha/convert/file/async",
response_model=TaskStatusResponse,
)
async def process_file_async(
orchestrator: Annotated[BaseAsyncOrchestrator, Depends(get_async_orchestrator)],
background_tasks: BackgroundTasks,
files: list[UploadFile],
options: Annotated[
ConvertDocumentsOptions, FormDepends(ConvertDocumentsOptions)
],
):
task = await _enque_file(
orchestrator=orchestrator, files=files, options=options
)
task_queue_position = await orchestrator.get_queue_position(
task_id=task.task_id
)
return TaskStatusResponse(
task_id=task.task_id,
task_status=task.task_status,
task_position=task_queue_position,
task_meta=task.processing_meta,
)
# Task status poll
@@ -327,7 +421,7 @@ def create_app(): # noqa: C901
response_model=TaskStatusResponse,
)
async def task_status_poll(
orchestrator: Annotated[AsyncLocalOrchestrator, Depends(get_orchestrator)],
orchestrator: Annotated[BaseAsyncOrchestrator, Depends(get_async_orchestrator)],
task_id: str,
wait: Annotated[
float, Query(help="Number of seconds to wait for a completed status.")
@@ -342,6 +436,7 @@ def create_app(): # noqa: C901
task_id=task.task_id,
task_status=task.task_status,
task_position=task_queue_position,
task_meta=task.processing_meta,
)
# Task status websocket
@@ -350,7 +445,7 @@ def create_app(): # noqa: C901
)
async def task_status_ws(
websocket: WebSocket,
orchestrator: Annotated[AsyncLocalOrchestrator, Depends(get_orchestrator)],
orchestrator: Annotated[BaseAsyncOrchestrator, Depends(get_async_orchestrator)],
task_id: str,
):
await websocket.accept()
@@ -375,6 +470,7 @@ def create_app(): # noqa: C901
task_id=task.task_id,
task_status=task.task_status,
task_position=task_queue_position,
task_meta=task.processing_meta,
)
await websocket.send_text(
WebsocketMessage(
@@ -389,6 +485,7 @@ def create_app(): # noqa: C901
task_id=task.task_id,
task_status=task.task_status,
task_position=task_queue_position,
task_meta=task.processing_meta,
)
await websocket.send_text(
WebsocketMessage(
@@ -416,10 +513,13 @@ def create_app(): # noqa: C901
},
)
async def task_result(
orchestrator: Annotated[AsyncLocalOrchestrator, Depends(get_orchestrator)],
orchestrator: Annotated[BaseAsyncOrchestrator, Depends(get_async_orchestrator)],
background_tasks: BackgroundTasks,
task_id: str,
):
result = await orchestrator.task_result(task_id=task_id)
result = await orchestrator.task_result(
task_id=task_id, background_tasks=background_tasks
)
if result is None:
raise HTTPException(
status_code=404,
@@ -427,4 +527,46 @@ def create_app(): # noqa: C901
)
return result
# Update task progress
@app.post(
"/v1alpha/callback/task/progress",
response_model=ProgressCallbackResponse,
)
async def callback_task_progress(
orchestrator: Annotated[BaseAsyncOrchestrator, Depends(get_async_orchestrator)],
request: ProgressCallbackRequest,
):
try:
await orchestrator.receive_task_progress(request=request)
return ProgressCallbackResponse(status="ack")
except TaskNotFoundError:
raise HTTPException(status_code=404, detail="Task not found.")
except ProgressInvalid as err:
raise HTTPException(
status_code=400, detail=f"Invalid progress payload: {err}"
)
#### Clear requests
# Offload models
@app.get(
"/v1alpha/clear/converters",
response_model=ClearResponse,
)
async def clear_converters():
_get_converter_from_hash.cache_clear()
return ClearResponse()
# Clean results
@app.get(
"/v1alpha/clear/results",
response_model=ClearResponse,
)
async def clear_results(
orchestrator: Annotated[BaseAsyncOrchestrator, Depends(get_async_orchestrator)],
older_then: float = 3600,
):
await orchestrator.clear_results(older_than=older_then)
return ClearResponse()
return app

View File

@@ -0,0 +1,50 @@
import enum
from typing import Annotated, Literal
from pydantic import BaseModel, Field
class ProgressKind(str, enum.Enum):
SET_NUM_DOCS = "set_num_docs"
UPDATE_PROCESSED = "update_processed"
class BaseProgress(BaseModel):
kind: ProgressKind
class ProgressSetNumDocs(BaseProgress):
kind: Literal[ProgressKind.SET_NUM_DOCS] = ProgressKind.SET_NUM_DOCS
num_docs: int
class SucceededDocsItem(BaseModel):
source: str
class FailedDocsItem(BaseModel):
source: str
error: str
class ProgressUpdateProcessed(BaseProgress):
kind: Literal[ProgressKind.UPDATE_PROCESSED] = ProgressKind.UPDATE_PROCESSED
num_processed: int
num_succeeded: int
num_failed: int
docs_succeeded: list[SucceededDocsItem]
docs_failed: list[FailedDocsItem]
class ProgressCallbackRequest(BaseModel):
task_id: str
progress: Annotated[
ProgressSetNumDocs | ProgressUpdateProcessed, Field(discriminator="kind")
]
class ProgressCallbackResponse(BaseModel):
status: Literal["ack"] = "ack"

View File

@@ -1,13 +1,15 @@
# Define the input options for the API
from typing import Annotated, Optional
from typing import Annotated, Any, Optional
from pydantic import BaseModel, Field
from pydantic import AnyUrl, BaseModel, Field, model_validator
from typing_extensions import Self
from docling.datamodel.base_models import InputFormat, OutputFormat
from docling.datamodel.pipeline_options import (
EasyOcrOptions,
PdfBackend,
PdfPipeline,
PictureDescriptionBaseOptions,
TableFormerMode,
TableStructureOptions,
)
@@ -26,6 +28,89 @@ ocr_factory = get_ocr_factory(
ocr_engines_enum = ocr_factory.get_enum()
class PictureDescriptionLocal(BaseModel):
repo_id: Annotated[
str,
Field(
description="Repository id from the Hugging Face Hub.",
examples=[
"HuggingFaceTB/SmolVLM-256M-Instruct",
"ibm-granite/granite-vision-3.2-2b",
],
),
]
prompt: Annotated[
str,
Field(
description="Prompt used when calling the vision-language model.",
examples=[
"Describe this image in a few sentences.",
"This is a figure from a document. Provide a detailed description of it.",
],
),
] = "Describe this image in a few sentences."
generation_config: Annotated[
dict[str, Any],
Field(
description="Config from https://huggingface.co/docs/transformers/en/main_classes/text_generation#transformers.GenerationConfig",
examples=[{"max_new_tokens": 200, "do_sample": False}],
),
] = {"max_new_tokens": 200, "do_sample": False}
class PictureDescriptionApi(BaseModel):
url: Annotated[
AnyUrl,
Field(
description="Endpoint which accepts openai-api compatible requests.",
examples=[
AnyUrl(
"http://localhost:8000/v1/chat/completions"
), # example of a local vllm api
AnyUrl(
"http://localhost:11434/v1/chat/completions"
), # example of ollama
],
),
]
headers: Annotated[
dict[str, str],
Field(
description="Headers used for calling the API endpoint. For example, it could include authentication headers."
),
] = {}
params: Annotated[
dict[str, Any],
Field(
description="Model parameters.",
examples=[
{ # on vllm
"model": "HuggingFaceTB/SmolVLM-256M-Instruct",
"max_completion_tokens": 200,
},
{ # on vllm
"model": "ibm-granite/granite-vision-3.2-2b",
"max_completion_tokens": 200,
},
{ # on ollama
"model": "granite3.2-vision:2b"
},
],
),
] = {}
timeout: Annotated[float, Field(description="Timeout for the API request.")] = 20
prompt: Annotated[
str,
Field(
description="Prompt used when calling the vision-language model.",
examples=[
"Describe this image in a few sentences.",
"This is a figures from a document. Provide a detailed description of it.",
],
),
] = "Describe this image in a few sentences."
class ConvertDocumentsOptions(BaseModel):
from_formats: Annotated[
list[InputFormat],
@@ -211,6 +296,14 @@ class ConvertDocumentsOptions(BaseModel):
),
] = 2.0
md_page_break_placeholder: Annotated[
str,
Field(
description="Add this placeholder betweek pages in the markdown output.",
examples=["<!-- page-break -->", ""],
),
] = ""
do_code_enrichment: Annotated[
bool,
Field(
@@ -226,7 +319,7 @@ class ConvertDocumentsOptions(BaseModel):
bool,
Field(
description=(
"If enabled, perform formula OCR, return Latex code. "
"If enabled, perform formula OCR, return LaTeX code. "
"Boolean. Optional, defaults to false."
),
examples=[False],
@@ -254,3 +347,48 @@ class ConvertDocumentsOptions(BaseModel):
examples=[False],
),
] = False
picture_description_area_threshold: Annotated[
float,
Field(
description="Minimum percentage of the area for a picture to be processed with the models.",
examples=[PictureDescriptionBaseOptions().picture_area_threshold],
),
] = PictureDescriptionBaseOptions().picture_area_threshold
picture_description_local: Annotated[
Optional[PictureDescriptionLocal],
Field(
description="Options for running a local vision-language model in the picture description. The parameters refer to a model hosted on Hugging Face. This parameter is mutually exclusive with picture_description_api.",
examples=[
PictureDescriptionLocal(repo_id="ibm-granite/granite-vision-3.2-2b"),
PictureDescriptionLocal(repo_id="HuggingFaceTB/SmolVLM-256M-Instruct"),
],
),
] = None
picture_description_api: Annotated[
Optional[PictureDescriptionApi],
Field(
description="API details for using a vision-language model in the picture description. This parameter is mutually exclusive with picture_description_local.",
examples=[
PictureDescriptionApi(
url="http://localhost:11434/v1/chat/completions",
params={"model": "granite3.2-vision:2b"},
)
],
),
] = None
@model_validator(mode="after")
def picture_description_exclusivity(self) -> Self:
# Validate picture description options
if (
self.picture_description_local is not None
and self.picture_description_api is not None
):
raise ValueError(
"The parameters picture_description_local and picture_description_api are mutually exclusive, only one of them can be set."
)
return self

View File

@@ -10,3 +10,4 @@ class TaskStatus(str, enum.Enum):
class AsyncEngine(str, enum.Enum):
LOCAL = "local"
KFP = "kfp"

View File

@@ -0,0 +1,7 @@
from pydantic import AnyUrl, BaseModel
class CallbackSpec(BaseModel):
url: AnyUrl
headers: dict[str, str] = {}
ca_cert: str = ""

View File

@@ -2,7 +2,7 @@ import base64
from io import BytesIO
from typing import Annotated, Any, Union
from pydantic import BaseModel, Field
from pydantic import AnyHttpUrl, BaseModel, Field
from docling.datamodel.base_models import DocumentStream
@@ -15,7 +15,7 @@ class DocumentsConvertBase(BaseModel):
class HttpSource(BaseModel):
url: Annotated[
str,
AnyHttpUrl,
Field(
description="HTTP url to process",
examples=["https://arxiv.org/pdf/2206.01062"],

View File

@@ -7,12 +7,18 @@ from docling.datamodel.document import ConversionStatus, ErrorItem
from docling.utils.profiling import ProfilingItem
from docling_core.types.doc import DoclingDocument
from docling_serve.datamodel.task_meta import TaskProcessingMeta
# Status
class HealthCheckResponse(BaseModel):
status: str = "ok"
class ClearResponse(BaseModel):
status: str = "ok"
class DocumentResponse(BaseModel):
filename: str
md_content: Optional[str] = None
@@ -38,6 +44,7 @@ class TaskStatusResponse(BaseModel):
task_id: str
task_status: str
task_position: Optional[int] = None
task_meta: Optional[TaskProcessingMeta] = None
class MessageKind(str, enum.Enum):

View File

@@ -1,17 +1,53 @@
from typing import Optional
import datetime
from functools import partial
from pathlib import Path
from typing import Optional, Union
from pydantic import BaseModel
from fastapi.responses import FileResponse
from pydantic import BaseModel, ConfigDict, Field
from docling.datamodel.base_models import DocumentStream
from docling_serve.datamodel.convert import ConvertDocumentsOptions
from docling_serve.datamodel.engines import TaskStatus
from docling_serve.datamodel.requests import ConvertDocumentsRequest
from docling_serve.datamodel.requests import FileSource, HttpSource
from docling_serve.datamodel.responses import ConvertDocumentResponse
from docling_serve.datamodel.task_meta import TaskProcessingMeta
TaskSource = Union[HttpSource, FileSource, DocumentStream]
class Task(BaseModel):
model_config = ConfigDict(arbitrary_types_allowed=True)
task_id: str
task_status: TaskStatus = TaskStatus.PENDING
request: Optional[ConvertDocumentsRequest]
result: Optional[ConvertDocumentResponse] = None
sources: list[TaskSource] = []
options: Optional[ConvertDocumentsOptions]
result: Optional[Union[ConvertDocumentResponse, FileResponse]] = None
scratch_dir: Optional[Path] = None
processing_meta: Optional[TaskProcessingMeta] = None
created_at: datetime.datetime = Field(
default_factory=partial(datetime.datetime.now, datetime.timezone.utc)
)
started_at: Optional[datetime.datetime] = None
finished_at: Optional[datetime.datetime] = None
last_update_at: datetime.datetime = Field(
default_factory=partial(datetime.datetime.now, datetime.timezone.utc)
)
def set_status(self, status: TaskStatus):
now = datetime.datetime.now(datetime.timezone.utc)
if status == TaskStatus.STARTED and self.started_at is None:
self.started_at = now
if (
status in [TaskStatus.SUCCESS, TaskStatus.FAILURE]
and self.finished_at is None
):
self.finished_at = now
self.last_update_at = now
self.task_status = status
def is_completed(self) -> bool:
if self.task_status in [TaskStatus.SUCCESS, TaskStatus.FAILURE]:

View File

@@ -0,0 +1,8 @@
from pydantic import BaseModel
class TaskProcessingMeta(BaseModel):
num_docs: int
num_processed: int = 0
num_succeeded: int = 0
num_failed: int = 0

View File

@@ -21,6 +21,8 @@ from docling.datamodel.pipeline_options import (
PdfBackend,
PdfPipeline,
PdfPipelineOptions,
PictureDescriptionApiOptions,
PictureDescriptionVlmOptions,
TableFormerMode,
VlmPipelineOptions,
smoldocling_vlm_conversion_options,
@@ -40,15 +42,12 @@ _log = logging.getLogger(__name__)
# Custom serializer for PdfFormatOption
# (model_dump_json does not work with some classes)
def _hash_pdf_format_option(pdf_format_option: PdfFormatOption) -> bytes:
data = pdf_format_option.model_dump()
data = pdf_format_option.model_dump(serialize_as_any=True)
# pipeline_options are not fully serialized by model_dump, dedicated pass
if pdf_format_option.pipeline_options:
data["pipeline_options"] = pdf_format_option.pipeline_options.model_dump()
# Replace `artifacts_path` with a string representation
data["pipeline_options"]["artifacts_path"] = repr(
data["pipeline_options"]["artifacts_path"]
data["pipeline_options"] = pdf_format_option.pipeline_options.model_dump(
serialize_as_any=True, mode="json"
)
# Replace `pipeline_cls` with a string representation
@@ -57,15 +56,11 @@ def _hash_pdf_format_option(pdf_format_option: PdfFormatOption) -> bytes:
# Replace `backend` with a string representation
data["backend"] = repr(data["backend"])
# Handle `device` in `accelerator_options`
if "accelerator_options" in data and "device" in data["accelerator_options"]:
data["accelerator_options"]["device"] = repr(
data["accelerator_options"]["device"]
)
# Serialize the dictionary to JSON with sorted keys to have consistent hashes
serialized_data = json.dumps(data, sort_keys=True)
options_hash = hashlib.sha1(serialized_data.encode()).digest()
options_hash = hashlib.sha1(
serialized_data.encode(), usedforsecurity=False
).digest()
return options_hash
@@ -116,6 +111,7 @@ def _parse_standard_pdf_opts(
pipeline_options = PdfPipelineOptions(
artifacts_path=artifacts_path,
enable_remote_services=docling_serve_settings.enable_remote_services,
document_timeout=request.document_timeout,
do_ocr=request.do_ocr,
ocr_options=ocr_options,
@@ -129,9 +125,28 @@ def _parse_standard_pdf_opts(
if request.image_export_mode != ImageRefMode.PLACEHOLDER:
pipeline_options.generate_page_images = True
if request.image_export_mode == ImageRefMode.REFERENCED:
pipeline_options.generate_picture_images = True
if request.images_scale:
pipeline_options.images_scale = request.images_scale
if request.picture_description_local is not None:
pipeline_options.picture_description_options = (
PictureDescriptionVlmOptions.model_validate(
request.picture_description_local.model_dump()
)
)
if request.picture_description_api is not None:
pipeline_options.picture_description_options = (
PictureDescriptionApiOptions.model_validate(
request.picture_description_api.model_dump()
)
)
pipeline_options.picture_description_options.picture_area_threshold = (
request.picture_description_area_threshold
)
return pipeline_options
@@ -179,7 +194,7 @@ def get_pdf_pipeline_opts(
if docling_serve_settings.artifacts_path is not None:
if str(docling_serve_settings.artifacts_path.absolute()) == "":
_log.info(
"artifacts_path is an empty path, model weights will be dowloaded "
"artifacts_path is an empty path, model weights will be downloaded "
"at runtime."
)
artifacts_path = None

View File

@@ -1,8 +0,0 @@
from functools import lru_cache
from docling_serve.engines.async_local.orchestrator import AsyncLocalOrchestrator
@lru_cache
def get_orchestrator() -> AsyncLocalOrchestrator:
return AsyncLocalOrchestrator()

View File

@@ -0,0 +1,137 @@
# ruff: noqa: E402, UP006, UP035
from typing import Any, Dict, List
from kfp import dsl
PYTHON_BASE_IMAGE = "python:3.12"
@dsl.component(
base_image=PYTHON_BASE_IMAGE,
packages_to_install=[
"pydantic",
"docling-serve @ git+https://github.com/docling-project/docling-serve@feat-kfp-engine",
],
pip_index_urls=["https://download.pytorch.org/whl/cpu", "https://pypi.org/simple"],
)
def generate_chunks(
run_name: str,
request: Dict[str, Any],
batch_size: int,
callbacks: List[Dict[str, Any]],
) -> List[List[Dict[str, Any]]]:
from pydantic import TypeAdapter
from docling_serve.datamodel.callback import (
ProgressCallbackRequest,
ProgressSetNumDocs,
)
from docling_serve.datamodel.kfp import CallbackSpec
from docling_serve.engines.async_kfp.notify import notify_callbacks
CallbacksListType = TypeAdapter(list[CallbackSpec])
sources = request["http_sources"]
splits = [sources[i : i + batch_size] for i in range(0, len(sources), batch_size)]
total = sum(len(chunk) for chunk in splits)
payload = ProgressCallbackRequest(
task_id=run_name, progress=ProgressSetNumDocs(num_docs=total)
)
notify_callbacks(
payload=payload,
callbacks=CallbacksListType.validate_python(callbacks),
)
return splits
@dsl.component(
base_image=PYTHON_BASE_IMAGE,
packages_to_install=[
"pydantic",
"docling-serve @ git+https://github.com/docling-project/docling-serve@feat-kfp-engine",
],
pip_index_urls=["https://download.pytorch.org/whl/cpu", "https://pypi.org/simple"],
)
def convert_batch(
run_name: str,
data_splits: List[Dict[str, Any]],
request: Dict[str, Any],
callbacks: List[Dict[str, Any]],
output_path: dsl.OutputPath("Directory"), # type: ignore
):
from pathlib import Path
from pydantic import AnyUrl, TypeAdapter
from docling_serve.datamodel.callback import (
FailedDocsItem,
ProgressCallbackRequest,
ProgressUpdateProcessed,
SucceededDocsItem,
)
from docling_serve.datamodel.convert import ConvertDocumentsOptions
from docling_serve.datamodel.kfp import CallbackSpec
from docling_serve.datamodel.requests import HttpSource
from docling_serve.engines.async_kfp.notify import notify_callbacks
CallbacksListType = TypeAdapter(list[CallbackSpec])
convert_options = ConvertDocumentsOptions.model_validate(request["options"])
print(convert_options)
output_dir = Path(output_path)
output_dir.mkdir(exist_ok=True, parents=True)
docs_succeeded: list[SucceededDocsItem] = []
docs_failed: list[FailedDocsItem] = []
for source_dict in data_splits:
source = HttpSource.model_validate(source_dict)
filename = Path(str(AnyUrl(source.url).path)).name
output_filename = output_dir / filename
print(f"Writing {output_filename}")
with output_filename.open("w") as f:
f.write(source.model_dump_json())
docs_succeeded.append(SucceededDocsItem(source=source.url))
payload = ProgressCallbackRequest(
task_id=run_name,
progress=ProgressUpdateProcessed(
num_failed=len(docs_failed),
num_processed=len(docs_succeeded) + len(docs_failed),
num_succeeded=len(docs_succeeded),
docs_succeeded=docs_succeeded,
docs_failed=docs_failed,
),
)
print(payload)
notify_callbacks(
payload=payload,
callbacks=CallbacksListType.validate_python(callbacks),
)
@dsl.pipeline()
def process(
batch_size: int,
request: Dict[str, Any],
callbacks: List[Dict[str, Any]] = [],
run_name: str = "",
):
chunks_task = generate_chunks(
run_name=run_name,
request=request,
batch_size=batch_size,
callbacks=callbacks,
)
chunks_task.set_caching_options(False)
with dsl.ParallelFor(chunks_task.output, parallelism=4) as data_splits:
convert_batch(
run_name=run_name,
data_splits=data_splits,
request=request,
callbacks=callbacks,
)

View File

@@ -0,0 +1,32 @@
import ssl
import certifi
import httpx
from docling_serve.datamodel.callback import ProgressCallbackRequest
from docling_serve.datamodel.kfp import CallbackSpec
def notify_callbacks(
payload: ProgressCallbackRequest,
callbacks: list[CallbackSpec],
):
if len(callbacks) == 0:
return
for callback in callbacks:
# https://www.python-httpx.org/advanced/ssl/#configuring-client-instances
if callback.ca_cert:
ctx = ssl.create_default_context(cadata=callback.ca_cert)
else:
ctx = ssl.create_default_context(cafile=certifi.where())
try:
httpx.post(
str(callback.url),
headers=callback.headers,
json=payload.model_dump(mode="json"),
verify=ctx,
)
except httpx.HTTPError as err:
print(f"Error notifying callback {callback.url}: {err}")

View File

@@ -0,0 +1,235 @@
import datetime
import json
import logging
import uuid
from pathlib import Path
from typing import Optional
from kfp_server_api.models import V2beta1RuntimeState
from pydantic import BaseModel, TypeAdapter
from pydantic_settings import SettingsConfigDict
from docling_serve.datamodel.callback import (
ProgressCallbackRequest,
ProgressSetNumDocs,
ProgressUpdateProcessed,
)
from docling_serve.datamodel.convert import ConvertDocumentsOptions
from docling_serve.datamodel.engines import TaskStatus
from docling_serve.datamodel.kfp import CallbackSpec
from docling_serve.datamodel.requests import HttpSource
from docling_serve.datamodel.task import Task, TaskSource
from docling_serve.datamodel.task_meta import TaskProcessingMeta
from docling_serve.engines.async_kfp.kfp_pipeline import process
from docling_serve.engines.async_orchestrator import (
BaseAsyncOrchestrator,
ProgressInvalid,
)
from docling_serve.settings import docling_serve_settings
_log = logging.getLogger(__name__)
class _RunItem(BaseModel):
model_config = SettingsConfigDict(arbitrary_types_allowed=True)
run_id: str
state: str
created_at: datetime.datetime
scheduled_at: datetime.datetime
finished_at: datetime.datetime
class AsyncKfpOrchestrator(BaseAsyncOrchestrator):
def __init__(self):
super().__init__()
import kfp
kfp_endpoint = docling_serve_settings.eng_kfp_endpoint
if kfp_endpoint is None:
raise ValueError("KFP endpoint is required when using the KFP engine.")
kube_sa_token_path = Path("/run/secrets/kubernetes.io/serviceaccount/token")
kube_sa_ca_cert_path = Path(
"/run/secrets/kubernetes.io/serviceaccount/service-ca.crt"
)
ssl_ca_cert = docling_serve_settings.eng_kfp_ca_cert_path
token = docling_serve_settings.eng_kfp_token
if (
ssl_ca_cert is None
and ".svc" in kfp_endpoint.host
and kube_sa_ca_cert_path.exists()
):
ssl_ca_cert = str(kube_sa_ca_cert_path)
if token is None and kube_sa_token_path.exists():
token = kube_sa_token_path.read_text()
self._client = kfp.Client(
host=str(kfp_endpoint),
existing_token=token,
ssl_ca_cert=ssl_ca_cert,
# verify_ssl=False,
)
async def enqueue(
self, sources: list[TaskSource], options: ConvertDocumentsOptions
) -> Task:
callbacks = []
if docling_serve_settings.eng_kfp_self_callback_endpoint is not None:
headers = {}
if docling_serve_settings.eng_kfp_self_callback_token_path is not None:
token = (
docling_serve_settings.eng_kfp_self_callback_token_path.read_text()
)
headers["Authorization"] = f"Bearer {token}"
ca_cert = ""
if docling_serve_settings.eng_kfp_self_callback_ca_cert_path is not None:
ca_cert = docling_serve_settings.eng_kfp_self_callback_ca_cert_path.read_text()
callbacks.append(
CallbackSpec(
url=docling_serve_settings.eng_kfp_self_callback_endpoint,
headers=headers,
ca_cert=ca_cert,
)
)
CallbacksType = TypeAdapter(list[CallbackSpec])
SourcesListType = TypeAdapter(list[HttpSource])
http_sources = [s for s in sources if isinstance(s, HttpSource)]
# hack: since the current kfp backend is not resolving the job_id placeholder,
# we set the run_name and pass it as argument to the job itself.
run_name = f"docling-job-{uuid.uuid4()}"
kfp_run = self._client.create_run_from_pipeline_func(
process,
arguments={
"batch_size": 10,
"sources": SourcesListType.dump_python(http_sources, mode="json"),
"options": options.model_dump(mode="json"),
"callbacks": CallbacksType.dump_python(callbacks, mode="json"),
"run_name": run_name,
},
run_name=run_name,
)
task_id = kfp_run.run_id
task = Task(task_id=task_id, sources=sources, options=options)
await self.init_task_tracking(task)
return task
async def _update_task_from_run(self, task_id: str, wait: float = 0.0):
run_info = self._client.get_run(run_id=task_id)
task = await self.get_raw_task(task_id=task_id)
# RUNTIME_STATE_UNSPECIFIED = "RUNTIME_STATE_UNSPECIFIED"
# PENDING = "PENDING"
# RUNNING = "RUNNING"
# SUCCEEDED = "SUCCEEDED"
# SKIPPED = "SKIPPED"
# FAILED = "FAILED"
# CANCELING = "CANCELING"
# CANCELED = "CANCELED"
# PAUSED = "PAUSED"
if run_info.state == V2beta1RuntimeState.SUCCEEDED:
task.set_status(TaskStatus.SUCCESS)
elif run_info.state == V2beta1RuntimeState.PENDING:
task.set_status(TaskStatus.PENDING)
elif run_info.state == V2beta1RuntimeState.RUNNING:
task.set_status(TaskStatus.STARTED)
else:
task.set_status(TaskStatus.FAILURE)
async def task_status(self, task_id: str, wait: float = 0.0) -> Task:
await self._update_task_from_run(task_id=task_id, wait=wait)
return await self.get_raw_task(task_id=task_id)
async def _get_pending(self) -> list[_RunItem]:
runs: list[_RunItem] = []
next_page: Optional[str] = None
while True:
res = self._client.list_runs(
page_token=next_page,
page_size=20,
filter=json.dumps(
{
"predicates": [
{
"operation": "EQUALS",
"key": "state",
"stringValue": "PENDING",
}
]
}
),
)
if res.runs is not None:
for run in res.runs:
runs.append(
_RunItem(
run_id=run.run_id,
state=run.state,
created_at=run.created_at,
scheduled_at=run.scheduled_at,
finished_at=run.finished_at,
)
)
if res.next_page_token is None:
break
next_page = res.next_page_token
return runs
async def queue_size(self) -> int:
runs = await self._get_pending()
return len(runs)
async def get_queue_position(self, task_id: str) -> Optional[int]:
runs = await self._get_pending()
for pos, run in enumerate(runs, start=1):
if run.run_id == task_id:
return pos
return None
async def process_queue(self):
return
async def warm_up_caches(self):
return
async def _get_run_id(self, run_name: str) -> str:
res = self._client.list_runs(
filter=json.dumps(
{
"predicates": [
{
"operation": "EQUALS",
"key": "name",
"stringValue": run_name,
}
]
}
),
)
if res.runs is not None and len(res.runs) > 0:
return res.runs[0].run_id
raise RuntimeError(f"Run with {run_name=} not found.")
async def receive_task_progress(self, request: ProgressCallbackRequest):
task_id = await self._get_run_id(run_name=request.task_id)
progress = request.progress
task = await self.get_raw_task(task_id=task_id)
if isinstance(progress, ProgressSetNumDocs):
task.processing_meta = TaskProcessingMeta(num_docs=progress.num_docs)
task.task_status = TaskStatus.STARTED
elif isinstance(progress, ProgressUpdateProcessed):
if task.processing_meta is None:
raise ProgressInvalid(
"UpdateProcessed was called before setting the expected number of documents."
)
task.processing_meta.num_processed += progress.num_processed
task.processing_meta.num_succeeded += progress.num_succeeded
task.processing_meta.num_failed += progress.num_failed
task.task_status = TaskStatus.STARTED
# TODO: could be moved to BackgroundTask
await self.notify_task_subscribers(task_id=task_id)

View File

@@ -3,44 +3,30 @@ import logging
import uuid
from typing import Optional
from fastapi import WebSocket
from docling_serve.datamodel.engines import TaskStatus
from docling_serve.datamodel.requests import ConvertDocumentsRequest
from docling_serve.datamodel.responses import (
MessageKind,
TaskStatusResponse,
WebsocketMessage,
)
from docling_serve.datamodel.task import Task
from docling_serve.datamodel.convert import ConvertDocumentsOptions
from docling_serve.datamodel.task import Task, TaskSource
from docling_serve.docling_conversion import get_converter, get_pdf_pipeline_opts
from docling_serve.engines.async_local.worker import AsyncLocalWorker
from docling_serve.engines.base_orchestrator import BaseOrchestrator
from docling_serve.engines.async_orchestrator import BaseAsyncOrchestrator
from docling_serve.settings import docling_serve_settings
_log = logging.getLogger(__name__)
class OrchestratorError(Exception):
pass
class TaskNotFoundError(OrchestratorError):
pass
class AsyncLocalOrchestrator(BaseOrchestrator):
class AsyncLocalOrchestrator(BaseAsyncOrchestrator):
def __init__(self):
super().__init__()
self.task_queue = asyncio.Queue()
self.tasks: dict[str, Task] = {}
self.queue_list: list[str] = []
self.task_subscribers: dict[str, set[WebSocket]] = {}
async def enqueue(self, request: ConvertDocumentsRequest) -> Task:
async def enqueue(
self, sources: list[TaskSource], options: ConvertDocumentsOptions
) -> Task:
task_id = str(uuid.uuid4())
task = Task(task_id=task_id, request=request)
self.tasks[task_id] = task
task = Task(task_id=task_id, sources=sources, options=options)
await self.init_task_tracking(task)
self.queue_list.append(task_id)
self.task_subscribers[task_id] = set()
await self.task_queue.put(task_id)
return task
@@ -52,16 +38,6 @@ class AsyncLocalOrchestrator(BaseOrchestrator):
self.queue_list.index(task_id) + 1 if task_id in self.queue_list else None
)
async def task_status(self, task_id: str, wait: float = 0.0) -> Task:
if task_id not in self.tasks:
raise TaskNotFoundError()
return self.tasks[task_id]
async def task_result(self, task_id: str):
if task_id not in self.tasks:
raise TaskNotFoundError()
return self.tasks[task_id].result
async def process_queue(self):
# Create a pool of workers
workers = []
@@ -75,28 +51,7 @@ class AsyncLocalOrchestrator(BaseOrchestrator):
await asyncio.gather(*workers)
_log.debug("All workers completed.")
async def notify_task_subscribers(self, task_id: str):
if task_id not in self.task_subscribers:
raise RuntimeError(f"Task {task_id} does not have a subscribers list.")
task = self.tasks[task_id]
task_queue_position = await self.get_queue_position(task_id)
msg = TaskStatusResponse(
task_id=task.task_id,
task_status=task.task_status,
task_position=task_queue_position,
)
for websocket in self.task_subscribers[task_id]:
await websocket.send_text(
WebsocketMessage(message=MessageKind.UPDATE, task=msg).model_dump_json()
)
if task.is_completed():
await websocket.close()
async def notify_queue_positions(self):
for task_id in self.task_subscribers.keys():
# notify only pending tasks
if self.tasks[task_id].task_status != TaskStatus.PENDING:
continue
await self.notify_task_subscribers(task_id)
async def warm_up_caches(self):
# Converter with default options
pdf_format_option = get_pdf_pipeline_opts(ConvertDocumentsOptions())
get_converter(pdf_format_option)

View File

@@ -1,17 +1,18 @@
import asyncio
import logging
import shutil
import time
from typing import TYPE_CHECKING, Any, Optional, Union
from fastapi import BackgroundTasks
from fastapi.responses import FileResponse
from docling.datamodel.base_models import DocumentStream
from docling_serve.datamodel.engines import TaskStatus
from docling_serve.datamodel.requests import ConvertDocumentFileSourcesRequest
from docling_serve.datamodel.responses import ConvertDocumentResponse
from docling_serve.datamodel.requests import FileSource, HttpSource
from docling_serve.docling_conversion import convert_documents
from docling_serve.response_preparation import process_results
from docling_serve.storage import get_scratch
if TYPE_CHECKING:
from docling_serve.engines.async_local.orchestrator import AsyncLocalOrchestrator
@@ -35,7 +36,7 @@ class AsyncLocalWorker:
task = self.orchestrator.tasks[task_id]
try:
task.task_status = TaskStatus.STARTED
task.set_status(TaskStatus.STARTED)
_log.info(f"Worker {self.worker_id} processing task {task_id}")
# Notify clients about task updates
@@ -44,61 +45,68 @@ class AsyncLocalWorker:
# Notify clients about queue updates
await self.orchestrator.notify_queue_positions()
# Get the current event loop
asyncio.get_event_loop()
# Define a callback function to send progress updates to the client.
# TODO: send partial updates, e.g. when a document in the batch is done
def run_conversion():
sources: list[Union[str, DocumentStream]] = []
convert_sources: list[Union[str, DocumentStream]] = []
headers: Optional[dict[str, Any]] = None
if isinstance(task.request, ConvertDocumentFileSourcesRequest):
for file_source in task.request.file_sources:
sources.append(file_source.to_document_stream())
else:
for http_source in task.request.http_sources:
sources.append(http_source.url)
if headers is None and http_source.headers:
headers = http_source.headers
for source in task.sources:
if isinstance(source, DocumentStream):
convert_sources.append(source)
elif isinstance(source, FileSource):
convert_sources.append(source.to_document_stream())
elif isinstance(source, HttpSource):
convert_sources.append(str(source.url))
if headers is None and source.headers:
headers = source.headers
# Note: results are only an iterator->lazy evaluation
results = convert_documents(
sources=sources,
options=task.request.options,
sources=convert_sources,
options=task.options,
headers=headers,
)
# The real processing will happen here
work_dir = get_scratch() / task_id
response = process_results(
background_tasks=BackgroundTasks(),
conversion_options=task.request.options,
conversion_options=task.options,
conv_results=results,
work_dir=work_dir,
)
if work_dir.exists():
task.scratch_dir = work_dir
if not isinstance(response, FileResponse):
_log.warning(
f"Task {task_id=} produced content in {work_dir=} but the response is not a file."
)
shutil.rmtree(work_dir, ignore_errors=True)
return response
# Run the prediction in a thread to avoid blocking the event loop.
start_time = time.monotonic()
# Run the prediction in a thread to avoid blocking the event loop.
# Get the current event loop
# loop = asyncio.get_event_loop()
# future = asyncio.run_coroutine_threadsafe(
# run_conversion(),
# loop=loop
# )
# response = future.result()
# Run in a thread
response = await asyncio.to_thread(
run_conversion,
)
processing_time = time.monotonic() - start_time
if not isinstance(response, ConvertDocumentResponse):
_log.error(
f"Worker {self.worker_id} got un-processable "
"result for {task_id}: {type(response)}"
)
task.result = response
task.request = None
task.sources = []
task.options = None
task.task_status = TaskStatus.SUCCESS
task.set_status(TaskStatus.SUCCESS)
_log.info(
f"Worker {self.worker_id} completed job {task_id} "
f"in {processing_time:.2f} seconds"
@@ -108,7 +116,7 @@ class AsyncLocalWorker:
_log.error(
f"Worker {self.worker_id} failed to process job {task_id}: {e}"
)
task.task_status = TaskStatus.FAILURE
task.set_status(TaskStatus.FAILURE)
finally:
await self.orchestrator.notify_task_subscribers(task_id)

View File

@@ -0,0 +1,127 @@
import asyncio
import datetime
import logging
import shutil
from typing import Union
from fastapi import BackgroundTasks, WebSocket
from fastapi.responses import FileResponse
from docling_serve.datamodel.callback import ProgressCallbackRequest
from docling_serve.datamodel.engines import TaskStatus
from docling_serve.datamodel.responses import (
ConvertDocumentResponse,
MessageKind,
TaskStatusResponse,
WebsocketMessage,
)
from docling_serve.datamodel.task import Task
from docling_serve.engines.base_orchestrator import (
BaseOrchestrator,
OrchestratorError,
TaskNotFoundError,
)
from docling_serve.settings import docling_serve_settings
_log = logging.getLogger(__name__)
class ProgressInvalid(OrchestratorError):
pass
class BaseAsyncOrchestrator(BaseOrchestrator):
def __init__(self):
self.tasks: dict[str, Task] = {}
self.task_subscribers: dict[str, set[WebSocket]] = {}
async def init_task_tracking(self, task: Task):
task_id = task.task_id
self.tasks[task.task_id] = task
self.task_subscribers[task_id] = set()
async def get_raw_task(self, task_id: str) -> Task:
if task_id not in self.tasks:
raise TaskNotFoundError()
return self.tasks[task_id]
async def task_status(self, task_id: str, wait: float = 0.0) -> Task:
return await self.get_raw_task(task_id=task_id)
async def task_result(
self, task_id: str, background_tasks: BackgroundTasks
) -> Union[ConvertDocumentResponse, FileResponse, None]:
try:
task = await self.get_raw_task(task_id=task_id)
if task.is_completed() and docling_serve_settings.single_use_results:
if task.scratch_dir is not None:
background_tasks.add_task(
shutil.rmtree, task.scratch_dir, ignore_errors=True
)
async def _remove_task_impl():
await asyncio.sleep(docling_serve_settings.result_removal_delay)
await self.delete_task(task_id=task.task_id)
async def _remove_task():
asyncio.create_task(_remove_task_impl()) # noqa: RUF006
background_tasks.add_task(_remove_task)
return task.result
except TaskNotFoundError:
return None
async def delete_task(self, task_id: str):
_log.info(f"Deleting {task_id=}")
if task_id in self.task_subscribers:
for websocket in self.task_subscribers[task_id]:
await websocket.close()
del self.task_subscribers[task_id]
if task_id in self.tasks:
del self.tasks[task_id]
async def clear_results(self, older_than: float = 0.0):
cutoff_time = datetime.datetime.now(datetime.timezone.utc) - datetime.timedelta(
seconds=older_than
)
tasks_to_delete = [
task_id
for task_id, task in self.tasks.items()
if task.finished_at is not None and task.finished_at < cutoff_time
]
for task_id in tasks_to_delete:
await self.delete_task(task_id=task_id)
async def notify_task_subscribers(self, task_id: str):
if task_id not in self.task_subscribers:
raise RuntimeError(f"Task {task_id} does not have a subscribers list.")
task = await self.get_raw_task(task_id=task_id)
task_queue_position = await self.get_queue_position(task_id)
msg = TaskStatusResponse(
task_id=task.task_id,
task_status=task.task_status,
task_position=task_queue_position,
task_meta=task.processing_meta,
)
for websocket in self.task_subscribers[task_id]:
await websocket.send_text(
WebsocketMessage(message=MessageKind.UPDATE, task=msg).model_dump_json()
)
if task.is_completed():
await websocket.close()
async def notify_queue_positions(self):
for task_id in self.task_subscribers.keys():
# notify only pending tasks
if self.tasks[task_id].task_status != TaskStatus.PENDING:
continue
await self.notify_task_subscribers(task_id)
async def receive_task_progress(self, request: ProgressCallbackRequest):
raise NotImplementedError()

View File

@@ -0,0 +1,21 @@
from functools import lru_cache
from docling_serve.datamodel.engines import AsyncEngine
from docling_serve.engines.async_orchestrator import BaseAsyncOrchestrator
from docling_serve.settings import docling_serve_settings
@lru_cache
def get_async_orchestrator() -> BaseAsyncOrchestrator:
if docling_serve_settings.eng_kind == AsyncEngine.LOCAL:
from docling_serve.engines.async_local.orchestrator import (
AsyncLocalOrchestrator,
)
return AsyncLocalOrchestrator()
elif docling_serve_settings.eng_kind == AsyncEngine.KFP:
from docling_serve.engines.async_kfp.orchestrator import AsyncKfpOrchestrator
return AsyncKfpOrchestrator()
raise RuntimeError(f"Engine {docling_serve_settings.eng_kind} not recognized.")

View File

@@ -1,11 +1,27 @@
from abc import ABC, abstractmethod
from typing import Optional, Union
from docling_serve.datamodel.task import Task
from fastapi import BackgroundTasks
from fastapi.responses import FileResponse
from docling_serve.datamodel.convert import ConvertDocumentsOptions
from docling_serve.datamodel.responses import ConvertDocumentResponse
from docling_serve.datamodel.task import Task, TaskSource
class OrchestratorError(Exception):
pass
class TaskNotFoundError(OrchestratorError):
pass
class BaseOrchestrator(ABC):
@abstractmethod
async def enqueue(self, task) -> Task:
async def enqueue(
self, sources: list[TaskSource], options: ConvertDocumentsOptions
) -> Task:
pass
@abstractmethod
@@ -13,9 +29,27 @@ class BaseOrchestrator(ABC):
pass
@abstractmethod
async def task_status(self, task_id: str) -> Task:
async def get_queue_position(self, task_id: str) -> Optional[int]:
pass
@abstractmethod
async def task_result(self, task_id: str):
async def task_status(self, task_id: str, wait: float = 0.0) -> Task:
pass
@abstractmethod
async def task_result(
self, task_id: str, background_tasks: BackgroundTasks
) -> Union[ConvertDocumentResponse, FileResponse, None]:
pass
@abstractmethod
async def clear_results(self, older_than: float = 0.0):
pass
@abstractmethod
async def process_queue(self):
pass
@abstractmethod
async def warm_up_caches(self):
pass

View File

@@ -1,16 +1,19 @@
import base64
import importlib
import itertools
import json
import logging
import ssl
import tempfile
import time
from pathlib import Path
from typing import Optional
import certifi
import gradio as gr
import httpx
from docling.datamodel.base_models import FormatToExtensions
from docling.datamodel.pipeline_options import (
PdfBackend,
PdfPipeline,
@@ -28,7 +31,7 @@ logger = logging.getLogger(__name__)
############################
logo_path = "https://raw.githubusercontent.com/docling-project/docling/refs/heads/main/docs/assets/logo.svg"
js_components_url = "https://unpkg.com/@docling/docling-components@0.0.6"
js_components_url = "https://unpkg.com/@docling/docling-components@0.0.7"
if (
docling_serve_settings.static_path is not None
and docling_serve_settings.static_path.is_dir()
@@ -82,7 +85,7 @@ css = """
height: 140px;
}
docling-img::part(pages) {
docling-img {
gap: 1rem;
}
@@ -203,12 +206,16 @@ def clear_file_input():
return None
def auto_set_return_as_file(url_input, file_input, image_export_mode):
def auto_set_return_as_file(
url_input_value: str,
file_input_value: Optional[list[str]],
image_export_mode_value: str,
):
# If more than one input source is provided, return as file
if (
(len(url_input.split(",")) > 1)
or (file_input and len(file_input) > 1)
or (image_export_mode == "referenced")
(len(url_input_value.split(",")) > 1)
or (file_input_value and len(file_input_value) > 1)
or (image_export_mode_value == "referenced")
):
return True
else:
@@ -344,7 +351,7 @@ def file_to_base64(file):
def process_file(
file,
files,
to_formats,
image_export_mode,
pipeline,
@@ -361,10 +368,12 @@ def process_file(
do_picture_classification,
do_picture_description,
):
if not file or file == "":
if not files or len(files) == 0:
logger.error("No files provided.")
raise gr.Error("No files provided.", print_exception=False)
files_data = [{"base64_string": file_to_base64(file), "filename": file.name}]
files_data = [
{"base64_string": file_to_base64(file), "filename": file.name} for file in files
]
parameters = {
"file_sources": files_data,
@@ -436,7 +445,7 @@ def response_to_output(response, return_as_file):
)
# Embed document JSON and trigger load at client via an image.
json_rendered_content = f"""
<docling-img id="dclimg" pagenumbers tooltip="parsed"></docling-img>
<docling-img id="dclimg" pagenumbers><docling-tooltip></docling-tooltip></docling-img>
<script id="dcljson" type="application/json" onload="document.getElementById('dclimg').src = JSON.parse(document.getElementById('dcljson').textContent);">{json_content}</script>
<img src onerror="document.getElementById('dclimg').src = JSON.parse(document.getElementById('dcljson').textContent);" />
"""
@@ -538,21 +547,12 @@ with gr.Blocks(
elem_id="file_input_zone",
label="Upload File",
file_types=[
".pdf",
".docx",
".pptx",
".html",
".xlsx",
".json",
".asciidoc",
".txt",
".md",
".jpg",
".jpeg",
".png",
".gif",
f".{v}"
for v in itertools.chain.from_iterable(
FormatToExtensions.values()
)
],
file_count="single",
file_count="multiple",
scale=4,
)
with gr.Column(scale=1):
@@ -625,9 +625,7 @@ with gr.Blocks(
)
with gr.Column(scale=1):
abort_on_error = gr.Checkbox(label="Abort on Error", value=False)
return_as_file = gr.Checkbox(
label="Return as File", visible=False, value=False
) # Disable until async handle output as file
return_as_file = gr.Checkbox(label="Return as File", value=False)
with gr.Row():
with gr.Column():
do_code_enrichment = gr.Checkbox(
@@ -677,23 +675,22 @@ with gr.Blocks(
# UI Actions #
##############
# Disable until async handle output as file
# Handle Return as File
# url_input.change(
# auto_set_return_as_file,
# inputs=[url_input, file_input, image_export_mode],
# outputs=[return_as_file],
# )
# file_input.change(
# auto_set_return_as_file,
# inputs=[url_input, file_input, image_export_mode],
# outputs=[return_as_file],
# )
# image_export_mode.change(
# auto_set_return_as_file,
# inputs=[url_input, file_input, image_export_mode],
# outputs=[return_as_file],
# )
url_input.change(
auto_set_return_as_file,
inputs=[url_input, file_input, image_export_mode],
outputs=[return_as_file],
)
file_input.change(
auto_set_return_as_file,
inputs=[url_input, file_input, image_export_mode],
outputs=[return_as_file],
)
image_export_mode.change(
auto_set_return_as_file,
inputs=[url_input, file_input, image_export_mode],
outputs=[return_as_file],
)
# URL processing
url_process_btn.click(

View File

@@ -1,9 +1,30 @@
import inspect
import json
import re
from typing import Union
from typing import Union, get_args, get_origin
from fastapi import Depends, Form
from pydantic import BaseModel
from pydantic import BaseModel, TypeAdapter
def is_pydantic_model(type_):
try:
if inspect.isclass(type_) and issubclass(type_, BaseModel):
return True
origin = get_origin(type_)
if origin is Union:
args = get_args(type_)
return any(
inspect.isclass(arg) and issubclass(arg, BaseModel)
for arg in args
if arg is not type(None)
)
except Exception:
pass
return False
# Adapted from
@@ -12,25 +33,62 @@ def FormDepends(cls: type[BaseModel]):
new_parameters = []
for field_name, model_field in cls.model_fields.items():
annotation = model_field.annotation
description = model_field.description
default = (
Form(..., description=description)
if model_field.is_required()
else Form(
model_field.default,
examples=model_field.examples,
description=description,
)
)
# Flatten nested Pydantic models by accepting them as JSON strings
if is_pydantic_model(annotation):
annotation = str
default = Form(
None
if model_field.default is None
else json.dumps(model_field.default.model_dump(mode="json")),
description=description,
examples=None
if not model_field.examples
else [
json.dumps(ex.model_dump(mode="json"))
for ex in model_field.examples
],
)
new_parameters.append(
inspect.Parameter(
name=field_name,
kind=inspect.Parameter.POSITIONAL_ONLY,
default=(
Form(...)
if model_field.is_required()
else Form(model_field.default)
),
annotation=model_field.annotation,
default=default,
annotation=annotation,
)
)
async def as_form_func(**data):
for field_name, model_field in cls.model_fields.items():
value = data.get(field_name)
annotation = model_field.annotation
# Parse nested models from JSON string
if value is not None and is_pydantic_model(annotation):
try:
validator = TypeAdapter(annotation)
data[field_name] = validator.validate_json(value)
except Exception as e:
raise ValueError(f"Invalid JSON for field '{field_name}': {e}")
return cls(**data)
sig = inspect.signature(as_form_func)
sig = sig.replace(parameters=new_parameters)
as_form_func.__signature__ = sig # type: ignore
return Depends(as_form_func)

View File

@@ -1,13 +1,12 @@
import logging
import os
import shutil
import tempfile
import time
from collections.abc import Iterable
from pathlib import Path
from typing import Union
from fastapi import BackgroundTasks, HTTPException
from fastapi import HTTPException
from fastapi.responses import FileResponse
from docling.datamodel.base_models import OutputFormat
@@ -28,6 +27,7 @@ def _export_document_as_content(
export_txt: bool,
export_doctags: bool,
image_mode: ImageRefMode,
md_page_break_placeholder: str,
):
document = DocumentResponse(filename=conv_res.input.file.name)
@@ -41,12 +41,16 @@ def _export_document_as_content(
document.html_content = new_doc.export_to_html(image_mode=image_mode)
if export_txt:
document.text_content = new_doc.export_to_markdown(
strict_text=True, image_mode=image_mode
strict_text=True,
image_mode=image_mode,
)
if export_md:
document.md_content = new_doc.export_to_markdown(image_mode=image_mode)
document.md_content = new_doc.export_to_markdown(
image_mode=image_mode,
page_break_placeholder=md_page_break_placeholder or None,
)
if export_doctags:
document.doctags_content = new_doc.export_to_document_tokens()
document.doctags_content = new_doc.export_to_doctags()
elif conv_res.status == ConversionStatus.SKIPPED:
raise HTTPException(status_code=400, detail=conv_res.errors)
else:
@@ -64,6 +68,7 @@ def _export_documents_as_files(
export_txt: bool,
export_doctags: bool,
image_export_mode: ImageRefMode,
md_page_break_placeholder: str,
):
success_count = 0
failure_count = 0
@@ -104,7 +109,9 @@ def _export_documents_as_files(
fname = output_dir / f"{doc_filename}.md"
_log.info(f"writing Markdown output to {fname}")
conv_res.document.save_as_markdown(
filename=fname, image_mode=image_export_mode
filename=fname,
image_mode=image_export_mode,
page_break_placeholder=md_page_break_placeholder or None,
)
# Export Document Tags format:
@@ -124,9 +131,9 @@ def _export_documents_as_files(
def process_results(
background_tasks: BackgroundTasks,
conversion_options: ConvertDocumentsOptions,
conv_results: Iterable[ConversionResult],
work_dir: Path,
) -> Union[ConvertDocumentResponse, FileResponse]:
# Let's start by processing the documents
try:
@@ -171,6 +178,7 @@ def process_results(
export_txt=export_txt,
export_doctags=export_doctags,
image_mode=conversion_options.image_export_mode,
md_page_break_placeholder=conversion_options.md_page_break_placeholder,
)
response = ConvertDocumentResponse(
@@ -183,7 +191,6 @@ def process_results(
# Multiple documents were processed, or we are forced returning as a file
else:
# Temporary directory to store the outputs
work_dir = Path(tempfile.mkdtemp(prefix="docling_"))
output_dir = work_dir / "output"
output_dir.mkdir(parents=True, exist_ok=True)
@@ -200,10 +207,10 @@ def process_results(
export_txt=export_txt,
export_doctags=export_doctags,
image_export_mode=conversion_options.image_export_mode,
md_page_break_placeholder=conversion_options.md_page_break_placeholder,
)
files = os.listdir(output_dir)
if len(files) == 0:
raise HTTPException(status_code=500, detail="No documents were exported.")
@@ -216,7 +223,7 @@ def process_results(
# Other cleanups after the response is sent
# Output directory
background_tasks.add_task(shutil.rmtree, work_dir, ignore_errors=True)
# background_tasks.add_task(shutil.rmtree, work_dir, ignore_errors=True)
response = FileResponse(
file_path, filename=file_path.name, media_type="application/zip"

View File

@@ -2,7 +2,9 @@ import sys
from pathlib import Path
from typing import Optional, Union
from pydantic import AnyUrl, model_validator
from pydantic_settings import BaseSettings, SettingsConfigDict
from typing_extensions import Self
from docling_serve.datamodel.engines import AsyncEngine
@@ -36,19 +38,50 @@ class DoclingServeSettings(BaseSettings):
api_host: str = "localhost"
artifacts_path: Optional[Path] = None
static_path: Optional[Path] = None
scratch_path: Optional[Path] = None
single_use_results: bool = True
result_removal_delay: float = 300 # 5 minutes
options_cache_size: int = 2
enable_remote_services: bool = False
allow_external_plugins: bool = False
max_document_timeout: float = 3_600 * 24 * 7 # 7 days
max_num_pages: int = sys.maxsize
max_file_size: int = sys.maxsize
max_sync_wait: int = 120 # 2 minutes
cors_origins: list[str] = ["*"]
cors_methods: list[str] = ["*"]
cors_headers: list[str] = ["*"]
eng_kind: AsyncEngine = AsyncEngine.LOCAL
# Local engine
eng_loc_num_workers: int = 2
# KFP engine
eng_kfp_endpoint: Optional[AnyUrl] = None
eng_kfp_token: Optional[str] = None
eng_kfp_ca_cert_path: Optional[str] = None
eng_kfp_self_callback_endpoint: Optional[str] = None
eng_kfp_self_callback_token_path: Optional[Path] = None
eng_kfp_self_callback_ca_cert_path: Optional[Path] = None
eng_kfp_experimental: bool = False
@model_validator(mode="after")
def engine_settings(self) -> Self:
# Validate KFP engine settings
if self.eng_kind == AsyncEngine.KFP:
if self.eng_kfp_endpoint is None:
raise ValueError("KFP endpoint is required when using the KFP engine.")
if self.eng_kind == AsyncEngine.KFP:
if not self.eng_kfp_experimental:
raise ValueError(
"KFP is not yet working. To enable the development version, you must set DOCLING_SERVE_ENG_KFP_EXPERIMENTAL=true."
)
return self
uvicorn_settings = UvicornSettings()

16
docling_serve/storage.py Normal file
View File

@@ -0,0 +1,16 @@
import tempfile
from functools import lru_cache
from pathlib import Path
from docling_serve.settings import docling_serve_settings
@lru_cache
def get_scratch() -> Path:
scratch_dir = (
docling_serve_settings.scratch_path
if docling_serve_settings.scratch_path is not None
else Path(tempfile.mkdtemp(prefix="docling_"))
)
scratch_dir.mkdir(exist_ok=True, parents=True)
return scratch_dir

View File

@@ -1,4 +1,4 @@
# Dolcing Serve documentation
# Docling Serve documentation
This documentation pages explore the webserver configurations, runtime options, deployment examples as well as development best practices.

View File

@@ -37,8 +37,44 @@ THe following table describes the options to configure the Docling Serve app.
| -----------|-----|---------|-------------|
| `--artifacts-path` | `DOCLING_SERVE_ARTIFACTS_PATH` | unset | If set to a valid directory, the model weights will be loaded from this path |
| | `DOCLING_SERVE_STATIC_PATH` | unset | If set to a valid directory, the static assets for the docs and ui will be loaded from this path |
| | `DOCLING_SERVE_SCRATCH_PATH` | | If set, this directory will be used as scratch workspace, e.g. storing the results before they get requested. If unset, a temporary created is created for this purpose. |
| `--enable-ui` | `DOCLING_SERVE_ENABLE_UI` | `false` | Enable the demonstrator UI. |
| | `DOCLING_SERVE_ENABLE_REMOTE_SERVICES` | `false` | Allow pipeline components making remote connections. For example, this is needed when using a vision-language model via APIs. |
| | `DOCLING_SERVE_ALLOW_EXTERNAL_PLUGINS` | `false` | Allow the selection of third-party plugins. |
| | `DOCLING_SERVE_SINGLE_USE_RESULTS` | `true` | If true, results can be accessed only once. If false, the results accumulate in the scratch directory. |
| | `DOCLING_SERVE_RESULT_REMOVAL_DELAY` | `300` | When `DOCLING_SERVE_SINGLE_USE_RESULTS` is active, this is the delay before results are removed from the task registry. |
| | `DOCLING_SERVE_MAX_DOCUMENT_TIMEOUT` | `604800` (7 days) | The maximum time for processing a document. |
| | `DOCLING_SERVE_MAX_NUM_PAGES` | | The maximum number of pages for a document to be processed. |
| | `DOCLING_SERVE_MAX_FILE_SIZE` | | The maximum file size for a document to be processed. |
| | `DOCLING_SERVE_MAX_SYNC_WAIT` | `120` | Max number of seconds a synchronous endpoint is waiting for the task completion. |
| | `DOCLING_SERVE_OPTIONS_CACHE_SIZE` | `2` | How many DocumentConveter objects (including their loaded models) to keep in the cache. |
| | `DOCLING_SERVE_CORS_ORIGINS` | `["*"]` | A list of origins that should be permitted to make cross-origin requests. |
| | `DOCLING_SERVE_CORS_METHODS` | `["*"]` | A list of HTTP methods that should be allowed for cross-origin requests. |
| | `DOCLING_SERVE_CORS_HEADERS` | `["*"]` | A list of HTTP request headers that should be supported for cross-origin requests. |
| | `DOCLING_SERVE_ENG_KIND` | `local` | The compute engine to use for the async tasks. Possible values are `local` and `kfp`. See below for more configurations of the engines. |
### Compute engine
Docling Serve can be deployed with several possible of compute engine.
The selected compute engine will be running all the async jobs.
#### Local engine
The following table describes the options to configure the Docling Serve KFP engine.
| ENV | Default | Description |
|-----|---------|-------------|
| `DOCLING_SERVE_ENG_LOC_NUM_WORKERS` | 2 | Number of workers/threads processing the incoming tasks. |
#### KFP engine
The following table describes the options to configure the Docling Serve KFP engine.
| ENV | Default | Description |
|-----|---------|-------------|
| `DOCLING_SERVE_ENG_KFP_ENDPOINT` | | Must be set to the Kubeflow Pipeline endpoint. When using the in-cluster deployment, make sure to use the cluster endpoint, e.g. `https://NAME.NAMESPACE.svc.cluster.local:8888` |
| `DOCLING_SERVE_ENG_KFP_TOKEN` | | The authentication token for KFP. For in-cluster deployment, the app will load automatically the token of the ServiceAccount. |
| `DOCLING_SERVE_ENG_KFP_CA_CERT_PATH` | | Path to the CA certificates for the KFP endpoint. For in-cluster deployment, the app will load automatically the internal CA. |
| `DOCLING_SERVE_ENG_KFP_SELF_CALLBACK_ENDPOINT` | | If set, it enables internal callbacks providing status update of the KFP job. Usually something like `https://NAME.NAMESPACE.svc.cluster.local:5001/v1alpha/callback/task/progress`. |
| `DOCLING_SERVE_ENG_KFP_SELF_CALLBACK_TOKEN_PATH` | | The token used for authenticating the progress callback. For cluster-internal workloads, use `/run/secrets/kubernetes.io/serviceaccount/token`. |
| `DOCLING_SERVE_ENG_KFP_SELF_CALLBACK_CA_CERT_PATH` | | The CA certificate for the progress callback. For cluster-inetrnal workloads, use `/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt`. |

View File

@@ -0,0 +1,47 @@
kind: Deployment
apiVersion: apps/v1
metadata:
name: docling-serve
labels:
app: docling-serve
component: docling-serve-api
spec:
replicas: 1
selector:
matchLabels:
app: docling-serve
component: docling-serve-api
template:
metadata:
labels:
app: docling-serve
component: docling-serve-api
spec:
restartPolicy: Always
containers:
- name: api
resources:
limits:
cpu: 500m
memory: 2Gi
requests:
cpu: 250m
memory: 1Gi
env:
- name: DOCLING_SERVE_ENABLE_UI
value: 'true'
- name: DOCLING_SERVE_ARTIFACTS_PATH
value: '/modelcache'
ports:
- name: http
containerPort: 5001
protocol: TCP
imagePullPolicy: Always
image: 'ghcr.io/docling-project/docling-serve-cpu'
volumeMounts:
- name: docling-model-cache
mountPath: /modelcache
volumes:
- name: docling-model-cache
persistentVolumeClaim:
claimName: docling-model-cache-pvc

View File

@@ -0,0 +1,33 @@
apiVersion: batch/v1
kind: Job
metadata:
name: docling-model-cache-load
spec:
selector: {}
template:
metadata:
name: docling-model-load
spec:
containers:
- name: loader
image: ghcr.io/docling-project/docling-serve-cpu:main
command:
- docling-tools
- models
- download
- '--output-dir=/modelcache'
- 'layout'
- 'tableformer'
- 'code_formula'
- 'picture_classifier'
- 'smolvlm'
- 'granite_vision'
- 'easyocr'
volumeMounts:
- name: docling-model-cache
mountPath: /modelcache
volumes:
- name: docling-model-cache
persistentVolumeClaim:
claimName: docling-model-cache-pvc
restartPolicy: Never

View File

@@ -0,0 +1,11 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: docling-model-cache-pvc
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 10Gi

View File

@@ -0,0 +1,76 @@
# This example deployment configures Docling Serve with a Route + Sticky sessions, a Service and cpu image
---
kind: Route
apiVersion: route.openshift.io/v1
metadata:
name: docling-serve
labels:
app: docling-serve
component: docling-serve-api
annotations:
haproxy.router.openshift.io/disable_cookies: "false" # this annotation enables the sticky sessions
spec:
path: /
to:
kind: Service
name: docling-serve
port:
targetPort: http
tls:
termination: edge
insecureEdgeTerminationPolicy: Redirect
---
apiVersion: v1
kind: Service
metadata:
name: docling-serve
labels:
app: docling-serve
component: docling-serve-api
spec:
ports:
- name: http
port: 5001
targetPort: http
selector:
app: docling-serve
component: docling-serve-api
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: docling-serve
labels:
app: docling-serve
component: docling-serve-api
spec:
replicas: 3
selector:
matchLabels:
app: docling-serve
component: docling-serve-api
template:
metadata:
labels:
app: docling-serve
component: docling-serve-api
spec:
restartPolicy: Always
containers:
- name: api
resources:
limits:
cpu: 500m
memory: 2Gi
requests:
cpu: 250m
memory: 1Gi
env:
- name: DOCLING_SERVE_ENABLE_UI
value: 'true'
ports:
- name: http
containerPort: 5001
protocol: TCP
imagePullPolicy: Always
image: 'ghcr.io/docling-project/docling-serve'

View File

@@ -192,3 +192,45 @@ curl -X 'POST' \
"http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
}'
```
### ReplicaSets with `sticky sessions`
Manifest example: [docling-serve-replicas-w-sticky-sessions.yaml](./deploy-examples/docling-serve-replicas-w-sticky-sessions.yaml)
This deployment has the following features:
- Deployment configuration with 3 replicas
- Service configuration
- Expose the service using a OpenShift `Route` and enables sticky sessions
Install the app with:
```sh
oc apply -f docs/deploy-examples/docling-serve-replicas-w-sticky-sessions.yaml
```
For using the API:
```sh
# Retrieve the endpoint
DOCLING_NAME=docling-serve
DOCLING_ROUTE="https://$(oc get routes $DOCLING_NAME --template={{.spec.host}})"
# Make a test query, store the cookie and taskid
task_id=$(curl -s -X 'POST' \
"${DOCLING_ROUTE}/v1alpha/convert/source/async" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d '{
"http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
}' \
-c cookies.txt | grep -oP '"task_id":"\K[^"]+')
```
```sh
# Grab the taskid and cookie to check the task status
curl -v -X 'GET' \
"${DOCLING_ROUTE}/v1alpha/status/poll/$task_id?wait=0" \
-H "accept: application/json" \
-b "cookies.txt"
```

103
docs/pre-loading-models.md Normal file
View File

@@ -0,0 +1,103 @@
# Pre-loading models for docling
This document provides examples for pre-loading docling models to a persistent volume and re-using it for docling-serve deployments.
1. We need to create a persistent volume that will store models weights:
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: docling-model-cache-pvc
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 10Gi
```
If you don't want to use default storage class, set your custom storage class with following:
```yaml
spec:
...
storageClassName: <Storage Class Name>
```
Manifest example: [docling-model-cache-pvc.yaml](./deploy-examples/docling-model-cache-pvc.yaml)
2. In order to load model weights, we can use docling-toolkit to download them, as this is a one time operation we can use kubernetes job for this:
```yaml
apiVersion: batch/v1
kind: Job
metadata:
name: docling-model-cache-load
spec:
selector: {}
template:
metadata:
name: docling-model-load
spec:
containers:
- name: loader
image: ghcr.io/docling-project/docling-serve-cpu:main
command:
- docling-tools
- models
- download
- '--output-dir=/modelcache'
- 'layout'
- 'tableformer'
- 'code_formula'
- 'picture_classifier'
- 'smolvlm'
- 'granite_vision'
- 'easyocr'
volumeMounts:
- name: docling-model-cache
mountPath: /modelcache
volumes:
- name: docling-model-cache
persistentVolumeClaim:
claimName: docling-model-cache-pvc
restartPolicy: Never
```
The job will mount previously created persistent volume and execute command similar to how we would load models locally:
`docling-tools models download --output-dir <MOUNT-PATH> [LIST_OF_MODELS]`
In manifest, we specify desired models individually, or we can use `--all` parameter to download all models.
Manifest example: [docling-model-cache-job.yaml](./deploy-examples/docling-model-cache-job.yaml)
3. Now we can mount volume in the docling-serve deployment and set env `DOCLING_SERVE_ARTIFACTS_PATH` to point to it.
Following additions to deploymeny should be made:
```yaml
spec:
template:
spec:
containers:
- name: api
env:
...
- name: DOCLING_SERVE_ARTIFACTS_PATH
value: '/modelcache'
volumeMounts:
- name: docling-model-cache
mountPath: /modelcache
...
volumes:
- name: docling-model-cache
persistentVolumeClaim:
claimName: docling-model-cache-pvc
```
Make sure that value of `DOCLING_SERVE_ARTIFACTS_PATH` is the same as where models were downloaded and where volume is mounted.
Now when docling-serve is executing tasks, the underlying docling installation will load model weights from mouted volume.
Manifest example: [docling-model-cache-deployment.yaml](./deploy-examples/docling-model-cache-deployment.yaml)

View File

@@ -6,19 +6,29 @@ The API provides two endpoints: one for urls, one for files. This is necessary t
On top of the source of file (see below), both endpoints support the same parameters, which are almost the same as the Docling CLI.
- `from_format` (List[str]): Input format(s) to convert from. Allowed values: `docx`, `pptx`, `html`, `image`, `pdf`, `asciidoc`, `md`. Defaults to all formats.
- `from_formats` (List[str]): Input format(s) to convert from. Allowed values: `docx`, `pptx`, `html`, `image`, `pdf`, `asciidoc`, `md`. Defaults to all formats.
- `to_formats` (List[str]): Output format(s) to convert to. Allowed values: `md`, `json`, `html`, `text`, `doctags`. Defaults to `md`.
- `pipeline` (str). The choice of which pipeline to use. Allowed values are `standard` and `vlm`. Defaults to `standard`.
- `page_range` (tuple). If speficied, only convert a range of pages. The page number starts at 1.
- `do_ocr` (bool): If enabled, the bitmap content will be processed using OCR. Defaults to `True`.
- `image_export_mode`: Image export mode for the document (only in case of JSON, Markdown or HTML). Allowed values: embedded, placeholder, referenced. Optional, defaults to `embedded`.
- `force_ocr` (bool): If enabled, replace any existing text with OCR-generated text over the full content. Defaults to `False`.
- `ocr_engine` (str): OCR engine to use. Allowed values: `easyocr`, `tesseract_cli`, `tesseract`, `rapidocr`, `ocrmac`. Defaults to `easyocr`.
- `ocr_lang` (List[str]): List of languages used by the OCR engine. Note that each OCR engine has different values for the language names. Defaults to empty.
- `pdf_backend` (str): PDF backend to use. Allowed values: `pypdfium2`, `dlparse_v1`, `dlparse_v2`. Defaults to `dlparse_v2`.
- `pdf_backend` (str): PDF backend to use. Allowed values: `pypdfium2`, `dlparse_v1`, `dlparse_v2`, `dlparse_v4`. Defaults to `dlparse_v4`.
- `table_mode` (str): Table mode to use. Allowed values: `fast`, `accurate`. Defaults to `fast`.
- `abort_on_error` (bool): If enabled, abort on error. Defaults to false.
- `return_as_file` (boo): If enabled, return the output as a file. Defaults to false.
- `md_page_break_placeholder` (str): Add this placeholder betweek pages in the markdown output.
- `do_table_structure` (bool): If enabled, the table structure will be extracted. Defaults to true.
- `include_images` (bool): If enabled, images will be extracted from the document. Defaults to true.
- `do_code_enrichment` (bool): If enabled, perform OCR code enrichment. Defaults to false.
- `do_formula_enrichment` (bool): If enabled, perform formula OCR, return LaTeX code. Defaults to false.
- `do_picture_classification` (bool): If enabled, classify pictures in documents. Defaults to false.
- `do_picture_description` (bool): If enabled, describe pictures in documents. Defaults to false.
- `picture_description_area_threshold` (float): Minimum percentage of the area for a picture to be processed with the models. Defaults to 0.05.
- `picture_description_local` (dict): Options for running a local vision-language model in the picture description. The parameters refer to a model hosted on Hugging Face. This parameter is mutually exclusive with picture_description_api.
- `picture_description_api` (dict): API details for using a vision-language model in the picture description. This parameter is mutually exclusive with picture_description_local.
- `include_images` (bool): If enabled, images will be extracted from the document. Defaults to false.
- `images_scale` (float): Scale factor for images. Defaults to 2.0.
## Convert endpoints
@@ -236,7 +246,7 @@ files = {
'files': ('2206.01062v1.pdf', open(file_path, 'rb'), 'application/pdf'),
}
response = await async_client.post(url, files=files, data={"parameters": json.dumps(parameters)})
response = await async_client.post(url, files=files, data=parameters)
assert response.status_code == 200, "Response should be 200 OK"
data = response.json()
@@ -244,6 +254,70 @@ data = response.json()
</details>
### Picture description options
When the picture description enrichment is activated, users may specify which model and which execution mode to use for this task. There are two choices for the execution mode: _local_ will run the vision-language model directly, _api_ will invoke an external API endpoint.
The local option is specified with:
```jsonc
{
"picture_description_local": {
"repo_id": "", // Repository id from the Hugging Face Hub.
"generation_config": {"max_new_tokens": 200, "do_sample": false}, // HF generation config.
"prompt": "Describe this image in a few sentences. ", // Prompt used when calling the vision-language model.
}
}
```
The possible values for `generation_config` are documented in the [Hugging Face text generation docs](https://huggingface.co/docs/transformers/en/main_classes/text_generation#transformers.GenerationConfig).
The api option is specified with:
```jsonc
{
"picture_description_api": {
"url": "", // Endpoint which accepts openai-api compatible requests.
"headers": {}, // Headers used for calling the API endpoint. For example, it could include authentication headers.
"params": {}, // Model parameters.
"timeout": 20, // Timeout for the API request.
"prompt": "Describe this image in a few sentences. ", // Prompt used when calling the vision-language model.
}
}
```
Example URLs are:
- `http://localhost:8000/v1/chat/completions` for the local vllm api, with example `params`:
- the `HuggingFaceTB/SmolVLM-256M-Instruct` model
```json
{
"model": "HuggingFaceTB/SmolVLM-256M-Instruct",
"max_completion_tokens": 200,
}
```
- the `ibm-granite/granite-vision-3.2-2b` model
```json
{
"model": "ibm-granite/granite-vision-3.2-2b",
"max_completion_tokens": 200,
}
```
- `http://localhost:11434/v1/chat/completions` for the local ollama api, with example `params`:
- the `granite3.2-vision:2b` model
```json
{
"model": "granite3.2-vision:2b"
}
```
Note that when using `picture_description_api`, the server must be launched with `DOCLING_SERVE_ENABLE_REMOTE_SERVICES=true`.
## Response format
The response can be a JSON Document or a File.
@@ -276,4 +350,92 @@ The response can be a JSON Document or a File.
## Asynchronous API
TBA
Both `/v1alpha/convert/source` and `/v1alpha/convert/file` endpoints are available as asynchronous variants.
The advantage of the asynchronous endpoints is the possible to interrupt the connection, check for the progress update and fetch the result.
This approach is more resilient against network stabilities and allows the client application logic to easily interleave conversion with other tasks.
Launch an asynchronous conversion with:
- `POST /v1alpha/convert/source/async` when providing the input as sources.
- `POST /v1alpha/convert/file/async` when providing the input as multipart-form files.
The response format is a task detail:
```jsonc
{
"task_id": "<task_id>", // the task_id which can be used for the next operations
"task_status": "pending|started|success|failure", // the task status
"task_position": 1, // the position in the queue
"task_meta": null, // metadata e.g. how many documents are in the total job and how many have been converted
}
```
### Polling status
For checking the progress of the conversion task and wait for its completion, use the endpoint:
- `GET /v1alpha/status/poll/{task_id}`
<details>
<summary>Example waiting loop:</summary>
```python
import time
import httpx
# ...
# response from the async task submission
task = response.json()
while task["task_status"] not in ("success", "failure"):
response = httpx.get(f"{base_url}/status/poll/{task['task_id']}")
task = response.json()
time.sleep(5)
```
<details>
### Subscribe with websockets
Using websocket you can get the client application being notified about updates of the conversion task.
To start the websocker connection, use the endpoint:
- `/v1alpha/status/ws/{task_id}`
Websocket messages are JSON object with the following structure:
```jsonc
{
"message": "connection|update|error", // type of message being sent
"task": {}, // the same content of the task description
"error": "", // description of the error
}
```
<details>
<summary>Example websocker usage:</summary>
```python
from websockets.sync.client import connect
uri = f"ws://{base_url}/v1alpha/status/ws/{task['task_id']}"
with connect(uri) as websocket:
for message in websocket:
try:
payload = json.loads(message)
if payload["message"] == "error":
break
if payload["message"] == "error" and payload["task"]["task_status"] in ("success", "failure"):
break
except:
break
```
</details>
### Fetch results
When the task is completed, the result can be fetched with the endpoint:
- `GET /v1alpha/result/{task_id}`

View File

@@ -1,6 +1,6 @@
[project]
name = "docling-serve"
version = "0.8.0" # DO NOT EDIT, updated automatically
version = "0.14.0" # DO NOT EDIT, updated automatically
description = "Running Docling as a service"
license = {text = "MIT"}
authors = [
@@ -31,9 +31,11 @@ classifiers = [
requires-python = ">=3.10"
dependencies = [
"docling[vlm]~=2.28",
"docling-core>=2.32.0",
"mlx-vlm~=0.1.12; sys_platform == 'darwin' and platform_machine == 'arm64'",
"fastapi[standard]~=0.115",
"httpx~=0.28",
"kfp[kubernetes]>=2.10.0",
"pydantic~=2.10",
"pydantic-settings~=2.4",
"python-multipart>=0.0.14,<0.1.0",
@@ -62,9 +64,13 @@ cu124 = [
"torch>=2.6.0",
"torchvision>=0.21.0",
]
flash-attn = [
"flash-attn~=2.7.0; sys_platform == 'linux' and platform_machine == 'x86_64'"
]
[dependency-groups]
dev = [
"asgi-lifespan~=2.0",
"mypy~=1.11",
"pre-commit-uv~=4.1",
"pytest~=8.3",
@@ -81,6 +87,13 @@ conflicts = [
{ extra = "cpu" },
{ extra = "cu124" },
],
[
{ extra = "cpu" },
{ extra = "flash-attn" },
],]
environments = ["sys_platform != 'darwin' or platform_machine != 'x86_64'"]
override-dependencies = [
"urllib3~=2.0"
]
[tool.uv.sources]
@@ -197,6 +210,8 @@ module = [
"tesserocr.*",
"rapidocr_onnxruntime.*",
"requests.*",
"kfp.*",
"kfp_server_api.*",
"mlx_vlm.*",
]
ignore_missing_imports = true

View File

@@ -47,9 +47,7 @@ async def test_convert_file(async_client):
"files": ("2206.01062v1.pdf", open(file_path, "rb"), "application/pdf"),
}
response = await async_client.post(
url, files=files, data={"options": json.dumps(options)}
)
response = await async_client.post(url, files=files, data=options)
assert response.status_code == 200, "Response should be 200 OK"
data = response.json()

View File

@@ -0,0 +1,71 @@
import json
import time
from pathlib import Path
import httpx
import pytest
import pytest_asyncio
@pytest_asyncio.fixture
async def async_client():
async with httpx.AsyncClient(timeout=60.0) as client:
yield client
@pytest.mark.asyncio
async def test_convert_url(async_client):
"""Test convert URL to all outputs"""
base_url = "http://localhost:5001/v1alpha"
payload = {
"to_formats": ["md", "json", "html"],
"image_export_mode": "placeholder",
"ocr": False,
"abort_on_error": False,
"return_as_file": False,
}
file_path = Path(__file__).parent / "2206.01062v1.pdf"
files = {
"files": (file_path.name, file_path.open("rb"), "application/pdf"),
}
for n in range(1):
response = await async_client.post(
f"{base_url}/convert/file/async", files=files, data=payload
)
assert response.status_code == 200, "Response should be 200 OK"
task = response.json()
print(json.dumps(task, indent=2))
while task["task_status"] not in ("success", "failure"):
response = await async_client.get(f"{base_url}/status/poll/{task['task_id']}")
assert response.status_code == 200, "Response should be 200 OK"
task = response.json()
print(f"{task['task_status']=}")
print(f"{task['task_position']=}")
time.sleep(2)
assert task["task_status"] == "success"
print(f"Task completed with status {task['task_status']=}")
result_resp = await async_client.get(f"{base_url}/result/{task['task_id']}")
assert result_resp.status_code == 200, "Response should be 200 OK"
result = result_resp.json()
print("Got result.")
assert "md_content" in result["document"]
assert result["document"]["md_content"] is not None
assert len(result["document"]["md_content"]) > 10
assert "html_content" in result["document"]
assert result["document"]["html_content"] is not None
assert len(result["document"]["html_content"]) > 10
assert "json_content" in result["document"]
assert result["document"]["json_content"] is not None
assert result["document"]["json_content"]["schema_name"] == "DoclingDocument"

View File

@@ -28,6 +28,16 @@ async def test_convert_url(async_client: httpx.AsyncClient):
"ocr": True,
"abort_on_error": False,
"return_as_file": False,
# "do_picture_description": True,
# "picture_description_api": {
# "url": "http://localhost:11434/v1/chat/completions",
# "params": {
# "model": "granite3.2-vision:2b",
# }
# },
# "picture_description_local": {
# "repo_id": "HuggingFaceTB/SmolVLM-256M-Instruct",
# },
},
# "http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}],
"file_sources": [{"base64_string": encoded_doc, "filename": doc_filename.name}],

View File

@@ -38,7 +38,7 @@ async def test_convert_url(async_client):
}
print(json.dumps(payload, indent=2))
for n in range(5):
for n in range(3):
response = await async_client.post(
f"{base_url}/convert/source/async", json=payload
)

View File

@@ -1,4 +1,3 @@
import json
import os
import httpx
@@ -48,9 +47,7 @@ async def test_convert_file(async_client):
("files", ("2408.09869v5.pdf", open(file_path, "rb"), "application/pdf")),
]
response = await async_client.post(
url, files=files, data={"options": json.dumps(options)}
)
response = await async_client.post(url, files=files, data=options)
assert response.status_code == 200, "Response should be 200 OK"
# Check for zip file attachment

View File

@@ -0,0 +1,88 @@
import json
import time
import httpx
import pytest
import pytest_asyncio
from pytest_check import check
@pytest_asyncio.fixture
async def async_client():
async with httpx.AsyncClient(timeout=60.0) as client:
yield client
@pytest.mark.asyncio
async def test_convert_url(async_client):
"""Test convert URL to all outputs"""
base_url = "http://localhost:5001/v1alpha"
payload = {
"options": {
"from_formats": [
"docx",
"pptx",
"html",
"image",
"pdf",
"asciidoc",
"md",
"xlsx",
],
"to_formats": ["md", "json", "html", "text", "doctags"],
"image_export_mode": "placeholder",
"ocr": True,
"force_ocr": False,
"ocr_engine": "easyocr",
"ocr_lang": ["en"],
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": False,
"return_as_file": False,
},
"http_sources": [
{"url": "https://arxiv.org/pdf/2206.01062"},
{"url": "https://arxiv.org/pdf/2408.09869"},
],
}
response = await async_client.post(f"{base_url}/convert/source/async", json=payload)
assert response.status_code == 200, "Response should be 200 OK"
task = response.json()
print(json.dumps(task, indent=2))
while task["task_status"] not in ("success", "failure"):
response = await async_client.get(f"{base_url}/status/poll/{task['task_id']}")
assert response.status_code == 200, "Response should be 200 OK"
task = response.json()
print(f"{task['task_status']=}")
print(f"{task['task_position']=}")
time.sleep(2)
assert task["task_status"] == "success"
result_resp = await async_client.get(f"{base_url}/result/{task['task_id']}")
assert result_resp.status_code == 200, "Response should be 200 OK"
# Check for zip file attachment
content_disposition = result_resp.headers.get("content-disposition")
with check:
assert content_disposition is not None, (
"Content-Disposition header should be present"
)
with check:
assert "attachment" in content_disposition, "Response should be an attachment"
with check:
assert 'filename="converted_docs.zip"' in content_disposition, (
"Attachment filename should be 'converted_docs.zip'"
)
content_type = result_resp.headers.get("content-type")
with check:
assert content_type == "application/zip", (
"Content-Type should be 'application/zip'"
)

View File

@@ -0,0 +1,156 @@
import asyncio
import json
import os
import pytest
import pytest_asyncio
from asgi_lifespan import LifespanManager
from httpx import ASGITransport, AsyncClient
from pytest_check import check
from docling_serve.app import create_app
@pytest.fixture(scope="session")
def event_loop():
return asyncio.get_event_loop()
@pytest_asyncio.fixture(scope="session")
async def app():
app = create_app()
async with LifespanManager(app) as manager:
print("Launching lifespan of app.")
yield manager.app
@pytest_asyncio.fixture(scope="session")
async def client(app):
async with AsyncClient(
transport=ASGITransport(app=app), base_url="http://app.io"
) as client:
print("Client is ready")
yield client
@pytest.mark.asyncio
async def test_health(client: AsyncClient):
response = await client.get("/health")
assert response.status_code == 200
assert response.json() == {"status": "ok"}
@pytest.mark.asyncio
async def test_convert_file(client: AsyncClient):
"""Test convert single file to all outputs"""
endpoint = "/v1alpha/convert/file"
options = {
"from_formats": [
"docx",
"pptx",
"html",
"image",
"pdf",
"asciidoc",
"md",
"xlsx",
],
"to_formats": ["md", "json", "html", "text", "doctags"],
"image_export_mode": "placeholder",
"ocr": True,
"force_ocr": False,
"ocr_engine": "easyocr",
"ocr_lang": ["en"],
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": False,
"return_as_file": False,
}
current_dir = os.path.dirname(__file__)
file_path = os.path.join(current_dir, "2206.01062v1.pdf")
files = {
"files": ("2206.01062v1.pdf", open(file_path, "rb"), "application/pdf"),
}
response = await client.post(endpoint, files=files, data=options)
assert response.status_code == 200, "Response should be 200 OK"
data = response.json()
# Response content checks
# Helper function to safely slice strings
def safe_slice(value, length=100):
if isinstance(value, str):
return value[:length]
return str(value) # Convert non-string values to string for debug purposes
# Document check
check.is_in(
"document",
data,
msg=f"Response should contain 'document' key. Received keys: {list(data.keys())}",
)
# MD check
check.is_in(
"md_content",
data.get("document", {}),
msg=f"Response should contain 'md_content' key. Received keys: {list(data.get('document', {}).keys())}",
)
if data.get("document", {}).get("md_content") is not None:
check.is_in(
"## DocLayNet: ",
data["document"]["md_content"],
msg=f"Markdown document should contain 'DocLayNet: '. Received: {safe_slice(data['document']['md_content'])}",
)
# JSON check
check.is_in(
"json_content",
data.get("document", {}),
msg=f"Response should contain 'json_content' key. Received keys: {list(data.get('document', {}).keys())}",
)
if data.get("document", {}).get("json_content") is not None:
check.is_in(
'{"schema_name": "DoclingDocument"',
json.dumps(data["document"]["json_content"]),
msg=f'JSON document should contain \'{{\\n "schema_name": "DoclingDocument\'". Received: {safe_slice(data["document"]["json_content"])}',
)
# HTML check
check.is_in(
"html_content",
data.get("document", {}),
msg=f"Response should contain 'html_content' key. Received keys: {list(data.get('document', {}).keys())}",
)
if data.get("document", {}).get("html_content") is not None:
check.is_in(
"<!DOCTYPE html>\n<html>\n<head>",
data["document"]["html_content"],
msg=f"HTML document should contain '<!DOCTYPE html>\n<html>\n<head>'. Received: {safe_slice(data['document']['html_content'])}",
)
# Text check
check.is_in(
"text_content",
data.get("document", {}),
msg=f"Response should contain 'text_content' key. Received keys: {list(data.get('document', {}).keys())}",
)
if data.get("document", {}).get("text_content") is not None:
check.is_in(
"DocLayNet: A Large Human-Annotated Dataset",
data["document"]["text_content"],
msg=f"Text document should contain 'DocLayNet: A Large Human-Annotated Dataset'. Received: {safe_slice(data['document']['text_content'])}",
)
# DocTags check
check.is_in(
"doctags_content",
data.get("document", {}),
msg=f"Response should contain 'doctags_content' key. Received keys: {list(data.get('document', {}).keys())}",
)
if data.get("document", {}).get("doctags_content") is not None:
check.is_in(
"<doctag><page_header>",
data["document"]["doctags_content"],
msg=f"DocTags document should contain '<doctag><page_header>'. Received: {safe_slice(data['document']['doctags_content'])}",
)

77
tests/test_file_opts.py Normal file
View File

@@ -0,0 +1,77 @@
import asyncio
import json
import os
import pytest
import pytest_asyncio
from asgi_lifespan import LifespanManager
from httpx import ASGITransport, AsyncClient
from docling_core.types import DoclingDocument
from docling_core.types.doc.document import PictureDescriptionData
from docling_serve.app import create_app
@pytest.fixture(scope="session")
def event_loop():
return asyncio.get_event_loop()
@pytest_asyncio.fixture(scope="session")
async def app():
app = create_app()
async with LifespanManager(app) as manager:
print("Launching lifespan of app.")
yield manager.app
@pytest_asyncio.fixture(scope="session")
async def client(app):
async with AsyncClient(
transport=ASGITransport(app=app), base_url="http://app.io"
) as client:
print("Client is ready")
yield client
@pytest.mark.asyncio
async def test_convert_file(client: AsyncClient):
"""Test convert single file to all outputs"""
endpoint = "/v1alpha/convert/file"
options = {
"to_formats": ["md", "json"],
"image_export_mode": "placeholder",
"ocr": False,
"do_picture_description": True,
"picture_description_api": json.dumps(
{
"url": "http://localhost:11434/v1/chat/completions", # ollama
"params": {"model": "granite3.2-vision:2b"},
"timeout": 60,
"prompt": "Describe this image in a few sentences. ",
}
),
}
current_dir = os.path.dirname(__file__)
file_path = os.path.join(current_dir, "2206.01062v1.pdf")
files = {
"files": ("2206.01062v1.pdf", open(file_path, "rb"), "application/pdf"),
}
response = await client.post(endpoint, files=files, data=options)
assert response.status_code == 200, "Response should be 200 OK"
data = response.json()
doc = DoclingDocument.model_validate(data["document"]["json_content"])
for pic in doc.pictures:
for ann in pic.annotations:
if isinstance(ann, PictureDescriptionData):
print(f"{pic.self_ref}")
print(ann.text)

View File

@@ -0,0 +1,54 @@
from docling_serve.datamodel.convert import (
ConvertDocumentsOptions,
PictureDescriptionApi,
)
from docling_serve.docling_conversion import (
_hash_pdf_format_option,
get_pdf_pipeline_opts,
)
def test_options_cache_key():
hashes = set()
opts = ConvertDocumentsOptions()
pipeline_opts = get_pdf_pipeline_opts(opts)
hash = _hash_pdf_format_option(pipeline_opts)
assert hash not in hashes
hashes.add(hash)
opts.do_picture_description = True
pipeline_opts = get_pdf_pipeline_opts(opts)
hash = _hash_pdf_format_option(pipeline_opts)
# pprint(pipeline_opts.pipeline_options.model_dump(serialize_as_any=True))
assert hash not in hashes
hashes.add(hash)
opts.picture_description_api = PictureDescriptionApi(
url="http://localhost",
params={"model": "mymodel"},
prompt="Hello 1",
)
pipeline_opts = get_pdf_pipeline_opts(opts)
hash = _hash_pdf_format_option(pipeline_opts)
# pprint(pipeline_opts.pipeline_options.model_dump(serialize_as_any=True))
assert hash not in hashes
hashes.add(hash)
opts.picture_description_api = PictureDescriptionApi(
url="http://localhost",
params={"model": "your-model"},
prompt="Hello 1",
)
pipeline_opts = get_pdf_pipeline_opts(opts)
hash = _hash_pdf_format_option(pipeline_opts)
# pprint(pipeline_opts.pipeline_options.model_dump(serialize_as_any=True))
assert hash not in hashes
hashes.add(hash)
opts.picture_description_api.prompt = "World"
pipeline_opts = get_pdf_pipeline_opts(opts)
hash = _hash_pdf_format_option(pipeline_opts)
# pprint(pipeline_opts.pipeline_options.model_dump(serialize_as_any=True))
assert hash not in hashes
hashes.add(hash)

127
tests/test_results_clear.py Normal file
View File

@@ -0,0 +1,127 @@
import asyncio
import base64
import json
from pathlib import Path
import pytest
import pytest_asyncio
from asgi_lifespan import LifespanManager
from httpx import ASGITransport, AsyncClient
from docling_serve.app import create_app
from docling_serve.settings import docling_serve_settings
@pytest.fixture(scope="session")
def event_loop():
return asyncio.get_event_loop()
@pytest_asyncio.fixture(scope="session")
async def app():
app = create_app()
async with LifespanManager(app) as manager:
print("Launching lifespan of app.")
yield manager.app
@pytest_asyncio.fixture(scope="session")
async def client(app):
async with AsyncClient(
transport=ASGITransport(app=app), base_url="http://app.io"
) as client:
print("Client is ready")
yield client
async def convert_file(client: AsyncClient):
doc_filename = Path("tests/2408.09869v5.pdf")
encoded_doc = base64.b64encode(doc_filename.read_bytes()).decode()
payload = {
"options": {
"to_formats": ["json"],
},
"file_sources": [{"base64_string": encoded_doc, "filename": doc_filename.name}],
}
response = await client.post("/v1alpha/convert/source/async", json=payload)
assert response.status_code == 200, "Response should be 200 OK"
task = response.json()
print(json.dumps(task, indent=2))
while task["task_status"] not in ("success", "failure"):
response = await client.get(f"/v1alpha/status/poll/{task['task_id']}")
assert response.status_code == 200, "Response should be 200 OK"
task = response.json()
print(f"{task['task_status']=}")
print(f"{task['task_position']=}")
await asyncio.sleep(2)
assert task["task_status"] == "success"
return task
@pytest.mark.asyncio
async def test_clear_results(client: AsyncClient):
"""Test removal of task."""
# Set long delay deletion
docling_serve_settings.result_removal_delay = 100
# Convert and wait for completion
task = await convert_file(client)
# Get result once
result_response = await client.get(f"/v1alpha/result/{task['task_id']}")
assert result_response.status_code == 200, "Response should be 200 OK"
print("Result 1 ok.")
result = result_response.json()
assert result["document"]["json_content"]["schema_name"] == "DoclingDocument"
# Get result twice
result_response = await client.get(f"/v1alpha/result/{task['task_id']}")
assert result_response.status_code == 200, "Response should be 200 OK"
print("Result 2 ok.")
result = result_response.json()
assert result["document"]["json_content"]["schema_name"] == "DoclingDocument"
# Clear
clear_response = await client.get("/v1alpha/clear/results?older_then=0")
assert clear_response.status_code == 200, "Response should be 200 OK"
print("Clear ok.")
# Get deleted result
result_response = await client.get(f"/v1alpha/result/{task['task_id']}")
assert result_response.status_code == 404, "Response should be removed"
print("Result was no longer found.")
@pytest.mark.asyncio
async def test_delay_remove(client: AsyncClient):
"""Test automatic removal of task with delay."""
# Set short delay deletion
docling_serve_settings.result_removal_delay = 5
# Convert and wait for completion
task = await convert_file(client)
# Get result once
result_response = await client.get(f"/v1alpha/result/{task['task_id']}")
assert result_response.status_code == 200, "Response should be 200 OK"
print("Result ok.")
result = result_response.json()
assert result["document"]["json_content"]["schema_name"] == "DoclingDocument"
print("Sleeping to wait the automatic task deletion.")
await asyncio.sleep(10)
# Get deleted result
result_response = await client.get(f"/v1alpha/result/{task['task_id']}")
assert result_response.status_code == 404, "Response should be removed"

7179
uv.lock generated

File diff suppressed because it is too large Load Diff