feat!: v1 api with list of sources and target (#249)

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
This commit is contained in:
Michele Dolfi
2025-07-14 13:19:49 +02:00
committed by GitHub
parent daa924a77e
commit 56e328baf7
23 changed files with 556 additions and 367 deletions

View File

@@ -6,3 +6,4 @@ This documentation pages explore the webserver configurations, runtime options,
- [Advance usage](./usage.md)
- [Deployment](./deployment.md)
- [Development](./development.md)
- [`v1` migration](./v1_migration.md)

View File

@@ -76,6 +76,6 @@ The following table describes the options to configure the Docling Serve KFP eng
| `DOCLING_SERVE_ENG_KFP_ENDPOINT` | | Must be set to the Kubeflow Pipeline endpoint. When using the in-cluster deployment, make sure to use the cluster endpoint, e.g. `https://NAME.NAMESPACE.svc.cluster.local:8888` |
| `DOCLING_SERVE_ENG_KFP_TOKEN` | | The authentication token for KFP. For in-cluster deployment, the app will load automatically the token of the ServiceAccount. |
| `DOCLING_SERVE_ENG_KFP_CA_CERT_PATH` | | Path to the CA certificates for the KFP endpoint. For in-cluster deployment, the app will load automatically the internal CA. |
| `DOCLING_SERVE_ENG_KFP_SELF_CALLBACK_ENDPOINT` | | If set, it enables internal callbacks providing status update of the KFP job. Usually something like `https://NAME.NAMESPACE.svc.cluster.local:5001/v1alpha/callback/task/progress`. |
| `DOCLING_SERVE_ENG_KFP_SELF_CALLBACK_ENDPOINT` | | If set, it enables internal callbacks providing status update of the KFP job. Usually something like `https://NAME.NAMESPACE.svc.cluster.local:5001/v1/callback/task/progress`. |
| `DOCLING_SERVE_ENG_KFP_SELF_CALLBACK_TOKEN_PATH` | | The token used for authenticating the progress callback. For cluster-internal workloads, use `/run/secrets/kubernetes.io/serviceaccount/token`. |
| `DOCLING_SERVE_ENG_KFP_SELF_CALLBACK_CA_CERT_PATH` | | The CA certificate for the progress callback. For cluster-inetrnal workloads, use `/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt`. |

View File

@@ -30,7 +30,7 @@ For using the API:
```sh
# Make a test query
curl -X 'POST' \
"localhost:5001/v1alpha/convert/source/async" \
"localhost:5001/v1/convert/source/async" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d '{
@@ -148,7 +148,7 @@ oc port-forward svc/docling-serve 5001:5001
# Make a test query
curl -X 'POST' \
"localhost:5001/v1alpha/convert/source/async" \
"localhost:5001/v1/convert/source/async" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d '{
@@ -184,7 +184,7 @@ OCP_AUTH_TOKEN=$(oc whoami --show-token)
# Make a test query
curl -X 'POST' \
"${DOCLING_ROUTE}/v1alpha/convert/source/async" \
"${DOCLING_ROUTE}/v1/convert/source/async" \
-H "Authorization: Bearer ${OCP_AUTH_TOKEN}" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
@@ -218,7 +218,7 @@ DOCLING_ROUTE="https://$(oc get routes $DOCLING_NAME --template={{.spec.host}})"
# Make a test query, store the cookie and taskid
task_id=$(curl -s -X 'POST' \
"${DOCLING_ROUTE}/v1alpha/convert/source/async" \
"${DOCLING_ROUTE}/v1/convert/source/async" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d '{
@@ -230,7 +230,7 @@ task_id=$(curl -s -X 'POST' \
```sh
# Grab the taskid and cookie to check the task status
curl -v -X 'GET' \
"${DOCLING_ROUTE}/v1alpha/status/poll/$task_id?wait=0" \
"${DOCLING_ROUTE}/v1/status/poll/$task_id?wait=0" \
-H "accept: application/json" \
-b "cookies.txt"
```

View File

@@ -18,7 +18,6 @@ On top of the source of file (see below), both endpoints support the same parame
- `pdf_backend` (str): PDF backend to use. Allowed values: `pypdfium2`, `dlparse_v1`, `dlparse_v2`, `dlparse_v4`. Defaults to `dlparse_v4`.
- `table_mode` (str): Table mode to use. Allowed values: `fast`, `accurate`. Defaults to `fast`.
- `abort_on_error` (bool): If enabled, abort on error. Defaults to false.
- `return_as_file` (boo): If enabled, return the output as a file. Defaults to false.
- `md_page_break_placeholder` (str): Add this placeholder between pages in the markdown output.
- `do_table_structure` (bool): If enabled, the table structure will be extracted. Defaults to true.
- `do_code_enrichment` (bool): If enabled, perform OCR code enrichment. Defaults to false.
@@ -35,7 +34,7 @@ On top of the source of file (see below), both endpoints support the same parame
### Source endpoint
The endpoint is `/v1alpha/convert/source`, listening for POST requests of JSON payloads.
The endpoint is `/v1/convert/source`, listening for POST requests of JSON payloads.
On top of the above parameters, you must send the URL(s) of the document you want process with either the `http_sources` or `file_sources` fields.
The first is fetching URL(s) (optionally using with extra headers), the second allows to provide documents as base64-encoded strings.
@@ -66,7 +65,6 @@ Simple payload example:
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": false,
"return_as_file": false,
},
"http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
}
@@ -80,7 +78,7 @@ Simple payload example:
```sh
curl -X 'POST' \
'http://localhost:5001/v1alpha/convert/source' \
'http://localhost:5001/v1/convert/source' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
@@ -109,7 +107,6 @@ curl -X 'POST' \
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": false,
"return_as_file": false,
"do_table_structure": true,
"include_images": true,
"images_scale": 2
@@ -127,7 +124,7 @@ curl -X 'POST' \
import httpx
async_client = httpx.AsyncClient(timeout=60.0)
url = "http://localhost:5001/v1alpha/convert/source"
url = "http://localhost:5001/v1/convert/source"
payload = {
"options": {
"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
@@ -140,7 +137,6 @@ payload = {
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": False,
"return_as_file": False,
},
"http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
}
@@ -179,7 +175,7 @@ cat <<EOF > /tmp/request_body.json
EOF
# 3. POST the request to the docling service
curl -X POST "localhost:5001/v1alpha/convert/source" \
curl -X POST "localhost:5001/v1/convert/source" \
-H "Content-Type: application/json" \
-d @/tmp/request_body.json
```
@@ -188,14 +184,14 @@ curl -X POST "localhost:5001/v1alpha/convert/source" \
### File endpoint
The endpoint is: `/v1alpha/convert/file`, listening for POST requests of Form payloads (necessary as the files are sent as multipart/form data). You can send one or multiple files.
The endpoint is: `/v1/convert/file`, listening for POST requests of Form payloads (necessary as the files are sent as multipart/form data). You can send one or multiple files.
<details>
<summary>CURL example:</summary>
```sh
curl -X 'POST' \
'http://127.0.0.1:5001/v1alpha/convert/file' \
'http://127.0.0.1:5001/v1/convert/file' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'ocr_engine=easyocr' \
@@ -211,7 +207,6 @@ curl -X 'POST' \
-F 'abort_on_error=false' \
-F 'to_formats=md' \
-F 'to_formats=text' \
-F 'return_as_file=false' \
-F 'do_ocr=true'
```
@@ -224,7 +219,7 @@ curl -X 'POST' \
import httpx
async_client = httpx.AsyncClient(timeout=60.0)
url = "http://localhost:5001/v1alpha/convert/file"
url = "http://localhost:5001/v1/convert/file"
parameters = {
"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
"to_formats": ["md", "json", "html", "text", "doctags"],
@@ -236,7 +231,6 @@ parameters = {
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": False,
"return_as_file": False
}
current_dir = os.path.dirname(__file__)
@@ -354,19 +348,19 @@ The response can be a JSON Document or a File.
`processing_time` is the Docling processing time in seconds, and `timings` (when enabled in the backend) provides the detailed
timing of all the internal Docling components.
- If you set the parameter `return_as_file` to True, the response will be a zip file.
- If multiple files are generated (multiple inputs, or one input but multiple outputs with `return_as_file` True), the response will be a zip file.
- If you set the parameter `target` to the zip mode, the response will be a zip file.
- If multiple files are generated (multiple inputs, or one input but multiple outputs with the zip target mode), the response will be a zip file.
## Asynchronous API
Both `/v1alpha/convert/source` and `/v1alpha/convert/file` endpoints are available as asynchronous variants.
Both `/v1/convert/source` and `/v1/convert/file` endpoints are available as asynchronous variants.
The advantage of the asynchronous endpoints is the possible to interrupt the connection, check for the progress update and fetch the result.
This approach is more resilient against network stabilities and allows the client application logic to easily interleave conversion with other tasks.
Launch an asynchronous conversion with:
- `POST /v1alpha/convert/source/async` when providing the input as sources.
- `POST /v1alpha/convert/file/async` when providing the input as multipart-form files.
- `POST /v1/convert/source/async` when providing the input as sources.
- `POST /v1/convert/file/async` when providing the input as multipart-form files.
The response format is a task detail:
@@ -383,7 +377,7 @@ The response format is a task detail:
For checking the progress of the conversion task and wait for its completion, use the endpoint:
- `GET /v1alpha/status/poll/{task_id}`
- `GET /v1/status/poll/{task_id}`
<details>
<summary>Example waiting loop:</summary>
@@ -410,7 +404,7 @@ while task["task_status"] not in ("success", "failure"):
Using websocket you can get the client application being notified about updates of the conversion task.
To start the websocker connection, use the endpoint:
- `/v1alpha/status/ws/{task_id}`
- `/v1/status/ws/{task_id}`
Websocket messages are JSON object with the following structure:
@@ -428,7 +422,7 @@ Websocket messages are JSON object with the following structure:
```python
from websockets.sync.client import connect
uri = f"ws://{base_url}/v1alpha/status/ws/{task['task_id']}"
uri = f"ws://{base_url}/v1/status/ws/{task['task_id']}"
with connect(uri) as websocket:
for message in websocket:
try:
@@ -447,4 +441,4 @@ with connect(uri) as websocket:
When the task is completed, the result can be fetched with the endpoint:
- `GET /v1alpha/result/{task_id}`
- `GET /v1/result/{task_id}`

80
docs/v1_migration.md Normal file
View File

@@ -0,0 +1,80 @@
# Migration to the `v1` API
Docling Serve from the initial prototype `v1alpha` API to the stable `v1` API.
This page provides simple instructions to upgrade your application to the new API.
## API changes
The breaking changes introduced in the `v1` release of Docling Serve are designed to provide a stable schema which
allows the project to provide new capabilities as new type of input sources, targets and also the definition of callback for event-driven applications.
### Endpoint names
All endpoints are renamed from `/v1alpha/` to `/v1/`.
### Sources
When using the `/v1/convert/source` endpoint, input documents have to be specified with the `sources: []` argument, which is replacing the usage of `file_sources` and `http_sources`.
Old version:
```jsonc
{
"options": {}, // conversion options
"file_sources": [ // input documents provided as base64-encoded strings
{"base64_string": "abc123...", "filename": "file.pdf"}
],
"http_sources": [ // input documents provided as http urls
{"url": "https://..."}
]
}
```
New version:
```jsonc
{
"options": {}, // conversion options
"sources": [
// input document provided as base64-encoded string
{"kind": "kind", "base64_string": "abc123...", "filename": "file.pdf"},
// input document provided as http urls
{"kind": "http", "url": "https://..."},
]
}
```
### Targets
Switching between output formats, i.e. from the JSON inbody response to the zip archive response, users have to specify the `target` argument, which is replacing the usage of `options.return_as_file`.
Old version:
```jsonc
{
"options": {
"return_as_file": true // <-- to be removed
},
// ...
}
```
New version:
```jsonc
{
"options": {},
"target": {"kind": "zip"}, // <-- add this
// ...
}
```
## Continue with the old API
If you are not able to apply the changes above to your application, please consider pinning of the previous `v0.x` container images, e.g.
```sh
podman run -p 5001:5001 -e DOCLING_SERVE_ENABLE_UI=1 quay.io/docling-project/docling-serve:v0.16.1
```
_Note that the old prototype API will not be supported in new `v1.x` versions._