feat!: v1 api with list of sources and target (#249)

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
This commit is contained in:
Michele Dolfi
2025-07-14 13:19:49 +02:00
committed by GitHub
parent daa924a77e
commit 56e328baf7
23 changed files with 556 additions and 367 deletions

View File

@@ -18,7 +18,6 @@ On top of the source of file (see below), both endpoints support the same parame
- `pdf_backend` (str): PDF backend to use. Allowed values: `pypdfium2`, `dlparse_v1`, `dlparse_v2`, `dlparse_v4`. Defaults to `dlparse_v4`.
- `table_mode` (str): Table mode to use. Allowed values: `fast`, `accurate`. Defaults to `fast`.
- `abort_on_error` (bool): If enabled, abort on error. Defaults to false.
- `return_as_file` (boo): If enabled, return the output as a file. Defaults to false.
- `md_page_break_placeholder` (str): Add this placeholder between pages in the markdown output.
- `do_table_structure` (bool): If enabled, the table structure will be extracted. Defaults to true.
- `do_code_enrichment` (bool): If enabled, perform OCR code enrichment. Defaults to false.
@@ -35,7 +34,7 @@ On top of the source of file (see below), both endpoints support the same parame
### Source endpoint
The endpoint is `/v1alpha/convert/source`, listening for POST requests of JSON payloads.
The endpoint is `/v1/convert/source`, listening for POST requests of JSON payloads.
On top of the above parameters, you must send the URL(s) of the document you want process with either the `http_sources` or `file_sources` fields.
The first is fetching URL(s) (optionally using with extra headers), the second allows to provide documents as base64-encoded strings.
@@ -66,7 +65,6 @@ Simple payload example:
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": false,
"return_as_file": false,
},
"http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
}
@@ -80,7 +78,7 @@ Simple payload example:
```sh
curl -X 'POST' \
'http://localhost:5001/v1alpha/convert/source' \
'http://localhost:5001/v1/convert/source' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
@@ -109,7 +107,6 @@ curl -X 'POST' \
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": false,
"return_as_file": false,
"do_table_structure": true,
"include_images": true,
"images_scale": 2
@@ -127,7 +124,7 @@ curl -X 'POST' \
import httpx
async_client = httpx.AsyncClient(timeout=60.0)
url = "http://localhost:5001/v1alpha/convert/source"
url = "http://localhost:5001/v1/convert/source"
payload = {
"options": {
"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
@@ -140,7 +137,6 @@ payload = {
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": False,
"return_as_file": False,
},
"http_sources": [{"url": "https://arxiv.org/pdf/2206.01062"}]
}
@@ -179,7 +175,7 @@ cat <<EOF > /tmp/request_body.json
EOF
# 3. POST the request to the docling service
curl -X POST "localhost:5001/v1alpha/convert/source" \
curl -X POST "localhost:5001/v1/convert/source" \
-H "Content-Type: application/json" \
-d @/tmp/request_body.json
```
@@ -188,14 +184,14 @@ curl -X POST "localhost:5001/v1alpha/convert/source" \
### File endpoint
The endpoint is: `/v1alpha/convert/file`, listening for POST requests of Form payloads (necessary as the files are sent as multipart/form data). You can send one or multiple files.
The endpoint is: `/v1/convert/file`, listening for POST requests of Form payloads (necessary as the files are sent as multipart/form data). You can send one or multiple files.
<details>
<summary>CURL example:</summary>
```sh
curl -X 'POST' \
'http://127.0.0.1:5001/v1alpha/convert/file' \
'http://127.0.0.1:5001/v1/convert/file' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'ocr_engine=easyocr' \
@@ -211,7 +207,6 @@ curl -X 'POST' \
-F 'abort_on_error=false' \
-F 'to_formats=md' \
-F 'to_formats=text' \
-F 'return_as_file=false' \
-F 'do_ocr=true'
```
@@ -224,7 +219,7 @@ curl -X 'POST' \
import httpx
async_client = httpx.AsyncClient(timeout=60.0)
url = "http://localhost:5001/v1alpha/convert/file"
url = "http://localhost:5001/v1/convert/file"
parameters = {
"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
"to_formats": ["md", "json", "html", "text", "doctags"],
@@ -236,7 +231,6 @@ parameters = {
"pdf_backend": "dlparse_v2",
"table_mode": "fast",
"abort_on_error": False,
"return_as_file": False
}
current_dir = os.path.dirname(__file__)
@@ -354,19 +348,19 @@ The response can be a JSON Document or a File.
`processing_time` is the Docling processing time in seconds, and `timings` (when enabled in the backend) provides the detailed
timing of all the internal Docling components.
- If you set the parameter `return_as_file` to True, the response will be a zip file.
- If multiple files are generated (multiple inputs, or one input but multiple outputs with `return_as_file` True), the response will be a zip file.
- If you set the parameter `target` to the zip mode, the response will be a zip file.
- If multiple files are generated (multiple inputs, or one input but multiple outputs with the zip target mode), the response will be a zip file.
## Asynchronous API
Both `/v1alpha/convert/source` and `/v1alpha/convert/file` endpoints are available as asynchronous variants.
Both `/v1/convert/source` and `/v1/convert/file` endpoints are available as asynchronous variants.
The advantage of the asynchronous endpoints is the possible to interrupt the connection, check for the progress update and fetch the result.
This approach is more resilient against network stabilities and allows the client application logic to easily interleave conversion with other tasks.
Launch an asynchronous conversion with:
- `POST /v1alpha/convert/source/async` when providing the input as sources.
- `POST /v1alpha/convert/file/async` when providing the input as multipart-form files.
- `POST /v1/convert/source/async` when providing the input as sources.
- `POST /v1/convert/file/async` when providing the input as multipart-form files.
The response format is a task detail:
@@ -383,7 +377,7 @@ The response format is a task detail:
For checking the progress of the conversion task and wait for its completion, use the endpoint:
- `GET /v1alpha/status/poll/{task_id}`
- `GET /v1/status/poll/{task_id}`
<details>
<summary>Example waiting loop:</summary>
@@ -410,7 +404,7 @@ while task["task_status"] not in ("success", "failure"):
Using websocket you can get the client application being notified about updates of the conversion task.
To start the websocker connection, use the endpoint:
- `/v1alpha/status/ws/{task_id}`
- `/v1/status/ws/{task_id}`
Websocket messages are JSON object with the following structure:
@@ -428,7 +422,7 @@ Websocket messages are JSON object with the following structure:
```python
from websockets.sync.client import connect
uri = f"ws://{base_url}/v1alpha/status/ws/{task['task_id']}"
uri = f"ws://{base_url}/v1/status/ws/{task['task_id']}"
with connect(uri) as websocket:
for message in websocket:
try:
@@ -447,4 +441,4 @@ with connect(uri) as websocket:
When the task is completed, the result can be fetched with the endpoint:
- `GET /v1alpha/result/{task_id}`
- `GET /v1/result/{task_id}`