mirror of
https://github.com/docling-project/docling-serve.git
synced 2025-11-29 16:43:24 +00:00
Signed-off-by: Tiago Santana <54704492+SantanaTiago@users.noreply.github.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
1.0 KiB
1.0 KiB
Examples
Split processing
The example of provided of split processing demonstrates how to split a PDF into chunks of pages and send them for conversion. At the end, it concatenates all split pages into a single conversion JSON.
At beginning of file there's variables to be used (and modified) such as:
| Variable | Description |
|---|---|
path_to_pdf |
Path to PDF file to be split |
pages_per_file |
The number of pages per chunk to split PDF |
base_url |
Base url of the docling-serve host |
out_dir |
The output folder of each conversion JSON of split PDF and the final concatenated JSON |
The example follows the following logic:
- Get the number of pages of the
PDF - Based on the number of chunks of pages, send each chunk to conversion using
page_rangeparameter - Wait all conversions to finish
- Get all conversion results
- Save each conversion
JSONresult into aJSONfile - Concatenate all
JSONsinto a singleJSONusingdoclingconcatenate method - Save concatenated
JSONinto aJSONfile