docs: Generate usage.md automatically (#340)

Signed-off-by: Tiago Santana <54704492+SantanaTiago@users.noreply.github.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2026-03-07 14:23:22 +00:00 · 2025-10-21 13:27:01 +01:00
parent 56e8535a7a
commit 9672f310b1
5 changed files with 212 additions and 29 deletions
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -7,12 +7,12 @@ repos:
      - id: ruff-format
        name: "Ruff formatter"
        args: [--config=pyproject.toml]
-        files: '^(docling_serve|tests|examples).*\.(py|ipynb)$'
+        files: '^(docling_serve|tests|examples|scripts).*\.(py|ipynb)$'
      # Run the Ruff linter.
      - id: ruff
        name: "Ruff linter"
        args: [--exit-non-zero-on-fix, --fix, --config=pyproject.toml]
-        files: '^(docling_serve|tests|examples).*\.(py|ipynb)$'
+        files: '^(docling_serve|tests|examples|scripts).*\.(py|ipynb)$'
  - repo: local
    hooks:
      - id: system
@@ -21,6 +21,15 @@ repos:
        pass_filenames: false
        language: system
        files: '\.py$'
+  - repo: local
+    hooks:
+      - id: update-docs-common-parameters
+        name: Update Documentation File
+        entry: uv run scripts/update_doc_usage.py
+        language: python
+        pass_filenames: false
+        # Fail the commit if documentation generation fails
+        require_serial: true
  - repo: https://github.com/errata-ai/vale
    rev: v3.12.0  # Use latest stable version
    hooks:
--- a/docs/usage.md
+++ b/docs/usage.md
@@ -4,31 +4,42 @@ The API provides two endpoints: one for urls, one for files. This is necessary t

 ## Common parameters

-On top of the source of file (see below), both endpoints support the same parameters, which are almost the same as the Docling CLI.
+On top of the source of file (see below), both endpoints support the same parameters.

- `from_formats` (List[str]): Input format(s) to convert from. Allowed values: `docx`, `pptx`, `html`, `image`, `pdf`, `asciidoc`, `md`. Defaults to all formats.
- `to_formats` (List[str]): Output format(s) to convert to. Allowed values: `md`, `json`, `html`, `text`, `doctags`. Defaults to `md`.
- `pipeline` (str). The choice of which pipeline to use. Allowed values are `standard` and `vlm`. Defaults to `standard`.
- `page_range` (tuple). If specified, only convert a range of pages. The page number starts at 1.
- `do_ocr` (bool): If enabled, the bitmap content will be processed using OCR. Defaults to `True`.
- `image_export_mode`: Image export mode for the document (only in case of JSON, Markdown or HTML). Allowed values: embedded, placeholder, referenced. Optional, defaults to `embedded`.
- `force_ocr` (bool): If enabled, replace any existing text with OCR-generated text over the full content. Defaults to `False`.
- `ocr_engine` (str): OCR engine to use. Allowed values: `easyocr`, `tesserocr`, `tesseract`, `rapidocr`, `ocrmac`. Defaults to `easyocr`. To use the `tesserocr` engine, `tesserocr` must be installed where docling-serve is running: `pip install tesserocr`
- `ocr_lang` (List[str]): List of languages used by the OCR engine. Note that each OCR engine has different values for the language names. Defaults to empty.
- `pdf_backend` (str): PDF backend to use. Allowed values: `pypdfium2`, `dlparse_v1`, `dlparse_v2`, `dlparse_v4`. Defaults to `dlparse_v4`.
- `table_mode` (str): Table mode to use. Allowed values: `fast`, `accurate`. Defaults to `fast`.
- `abort_on_error` (bool): If enabled, abort on error. Defaults to false.
- `md_page_break_placeholder` (str): Add this placeholder between pages in the markdown output.
- `do_table_structure` (bool): If enabled, the table structure will be extracted. Defaults to true.
- `do_code_enrichment` (bool): If enabled, perform OCR code enrichment. Defaults to false.
- `do_formula_enrichment` (bool): If enabled, perform formula OCR, return LaTeX code. Defaults to false.
- `do_picture_classification` (bool): If enabled, classify pictures in documents. Defaults to false.
- `do_picture_description` (bool): If enabled, describe pictures in documents. Defaults to false.
- `picture_description_area_threshold` (float): Minimum percentage of the area for a picture to be processed with the models. Defaults to 0.05.
- `picture_description_local` (dict): Options for running a local vision-language model in the picture description. The parameters refer to a model hosted on Hugging Face. This parameter is mutually exclusive with `picture_description_api`.
- `picture_description_api` (dict): API details for using a vision-language model in the picture description. This parameter is mutually exclusive with `picture_description_local`.
- `include_images` (bool): If enabled, images will be extracted from the document. Defaults to false.
- `images_scale` (float): Scale factor for images. Defaults to 2.0.
+<!-- begin: parameters-docs -->
+
+| Field Name | Type | Description |
+|------------|------|-------------|
+| `from_formats` | List[InputFormat] | Input format(s) to convert from. String or list of strings. Allowed values: `docx`, `pptx`, `html`, `image`, `pdf`, `asciidoc`, `md`, `csv`, `xlsx`, `xml_uspto`, `xml_jats`, `mets_gbs`, `json_docling`, `audio`, `vtt`. Optional, defaults to all formats. |
+| `to_formats` | List[OutputFormat] | Output format(s) to convert to. String or list of strings. Allowed values: `md`, `json`, `html`, `html_split_page`, `text`, `doctags`. Optional, defaults to Markdown. |
+| `image_export_mode` | ImageRefMode | Image export mode for the document (in case of JSON, Markdown or HTML). Allowed values: `placeholder`, `embedded`, `referenced`. Optional, defaults to Embedded. |
+| `do_ocr` | bool | If enabled, the bitmap content will be processed using OCR. Boolean. Optional, defaults to true |
+| `force_ocr` | bool | If enabled, replace existing text with OCR-generated text over content. Boolean. Optional, defaults to false. |
+| `ocr_engine` | `ocr_engines_enum` | The OCR engine to use. String. Allowed values: `auto`, `easyocr`, `ocrmac`, `rapidocr`, `tesserocr`, `tesseract`. Optional, defaults to `easyocr`. |
+| `ocr_lang` | List[str] or NoneType | List of languages used by the OCR engine. Note that each OCR engine has different values for the language names. String or list of strings. Optional, defaults to empty. |
+| `pdf_backend` | PdfBackend | The PDF backend to use. String. Allowed values: `pypdfium2`, `dlparse_v1`, `dlparse_v2`, `dlparse_v4`. Optional, defaults to `dlparse_v4`. |
+| `table_mode` | TableFormerMode | Mode to use for table structure, String. Allowed values: `fast`, `accurate`. Optional, defaults to accurate. |
+| `table_cell_matching` | bool | If true, matches table cells predictions back to PDF cells. Can break table output if PDF cells are merged across table columns. If false, let table structure model define the text cells, ignore PDF cells. |
+| `pipeline` | ProcessingPipeline | Choose the pipeline to process PDF or image files. |
+| `page_range` | Tuple | Only convert a range of pages. The page number starts at 1. |
+| `document_timeout` | float | The timeout for processing each document, in seconds. |
+| `abort_on_error` | bool | Abort on error if enabled. Boolean. Optional, defaults to false. |
+| `do_table_structure` | bool | If enabled, the table structure will be extracted. Boolean. Optional, defaults to true. |
+| `include_images` | bool | If enabled, images will be extracted from the document. Boolean. Optional, defaults to true. |
+| `images_scale` | float | Scale factor for images. Float. Optional, defaults to 2.0. |
+| `md_page_break_placeholder` | str | Add this placeholder between pages in the markdown output. |
+| `do_code_enrichment` | bool | If enabled, perform OCR code enrichment. Boolean. Optional, defaults to false. |
+| `do_formula_enrichment` | bool | If enabled, perform formula OCR, return LaTeX code. Boolean. Optional, defaults to false. |
+| `do_picture_classification` | bool | If enabled, classify pictures in documents. Boolean. Optional, defaults to false. |
+| `do_picture_description` | bool | If enabled, describe pictures in documents. Boolean. Optional, defaults to false. |
+| `picture_description_area_threshold` | float | Minimum percentage of the area for a picture to be processed with the models. |
+| `picture_description_local` | PictureDescriptionLocal or NoneType | Options for running a local vision-language model in the picture description. The parameters refer to a model hosted on Hugging Face. This parameter is mutually exclusive with `picture_description_api`. |
+| `picture_description_api` | PictureDescriptionApi or NoneType | API details for using a vision-language model in the picture description. This parameter is mutually exclusive with `picture_description_local`. |
+| `vlm_pipeline_model` | VlmModelType or NoneType | Preset of local and API models for the `vlm` pipeline. This parameter is mutually exclusive with `vlm_pipeline_model_local` and `vlm_pipeline_model_api`. Use the other options for more parameters. |
+| `vlm_pipeline_model_local` | VlmModelLocal or NoneType | Options for running a local vision-language model for the `vlm` pipeline. The parameters refer to a model hosted on Hugging Face. This parameter is mutually exclusive with `vlm_pipeline_model_api` and `vlm_pipeline_model`. |
+| `vlm_pipeline_model_api` | VlmModelApi or NoneType | API details for using a vision-language model for the `vlm` pipeline. This parameter is mutually exclusive with `vlm_pipeline_model_local` and `vlm_pipeline_model`. |
+
+<!-- end: parameters-docs -->

 ### Authentication

--- a/scripts/init.py
+++ b/scripts/init.py
--- a/scripts/update_doc_usage.py
+++ b/scripts/update_doc_usage.py
@@ -0,0 +1,163 @@
+import re
+from typing import Annotated, Any, get_args, get_origin
+
+from pydantic import BaseModel
+
+from docling_serve.datamodel.convert import ConvertDocumentsRequestOptions
+
+DOCS_FILE = "docs/usage.md"
+
+VARIABLE_WORDS: list[str] = [
+    "picture_description_local",
+    "vlm_pipeline_model",
+    "vlm",
+    "vlm_pipeline_model_api",
+    "ocr_engines_enum",
+    "easyocr",
+    "dlparse_v4",
+    "fast",
+    "picture_description_api",
+    "vlm_pipeline_model_local",
+]
+
+
+def format_variable_names(text: str) -> str:
+    """Format specific words in description to be code-formatted."""
+    sorted_words = sorted(VARIABLE_WORDS, key=len, reverse=True)
+
+    escaped_words = [re.escape(word) for word in sorted_words]
+
+    for word in escaped_words:
+        pattern = rf"(?<!`)\b{word}\b(?!`)"
+        text = re.sub(pattern, f"`{word}`", text)
+
+    return text
+
+
+def format_allowed_values_description(description: str) -> str:
+    """Format description to code-format allowed values."""
+    # Regex pattern to find text after "Allowed values:"
+    match = re.search(r"Allowed values:(.+?)(?:\.|$)", description, re.DOTALL)
+
+    if match:
+        # Extract the allowed values
+        values_str = match.group(1).strip()
+
+        # Split values, handling both comma and 'and' separators
+        values = re.split(r"\s*(?:,\s*|\s+and\s+)", values_str)
+
+        # Remove any remaining punctuation and whitespace
+        values = [value.strip("., ") for value in values]
+
+        # Create code-formatted values
+        formatted_values = ", ".join(f"`{value}`" for value in values)
+
+        # Replace the original allowed values with formatted version
+        formatted_description = re.sub(
+            r"(Allowed values:)(.+?)(?:\.|$)",
+            f"\\1 {formatted_values}.",
+            description,
+            flags=re.DOTALL,
+        )
+
+        return formatted_description
+
+    return description
+
+
+def _format_type(type_hint: Any) -> str:
+    """Format type ccrrectly, like Annotation or Union."""
+    if get_origin(type_hint) is Annotated:
+        base_type = get_args(type_hint)[0]
+        return _format_type(base_type)
+
+    if hasattr(type_hint, "__origin__"):
+        origin = type_hint.__origin__
+        args = get_args(type_hint)
+
+        if origin is list:
+            return f"List[{_format_type(args[0])}]"
+        elif origin is dict:
+            return f"Dict[{_format_type(args[0])}, {_format_type(args[1])}]"
+        elif str(origin).__contains__("Union") or str(origin).__contains__("Optional"):
+            return " or ".join(_format_type(arg) for arg in args)
+        elif origin is None:
+            return "null"
+
+    if hasattr(type_hint, "__name__"):
+        return type_hint.__name__
+
+    return str(type_hint)
+
+
+def generate_model_doc(model: type[BaseModel]) -> str:
+    """Generate documentation for a Pydantic model."""
+    doc = "\n| Field Name | Type | Description |\n"
+    doc += "|------------|------|-------------|\n"
+
+    for base_model in model.__mro__:
+        # Check if this is a Pydantic model
+        if hasattr(base_model, "model_fields"):
+            # Iterate through fields of this model
+            for field_name, field in base_model.model_fields.items():
+                # Extract description from Annotated field if possible
+                description = field.description or "No description provided."
+                description = format_allowed_values_description(description)
+                description = format_variable_names(description)
+
+                # Handle Annotated types
+                original_type = field.annotation
+                if get_origin(original_type) is Annotated:
+                    # Extract base type and additional metadata
+                    type_args = get_args(original_type)
+                    base_type = type_args[0]
+                else:
+                    base_type = original_type
+
+                field_type = _format_type(base_type)
+                field_type = format_variable_names(field_type)
+
+                doc += f"| `{field_name}` | {field_type} | {description} |\n"
+
+            # stop iterating the base classes
+            break
+
+    doc += "\n"
+    return doc
+
+
+def update_documentation():
+    """Update the documentation file with model information."""
+    doc_request = generate_model_doc(ConvertDocumentsRequestOptions)
+
+    with open(DOCS_FILE) as f:
+        content = f.readlines()
+
+    # Prepare to update the content
+    new_content = []
+    in_cp_section = False
+
+    for line in content:
+        if line.startswith("<!-- begin: parameters-docs -->"):
+            in_cp_section = True
+            new_content.append(line)
+            new_content.append(doc_request)
+            continue
+
+        if in_cp_section and line.strip() == "<!-- end: parameters-docs -->":
+            in_cp_section = False
+
+        if not in_cp_section:
+            new_content.append(line)
+
+    # Only write to the file if new_content is different from content
+    if "".join(new_content) != "".join(content):
+        with open(DOCS_FILE, "w") as f:
+            f.writelines(new_content)
+        print(f"Documentation updated in {DOCS_FILE}")
+    else:
+        print("No changes detected. Documentation file remains unchanged.")
+
+
+if __name__ == "__main__":
+    update_documentation()
--- a/uv.lock
+++ b/uv.lock
@@ -1458,7 +1458,7 @@ wheels = [

 [[package]]
 name = "docling-jobkit"
-version = "1.6.0"
+version = "1.7.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "boto3", marker = "platform_machine != 'x86_64' or sys_platform != 'darwin' or (extra == 'group-13-docling-serve-cpu' and extra == 'group-13-docling-serve-cu126') or (extra == 'group-13-docling-serve-cpu' and extra == 'group-13-docling-serve-cu128') or (extra == 'group-13-docling-serve-cpu' and extra == 'group-13-docling-serve-pypi') or (extra == 'group-13-docling-serve-cpu' and extra == 'group-13-docling-serve-rocm') or (extra == 'group-13-docling-serve-cu126' and extra == 'group-13-docling-serve-cu128') or (extra == 'group-13-docling-serve-cu126' and extra == 'group-13-docling-serve-pypi') or (extra == 'group-13-docling-serve-cu126' and extra == 'group-13-docling-serve-rocm') or (extra == 'group-13-docling-serve-cu128' and extra == 'group-13-docling-serve-pypi') or (extra == 'group-13-docling-serve-cu128' and extra == 'group-13-docling-serve-rocm') or (extra == 'group-13-docling-serve-pypi' and extra == 'group-13-docling-serve-rocm')" },
@@ -1471,9 +1471,9 @@ dependencies = [
    { name = "pydantic-settings", marker = "platform_machine != 'x86_64' or sys_platform != 'darwin' or (extra == 'group-13-docling-serve-cpu' and extra == 'group-13-docling-serve-cu126') or (extra == 'group-13-docling-serve-cpu' and extra == 'group-13-docling-serve-cu128') or (extra == 'group-13-docling-serve-cpu' and extra == 'group-13-docling-serve-pypi') or (extra == 'group-13-docling-serve-cpu' and extra == 'group-13-docling-serve-rocm') or (extra == 'group-13-docling-serve-cu126' and extra == 'group-13-docling-serve-cu128') or (extra == 'group-13-docling-serve-cu126' and extra == 'group-13-docling-serve-pypi') or (extra == 'group-13-docling-serve-cu126' and extra == 'group-13-docling-serve-rocm') or (extra == 'group-13-docling-serve-cu128' and extra == 'group-13-docling-serve-pypi') or (extra == 'group-13-docling-serve-cu128' and extra == 'group-13-docling-serve-rocm') or (extra == 'group-13-docling-serve-pypi' and extra == 'group-13-docling-serve-rocm')" },
    { name = "typer", marker = "platform_machine != 'x86_64' or sys_platform != 'darwin' or (extra == 'group-13-docling-serve-cpu' and extra == 'group-13-docling-serve-cu126') or (extra == 'group-13-docling-serve-cpu' and extra == 'group-13-docling-serve-cu128') or (extra == 'group-13-docling-serve-cpu' and extra == 'group-13-docling-serve-pypi') or (extra == 'group-13-docling-serve-cpu' and extra == 'group-13-docling-serve-rocm') or (extra == 'group-13-docling-serve-cu126' and extra == 'group-13-docling-serve-cu128') or (extra == 'group-13-docling-serve-cu126' and extra == 'group-13-docling-serve-pypi') or (extra == 'group-13-docling-serve-cu126' and extra == 'group-13-docling-serve-rocm') or (extra == 'group-13-docling-serve-cu128' and extra == 'group-13-docling-serve-pypi') or (extra == 'group-13-docling-serve-cu128' and extra == 'group-13-docling-serve-rocm') or (extra == 'group-13-docling-serve-pypi' and extra == 'group-13-docling-serve-rocm')" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/34/4a/ba2cba099265613ddcc3c5767fe190bb192674a68c4ef26829c192fb99a0/docling_jobkit-1.6.0.tar.gz", hash = "sha256:7a5e326587599899fbb6823e67176810c08532d4e83e381a494092c80bba73e5", size = 56046, upload-time = "2025-10-03T09:35:34.797Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/c8/ea/42aa09c49bf7ec12338dd65f4eb6b807fdb260e8e25b640896b40d04a56c/docling_jobkit-1.7.0.tar.gz", hash = "sha256:8afbd46d8594bfe669d29e616e7b4a5fa9cba532c48f0e0b3976d573488e5bd0", size = 56044, upload-time = "2025-10-21T10:04:07.559Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/84/1e/531ec87f9a5af4d39f7bd76ffc18c7b32d6247f3bfc1869de0b696fc8e37/docling_jobkit-1.6.0-py3-none-any.whl", hash = "sha256:eeee1b607314d4b93484c63b7190a7dd404dc1463410ca683d0d96bcd27d1f02", size = 78950, upload-time = "2025-10-03T09:35:33.026Z" },
+    { url = "https://files.pythonhosted.org/packages/d6/24/78e69caed36d3f32c3a1321ae50af1dac83d1beb30ecdf50722ff91bef27/docling_jobkit-1.7.0-py3-none-any.whl", hash = "sha256:cd244860dfedbf6c2bb52ccfe3f280a719dc1bec3e034a5f7520af6127f1c486", size = 78949, upload-time = "2025-10-21T10:04:05.777Z" },
 ]

 [package.optional-dependencies]