docs: Expand automatic docs to nested objects. More complete usage docs. (#426)

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-11-29 00:23:36 +00:00 · 2025-10-31 15:02:20 +01:00
parent f3957aeb57
commit 35319b0da7
3 changed files with 110 additions and 26 deletions
--- a/docs/usage.md
+++ b/docs/usage.md
@@ -7,6 +7,7 @@ The API provides two endpoints: one for urls, one for files. This is necessary t
 On top of the source of file (see below), both endpoints support the same parameters.

 <!-- begin: parameters-docs -->
+<h4>ConvertDocumentsRequestOptions</h4>

 | Field Name | Type | Description |
 |------------|------|-------------|
@@ -39,6 +40,52 @@ On top of the source of file (see below), both endpoints support the same parame
 | `vlm_pipeline_model_local` | VlmModelLocal or NoneType | Options for running a local vision-language model for the `vlm` pipeline. The parameters refer to a model hosted on Hugging Face. This parameter is mutually exclusive with `vlm_pipeline_model_api` and `vlm_pipeline_model`. |
 | `vlm_pipeline_model_api` | VlmModelApi or NoneType | API details for using a vision-language model for the `vlm` pipeline. This parameter is mutually exclusive with `vlm_pipeline_model_local` and `vlm_pipeline_model`. |

+<h4>VlmModelApi</h4>
+
+| Field Name | Type | Description |
+|------------|------|-------------|
+| `url` | AnyUrl | Endpoint which accepts openai-api compatible requests. |
+| `headers` | Dict[str, str] | Headers used for calling the API endpoint. For example, it could include authentication headers. |
+| `params` | Dict[str, Any] | Model parameters. |
+| `timeout` | float | Timeout for the API request. |
+| `concurrency` | int | Maximum number of concurrent requests to the API. |
+| `prompt` | str | Prompt used when calling the vision-language model. |
+| `scale` | float | Scale factor of the images used. |
+| `response_format` | ResponseFormat | Type of response generated by the model. |
+| `temperature` | float | Temperature parameter controlling the reproducibility of the result. |
+
+<h4>VlmModelLocal</h4>
+
+| Field Name | Type | Description |
+|------------|------|-------------|
+| `repo_id` | str | Repository id from the Hugging Face Hub. |
+| `prompt` | str | Prompt used when calling the vision-language model. |
+| `scale` | float | Scale factor of the images used. |
+| `response_format` | ResponseFormat | Type of response generated by the model. |
+| `inference_framework` | InferenceFramework | Inference framework to use. |
+| `transformers_model_type` | TransformersModelType | Type of transformers auto-model to use. |
+| `extra_generation_config` | Dict[str, Any] | Config from https://huggingface.co/docs/transformers/en/main_classes/text_generation#transformers.GenerationConfig |
+| `temperature` | float | Temperature parameter controlling the reproducibility of the result. |
+
+<h4>PictureDescriptionApi</h4>
+
+| Field Name | Type | Description |
+|------------|------|-------------|
+| `url` | AnyUrl | Endpoint which accepts openai-api compatible requests. |
+| `headers` | Dict[str, str] | Headers used for calling the API endpoint. For example, it could include authentication headers. |
+| `params` | Dict[str, Any] | Model parameters. |
+| `timeout` | float | Timeout for the API request. |
+| `concurrency` | int | Maximum number of concurrent requests to the API. |
+| `prompt` | str | Prompt used when calling the vision-language model. |
+
+<h4>PictureDescriptionLocal</h4>
+
+| Field Name | Type | Description |
+|------------|------|-------------|
+| `repo_id` | str | Repository id from the Hugging Face Hub. |
+| `prompt` | str | Prompt used when calling the vision-language model. |
+| `generation_config` | Dict[str, Any] | Config from https://huggingface.co/docs/transformers/en/main_classes/text_generation#transformers.GenerationConfig |
+
 <!-- end: parameters-docs -->

 ### Authentication