From 196c5ce42a04d77234a4212c3d9b9772d2c2073e Mon Sep 17 00:00:00 2001 From: Ryan Fernandes Date: Tue, 17 Jun 2025 10:53:51 -0400 Subject: [PATCH] fix: "tesserocr" instead of "tesseract_cli" in usage docs (#223) Signed-off-by: Ryan Fernandes --- docs/usage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/usage.md b/docs/usage.md index 3bc6c45..7a78653 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -13,7 +13,7 @@ On top of the source of file (see below), both endpoints support the same parame - `do_ocr` (bool): If enabled, the bitmap content will be processed using OCR. Defaults to `True`. - `image_export_mode`: Image export mode for the document (only in case of JSON, Markdown or HTML). Allowed values: embedded, placeholder, referenced. Optional, defaults to `embedded`. - `force_ocr` (bool): If enabled, replace any existing text with OCR-generated text over the full content. Defaults to `False`. -- `ocr_engine` (str): OCR engine to use. Allowed values: `easyocr`, `tesseract_cli`, `tesseract`, `rapidocr`, `ocrmac`. Defaults to `easyocr`. +- `ocr_engine` (str): OCR engine to use. Allowed values: `easyocr`, `tesserocr`, `tesseract`, `rapidocr`, `ocrmac`. Defaults to `easyocr`. To use the `tesserocr` engine, `tesserocr` must be installed where docling-serve is running: `pip install tesserocr` - `ocr_lang` (List[str]): List of languages used by the OCR engine. Note that each OCR engine has different values for the language names. Defaults to empty. - `pdf_backend` (str): PDF backend to use. Allowed values: `pypdfium2`, `dlparse_v1`, `dlparse_v2`, `dlparse_v4`. Defaults to `dlparse_v4`. - `table_mode` (str): Table mode to use. Allowed values: `fast`, `accurate`. Defaults to `fast`.