diff --git a/docling_serve/gradio_ui.py b/docling_serve/gradio_ui.py index 0414de0..8bdd526 100644 --- a/docling_serve/gradio_ui.py +++ b/docling_serve/gradio_ui.py @@ -480,7 +480,7 @@ with gr.Blocks( css=css, theme=theme, title="Docling Serve", - delete_cache=(3600, 3600), # Delete all files older than 1 hour every hour + delete_cache=(3600, 36000), # Delete all files older than 10 hour every hour ) as ui: # Constants stored in states to be able to pass them as inputs to functions processing_text = gr.State("Processing your document(s), please wait...") diff --git a/docs/configuration.md b/docs/configuration.md index 7f1d6f6..61cd162 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -80,3 +80,10 @@ The following table describes the options to configure the Docling Serve KFP eng | `DOCLING_SERVE_ENG_KFP_SELF_CALLBACK_ENDPOINT` | | If set, it enables internal callbacks providing status update of the KFP job. Usually something like `https://NAME.NAMESPACE.svc.cluster.local:5001/v1/callback/task/progress`. | | `DOCLING_SERVE_ENG_KFP_SELF_CALLBACK_TOKEN_PATH` | | The token used for authenticating the progress callback. For cluster-internal workloads, use `/run/secrets/kubernetes.io/serviceaccount/token`. | | `DOCLING_SERVE_ENG_KFP_SELF_CALLBACK_CA_CERT_PATH` | | The CA certificate for the progress callback. For cluster-inetrnal workloads, use `/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt`. | + +#### Gradio UI + +When using Gradio UI and using the option to output conversion as file, Gradio uses cache to prevent files to be overwritten ([more info here](https://www.gradio.app/guides/file-access#the-gradio-cache)), and we defined the cache clean frequency of one hour to clean files older than 10hours. For situations that files need to be available to download from UI older than 10 hours, there is two options: + +- Increase the older age of files to clean [here](https://github.com/docling-project/docling-serve/blob/main/docling_serve/gradio_ui.py#L483) to suffice the age desired; +- Or set the clean up manually by defining the temporary dir of Gradio to use the same as `DOCLING_SERVE_SCRATCH_PATH` absolute path. This can be achieved by setting the environment variable `GRADIO_TEMP_DIR`, that can be done via command line `export GRADIO_TEMP_DIR=""` or in `Dockerfile` using `ENV GRADIO_TEMP_DIR=""`. After this, set the clean of cache to `None` [here](https://github.com/docling-project/docling-serve/blob/main/docling_serve/gradio_ui.py#L483). Now, the clean up of `DOCLING_SERVE_SCRATCH_PATH` will also clean the Gradio temporary dir. (If you use this option, please remember when reversing changes to remove the environment variable `GRADIO_TEMP_DIR`, otherwise may lead to files not be available to download).