3.3 KiB
Pre-loading models for docling
This document provides examples for pre-loading docling models to a persistent volume and re-using it for docling-serve deployments.
-
We need to create a persistent volume that will store models weights:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: docling-model-cache-pvc spec: accessModes: - ReadWriteOnce volumeMode: Filesystem resources: requests: storage: 10GiIf you don't want to use default storage class, set your custom storage class with following:
spec: ... storageClassName: <Storage Class Name>Manifest example: docling-model-cache-pvc.yaml
-
In order to load model weights, we can use docling-toolkit to download them, as this is a one time operation we can use kubernetes job for this:
apiVersion: batch/v1 kind: Job metadata: name: docling-model-cache-load spec: selector: {} template: metadata: name: docling-model-load spec: containers: - name: loader image: ghcr.io/docling-project/docling-serve-cpu:main command: - docling-tools - models - download - '--output-dir=/modelcache' - 'layout' - 'tableformer' - 'code_formula' - 'picture_classifier' - 'smolvlm' - 'granite_vision' - 'easyocr' volumeMounts: - name: docling-model-cache mountPath: /modelcache volumes: - name: docling-model-cache persistentVolumeClaim: claimName: docling-model-cache-pvc restartPolicy: NeverThe job will mount previously created persistent volume and execute command similar to how we would load models locally:
docling-tools models download --output-dir <MOUNT-PATH> [LIST_OF_MODELS]In manifest, we specify desired models individually, or we can use
--allparameter to download all models.Manifest example: docling-model-cache-job.yaml
-
Now we can mount volume in the docling-serve deployment and set env
DOCLING_SERVE_ARTIFACTS_PATHto point to it. Following additions to deploymeny should be made:spec: template: spec: containers: - name: api env: ... - name: DOCLING_SERVE_ARTIFACTS_PATH value: '/modelcache' volumeMounts: - name: docling-model-cache mountPath: /modelcache ... volumes: - name: docling-model-cache persistentVolumeClaim: claimName: docling-model-cache-pvcMake sure that value of
DOCLING_SERVE_ARTIFACTS_PATHis the same as where models were downloaded and where volume is mounted.Now when docling-serve is executing tasks, the underlying docling installation will load model weights from mouted volume.
Manifest example: docling-model-cache-deployment.yaml