diff --git a/Dockerfile b/Dockerfile index 534f99d..efd3b82 100644 --- a/Dockerfile +++ b/Dockerfile @@ -37,9 +37,10 @@ RUN pip3 install --upgrade pip setuptools wheel && \ COPY . . # Install WhisperLiveKit directly, allowing for optional dependencies +# Example: --build-arg EXTRAS="translation" RUN if [ -n "$EXTRAS" ]; then \ echo "Installing with extras: [$EXTRAS]"; \ - pip install --no-cache-dir whisperlivekit[$EXTRAS]; \ + pip install --no-cache-dir "whisperlivekit[$EXTRAS]"; \ else \ echo "Installing base package only"; \ pip install --no-cache-dir whisperlivekit; \ diff --git a/README.md b/README.md index aa295c8..948f015 100644 --- a/README.md +++ b/README.md @@ -147,8 +147,8 @@ async def websocket_endpoint(websocket: WebSocket): |-----------|-------------|---------| | `--model` | Whisper model size. List and recommandations [here](https://github.com/QuentinFuxa/WhisperLiveKit/blob/main/docs/default_and_custom_models.md) | `small` | | `--model-path` | Local .pt file/directory **or** Hugging Face repo ID containing the Whisper model. Overrides `--model`. Recommandations [here](https://github.com/QuentinFuxa/WhisperLiveKit/blob/main/docs/default_and_custom_models.md) | `None` | -| `--language` | List [here](https://github.com/QuentinFuxa/WhisperLiveKit/blob/main/whisperlivekit/whisper/tokenizer.py). If you use `auto`, the model attempts to detect the language automatically, but it tends to bias towards English. | `auto` | -| `--target-language` | If sets, translates using [NLLW](https://github.com/QuentinFuxa/NoLanguageLeftWaiting). [200 languages available](https://github.com/QuentinFuxa/WhisperLiveKit/blob/main/docs/supported_languages.md). If you want to translate to english, you can also use `--direct-english-translation`. The STT model will try to directly output the translation. | `None` | +| `--language` | List [here](docs/supported_languages.md). If you use `auto`, the model attempts to detect the language automatically, but it tends to bias towards English. | `auto` | +| `--target-language` | If sets, translates using [NLLW](https://github.com/QuentinFuxa/NoLanguageLeftWaiting). [200 languages available](docs/supported_languages.md). If you want to translate to english, you can also use `--direct-english-translation`. The STT model will try to directly output the translation. | `None` | | `--diarization` | Enable speaker identification | `False` | | `--backend-policy` | Streaming strategy: `1`/`simulstreaming` uses AlignAtt SimulStreaming, `2`/`localagreement` uses the LocalAgreement policy | `simulstreaming` | | `--backend` | Whisper implementation selector. `auto` picks MLX on macOS (if installed), otherwise Faster-Whisper, otherwise vanilla Whisper. You can also force `mlx-whisper`, `faster-whisper`, `whisper`, or `openai-api` (LocalAgreement only) | `auto` | @@ -267,7 +267,7 @@ docker run --gpus all -p 8000:8000 --name wlk wlk --model large-v3 --language fr #### Customization - `--build-arg` Options: - - `EXTRAS="whisper-timestamped"` - Add extras to the image's installation (no spaces). Remember to set necessary container options! + - `EXTRAS="translation"` - Add extras to the image's installation (no spaces). Remember to set necessary container options! - `HF_PRECACHE_DIR="./.cache/"` - Pre-load a model cache for faster first-time start - `HF_TKN_FILE="./token"` - Add your Hugging Face Hub access token to download gated models diff --git a/pyproject.toml b/pyproject.toml index 0ad75ad..f3ce5e4 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -35,6 +35,7 @@ dependencies = [ "torchaudio>=2.0.0", "torch>=2.0.0", "huggingface-hub>=0.25.0", + "faster-whisper>=1.2.0", "tqdm", "tiktoken", 'triton>=2.0.0; platform_machine == "x86_64" and (sys_platform == "linux" or sys_platform == "linux2")'