mlx/fasterWhisper encoders are loaded once and shared in simulstreaming

This commit is contained in:
Quentin Fuxa
2025-08-31 12:33:19 +02:00
parent d467716e26
commit d5008ed828
6 changed files with 132 additions and 53 deletions

View File

@@ -10,6 +10,12 @@ On macOS Apple Silicon M4 :
| FASTER_WHISPER | 0.4s | 1.20s |
| MLX_WHISPER | 0.07s | 0.20s |
Memory saved by only loading encoder for optimized framework:
For tiny.en, mlx whisper:
Sizes MLX whisper:
Decoder weights: 59110771 bytes
Encoder weights: 15268874 bytes