Quentin Fuxa
b102e12943
M5 benchmark figures: WER vs RTF scatter, 0.6B+1.7B MLX results
2026-03-15 15:00:00 +01:00
Quentin Fuxa
7aa3b764bd
MLX benchmark: 1.7B SimulStreaming on M5 (WER 4.07%, RTF 0.944)
...
LibriSpeech test-clean, 500 utterances.
1.7B is borderline real-time on M5 (RTF 0.944).
0.6B (3.30% WER, 0.263 RTF) is the practical choice for MacBook.
2026-03-15 14:00:00 +01:00
Quentin Fuxa
a422e604ae
MLX benchmark: 0.6B SimulStreaming on M5 MacBook (WER 3.30%, RTF 0.263)
...
LibriSpeech test-clean, 500 utterances, per-utterance simul-streaming.
AlignAtt border detection with 20 alignment heads.
Platform: Apple M5 32GB (MLX fp16).
benchmark_mlx_simul.py: reusable benchmark script for MLX backends.
2026-03-15 13:00:00 +01:00
Quentin Fuxa
e14b913807
Merge branch 'benchmarks-h100'
2026-03-15 12:00:00 +01:00
Quentin Fuxa
47d4cbeecc
reorganize benchmarks: move H100 results to benchmarks/h100/
2026-03-15 23:59:00 +01:00
Quentin Fuxa
f75dfb386d
final benchmark: Voxtral vLLM realtime streaming
2026-03-15 23:59:00 +01:00
Quentin Fuxa
276ba84d02
update figures with Voxtral vLLM results
2026-03-15 23:55:00 +01:00
Quentin Fuxa
36b3885cf2
add Voxtral 4B to benchmark figures
2026-03-15 23:30:00 +01:00
Quentin Fuxa
a29e799ba5
update H100 benchmark figures with ACL6060 results
2026-03-15 22:30:00 +01:00
Quentin Fuxa
22325ba326
tune simul-kv: 2s inference interval, configurable min_new_seconds
2026-03-15 21:30:00 +01:00
Quentin Fuxa
a540a5fd10
fix simul-kv audio trim bug, add 1.7B v2 alignment heads
2026-03-15 20:45:00 +01:00
Quentin Fuxa
7b08ea74ab
add H100 benchmark figures
2026-03-15 19:15:00 +01:00
Quentin Fuxa
b69eaf82be
qwen3 simul+kv: optimized streaming with kv cache reuse
2026-03-15 18:30:00 +01:00
Quentin Fuxa
3b7a2fcc87
Add Qwen3-ASR MLX SimulStreaming backend
...
New backend 'qwen3-mlx-simul' for Apple Silicon: AlignAtt border
detection via monkey-patched cross-attention on MLX Qwen3-ASR.
Supports 0.6B (RTF 0.236 on M5) and 1.7B models.
- qwen3_mlx_simul.py: full streaming implementation with KV cache,
alignment head attention extraction, border-distance policy
- core.py: register new backend in TranscriptionEngine + online_factory
- parse_args.py: add qwen3-mlx-simul to CLI choices
2026-03-15 11:00:00 +01:00
Quentin Fuxa
ed503be140
qwen
2026-01-02 23:52:00 +01:00
Quentin Fuxa
a6a85431f6
update benchmark with qwen3 which reuses kv cache
2026-03-15 22:32:01 +01:00
Quentin Fuxa
dd48997674
qwen3: reuse encoder kv cache
2026-03-15 22:31:39 +01:00
Quentin Fuxa
f24481dc29
update archi
2026-03-15 11:36:45 +01:00
Quentin Fuxa
ed76f40ee5
Merge branch 'main' of https://github.com/QuentinFuxa/WhisperLiveKit
2026-03-15 11:16:38 +01:00
Quentin Fuxa
5330b3fac5
update benchmark part
2026-03-15 11:16:26 +01:00
Quentin Fuxa
0c73a73aa3
update benchmark results and procedure
2026-03-15 11:16:15 +01:00
Quentin Fuxa
2d6bc4f572
Add '*.c' to .dockerignore
2026-03-14 00:18:10 +01:00
Quentin Fuxa
dfd5bf417c
voxtral mlx : improved chunking
2026-03-14 00:13:29 +01:00
Quentin Fuxa
9d8db7ab38
add qwen3 simul in tests
2026-03-14 00:13:09 +01:00
Quentin Fuxa
fa15115163
qwen3 alignment heads
2026-03-14 00:12:50 +01:00
Quentin Fuxa
8dc7b77071
Bump version to 0.2.20
v0.2.20
2026-03-08 16:02:00 +01:00
Quentin Fuxa
10d85ff65f
Update docs, CI, and architecture diagram
2026-03-08 15:14:00 +01:00
Quentin Fuxa
e7e3441ca4
Add Qwen3 ASR backend
2026-03-07 11:48:00 +01:00
Quentin Fuxa
9abe26a996
Add CLI with serve, transcribe, listen, pull, diagnose
2026-03-01 13:37:00 +01:00
Quentin Fuxa
c8e7c216ed
Replace mock tests with real pipeline tests
2026-02-28 10:05:00 +01:00
Quentin Fuxa
586540ae36
Add test harness and test client
2026-02-22 16:19:00 +01:00
Quentin Fuxa
cd8df8e1aa
Update package setup and exports
2026-02-21 11:33:00 +01:00
Quentin Fuxa
e30f9a2573
Improve diarization backends
2026-02-15 14:55:00 +01:00
Quentin Fuxa
32de7b1276
Fix frontend buffer rendering for slow backends
2026-02-14 09:28:00 +01:00
Quentin Fuxa
9ac7c26a0b
Add OpenAI REST API and Deepgram WebSocket
2026-02-08 15:42:00 +01:00
Quentin Fuxa
c0e2600993
Add snapshot-then-diff WebSocket protocol
2026-02-07 10:17:00 +01:00
Quentin Fuxa
e0db3a98f9
Add per-session language proxy
2026-02-01 17:03:00 +01:00
Quentin Fuxa
2fe34427ef
Fix voxtral streaming drain and silence flush
2026-01-31 11:12:00 +01:00
Quentin Fuxa
d58365421f
Refactor audio processor async pipeline
2026-01-25 13:48:00 +01:00
Quentin Fuxa
a282cbe75f
Improve tokens alignment and silence handling
2026-01-24 10:55:00 +01:00
Quentin Fuxa
6e85c16614
Refactor TranscriptionEngine singleton
2026-01-18 15:27:00 +01:00
Quentin Fuxa
e1823dd99c
Improve online ASR processor
2026-01-17 09:35:00 +01:00
Quentin Fuxa
e144abbbc7
Refactor timed objects and data structures
2026-01-11 16:08:00 +01:00
Quentin Fuxa
83362c89c4
Clean up config and model paths
2026-01-10 11:42:00 +01:00
Quentin Fuxa
74c4dc791d
Lint scripts and tests
2026-01-04 14:15:00 +01:00
Quentin Fuxa
cf6c49f502
Ruff lint cleanup
2026-01-03 10:23:00 +01:00
Quentin Fuxa
451535d48f
Fix ctranslate2 encoder conversion ( #345 ) and memory leak in TokensAlignment ( #344 )
...
- Add fallback chain for StorageView to numpy conversion
- Prune old tokens/segments after 5min to bound memory
2026-03-10 22:37:00 +01:00
Quentin Fuxa
8bc0937c46
Update README section on powered research
2026-03-06 18:46:07 +01:00
Quentin Fuxa
929cf7a26b
add link to AlignAtt interactive playground
2026-03-06 18:43:25 +01:00
Quentin Fuxa
abfaf06203
Merge branch 'main' of https://github.com/QuentinFuxa/WhisperLiveKit
2026-03-04 18:17:23 +01:00