jedzill4
c56a53fbf4
deps(mlx-groups): add optional dependencies for Apple Silicon MLX backends
2026-03-01 20:05:52 -03:00
jedzill4
4bb58dc7aa
deps(diart): improve diart dependency tree. rename gpu-cu129 dependency group to cu129
2026-02-25 20:27:26 -03:00
jedzill4
994ce21365
📌 chore(deps): pin dependences to python 3.11 to 3.13 due dependency resolution matrix
2026-02-25 14:21:19 -03:00
Quentin Fuxa
b1fc23807a
docs: add benchmark collaboration call, voxtral in powered-by section
2026-02-23 10:37:22 +01:00
Quentin Fuxa
10c4e5f730
docs: add speed vs accuracy scatter plot to benchmark and README
...
WER vs RTF scatter plot showing all backend/policy/model combos
on the 30s English file. Sweet spot zone highlights the best
tradeoffs. Added to both BENCHMARK.md and README.md.
2026-02-23 10:27:53 +01:00
Quentin Fuxa
4b2377c243
fix: correct false auto-detect claim, median bug, RTF inflation
...
- BENCHMARK.md: whisper also supports --language auto, voxtral is not
the only one. Fixed mlx-whisper speed comparison (LA is actually
faster than SS for mlx-whisper, not comparable).
- metrics.py: median calculation was wrong for even-length lists
(took upper middle instead of averaging the two middle values).
- metrics_collector.py: RTF was inflated because log_summary() used
wall-clock elapsed time instead of sum of actual ASR call durations.
- README.md: clarified that whisper also supports auto language
detection, voxtral just does it better.
- Added 2 new median tests (even + odd length).
2026-02-22 23:38:04 +01:00
Quentin Fuxa
9b2c3ee844
docs: update README with voxtral backend, benchmarks, testing sections
...
- Add Voxtral Backend section explaining voxtral-mlx and voxtral (HF).
- Add Testing & Benchmarks section with commands to run tests/benchmarks.
- Update --backend parameter docs to include voxtral-mlx and voxtral.
- Update optional dependencies table with Voxtral entry.
- Link to BENCHMARK.md for detailed performance comparisons.
2026-02-22 23:27:57 +01:00
Quentin Fuxa
3192553e20
fixes #307
2025-12-09 10:27:49 +01:00
Quentin Fuxa
cc5f819ce7
hf weights
2025-11-29 17:50:46 +01:00
Quentin Fuxa
82cd24bb75
LoRa path v0 - functional
2025-11-29 17:21:10 +01:00
Quentin Fuxa
34ddd2ac02
update doc
2025-11-25 23:20:00 +01:00
Quentin Fuxa
345d781e97
update doc
2025-11-25 23:20:00 +01:00
Quentin Fuxa
28cf831701
indicate for context token limits for --max-context-tokens. bump to 0.2.16.dev0
2025-11-25 23:45:15 +01:00
Quentin Fuxa
60c62f8f84
troubleshooting #271 #276 #284 #286
2025-11-25 23:31:46 +01:00
Quentin Fuxa
7faa21f95f
alignatt: enable model sharing by removing hooks and centralizing session state. Solves #282
...
Co-authored-by: Emmanuel Schmidbauer <eschmidbauer@gmail.com >
2025-11-25 23:07:42 +01:00
Quentin Fuxa
fc9cb66813
disabling vac is not advised
2025-11-23 11:20:00 +01:00
Quentin Fuxa
6206fff118
0.2.15
2025-11-21 23:52:00 +01:00
Quentin Fuxa
bcffdbc6b3
bump to 0.2.14
2025-11-15 20:19:09 +01:00
Quentin Fuxa
80b77998f9
Refactor backend handling
2025-11-15 19:51:41 +01:00
Quentin Fuxa
d310f7e25f
hf compatibility
2025-11-15 18:34:19 +01:00
Quentin Fuxa
16461052ed
task to direct-english-translation
2025-11-10 13:20:26 +01:00
Quentin Fuxa
13401ffe24
whisper core at root of wlk
2025-11-10 12:17:18 +01:00
Quentin Fuxa
a732e0903e
Add a script to detect alignement heads, usefull for distilled whisper
2025-11-09 18:12:09 +01:00
Quentin Fuxa
ece02db6a3
Use optional new separate NLLW package for translation
2025-10-30 19:36:28 +01:00
Quentin Fuxa
0c5365e7c6
fixes #258
2025-10-24 20:51:16 +02:00
Quentin Fuxa
818c9c37ca
README: path to doc for model file format
2025-10-23 20:34:36 +02:00
Quentin Fuxa
9c5bb5df19
README: dir to pah
...
Co-authored-by: David Georg Reichelt <david.reichelt@uni-leipzig.de >
2025-10-23 20:31:12 +02:00
Quentin Fuxa
21bbb59e31
Merge pull request #250 from ladinu/patch-1
...
fix broken link
2025-10-15 08:59:02 +02:00
Ladinu Chandrasinghe
3467109668
fix broken link
2025-10-05 10:51:41 -07:00
Alvaro Ollero
5832d7433d
update documentation
2025-10-04 23:18:10 +02:00
Quentin Fuxa
8cbaeecc75
cutom alignment heads parameter for custom models
2025-09-27 11:04:00 +02:00
Quentin Fuxa
9fc6654a4a
common frontend for web/ and chrome extension
2025-09-25 23:14:25 +02:00
Quentin Fuxa
777ec63a71
--pcm-input option information
2025-09-17 16:06:28 +02:00
Quentin Fuxa
0a6e5ae9c1
ffmpeg install instruction error indicates --pcm-input alternative
2025-09-17 16:04:17 +02:00
Quentin Fuxa
65025cc448
nllb backend can be transformers, and model size can be 1.3B
2025-09-17 10:20:31 +02:00
Quentin Fuxa
bbba1d9bb7
add nllb-backend and translation perf test in dev_notes
2025-09-16 20:45:01 +02:00
Quentin Fuxa
babe93b99a
to 0.2.9
2025-09-11 21:36:32 +02:00
Quentin Fuxa
a4e9f3cab7
support for raw PCM input option by @YeonjunNotFR
2025-09-11 21:32:11 +02:00
Quentin Fuxa
b06866877a
add --disable-punctuation-split option
2025-09-11 21:03:00 +02:00
Quentin Fuxa
2963e8a757
translate when at least 3 new tokens
2025-09-09 21:45:00 +02:00
notV3NOM
a178ed5c22
fix simulstreaming preload model count argument in cli
2025-09-06 18:18:09 +05:30
Quentin Fuxa
d1a9913c47
nllb v0
2025-09-05 18:02:42 +02:00
Quentin Fuxa
3bd2122eb4
0.2.8 : only the decoder of whisper is loaded in memory when a different encoder is used
2025-09-02 21:12:25 +02:00
Quentin Fuxa
4a71a391b8
get_web_interface_html to get_inline_ui_html for embedded web interface HTML
2025-08-30 13:44:06 +02:00
Quentin Fuxa
583a2ec2e4
highlight Sortformer optional installation
2025-08-27 21:02:25 +02:00
Quentin Fuxa
9895bc83bf
auto detection of language for warmup if not indicated
2025-08-27 20:37:48 +02:00
Quentin Fuxa
f9c9c4188a
optional dependencies removed, ask to direct alternative package installations
2025-08-27 18:15:32 +02:00
Quentin Fuxa
52a755a08c
indications on how to choose a model
2024-08-24 19:22:00 +02:00
Quentin Fuxa
5258305745
default diarization backend in now sortformer
2025-08-24 18:32:01 +02:00
Quentin Fuxa
58297daf6d
sortformer diar implementation v0.3
2025-08-24 18:32:01 +02:00