WhisperLiveKit

LLM/WhisperLiveKit

Fork 0

mirror of https://github.com/QuentinFuxa/WhisperLiveKit.git synced 2026-04-28 09:30:05 +00:00

Commit Graph

Author	SHA1	Message	Date
Quentin Fuxa	4b2377c243	fix: correct false auto-detect claim, median bug, RTF inflation - BENCHMARK.md: whisper also supports --language auto, voxtral is not the only one. Fixed mlx-whisper speed comparison (LA is actually faster than SS for mlx-whisper, not comparable). - metrics.py: median calculation was wrong for even-length lists (took upper middle instead of averaging the two middle values). - metrics_collector.py: RTF was inflated because log_summary() used wall-clock elapsed time instead of sum of actual ASR call durations. - README.md: clarified that whisper also supports auto language detection, voxtral just does it better. - Added 2 new median tests (even + odd length).	2026-02-22 23:38:04 +01:00
Quentin Fuxa	f5eee67b11	fix: silence double-counting bug, add metrics module and runtime instrumentation - Fix _begin_silence pushing same object reference as _end_silence, causing the consumer to process two ended events and double the silence duration. - Fix initial silence never cleared when VAC is disabled, causing the no-VAC path to enqueue zero audio. - Add sample-precise silence boundaries (at_sample parameter). - Add whisperlivekit/metrics.py with WER computation (word-level Levenshtein) and timestamp accuracy (greedy alignment). No external dependencies. - Add whisperlivekit/metrics_collector.py with SessionMetrics dataclass for per-session runtime observability. Instrumented at 6 points in AudioProcessor: init, process_audio, transcription_processor, _end_silence, results_formatter, cleanup. Emits SESSION_METRICS structured log line on session end.	2026-02-22 23:27:12 +01:00

Author

SHA1

Message

Date

Quentin Fuxa

4b2377c243

fix: correct false auto-detect claim, median bug, RTF inflation

- BENCHMARK.md: whisper also supports --language auto, voxtral is not
  the only one. Fixed mlx-whisper speed comparison (LA is actually
  faster than SS for mlx-whisper, not comparable).
- metrics.py: median calculation was wrong for even-length lists
  (took upper middle instead of averaging the two middle values).
- metrics_collector.py: RTF was inflated because log_summary() used
  wall-clock elapsed time instead of sum of actual ASR call durations.
- README.md: clarified that whisper also supports auto language
  detection, voxtral just does it better.
- Added 2 new median tests (even + odd length).

2026-02-22 23:38:04 +01:00

Quentin Fuxa

f5eee67b11

fix: silence double-counting bug, add metrics module and runtime instrumentation

- Fix _begin_silence pushing same object reference as _end_silence,
  causing the consumer to process two ended events and double the
  silence duration.
- Fix initial silence never cleared when VAC is disabled, causing
  the no-VAC path to enqueue zero audio.
- Add sample-precise silence boundaries (at_sample parameter).
- Add whisperlivekit/metrics.py with WER computation (word-level
  Levenshtein) and timestamp accuracy (greedy alignment). No
  external dependencies.
- Add whisperlivekit/metrics_collector.py with SessionMetrics
  dataclass for per-session runtime observability. Instrumented
  at 6 points in AudioProcessor: init, process_audio,
  transcription_processor, _end_silence, results_formatter, cleanup.
  Emits SESSION_METRICS structured log line on session end.

2026-02-22 23:27:12 +01:00

2 Commits