Quentin Fuxa
f5eee67b11
fix: silence double-counting bug, add metrics module and runtime instrumentation
...
- Fix _begin_silence pushing same object reference as _end_silence,
causing the consumer to process two ended events and double the
silence duration.
- Fix initial silence never cleared when VAC is disabled, causing
the no-VAC path to enqueue zero audio.
- Add sample-precise silence boundaries (at_sample parameter).
- Add whisperlivekit/metrics.py with WER computation (word-level
Levenshtein) and timestamp accuracy (greedy alignment). No
external dependencies.
- Add whisperlivekit/metrics_collector.py with SessionMetrics
dataclass for per-session runtime observability. Instrumented
at 6 points in AudioProcessor: init, process_audio,
transcription_processor, _end_silence, results_formatter, cleanup.
Emits SESSION_METRICS structured log line on session end.
2026-02-22 23:27:12 +01:00
Quentin Fuxa
4c7706e2cf
fix: use vac_chunk_size for audio processing interval when VAC is enabled ( #334 )
2026-02-20 20:48:06 +01:00
Anton Jacobson
3db5d81a20
update diarization lag after stream analysed
2025-12-18 14:13:28 +01:00
Quentin Fuxa
2431a6bf91
isolated VAD states per user: .onnx: share a stateless model. .jit: require duplicating the model.
...
Co-authored-by: eschmidbauer <eschmidbauer@gmail.com >
2025-12-05 15:27:14 +01:00
Quentin Fuxa
c0965c6c31
Lines to Segments. Merging dataclasses
2025-11-27 21:54:58 +01:00
Quentin Fuxa
870141298c
isort
2025-11-23 11:20:00 +01:00
Quentin Fuxa
a175d1a327
fixes silence detected but never reported by silero
2025-11-23 11:20:00 +01:00
Quentin Fuxa
b5067249c0
stt/diar/nllw alignment: internal rework 5
2025-11-20 23:52:00 +01:00
Quentin Fuxa
f4f9831d39
stt/diar/nllw alignment: internal rework 5
2025-11-20 23:52:00 +01:00
Quentin Fuxa
254faaf64c
stt/diar/nllw alignment: internal rework 5
2025-11-20 23:52:00 +01:00
Quentin Fuxa
8e7aea4fcf
internal rework 4
2025-11-20 23:45:20 +01:00
Quentin Fuxa
270faf2069
internal rework 3
2025-11-20 22:28:30 +01:00
Quentin Fuxa
b7c1cc77cc
internal rework 2
2025-11-20 22:06:38 +01:00
Quentin Fuxa
9a45ec221c
internal rework 1
2025-11-20 12:58:38 +01:00
Quentin Fuxa
b7d20a0ff0
segment attribution in result formatter
2025-11-19 21:10:28 +01:00
Quentin Fuxa
11e9def0b2
diarization corrections
2025-11-19 19:06:03 +01:00
Quentin Fuxa
3104f40f6e
fixes #279 #278
2025-11-19 18:17:50 +01:00
Quentin Fuxa
bfd60b3921
Add audio partial silence in chunks handling. bump to 0.2.14.post2
2025-11-17 22:52:00 +01:00
Quentin Fuxa
1e67bf97f0
improve buffering when use of heavy models
2027-04-25 23:52:00 +02:00
Quentin Fuxa
28985962a0
Silence handling: finish transcription even if not validated at the BEGINNING of the silence
2025-11-16 22:29:08 +01:00
Quentin Fuxa
8d9be88fe6
translation buffer is now displayed in frontend
2025-11-10 15:22:26 +01:00
Quentin Fuxa
ffe5284764
_processing_tasks_done checks task completion
2025-11-05 23:34:00 +01:00
Quentin Fuxa
61edb70fff
audioProcessor state variables are now uniquely in State dataclass
2025-10-26 18:54:47 +01:00
Quentin Fuxa
9434390ad3
simplify task stopping condition
2025-10-26 17:26:43 +01:00
Quentin Fuxa
1f684cdd97
fixes #251
2025-10-06 19:53:27 +02:00
Quentin Fuxa
374618e050
token speakers are only reattributed for token coming after last_validated_token
2025-10-04 09:52:00 +02:00
Quentin Fuxa
a7db39d999
solves incorrect spacing in buffer diarization
2025-10-02 23:04:00 +02:00
Quentin Fuxa
a153e11fe0
update when self.diarization_before_transcription
2025-09-28 11:04:00 +02:00
Quentin Fuxa
8cbaeecc75
cutom alignment heads parameter for custom models
2025-09-27 11:04:00 +02:00
Quentin Fuxa
545ea15c9a
ensure buffer size to be a multiple of the element size
2025-09-27 13:58:32 +02:00
Quentin Fuxa
6caf3e0485
correct silence handling in translation
2025-09-27 11:58:00 +02:00
Quentin Fuxa
d55490cd27
typo and simpler conditions
2025-09-26 20:38:26 +02:00
Quentin Fuxa
4dd5d8bf8a
translation compatible with auto and detected language
2025-09-22 11:20:00 +02:00
Quentin Fuxa
93f002cafb
language detection after few seconds working
2025-09-20 11:08:00 +02:00
Quentin Fuxa
674b20d3af
in buffer while language not detected »
2025-09-21 11:05:00 +02:00
Quentin Fuxa
a5503308c5
O(n) to O(1) for simulstreaming timestamp determination
2025-09-21 11:04:00 +02:00
Quentin Fuxa
b03a212fbf
fixes #227 , auto language dectection v0.1 - simulstreaming only - when diarization and auto
2025-09-19 19:15:28 +02:00
Quentin Fuxa
1833e7c921
0.2.10
2025-09-16 23:45:00 +02:00
Quentin Fuxa
ee448a37e9
when pcm-input is set, the frontend uses AudioWorklet
2025-09-17 14:55:57 +02:00
Quentin Fuxa
9c051052b0
Merge branch 'main' into ScriptProcessorNode-to-AudioWorklet
2025-09-17 11:28:36 +02:00
Quentin Fuxa
99dc96c644
fixes #224
2025-09-16 18:34:35 +02:00
GeorgeCaoJ
2a27d2030a
feat: support web audio 16kHz PCM input and remove ffmpeg dependency
2025-09-15 23:22:25 +08:00
Quentin Fuxa
cd160caaa1
asyncio.to_thread for transcription and translation
2025-09-15 15:23:22 +02:00
Quentin Fuxa
a4e9f3cab7
support for raw PCM input option by @YeonjunNotFR
2025-09-11 21:32:11 +02:00
Quentin Fuxa
b06866877a
add --disable-punctuation-split option
2025-09-11 21:03:00 +02:00
Quentin Fuxa
967cdfebc8
fix Translation imports
2025-09-11 21:03:00 +02:00
Quentin Fuxa
3c11c60126
fix by @treeaaa
2025-09-11 21:03:00 +02:00
Quentin Fuxa
cb2d4ea88a
audio processor lines use now Lines objects instead of dict
2025-09-09 21:45:00 +02:00
Quentin Fuxa
add7ea07ee
translator takes all the tokens from the queue
2025-09-09 19:55:39 +02:00
Quentin Fuxa
f661f21675
translation asyncio task
2025-09-08 18:34:31 +02:00