Commit Graph

51 Commits

Author SHA1 Message Date
Quentin Fuxa
674b20d3af in buffer while language not detected » 2025-09-21 11:05:00 +02:00
Quentin Fuxa
a5503308c5 O(n) to O(1) for simulstreaming timestamp determination 2025-09-21 11:04:00 +02:00
Quentin Fuxa
b03a212fbf fixes #227 , auto language dectection v0.1 - simulstreaming only - when diarization and auto 2025-09-19 19:15:28 +02:00
Quentin Fuxa
1833e7c921 0.2.10 2025-09-16 23:45:00 +02:00
Quentin Fuxa
ee448a37e9 when pcm-input is set, the frontend uses AudioWorklet 2025-09-17 14:55:57 +02:00
Quentin Fuxa
9c051052b0 Merge branch 'main' into ScriptProcessorNode-to-AudioWorklet 2025-09-17 11:28:36 +02:00
Quentin Fuxa
99dc96c644 fixes #224 2025-09-16 18:34:35 +02:00
GeorgeCaoJ
2a27d2030a feat: support web audio 16kHz PCM input and remove ffmpeg dependency 2025-09-15 23:22:25 +08:00
Quentin Fuxa
cd160caaa1 asyncio.to_thread for transcription and translation 2025-09-15 15:23:22 +02:00
Quentin Fuxa
a4e9f3cab7 support for raw PCM input option by @YeonjunNotFR 2025-09-11 21:32:11 +02:00
Quentin Fuxa
b06866877a add --disable-punctuation-split option 2025-09-11 21:03:00 +02:00
Quentin Fuxa
967cdfebc8 fix Translation imports 2025-09-11 21:03:00 +02:00
Quentin Fuxa
3c11c60126 fix by @treeaaa 2025-09-11 21:03:00 +02:00
Quentin Fuxa
cb2d4ea88a audio processor lines use now Lines objects instead of dict 2025-09-09 21:45:00 +02:00
Quentin Fuxa
add7ea07ee translator takes all the tokens from the queue 2025-09-09 19:55:39 +02:00
Quentin Fuxa
f661f21675 translation asyncio task 2025-09-08 18:34:31 +02:00
Quentin Fuxa
4a5d5e1f3b raise Exception when language == auto and task == translation 2025-08-29 17:44:46 +02:00
Quentin Fuxa
ab98c31f16 trim will happen before audio processor 2025-08-27 18:17:11 +02:00
Quentin Fuxa
b101ce06bd several users share the same sortformer model instance 2024-08-24 19:18:00 +02:00
Quentin Fuxa
ce781831ee punctuation is checked in audio-processor's result formatter 2025-08-24 18:32:01 +02:00
Quentin Fuxa
58297daf6d sortformer diar implementation v0.3 2025-08-24 18:32:01 +02:00
Quentin Fuxa
909ac9dd41 speaker -1 are no more sent in websocket - no buffer when their is a silence 2025-08-21 14:09:02 +02:00
Quentin Fuxa
d94a07d417 default model is now base. default backend simulstreaming 2025-08-21 11:55:36 +02:00
Quentin Fuxa
b32dd8bfc4 Align backend and frontend time handling 2025-08-21 10:33:15 +02:00
Quentin Fuxa
253a080df5 diart diarization handles pauses/silences thanks to offset 2025-08-19 21:12:55 +02:00
Quentin Fuxa
e14bbde77d sortformer diar implementation v0 2025-08-19 17:02:55 +02:00
Quentin Fuxa
e2184d5e06 better handle silences when VAC + correct offset issue with whisperstreaming backend 2025-08-17 01:27:07 +02:00
Quentin Fuxa
7fe0353260 vac model is loaded in TranscriptionEngine, and by default 2025-08-17 00:34:25 +02:00
Quentin Fuxa
28bdc52e1d VAC before doing transcription and diarization. V0 2025-08-16 23:04:21 +02:00
Quentin Fuxa
38b4ebe8ba Handle 3 types of silences: Indicated by whisper, between tokens, and at the end of the input. Display them in the frontend 2025-08-11 17:56:57 +02:00
Quentin Fuxa
2bbdc70187 lags are now updated every 0.1s 2025-08-09 23:11:05 +02:00
Quentin Fuxa
197293e25e refactor(simulstreaming): extract backend + online module into separate files from whisper streaming 2025-08-08 18:07:51 +02:00
Quentin Fuxa
12a544164f Merge branch 'main' of https://github.com/QuentinFuxa/whisper_streaming_web 2025-07-16 12:05:01 +02:00
Quentin Fuxa
3ad3683ca7 Refactor speaker assignment in DiartDiarization for clarity and punctuation awareness 2025-07-15 14:38:53 +02:00
choomegan
64e44fb24f fix: logic of adding of pcm_array to diarization_queue 2025-07-15 15:33:41 +08:00
Quentin Fuxa
f25de6d8a4 ffmpeg-python is not used anymore - ffmpeg is directly called through create_subprocess_exec 2025-07-01 18:53:35 +02:00
Quentin Fuxa
2c1a603e38 ffmpeg is managed in a thread in FFmpegManager to prevent the all from crashing when an error occurs 2025-07-01 11:19:10 +02:00
Quentin Fuxa
774cee036b increase timeout from 2 to 20s for ffmpeg stdin flush and writing 2025-06-30 18:28:50 +02:00
Quentin Fuxa
f668570292 Trim buffer when no new ASR tokens are issued 2025-06-30 11:55:07 +02:00
Quentin Fuxa
bfec335a5f restore a functionnal buffer_diarization 2025-06-25 23:38:23 +02:00
Quentin Fuxa
6867041254 1rst version of SimulStreaming backend. many improvements needed 2025-06-25 17:59:46 +02:00
Quentin Fuxa
0f79d442ee improve diarization speed + Use punctuation to better align speakers and diarization 2025-06-19 13:03:29 +02:00
Quentin Fuxa
993a83546a core refactoring 2025-06-16 16:13:57 +02:00
Quentin Fuxa
f7644268c1 Message when launching transcription and no audio is detected 2025-05-28 13:25:49 +02:00
Quentin Fuxa
34e8fe260e lag information in real time even when no audio is detected 2025-05-28 12:25:47 +02:00
Quentin Fuxa
6797b88176 Error handling for missing FFmpeg in start_ffmpeg_decoder 2025-05-28 11:43:30 +02:00
Quentin Fuxa
fea3c3553c logging in ASR proc. includes internal buffer duration and transcription lag 2025-05-07 11:45:00 +02:00
Quentin Fuxa
083d5b2f44 uses sentinel object when end of transcription, to properly terminate tasks 2025-05-07 10:55:44 +02:00
Quentin Fuxa
b56fcffde1 Solves stdin flushes blocking IOhttps://github.com/QuentinFuxa/WhisperLiveKit/issues/110
https://github.com/QuentinFuxa/WhisperLiveKit/issues/106
https://github.com/QuentinFuxa/WhisperLiveKit/issues/90
https://github.com/QuentinFuxa/WhisperLiveKit/issues/87
https://github.com/QuentinFuxa/WhisperLiveKit/issues/81
https://github.com/QuentinFuxa/WhisperLiveKit/issues/2
2025-04-12 15:25:46 +02:00
Quentin Fuxa
704170ccf3 Logs for https://github.com/QuentinFuxa/WhisperLiveKit/issues/110 https://github.com/QuentinFuxa/WhisperLiveKit/issues/106
https://github.com/QuentinFuxa/WhisperLiveKit/issues/90
https://github.com/QuentinFuxa/WhisperLiveKit/issues/87
https://github.com/QuentinFuxa/WhisperLiveKit/issues/81
https://github.com/QuentinFuxa/WhisperLiveKit/issues/2
2025-04-09 11:34:27 +02:00