Quentin Fuxa
|
8cbaeecc75
|
cutom alignment heads parameter for custom models
|
2025-09-27 11:04:00 +02:00 |
|
Quentin Fuxa
|
545ea15c9a
|
ensure buffer size to be a multiple of the element size
|
2025-09-27 13:58:32 +02:00 |
|
Quentin Fuxa
|
6caf3e0485
|
correct silence handling in translation
|
2025-09-27 11:58:00 +02:00 |
|
Quentin Fuxa
|
d55490cd27
|
typo and simpler conditions
|
2025-09-26 20:38:26 +02:00 |
|
Quentin Fuxa
|
4dd5d8bf8a
|
translation compatible with auto and detected language
|
2025-09-22 11:20:00 +02:00 |
|
Quentin Fuxa
|
93f002cafb
|
language detection after few seconds working
|
2025-09-20 11:08:00 +02:00 |
|
Quentin Fuxa
|
674b20d3af
|
in buffer while language not detected »
|
2025-09-21 11:05:00 +02:00 |
|
Quentin Fuxa
|
a5503308c5
|
O(n) to O(1) for simulstreaming timestamp determination
|
2025-09-21 11:04:00 +02:00 |
|
Quentin Fuxa
|
b03a212fbf
|
fixes #227 , auto language dectection v0.1 - simulstreaming only - when diarization and auto
|
2025-09-19 19:15:28 +02:00 |
|
Quentin Fuxa
|
1833e7c921
|
0.2.10
|
2025-09-16 23:45:00 +02:00 |
|
Quentin Fuxa
|
ee448a37e9
|
when pcm-input is set, the frontend uses AudioWorklet
|
2025-09-17 14:55:57 +02:00 |
|
Quentin Fuxa
|
9c051052b0
|
Merge branch 'main' into ScriptProcessorNode-to-AudioWorklet
|
2025-09-17 11:28:36 +02:00 |
|
Quentin Fuxa
|
99dc96c644
|
fixes #224
|
2025-09-16 18:34:35 +02:00 |
|
GeorgeCaoJ
|
2a27d2030a
|
feat: support web audio 16kHz PCM input and remove ffmpeg dependency
|
2025-09-15 23:22:25 +08:00 |
|
Quentin Fuxa
|
cd160caaa1
|
asyncio.to_thread for transcription and translation
|
2025-09-15 15:23:22 +02:00 |
|
Quentin Fuxa
|
a4e9f3cab7
|
support for raw PCM input option by @YeonjunNotFR
|
2025-09-11 21:32:11 +02:00 |
|
Quentin Fuxa
|
b06866877a
|
add --disable-punctuation-split option
|
2025-09-11 21:03:00 +02:00 |
|
Quentin Fuxa
|
967cdfebc8
|
fix Translation imports
|
2025-09-11 21:03:00 +02:00 |
|
Quentin Fuxa
|
3c11c60126
|
fix by @treeaaa
|
2025-09-11 21:03:00 +02:00 |
|
Quentin Fuxa
|
cb2d4ea88a
|
audio processor lines use now Lines objects instead of dict
|
2025-09-09 21:45:00 +02:00 |
|
Quentin Fuxa
|
add7ea07ee
|
translator takes all the tokens from the queue
|
2025-09-09 19:55:39 +02:00 |
|
Quentin Fuxa
|
f661f21675
|
translation asyncio task
|
2025-09-08 18:34:31 +02:00 |
|
Quentin Fuxa
|
4a5d5e1f3b
|
raise Exception when language == auto and task == translation
|
2025-08-29 17:44:46 +02:00 |
|
Quentin Fuxa
|
ab98c31f16
|
trim will happen before audio processor
|
2025-08-27 18:17:11 +02:00 |
|
Quentin Fuxa
|
b101ce06bd
|
several users share the same sortformer model instance
|
2024-08-24 19:18:00 +02:00 |
|
Quentin Fuxa
|
ce781831ee
|
punctuation is checked in audio-processor's result formatter
|
2025-08-24 18:32:01 +02:00 |
|
Quentin Fuxa
|
58297daf6d
|
sortformer diar implementation v0.3
|
2025-08-24 18:32:01 +02:00 |
|
Quentin Fuxa
|
909ac9dd41
|
speaker -1 are no more sent in websocket - no buffer when their is a silence
|
2025-08-21 14:09:02 +02:00 |
|
Quentin Fuxa
|
d94a07d417
|
default model is now base. default backend simulstreaming
|
2025-08-21 11:55:36 +02:00 |
|
Quentin Fuxa
|
b32dd8bfc4
|
Align backend and frontend time handling
|
2025-08-21 10:33:15 +02:00 |
|
Quentin Fuxa
|
253a080df5
|
diart diarization handles pauses/silences thanks to offset
|
2025-08-19 21:12:55 +02:00 |
|
Quentin Fuxa
|
e14bbde77d
|
sortformer diar implementation v0
|
2025-08-19 17:02:55 +02:00 |
|
Quentin Fuxa
|
e2184d5e06
|
better handle silences when VAC + correct offset issue with whisperstreaming backend
|
2025-08-17 01:27:07 +02:00 |
|
Quentin Fuxa
|
7fe0353260
|
vac model is loaded in TranscriptionEngine, and by default
|
2025-08-17 00:34:25 +02:00 |
|
Quentin Fuxa
|
28bdc52e1d
|
VAC before doing transcription and diarization. V0
|
2025-08-16 23:04:21 +02:00 |
|
Quentin Fuxa
|
38b4ebe8ba
|
Handle 3 types of silences: Indicated by whisper, between tokens, and at the end of the input. Display them in the frontend
|
2025-08-11 17:56:57 +02:00 |
|
Quentin Fuxa
|
2bbdc70187
|
lags are now updated every 0.1s
|
2025-08-09 23:11:05 +02:00 |
|
Quentin Fuxa
|
197293e25e
|
refactor(simulstreaming): extract backend + online module into separate files from whisper streaming
|
2025-08-08 18:07:51 +02:00 |
|
Quentin Fuxa
|
12a544164f
|
Merge branch 'main' of https://github.com/QuentinFuxa/whisper_streaming_web
|
2025-07-16 12:05:01 +02:00 |
|
Quentin Fuxa
|
3ad3683ca7
|
Refactor speaker assignment in DiartDiarization for clarity and punctuation awareness
|
2025-07-15 14:38:53 +02:00 |
|
choomegan
|
64e44fb24f
|
fix: logic of adding of pcm_array to diarization_queue
|
2025-07-15 15:33:41 +08:00 |
|
Quentin Fuxa
|
f25de6d8a4
|
ffmpeg-python is not used anymore - ffmpeg is directly called through create_subprocess_exec
|
2025-07-01 18:53:35 +02:00 |
|
Quentin Fuxa
|
2c1a603e38
|
ffmpeg is managed in a thread in FFmpegManager to prevent the all from crashing when an error occurs
|
2025-07-01 11:19:10 +02:00 |
|
Quentin Fuxa
|
774cee036b
|
increase timeout from 2 to 20s for ffmpeg stdin flush and writing
|
2025-06-30 18:28:50 +02:00 |
|
Quentin Fuxa
|
f668570292
|
Trim buffer when no new ASR tokens are issued
|
2025-06-30 11:55:07 +02:00 |
|
Quentin Fuxa
|
bfec335a5f
|
restore a functionnal buffer_diarization
|
2025-06-25 23:38:23 +02:00 |
|
Quentin Fuxa
|
6867041254
|
1rst version of SimulStreaming backend. many improvements needed
|
2025-06-25 17:59:46 +02:00 |
|
Quentin Fuxa
|
0f79d442ee
|
improve diarization speed + Use punctuation to better align speakers and diarization
|
2025-06-19 13:03:29 +02:00 |
|
Quentin Fuxa
|
993a83546a
|
core refactoring
|
2025-06-16 16:13:57 +02:00 |
|
Quentin Fuxa
|
f7644268c1
|
Message when launching transcription and no audio is detected
|
2025-05-28 13:25:49 +02:00 |
|