Quentin Fuxa
0d874fb515
cuda or cpu auto detection
2025-02-07 10:16:03 +01:00
Quentin Fuxa
4d1aa4421a
Merge pull request #30 from SilasK/tsw
...
Time stamped text classes
2025-01-31 22:54:58 +01:00
Quentin Fuxa
f4d98e2c8c
Merge pull request #27 from SilasK/fix-sentencesegmenter
...
Fix sentence segmenter
2025-01-31 22:54:33 +01:00
Silas Kieser
15205f31d1
add doctest
2025-01-28 23:17:21 +01:00
Silas Kieser
b1f7034577
my version of timestamped text
2025-01-28 23:13:15 +01:00
Silas Kieser
23dee02d56
sentence overflow works
2025-01-28 22:38:55 +01:00
Silas Kieser
efd80095a7
segment also works
2025-01-28 22:11:28 +01:00
Silas Kieser
f4d3df3d87
change log format
2025-01-28 21:25:17 +01:00
Silas Kieser
9c7d429e15
add logging config to server
2025-01-28 17:38:13 +01:00
Silas Kieser
611d33cba5
keep a test script in base directory
2025-01-28 17:13:03 +01:00
Silas Kieser
ab7c22d3e3
whisper_online works with the new sentence segment
2025-01-28 17:02:21 +01:00
Silas Kieser
870a779666
sentence work again!
2025-01-28 16:55:07 +01:00
Quentin Fuxa
c3d72cae7c
Merge pull request #26 from SilasK/fix-sentencesegmenter
...
Improve logging stil trying to fix sentence segmenter
2025-01-28 15:53:26 +01:00
Quentin Fuxa
4622fe7aff
Merge branch 'main' into fix-sentencesegmenter
2025-01-28 15:53:10 +01:00
Silas Kieser
8ee1488c08
rename to_flush to concatenate_tsw
2025-01-27 16:49:22 +01:00
Silas Kieser
77d43885a3
chunk at sentence takes now an argument =self.comited
2025-01-27 16:29:06 +01:00
Silas Kieser
04170153e0
improve logging
2025-01-27 16:12:30 +01:00
Silas Kieser
baddf0284b
buffer length in sentence segmentation is no also max as in segment.
2025-01-27 15:36:19 +01:00
Quentin Fuxa
6e0f1dda25
Merge remote-tracking branch 'contrib/fix-sentencesegmenter'
2025-01-26 15:34:41 +01:00
Quentin Fuxa
c66794e1f5
Merge pull request #20 from SilasK/clean-main
...
In my limited experience with french "" should also be the sep for mlx-whisper
2025-01-26 14:57:52 +01:00
Silas Kieser
f0eaffacd3
improve logging in whisper_online.py
2025-01-21 14:59:36 +01:00
Silas Kieser
69a2ed6bfb
add logger for online asr
2025-01-21 14:45:45 +01:00
Silas Kieser
25eb276794
ignore wav and scripts
2025-01-21 14:08:41 +01:00
Silas Kieser
9f262813ec
sep for mlx is also ""
2025-01-21 12:16:46 +01:00
Silas Kieser
4293580581
use moses sentence segmenter instead of tokenizer
2025-01-21 12:12:41 +01:00
Silas Kieser
42d2784c20
clearer log messages for sentence segmentation
2025-01-21 12:11:54 +01:00
Silas Kieser
7fad0a3ee2
sep for mlx is also ""
2025-01-21 10:42:07 +01:00
Quentin Fuxa
27d2db77f7
Update README.md
2025-01-20 03:08:01 +01:00
Quentin Fuxa
fba37eba0a
move to src
2025-01-19 21:17:55 +01:00
Quentin Fuxa
5523b51fd7
first speaker is "0" no more None
2025-01-19 19:40:09 +01:00
Quentin Fuxa
9bdb92e923
update demo.png
2025-01-19 19:36:10 +01:00
Quentin Fuxa
b51c8427f4
diart link added
2025-01-19 17:12:55 +01:00
Quentin Fuxa
977436622a
add diarization (beta). Disabled by default
2025-01-19 17:12:40 +01:00
Quentin Fuxa
ce56264241
split whisper_online.py into smaller files
2025-01-14 20:52:53 +01:00
Quentin Fuxa
9cbac96c44
del online once webstreaming is finished
2025-01-14 20:20:22 +01:00
Quentin Fuxa
3f30d3de6e
Merge branch 'main' of https://github.com/QuentinFuxa/whisper_streaming_web
2025-01-14 20:14:22 +01:00
Quentin Fuxa
f884d1162d
warning when transcribe_kargs are used with MLX Whisper
2025-01-14 20:14:16 +01:00
Quentin Fuxa
6ee91c3c93
Merge pull request #15 from in-c0/patch-1
...
Specify encoding to ensure Python reads file as UTF-8
2025-01-13 20:30:51 +01:00
Ava
f52a5ae3c2
specify encoding to ensure Python reads file as UTF-8
...
executing `python whisper_fastapi_online_server.py --host 0.0.0.0 --port 8000` resulted in error on my setup for me:
```
whisper_streaming_web\whisper_fastapi_online_server.py, line 47, in <module>
html = f.read()
^^^^^^^^
File "C:\Python312\Lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 1818: character maps to <undefined>
```
On Windows, Python defaults to the `cp1252` encoding, which may not match the encoding of the file being read.
Files containing special characters, non-ASCII text, or saved with UTF-8 encoding can trigger this error when read without specifying the correct encoding.
2025-01-13 23:12:38 +11:00
Quentin Fuxa
0ff6067f37
Update README.md
2025-01-04 00:55:12 +01:00
Quentin Fuxa
da6c8d25e4
Update README.md
2025-01-03 14:54:29 +01:00
Quentin Fuxa
aa0ba598f0
no online conflict when multiple users
2025-01-03 14:48:45 +01:00
Quentin Fuxa
b7a2d23a18
if websocket connection fails, frontend does not allow recording
2024-12-31 11:17:41 +01:00
Quentin Fuxa
58e48bb717
Merge pull request #10 from SilasK/main
...
More flexibility by using custom tokenize_method + black
2024-12-31 10:33:47 +01:00
silask
6a04ddbed2
only print translated text not timestamps
2024-12-30 21:53:33 +01:00
silask
aa4d2599cc
fix #7
2024-12-30 21:53:33 +01:00
silask
5fdb08edae
black formating
2024-12-30 21:53:33 +01:00
Quentin Fuxa
4cb3660666
Update README.md
2024-12-30 20:46:36 +01:00
Quentin Fuxa
122368bff3
Append full transcription in websocket processing
2024-12-30 15:21:00 +01:00
Quentin Fuxa
0d833eaea2
Merge branch 'main' of https://github.com/QuentinFuxa/whisper_streaming_web
2024-12-28 18:32:36 +01:00