Commit Graph

73 Commits

Author SHA1 Message Date
Dominik Macháček
bb93952fd2 Merge branch 'main' into online-from-factory 2024-04-17 15:07:00 +02:00
Dominik Macháček
ce215e621b Merge pull request #82 from ufal/regularfry-ayo-warmup-file
warmup file -- PR #81
2024-04-17 14:54:48 +02:00
Dominik Macháček
e0f5d42b13 better documentation, help message and logging prints 2024-04-17 14:51:49 +02:00
Alex Young
8883397b44 Merge branch 'main' into ayo-warmup-file 2024-04-14 20:03:50 +01:00
Alex Young
fc4b3cd518 Check whether we are passed a warmup file before trying to see if it exists 2024-04-14 19:38:41 +01:00
Alex Young
70bc57180c Add a --warmup-file option to pass in a path 2024-04-14 19:29:46 +01:00
Dominik Macháček
d497503b5c COntributions at README.md
+ nicer formatting
+ #77
2024-04-10 18:13:07 +02:00
Dominik Macháček
6b1c2c5606 Merge pull request #75 from gaardhus/patch-1
Update README.md: Add Python syntax highlighting to code chunk
2024-04-09 14:22:55 +02:00
Tobias Gårdhus
8223afee78 Update README.md
Add Python syntax highlighting to code chunk
2024-03-29 19:35:30 +01:00
Dominik Macháček
b3647da087 Update README.md PDF link 2024-03-29 09:19:59 +01:00
Dominik Macháček
3af93975cc Merge pull request #70 from tijszwinkels/fix-imports
Fix imports
2024-03-21 00:11:52 +01:00
Tijs Zwinkels
bccbb15177 Move creation of OnlineASRProcessor inside the factory method
Preventing more code duplication between whisper_online.py and whisper_online_server.py
2024-03-20 16:29:01 +01:00
Tijs Zwinkels
006de3e7b0 Fix imports
Now, the ASR implementations do their own imports. No need to import in the factory
2024-03-20 16:02:24 +01:00
Dominik Macháček
50937bb872 Merge pull request #69 from tijszwinkels/fix-server-openai-crash
Fix crash when using openai-api with whisper_online_server
2024-03-20 15:39:38 +01:00
Tijs Zwinkels
8896389ea3 Fix crash when using openai-api with whisper_online_server
+ refactored creation of the ASR into a factory method
2024-03-20 15:29:10 +01:00
Dominik Macháček
5929a82896 Update README.md
bibtex update
2024-03-11 12:38:44 +01:00
koiking213
4405c451ce specify dtype for librosa.load, instead of cast 2024-02-20 23:29:25 +09:00
koiking213
24926c98e0 specify audio dtype 2024-02-20 22:46:04 +09:00
Dominik Macháček
db8b7d2883 removed unused variable 2024-02-20 14:37:18 +01:00
Aleksei Scripnic
80eb0baf5d Removed duplicate variable self.last_chunked_at
I tried to find the difference between self.last_chunked_at and self.buffer_time_offset, and it took me a while to understand that they are exactly the same. I think it's better to get rid of one of the duplicates to make the code more readable.
2024-02-20 14:37:18 +01:00
Dominik Macháček
949304ab05 Merge branch 'opeanai-api2' into opeanai-api 2024-02-19 13:51:26 +01:00
Tijs Zwinkels
9fcd403439 Use automatic language detection by default (instead of English) 2024-02-15 22:24:43 +01:00
Tijs Zwinkels
922ad18ebc Make OpenAI backend work with language autodetect 2024-02-14 17:29:45 +01:00
Tijs Zwinkels
f0a24cd5e1 Make --vad work with --backend openai-api 2024-02-14 17:01:29 +01:00
Tijs Zwinkels
3696fef2b1 Use OpenAI api word-level timestamps 2024-02-14 17:01:29 +01:00
Tijs Zwinkels
531418ad07 Interpolate word timestamps based on word character length 2024-02-14 17:01:29 +01:00
Dominik Macháček
2270014219 fixes 2024-02-14 17:01:29 +01:00
Dominik Macháček
f8b2ae07b8 missing features in openai-api, PR #52 2024-02-14 17:01:29 +01:00
Tijs Zwinkels
6ec1f65fe2 Update documentation to include openai-api backend 2024-02-14 17:01:29 +01:00
Tijs Zwinkels
f412812082 OpenAI Whisper API backend 2024-02-14 17:01:29 +01:00
Dominik Macháček
b66c61cf7a README update auto language detection 2024-02-06 14:31:24 +01:00
Dominik Macháček
cd221a3198 auto language detection #56 2024-02-06 14:29:30 +01:00
Dominik Macháček
d65fd8a649 fixes 2024-01-25 17:53:07 +01:00
Dominik Macháček
50f1b94856 missing features in openai-api, PR #52 2024-01-25 16:50:02 +01:00
Tijs Zwinkels
ab27bfb361 Update documentation to include openai-api backend 2024-01-25 10:21:42 +01:00
Tijs Zwinkels
c30969fe27 OpenAI Whisper API backend 2024-01-25 10:21:33 +01:00
Dominik Macháček
1f2352fa1d README typo and one more simulation option is not shared 2024-01-03 12:52:44 +01:00
Dominik Macháček
bfbe83d792 Samples should be an integer, not seconds
- Merge pull request #49 from skripnik/patch-1
- tested performance --  ESIC dev2, 27 docs, on En, De, Cs ASR, Nvidia A40, min chunk 1s, VAD => it has lower WER and latency with "segment" buffer trimming with various thresholds
2024-01-03 10:37:32 +01:00
Aleksei Scripnic
234ac8f5e8 Samples should be an integer, not seconds
I believe it's just a typo
2024-01-02 14:40:22 +00:00
Dominik Macháček
aa51e39de4 buffer trimming option, sent. segmenter not required anymore
- both for whisper_online + server
- removed argparse code repetition
- README updated
2024-01-02 14:56:30 +01:00
Dominik Macháček
ef08538697 buffer trimming options + most recommendable default
evaluated on ESIC dev2, 27 docs
2024-01-02 12:06:29 +01:00
Dominik Macháček
99aef35958 Merge pull request #36 from luweigen/bug-chunk_completed_sentence
fix bug of completed sentence chunking. tested on faster-whisper in e…
2023-12-19 13:39:37 +01:00
Dominik Macháček
ff794b4d32 Merge pull request #40 from lifefeel/main
Fix: Omitting the last chunk problem in comp_unaware mode
2023-12-07 13:31:47 +01:00
J.P Lee
2b98af7b19 Fix: Omitting the last chunk problem in comp_unaware mode 2023-12-07 17:00:38 +09:00
Dominik Macháček
64c445f073 proceedings link 2023-11-29 10:16:44 +01:00
Dominik Macháček
256ec31d21 bibtex and proceedings link 2023-11-29 10:14:30 +01:00
Wei Lu
a60c64c831 fix bug of completed sentence chunking. tested on faster-whisper in en language 2023-11-28 18:51:36 +02:00
Dominik Macháček
8f32dea5ca logfile reviewed, whisper_timestamped loading module and vad
PR #10, issues #9, #30
2023-11-28 12:16:20 +01:00
Dominik Macháček
bd0d848e7f Merge branch 'main' into TIAGo-WE-COBOT 2023-11-28 11:03:58 +01:00
Dominik Macháček
878f11cdb7 create_tokenizer in documentation
#25
2023-11-26 16:11:42 +01:00