Update README.md

This commit is contained in:
Quentin Fuxa
2025-08-01 16:33:26 +02:00
committed by GitHub
parent 00424d7ca3
commit dbdb4ea66c

View File

@@ -13,19 +13,8 @@
<a href="https://github.com/QuentinFuxa/WhisperLiveKit/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/badge/License-MIT/Dual Licensed-dark_green"></a>
</p>
## Overview
This project is based on [WhisperStreaming](https://github.com/ufal/whisper_streaming) and [SimulStreaming](https://github.com/ufal/SimulStreaming), allowing you to transcribe audio directly from your browser. WhisperLiveKit provides a complete backend solution for real-time speech transcription with a functional, simple and customizable frontend. Everything runs locally on your machine ✨
### Architecture
WhisperLiveKit consists of three main components:
- **Frontend**: A basic html + JS interface that captures microphone audio and streams it to the backend via WebSockets. You can use and adapt the [provided template](https://github.com/QuentinFuxa/WhisperLiveKit/blob/main/whisperlivekit/web/live_transcription.html).
- **Backend (Web Server)**: A FastAPI-based WebSocket server that receives streamed audio data, processes it in real time, and returns transcriptions to the frontend. This is where the WebSocket logic and routing live.
- **Core Backend (Library Logic)**: A server-agnostic core that handles audio processing, ASR, and diarization. It exposes reusable components that take in audio bytes and return transcriptions.
### Key Features
- **Real-time Transcription** - Locally (or on-prem) convert speech to text instantly as you speak
@@ -37,6 +26,11 @@ WhisperLiveKit consists of three main components:
- **Punctuation-Based Speaker Splitting [BETA]** - Align speaker changes with natural sentence boundaries for more readable transcripts
- **SimulStreaming Backend** - [Dual-licensed](https://github.com/ufal/SimulStreaming#-licence-and-contributions) - Ultra-low latency transcription using SOTA AlignAtt policy.
### Architecture
<img width="3144" height="875" alt="Picture 1" src="https://github.com/user-attachments/assets/7c726753-0139-47f4-9080-e4b95c651c23" />
## Quick Start
```bash