Update README.md

2026-03-07 22:33:36 +00:00 · 2025-08-01 16:33:26 +02:00
parent 00424d7ca3
commit dbdb4ea66c
1 changed files with 5 additions and 11 deletions
--- a/README.md
+++ b/README.md
@@ -13,19 +13,8 @@
 <a href="https://github.com/QuentinFuxa/WhisperLiveKit/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/badge/License-MIT/Dual Licensed-dark_green"></a>
 </p>

-## Overview
-
 This project is based on [WhisperStreaming](https://github.com/ufal/whisper_streaming) and [SimulStreaming](https://github.com/ufal/SimulStreaming), allowing you to transcribe audio directly from your browser. WhisperLiveKit provides a complete backend solution for real-time speech transcription with a functional, simple and customizable frontend. Everything runs locally on your machine ✨

-### Architecture
-
-WhisperLiveKit consists of three main components:
-
- **Frontend**: A basic html + JS interface that captures microphone audio and streams it to the backend via WebSockets. You can use and adapt the [provided template](https://github.com/QuentinFuxa/WhisperLiveKit/blob/main/whisperlivekit/web/live_transcription.html).
- **Backend (Web Server)**: A FastAPI-based WebSocket server that receives streamed audio data, processes it in real time, and returns transcriptions to the frontend. This is where the WebSocket logic and routing live.
- **Core Backend (Library Logic)**: A server-agnostic core that handles audio processing, ASR, and diarization. It exposes reusable components that take in audio bytes and return transcriptions.
-
-
 ### Key Features

 - **Real-time Transcription** - Locally (or on-prem) convert speech to text instantly as you speak
@@ -37,6 +26,11 @@ WhisperLiveKit consists of three main components:
 - **Punctuation-Based Speaker Splitting [BETA]** - Align speaker changes with natural sentence boundaries for more readable transcripts
 - **SimulStreaming Backend** - [Dual-licensed](https://github.com/ufal/SimulStreaming#-licence-and-contributions) - Ultra-low latency transcription using SOTA AlignAtt policy. 

+### Architecture
+
+<img width="3144" height="875" alt="Picture 1" src="https://github.com/user-attachments/assets/7c726753-0139-47f4-9080-e4b95c651c23" />
+
+
 ## Quick Start

 ```bash