mirror of
https://github.com/datalab-to/chandra.git
synced 2026-04-21 00:31:50 +00:00
Update README
This commit is contained in:
14
README.md
14
README.md
@@ -19,7 +19,7 @@ Chandra OCR 2 is a highly accurate OCR model that converts images and PDFs into
|
||||
|
||||
## News
|
||||
|
||||
- 3/2026 - Chandra 2 is here, with significant improvements to math, tables, and multilingual OCR
|
||||
- 3/2026 - Chandra 2 is here, with significant improvements to math, tables, layout, and multilingual OCR
|
||||
- 10/2025 - Chandra 1 launched
|
||||
|
||||
## Features
|
||||
@@ -93,10 +93,6 @@ See full scores [below](#benchmark-table).
|
||||
| Other | Charts | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/other/charts.png) |
|
||||
| Other | Chemistry | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/other/chemistry.png) |
|
||||
|
||||
## Community
|
||||
|
||||
[Discord](https://discord.gg//KuZwXNGnfH) is where we discuss future development.
|
||||
|
||||
## Installation
|
||||
|
||||
### Package
|
||||
@@ -272,6 +268,14 @@ We also have a more comprehensive evaluation covering 90 languages, comparing Ch
|
||||
|
||||
See the [full 90-language results](FULL_BENCHMARKS.md).
|
||||
|
||||
## Throughput
|
||||
|
||||
Benchmarked with vLLM on a single NVIDIA H100 80GB GPU using a diverse mix of documents (math, tables, scans, multi-column layouts) from the olmOCR benchmark set. This set is significantly slower than real-world usage - we estimate 2 pages/s in real-world usage.
|
||||
|
||||
| Configuration | Pages/sec | Avg Latency | P95 Latency | Failure Rate |
|
||||
|---|:---:|:---:|:---:|:---:|
|
||||
| vLLM, 96 concurrent sequences | 1.44 | 60s | 156s | 0% |
|
||||
|
||||
# Credits
|
||||
|
||||
Thank you to the following open source projects:
|
||||
|
||||
Reference in New Issue
Block a user