refine readme

2026-03-07 14:24:44 +00:00 · 2023-12-25 17:26:21 -06:00
parent 6e3eaabef0
commit aba11e32cd
1 changed files with 24 additions and 0 deletions
--- a/air_llm/README.md
+++ b/air_llm/README.md
@@ -1,9 +1,16 @@
 ![airllm_logo](https://github.com/lyogavin/Anima/blob/main/assets/airllm_logo_sm.png?v=3&raw=true)

+[**Quickstart**](#quickstart) | 
+[**Configurations**](#configurations) | 
+[**MacOS**](#macos) | 
+[**Example notebooks**](#example-python-notebook) | 
+[**FAQ**](#faq)
+
 **AirLLM** optimizes inference memory usage, allowing 70B large language models to run inference on a single 4GB GPU card. No quantization, distillation, pruning or other model compression techniques that would result in degraded model performance are needed.

 AirLLM优化inference内存，4GB单卡GPU可以运行70B大语言模型推理。不需要任何损失模型性能的量化和蒸馏，剪枝等模型压缩。

+
 ## Updates

 [2023/12/25] v2.8.2: Support MacOS running 70B large language models.
@@ -350,6 +357,23 @@ input_tokens = model.tokenizer(input_text,
 )
 ```

+## Citing AirLLM
+
+If you find
+AirLLM useful in your research and wish to cite it, please use the following
+BibTex entry:
+
+```
+@software{airllm2023,
+  author = {Gavin Li},
+  title = {AirLLM: scaling large language models on low-end commodity computers},
+  url = {https://github.com/lyogavin/Anima/tree/main/air_llm},
+  version = {0.0},
+  year = {2023},
+}
+```
+
+

 ## Contribution