This commit is contained in:
Yu Li
2023-12-01 22:36:20 -06:00
parent 9e6557aeee
commit 82201f9191
3 changed files with 1 additions and 1 deletions

View File

@@ -1,4 +1,4 @@
AirLLM optimizes inference memory usage, allowing 70B large language models to run inference on a single 4GB GPU card. No quantization, distillation, pruning or other model compression techniques that would result in degraded model performance are needed.
![airllm_logo](https://github.com/lyogavin/Anima/blob/main/assets/airllm_logo_sm.png?v=2&raw=true)AirLLM optimizes inference memory usage, allowing 70B large language models to run inference on a single 4GB GPU card. No quantization, distillation, pruning or other model compression techniques that would result in degraded model performance are needed.
AirLLM优化inference内存4GB单卡GPU可以运行70B大语言模型推理。不需要任何损失模型性能的量化和蒸馏剪枝等模型压缩。

BIN
assets/airllm_logo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

BIN
assets/airllm_logo_sm.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB