This website requires JavaScript.
Explore
Help
Register
Sign In
LLM
/
airllm
Watch
1
Star
0
Fork
0
You've already forked airllm
mirror of
https://github.com/0xSojalSec/airllm.git
synced
2026-04-28 17:40:01 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
Files
6b1fcbfe8c6a280c15fcfaa21b3a3598ad8c988f
airllm
/
rlhf
History
Yu Li
6b1fcbfe8c
wandb eval loss chart
2023-07-01 17:50:54 -05:00
..
qlora_dpo.py
init dpo based rlhf
2023-06-29 16:08:59 -05:00
README.md
readme
2023-07-01 16:42:16 -05:00
RLHF.png
RLHF graph
2023-06-29 17:49:06 -05:00
run_dpo_training.sh
init dpo based rlhf
2023-06-29 16:08:59 -05:00
wandb_eval_loss_chart.png
wandb eval loss chart
2023-07-01 17:50:54 -05:00
README.md
Anima基于QLoRA+DPO的RLHF
Read this in
English
.
参与贡献
欢迎大家参与贡献本项目
🙏
如果你喜欢我们的项目,请帮忙点个
⭐
吧!