This website requires JavaScript.
Explore
Help
Register
Sign In
LLM
/
airllm
Watch
1
Star
0
Fork
0
You've already forked airllm
mirror of
https://github.com/0xSojalSec/airllm.git
synced
2026-03-07 22:33:47 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
Files
834d5e7aaf719808645b204dab8e5f93f20b583d
airllm
/
rlhf
History
Yu Li
834d5e7aaf
cs
2023-07-01 17:57:41 -05:00
..
pre_post_dpo_model_output_belle_eval_1k.csv
cs
2023-07-01 17:57:41 -05:00
qlora_dpo.py
init dpo based rlhf
2023-06-29 16:08:59 -05:00
README.md
readme
2023-07-01 16:42:16 -05:00
RLHF.png
RLHF graph
2023-06-29 17:49:06 -05:00
run_dpo_training.sh
init dpo based rlhf
2023-06-29 16:08:59 -05:00
wandb_eval_loss_chart.png
wandb eval loss chart
2023-07-01 17:50:54 -05:00
README.md
Anima基于QLoRA+DPO的RLHF
Read this in
English
.
参与贡献
欢迎大家参与贡献本项目
🙏
如果你喜欢我们的项目,请帮忙点个
⭐
吧!