Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

基于SFT后的billa权重续训问题 #39

Open
Jiangchenglin521 opened this issue Jul 4, 2023 · 1 comment
Open

基于SFT后的billa权重续训问题 #39

Jiangchenglin521 opened this issue Jul 4, 2023 · 1 comment

Comments

@Jiangchenglin521
Copy link

1,(题外)现在llama感觉已经放松了,vicuna都已经全量放在hf上来,咱们hf模型能不能不搞这种权重mask操作,直接上一个可用版本呢?这样第一步的convert就不用自己搞了。
2,看训练里只是说到了要想基于sft的续训,就要还原成原始的billa-llama的模型文件进行训练。但没有给出脚本,这边想问下,大佬能不能直接将sft后的可以续训的模型文件发我一下呢?或者有没有反向转换教程提供下~。这边想基于第三阶段模型微调一下。

@YuxuanLei2000
Copy link

同问,想继续微调模型

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants