Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

全参数微调资源要求 #37

Open
FloatingIsland2 opened this issue Jun 9, 2023 · 4 comments
Open

全参数微调资源要求 #37

FloatingIsland2 opened this issue Jun 9, 2023 · 4 comments

Comments

@FloatingIsland2
Copy link

我们在8*P100上进行第二阶段的全参数微调,报错:No pre-built kernel is found, build and load the cpu_adam kernel during runtime now,这样是显存不够或者算力不行吗。

想问作者,全参数微调是不是在8*(v100 32G)上才跑得动,如果是8*(v100 16G)可以吗。如果可以的话我们还有的换,不行的话是否需要参考deepspeed改为模型并行?

@Neutralzz
Copy link
Owner

这是colossalai没装好

@Neutralzz
Copy link
Owner

7B模型fp16加载到gpu里就要占14G显存的,你要在16G的V100上跑 batch_size估计是能是1或2。。。

@FloatingIsland2
Copy link
Author

这是colossalai没装好

这样呀,我试试换个版本重装,谢谢~

@FloatingIsland2
Copy link
Author

7B模型fp16加载到gpu里就要占14G显存的,你要在16G的V100上跑 batch_size估计是能是1或2。。。

好的。还想问下第三阶段readme里面提到用了两张A100,为什么这一步比其他的省卡了呢,一张A100 40G能否跑起来这一阶段?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants