Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU seems not used #39

Open
tppqt opened this issue Feb 12, 2025 · 2 comments
Open

GPU seems not used #39

tppqt opened this issue Feb 12, 2025 · 2 comments

Comments

@tppqt
Copy link

tppqt commented Feb 12, 2025

I'm fine-tuning QWen2.5-VL-3B, but it's taking too long to run—about 4 hours per epoch, and the GPU is clearly underutilized while the CPU is almost at 100% utilization. Is this normal? I'm using a single 4090 GPU for fine-tuning.

Image

Image
Image

@2U1
Copy link
Owner

2U1 commented Feb 12, 2025

It seems like you are using cpu offloading. That slows down the training quite much.
You could adjust the settings in the deepspeed config file to put some layers back to gpu.

@2U1
Copy link
Owner

2U1 commented Feb 12, 2025

The settings written in the config files are not optimal for everyone. The less vram you are using, you will take more time to train. You should balance between 2 of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants