GPU seems not used #39

tppqt · 2025-02-12T09:12:18Z

I'm fine-tuning QWen2.5-VL-3B, but it's taking too long to run—about 4 hours per epoch, and the GPU is clearly underutilized while the CPU is almost at 100% utilization. Is this normal? I'm using a single 4090 GPU for fine-tuning.

2U1 · 2025-02-12T09:19:58Z

It seems like you are using cpu offloading. That slows down the training quite much.
You could adjust the settings in the deepspeed config file to put some layers back to gpu.

2U1 · 2025-02-12T09:22:32Z

The settings written in the config files are not optimal for everyone. The less vram you are using, you will take more time to train. You should balance between 2 of them.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU seems not used #39

GPU seems not used #39

tppqt commented Feb 12, 2025

2U1 commented Feb 12, 2025

2U1 commented Feb 12, 2025

GPU seems not used #39

GPU seems not used #39

Comments

tppqt commented Feb 12, 2025

2U1 commented Feb 12, 2025

2U1 commented Feb 12, 2025