You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm fine-tuning QWen2.5-VL-3B, but it's taking too long to run—about 4 hours per epoch, and the GPU is clearly underutilized while the CPU is almost at 100% utilization. Is this normal? I'm using a single 4090 GPU for fine-tuning.
The text was updated successfully, but these errors were encountered:
It seems like you are using cpu offloading. That slows down the training quite much.
You could adjust the settings in the deepspeed config file to put some layers back to gpu.
The settings written in the config files are not optimal for everyone. The less vram you are using, you will take more time to train. You should balance between 2 of them.
I'm fine-tuning QWen2.5-VL-3B, but it's taking too long to run—about 4 hours per epoch, and the GPU is clearly underutilized while the CPU is almost at 100% utilization. Is this normal? I'm using a single 4090 GPU for fine-tuning.
The text was updated successfully, but these errors were encountered: