You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hi, excellent work
I want to conduct large-scale data experiments to evaluate the i2v effect. How can I implement distributed training across multiple machines?
tks
The text was updated successfully, but these errors were encountered:
I guess if you wanted to do that with musubi the best way would be by configuring accelerate appropriately and then launching with a custom accelerate config. You can be guided through this setup with "accelerate config --config_file distrubuted.yaml" which will ask you questions about your setup and then when launching musubi change your command to "accelerate launch --config_file distributed.yaml hv_train_network.py etc..." but you will note under Features in the musubi readme.md it says "Multi-GPU support not implemented" so ymmv.
Otherwise take a look at https://github.com/tdrussell/diffusion-pipe which is more designed around distributed training, whereas musubi is more aimed at overall memory efficiency for a single device.
hi, excellent work
I want to conduct large-scale data experiments to evaluate the i2v effect. How can I implement distributed training across multiple machines?
tks
The text was updated successfully, but these errors were encountered: