You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The following line defining the raw_model is only called once at the start before the training loop begins:
raw_model=model.moduleifddpelsemodel# always contains the "raw" unwrapped model
If I'm not mistaken, doesn't this create a small bug because raw_model is never trained in the loop? As only model, which is either the normal model or a DDP() wrapped model, is the model instance that is trained? Unless raw_model is also updated during DDP?
If so with this logic, it seems that saving the model checkpoints is currently redundant, as only raw_model.state_dict() is being saved every time, which is static.
The following line defining the
raw_model
is only called once at the start before the training loop begins:If I'm not mistaken, doesn't this create a small bug because
raw_model
is never trained in the loop? As onlymodel
, which is either the normal model or aDDP()
wrappedmodel
, is the model instance that is trained? Unlessraw_model
is also updated duringDDP
?If so with this logic, it seems that saving the model checkpoints is currently redundant, as only
raw_model.state_dict()
is being saved every time, which is static.As a suggestion would it be more correct to use:
The text was updated successfully, but these errors were encountered: