You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Quick request if possible. loss/epoch seems to be calculated as the total average loss since the start of training. Is it possible to upgrade this to record the average step loss of the epoch instead? I think this will better-guide everyone in how their latest model is performing and better matches what people expect this value to contain.
Eg, something like:
forepochinrange(epoch_to_start, num_train_epochs):
epoch_losses= [] # Reset each epochforstep, batchinenumerate(train_dataloader):
# ... training loop ...current_loss=loss.detach().item()
epoch_losses.append(current_loss)
# ... other code ...# After steps in epoch, calculate epoch lossavg_epoch_loss=sum(epoch_losses) /len(epoch_losses) iflen(epoch_losses) >0else0.0iflen(accelerator.trackers) >0:
logs= {"loss/epoch": avg_epoch_loss}
accelerator.log(logs, step=epoch+1)
The text was updated successfully, but these errors were encountered:
Quick request if possible.
loss/epoch
seems to be calculated as the total average loss since the start of training. Is it possible to upgrade this to record the average step loss of the epoch instead? I think this will better-guide everyone in how their latest model is performing and better matches what people expect this value to contain.Eg, something like:
The text was updated successfully, but these errors were encountered: