-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to do custom EarlyStopping?❓ [QUESTION] #380
Comments
Hi @ThePauliPrinciple , Thanks for your nice question and work with our code! Re
I've just added support for custom early stopping conditions on branch: https://github.com/mir-group/nequip/tree/feature-custom-early-stop with an example at https://github.com/mir-group/nequip-example-extension/tree/earlystop. Please give this a try and let me know if it works for you, and I'll merge it down. If this doesn't fully solve the issue (or even if it does), it might be a more complicated workflow than I'm anticipating, and maybe we should have a quick call to discuss---please feel free to send me an email at the address listed in my profile. |
This looks good to me. Passing the trainer object to the stopper might be useful to some, although for my use case I am only interested in "external" information. I'm not exactly certain what the comment about restarting means, in particular, when is a stopper considered "stateful"? The original early stopper also returned values to immediately debug/print, maybe that's also nice to add. |
Great! A stopper is "stateful" when it maintains a state like, say, how many epochs the validation loss hasn't improved (like the patience setting) or what the minimum observed value was (see https://github.com/mir-group/nequip/blob/feature-custom-early-stop/nequip/train/early_stopping.py#L120-L121). If it only depends on the current arguments to the object, and not any state stored in your custom object, then it's not stateful. (State of the trainer, if that was passed in, will be correctly preserved across restarts.)
What do you mean, exactly? |
nequip/nequip/train/early_stopping.py Line 98 in c56f48f
Here debug_args is returned, which is printed to the log:nequip/nequip/train/trainer.py Lines 874 to 882 in c56f48f
|
I would like to do some custom early stopping (e.g. based on a file existing, or checking if I get close to a walltime on a compute cluster)
Is there some way to specify a custom early stopping class?
I tried using
early_stopping
andearly_stopping_conds
arguments of the trainer (or in the config.yaml), but could not make anything happen.I was able to accomplish what I wanted through an on-end-epoch callback
But it seems rather hackish (you can't set
trainer.stop_cond
directly because it is a property without a setter).The text was updated successfully, but these errors were encountered: