Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] L1 L2 Regularization #4415

Open
well-zt opened this issue Nov 25, 2024 · 3 comments
Open

[Feature Request] L1 L2 Regularization #4415

well-zt opened this issue Nov 25, 2024 · 3 comments

Comments

@well-zt
Copy link

well-zt commented Nov 25, 2024

Summary

Hi, I'm using Deepmd-kit, but I encountered over-fitting problem during training. How can I port L1 L2 regularization into deepmd-kit?

Detailed Description

Just as summary

Further Information, Files, and Links

No response

@caic99
Copy link
Member

caic99 commented Dec 9, 2024

Hi @well-zt ,
You can try setting gradient_max_norm parameter in input.json. The related codes are here:

if self.gradient_max_norm > 0.0:
torch.nn.utils.clip_grad_norm_(
self.wrapper.parameters(),
self.gradient_max_norm,
error_if_nonfinite=True,
)

Ref

@QuantumMisaka
Copy link

@caic99 L1, L2 regularization is a common practice in ML training but not common in MLIP training, should we put how (and why) to use regularization in loss function in docs of deepmd-kit ?

@caic99
Copy link
Member

caic99 commented Dec 25, 2024

@caic99 L1, L2 regularization is a common practice in ML training but not common in MLIP training, should we put how (and why) to use regularization in loss function in docs of deepmd-kit ?

@QuantumMisaka I would like to correct my reply above: what I meant was the L2 norm used for gradient clipping, which is not the same thing as the original poster said.

how (and why) to use regularization in loss function

Unluckily I have no experience on using regularization for MLIP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants