-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Atom virial and stress calculation for virial calculation parallelism of Allegro in Lammps #281
base: develop
Are you sure you want to change the base?
Conversation
parallelism of virial calculation in Lammps.
data = AtomicDataDict.with_edge_vectors(data, with_lengths=False) | ||
data[AtomicDataDict.EDGE_VECTORS_KEY].requires_grad_(True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
nequip/nn/_grad_output.py
Outdated
edge_virial_2 = torch.einsum( | ||
"zi,zj->zji", vector_force, data[AtomicDataDict.EDGE_VECTORS_KEY] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason not to just use edge_virial.transpose(-1, -2)
and avoid recomputing the einsum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your suggestions and we can just do transpose.
nequip/nn/_grad_output.py
Outdated
edge_virial_2 = torch.einsum( | ||
"zi,zj->zji", vector_force, data[AtomicDataDict.EDGE_VECTORS_KEY] | ||
) | ||
edge_virial = (edge_virial_1 + edge_virial_2) / 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This guarantees a symmetric virial?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right.
nequip/nn/_grad_output.py
Outdated
) | ||
virial = scatter(atom_virial, batch, dim=0, reduce="sum") | ||
|
||
# virial = grads[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# virial = grads[1] |
if has_cell: | ||
orig_cell = data[AtomicDataDict.CELL_KEY] | ||
# Make the cell per-batch | ||
cell = orig_cell.view(-1, 3, 3).expand(num_batch, 3, 3) | ||
data[AtomicDataDict.CELL_KEY] = cell | ||
else: | ||
# torchscript | ||
orig_cell = self._empty | ||
cell = self._empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This business is all unnecessary now, right? Since there is no derivative w.r.t. the cell, so the cell doesn't need to be made per-batch-entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Cell here is only used to get the stress and for the training with stress.
torch.cross(cell[:, 1, :], cell[:, 2, :], dim=1), | ||
).unsqueeze(-1) | ||
stress = virial / volume.view(-1, 1, 1) | ||
data[AtomicDataDict.CELL_KEY] = orig_cell |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data[AtomicDataDict.CELL_KEY] = orig_cell |
# they say the standard convention is virial = -stress x volume | ||
# looking above this means that we need to pick up another negative sign for the virial | ||
# to fit this equation with the stress computed above | ||
virial = torch.neg(virial) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would need to confirm correctness of sign convention here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. I didn't check it and just inherit it from StressForceOutput
.
|
||
if not did_pos_req_grad: | ||
# don't give later modules one that does | ||
pos.requires_grad_(False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also reset requires_grad
opn the edge vectors
atoms.calc = SinglePointCalculator( | ||
energy=np.random.random(), | ||
forces=np.random.random((len(atoms), 3)), | ||
stress=np.random.random(6), | ||
magmoms=None, | ||
atoms=atoms, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fake data is never used, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. This part is just inherited from molecules. Can just delete them.
What is the status of this? If it is working, I do have an immediate use / test case for it and am very interested. |
Hi @tgmaxson , Not sure about status of this implementation; our implementation using the existing If you have further questions about that please feel free to open a thread in the |
@Linux-cpp-lisp Hi, as you can see the tests all pass in python interface (give consistence result with StressOutput with cell) and Lammps interface (consistence result between python and Lammps with different ranks) and my own experience and checks, I am quite sure this PR is reliable and suitable for Allegro for NPT simulations in parallel so I don't think this PR and the PR of pair_allegro need more changes. I have also tried StressOutput with Allegro in lammps in parallel but it gives inconsistent force and stress caused by pair_allegro interface. @tgmaxson You can have a try by using nequip and pair_allegro with my modification and train with ParaStressForceOutput, which I have tested a lot and been using it for a while and things goes well on my side. The analysis of my implementations can be seen in page 11 of this preprint. Feel free if you meet any problems. |
Hi @Hongyu-yu , Ah great, thanks for the update!
Interesting, was this with the |
Hi @Linux-cpp-lisp, Right. I tested with the stress branch and it failed. I test it again just then (see here). Hope that this PR helps. |
@Hongyu-yu I see, thanks for clarifying and the quick responses!--- I'm hoping to find enough time to look through all this carefully this week. |
@Linux-cpp-lisp Thanks for your work:) |
Thanks for sharing your thoughts. The implementation works fine for both Allegro and NequIP and for Allegro, which is strictly local, the atomic virial could be possible to be right and I think it's worthwhile to give it a try. |
Hello everyone, Sorry for barging in on this PR with some PR 😄, but I thought I'd highlight some developments that may be of interest to this discussion! I talked to Simon and Anders a while ago about these topics as well. Regarding the heat flux, we wrote a small article a few weeks ago that confirms @brucefan1983's suspicion: Message passing makes the heat flux a bit more complicated, especially with AD. Luckily there's a fairly straightforward way to implement it anyway with AD. We also just released another article that goes into a bit more implementation detail (example implementation in JAX), and also discusses computing the stress (and atomic stresses) with AD. It should be fairly straightforward to apply this heat flux to NequIP (or Allegro, where the "local" version is even faster). Let me know if I can help with that! Cheers! |
Hi @sirmarcel , |
Hey, The equations in the paper look indeed fine, but for heat flux, you need to be rather careful about indexing, as @brucefan1983 pointed out above: The heat flux is typically computed as In other words, the index of the atomic energy shouldn't be the same index as the one in the velocity. I'm not sure what happens if you do the averaging you describe in the text after eq. 9. Is there any reason you do it this way? |
Hi @sirmarcel , |
Sorry, I'm not so familiar with the internals of the scatter(
edge_virial,
data[AtomicDataDict.EDGE_INDEX_KEY][1],
dim=0,
reduce="sum",
dim_size=len(pos),
) in your code is doing, but I'm not sure, as I don't really know that the index arrays are, etc. In all cases you should get the same total stress tensor for the system. |
Hello, sorry for disturbing again. Thanks @sirmarcel for confirming that heat current for message-passing construction is a little more complicated than "local" machine-learned potentials. I also read your preprints and they look very nice. I think the current parallel version of virial by @Hongyu-yu should be correct as he confirmed there are consistency between two approaches (system virial and atomic virial). I suggest merging this nice work to master branch. This feature is crucial for efficient NPT simulations and alike in LAMMPS. I have seen this PR to be here for a long time. Why not move on? Heat current is more involved and can be left for a future work. One can put a warning message though. |
Hi @sirmarcel , |
Hi @brucefan1983, |
Hello @Hongyu-yu and thanks a lot for you implimetation of per atom virial stress calculation. I am using your
I am trainig the potential using the following yaml with your new Allegro model
What I get as an output for the global
and the per atom stress are always zero:
Probably I haven't understood something correctly about the functionality in LAMMPS. |
Hi,@enikidis |
Hi @Hongyu-yu, But I tried to compile your I got this error massage
Can you know me how to fix it? Thanks again! |
Hi, @pyungjinpark |
Thank you for your quick reply @Hongyu-yu ! I didn't change your sorce file, I will try it agin. And I tried 2000K NPT simulation in silicon with pair_nequip based on your code (you can check it in my repositories https://github.com/pyungjinpark/pair_nequip). It give me atomic press well but when I sum it, It is different with total press
where Press is total press and v_press is sum of atomic press When I tested same thing in QUIP, it gave same total press and v_press Do you have any clue? |
just a quick thought, does the pressure in this test you are doing include the ideal gas contribution from velocity? this may result in deviations so make sure you are calculating things consistently (which is the normal error I have when moving from LAMMPS to analysis). |
Hi @pyungjinpark , |
It is solved!
It shows some error (not exactly same) but works well when I fix the sign. |
@pyungjinpark Looks great! How about the results that only changing the sign of atomic virial? The atomic virial is not required to be symmetric. |
Hi !@Hongyu-yu
The definition of virial in lammps is where In case of heat flux vector where Maybe Lammps assumes it is symmetric. If I understand corretly you used edge data for virial then It might give symmetric information, isn't it?
|
Hi @pyungjinpark ,
|
Here is the exact lines of QUIP @Hongyu-yu if (vflag_atom) { you can check it in lammps/src/ML-QUIP/pair_quip.cpp @brucefan1983 is right. I only considered pair-atomic like virial, I will change the code ! |
Hi, @pyungjinpark |
ParaStressForceOutput
Support for atom virial and virial parallelism in Lammps.Description
Another algorithm implementation for stress calculation can calculate atom virial and should help with parallelism of Allegro in Lammps.
PBC systems are used for tests.
Motivation and Context
Resolves: pari_allegro#9
Enable Allegro NPT simulations.
How Has This Been Tested?
Stress and virial results of
ParaStressForceOutput
andStressForceOutput
are consistent.Tests are carried on a PBC system with cell.
Just replace
StressForceOutput
withParaStressForceOutput
:)See test_stress in nequip/tests/unit/model/test_nequip_model.py.
Types of changes
Checklist:
black
.docs/options
) has been updated with new or changed options.CHANGELOG.md
.