Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use lammps inference not quickly, but use python to run model is fast #57

Open
WorldPeace123 opened this issue Jan 7, 2025 · 3 comments

Comments

@WorldPeace123
Copy link

in pair_allegro_kokkos.cpp line 296. the model need input ,after compute give the lmp an output.
I test the " auto output = this->model.forward(input_vector).toGenericDict();" time ,
in RTX3060, 1step need 0.4s.and i write a python code .load the model and give input.,and 1 step need 0.05s!!!
so,what is the problem in lammps?and how to make the model inference fast?
Thanks !

@WorldPeace123
Copy link
Author

1: as we all know ,use scriptmodel run in C++ have good performance.but in this model ,it seemd bad.
2:netron can see the model input shape,such as resnet50,but deployed.pth cant see in netron.if i get the true input shape ,i can use lot of tools to make the model run fast ,but scriptmodel is different. so if we change the model save ,maby good?
3:now have many ai compile tools,this model cant let ai compiler to change that have better performance :-(

@Linux-cpp-lisp
Copy link
Collaborator

The TorchScript compiler can take a few iterations to warm up, which often manifests in very slow time steps initially.

@WorldPeace123
Copy link
Author

TorchScript 编译器可能需要几次迭代来预热,这通常在最初表现为非常慢的时间步长。
i know my friend,but in pair_allegro.cpp ,auto output = this->model.forward(input_vector).toGenericDict()The average time taken is 0.4s after the warm-up mechanism.

i get the log ,can you help me to know what reason make the C++ run model dont not fast than use python to run model?
Thanks !!!

this is python
##jit.load.mod and use
/home/worldpeace/anaconda3/envs/tvm/lib/python3.11/site-packages/nequip/init.py:20: UserWarning: !! PyTorch version 2.5.1 found. Upstream issues in PyTorch versions 1.13.* and 2.* have been seen to cause unusual performance degredations on some CUDA systems that become worse over time; see mir-group/nequip#311. The best tested PyTorch version to use with CUDA devices is 1.11; while using other versions if you observe this problem, an unexpected lack of this problem, or other strange behavior, please post in the linked GitHub issue. warnings.warn(
Time :0.6278s
Time :0.5497s
Time :1.6636s
Time :0.3279s
Time :0.0221s
Time :0.0236s
Time :0.0239s
Time :0.0243s
Time :0.0231s
Time :0.0288s

this is pair_allegro
Per MPI rank memory allocation (min/avg/max) = 5.31 | 5.31 | 5.31 Mbytes Step Time PotEng KinEng TotEng Temp Press Volume Density
0 0 -5115.3514 247.66244 -4867.6889 1000 124912.37 35001.599 9.8101743
model.forward Time is : 0.445529 stoTensor().cpu Time is : 0.000252 s
Pair All Time is : 0.452624 s
model.forward Time is : 3.49675 s
toTensor().cpu Time is : 7.3e-05 s
Pair All Time is : 3.50579 s
model.forward Time is : 0.283893 s
toTensor().cpu Time is : 0.000103 s
Pair All Time is : 0.286683 s
model.forward Time is : 0.282643 s
toTensor().cpu Time is : 6e-05 s
Pair All Time is : 0.290819 s
model.forward Time is : 0.283949 s
toTensor().cpu Time is : 0.000106 s
Pair All Time is : 0.28688 s
model.forward Time is : 0.282495 s
toTensor().cpu Time is : 5.4e-05 s
Pair All Time is : 0.286974 s
model.forward Time is : 0.282829 s
toTensor().cpu Time is : 0.000113 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants