Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality to support the deep potential #827

Open
wants to merge 70 commits into
base: master
Choose a base branch
from

Conversation

Kick-H
Copy link
Contributor

@Kick-H Kick-H commented Dec 15, 2024

Summary

I wrote a preliminary version and tested it with water system. After comparison, there was no problem with the force and energy. Here is the detailed installation and testing process.

The most complicated and time-consuming part of this version is probably the establishment of the neighbor list. I used the easiest method here, which requires completely mirroring the central cell to obtain the information of the 27 mirror boxes around it (boundary conditions p p p), and then completing the solution of the information of the atoms in the central box (the real box atoms).

This version was initially tested for speed and was several times slower than lammps, but it supports a larger simulation system size.

Modification

Added two files, force/deepmd.cu and force/deepmd.cuh, and added an example in examples/14_DP_water_msd, and the installation tutorial is also here (examples/14_DP_water_msd/Readme.md).

Others

Refer to the DP-lammps interface file pair_deepmd.cpp, which contains the dp.compute command, which can complete the solution of force and energy through coordinates and atomic types by calling dp's dependencies.

@@ -548,8 +620,14 @@ void initialize_position(
}

std::vector<std::string> atom_symbols;
auto filename_potential = get_filename_potential();
auto filename_potential = get_out_potential();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should be separately writen in the #ifdef and #else parts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should be separately writen in the #ifdef and #else parts

OK, I've fixed it and it compiles.

@brucefan1983
Copy link
Owner

brucefan1983 commented Dec 15, 2024

Have you also tested virial stress and heat current?

Make output concise
@Dankomaister
Copy link

Nice work @Kick-H !
I curious why does deepmd need its own neighbour list implementation instead of just using the one thats already in GPUMD?

@Kick-H
Copy link
Contributor Author

Kick-H commented Dec 16, 2024

Nice work @Kick-H ! I curious why does deepmd need its own neighbour list implementation instead of just using the one thats already in GPUMD?

This is related to the C++ interface of DP. If you are interested, welcome to check the source code.

@Kick-H
Copy link
Contributor Author

Kick-H commented Dec 16, 2024

Have you also tested virial stress and heat current?

I have tested stress. The heat flux test has not been done yet, but it is definitely worth doing.

Keep documents streamlined by removing unnecessary information
@brucefan1983 brucefan1983 changed the title Add functionality to support Deepkit potential Add functionality to support the deep potential Dec 16, 2024
@Dankomaister
Copy link

Maybe worth replacing the cuda* specific calls with the macros we have (src/utilities/gpu_macro.cuh) ? Then this should work on AMD GPUs as well (assuming one has tensorflow and the deepmd lib)

@Kick-H
Copy link
Contributor Author

Kick-H commented Dec 17, 2024

Maybe worth replacing the cuda* specific calls with the macros we have (src/utilities/gpu_macro.cuh) ? Then this should work on AMD GPUs as well (assuming one has tensorflow and the deepmd lib)

OK, thank you very much for your suggestion.

replae the cuda* specific calls with the macros
@brucefan1983
Copy link
Owner

brucefan1983 commented Dec 19, 2024

General comments:

  • Do not modify read_xyz.cu, try to make a driver input file for DeePMD to be specified in run.in:
potential deepmd_driver_input_file.txt

The file deepmd_driver_input_file.txt can read as follows:

deepmd 2 H O
your_deepmd.pb
  • Move the special neighbor list code to deepmd.cu. Do not put special code in public file.

  • Use a separate makefile for deepmd. Do not touch the existing makefiles.

  • In summary, keep the other parts untouched. Check this PR for a very good model: A hybrid potential "NEP+ILP" for graphene and h-BN #835

@QuantumMisaka
Copy link

@Kick-H NICE WORK!But is this support for all backends and descriptors of DP model? or only support spefific (like TF backends) of DP model?

@brucefan1983
Copy link
Owner

@Kick-H NICE WORK!But is this support for all backends and descriptors of DP model? or only support spefific (like TF backends) of DP model?

Currently only for the tensorflow backend, and the code is to be refactored (to make this part as isolated as possible) and optimized.

@brucefan1983 brucefan1983 marked this pull request as draft December 30, 2024 21:23
Delete the DUSE_tensorflow command of the Xu version
@Kick-H
Copy link
Contributor Author

Kick-H commented Jan 5, 2025

Hi, I tried to achieve it based on the code of @Kick-H these days and now there is some progress. 👼

Modification

  • Add two dp files: dp.cu and dp.cuh
  • Check dp in *force.cu"
  #ifdef DP_BHK
  } else if (strcmp(potential_name, "dp") == 0) {
    if (num_param != 3) {
      PRINT_INPUT_ERROR("potential should contain DP potential file behind setting file.\n");
    }
    potential.reset(new DP(param[2], number_of_atoms));
  #endif
  • Add two copy functions for GPU_Vector in gpu_vector.cuh to copy the GPU momery with an offset. I think it may be important so I didn't include them in macro definition.

Usage

I made a lot of mistakes when I built the environment before. So here, I want to show the usage method starting by setting up the environment. Maybe some experience could help you. My GPU is RTX4090 and system is Ubuntu 22.04. The environment controlling tool is conda

  1. Create a new conda environment with python and activate it.
conda create -n tf-gpu2  python=3.9
conda activate tf-gpu2
  1. Install CMake, CUDA-toolkit and Tensorflow. Please make sure the versions of CUDA-toolkit and Tensorflow is COMPATIBLE. My tensorflow version is 2.18.0.
pip install --upgrade cmake
conda install -c conda-forge cudatoolkit=11.8
pip install --upgrade tensorflow
  1. Download DP source code and compile the source files following DP docs. Here is my cmake commands:
mkdir build
cd build
cmake -DENABLE_TENSORFLOW=TRUE -DUSE_CUDA_TOOLKIT=TRUE -DCMAKE_INSTALL_PREFIX=path_to_install -DUSE_TF_PYTHON_LIBS=TRUE ../
make -j
make install

We just need DP C++ interface, so we don't source all DP environment. The libraries will be installed in path_to_install.

  1. Configure the makefile of GPUMD. The DP code is included by macro definition DP_BHK. So add it to CFLAGS
CFLAGS = -std=c++14 -O3 $(CUDA_ARCH) -DDP_BHK

Then we need to link the DP C++ libraries. Add this two lines to update the include and link paths and compile GPUMD.

INC += -Ipath_to_install/include/deepmd
LDFLAGS += -Lpath_to_install/lib -ldeepmd_cc
  1. When run GPUMD, I get an error that could not find libraries of DP. So I need to add it to my library path. I choose a temporary method. Here is the run code:
LD_LIBRARY_PATH=path_to_install/lib:$LD_LIBRARY_PATH   gpumd
  1. This DP interface need two files: setting file and DP potential file. The first file is very easy, used to make GPUMD know the atom number and types. For example:
dp 2 O H

! ! ! IMPORTANT ! ! ! The type list of setting file and potential file must be the same .

Test

The test files are in tests/dp/. I test on my computer and here are some results.

  • Energy conservation: run NVE in 1K with timestep 1fs
    nve_dp
  • Speed: 4.7e4 atom*step/second for 1k5 atoms running 1e4 steps
    image
  • DEBUG flag test:
    image

Others

This interface may need more test 🚶 , such as virial. Welcome to test! ✨

TODO

  • Triclinic box is not supported.
  • Just support Tensorflow backend.

Thank you very much, I will check it as much as possiable.

@Kick-H
Copy link
Contributor Author

Kick-H commented Jan 5, 2025

I have change the #ifdef DP_BHK to #ifdef USE_TENSORFLOW, And the version is from @BBBuZHIDAO , which is faster. The old version has been backed up.

@Kick-H
Copy link
Contributor Author

Kick-H commented Jan 5, 2025

I will write a README and test it as soon as possible.

@Kick-H Kick-H marked this pull request as ready for review January 6, 2025 07:13
@Kick-H Kick-H marked this pull request as draft January 6, 2025 07:14
@Kick-H
Copy link
Contributor Author

Kick-H commented Jan 6, 2025

GPUMD supports DP potential project

Author: 徐克 (kickhsu[at]gmail.com)

Author: 卜河凯 (hekai_bu[at]whu.edu.cn)

WeChat Official Account: 微纳计算 (nanocomp)

0 Program Introduction

0.1 Necessary instructions

  • This is a test version.
  • Only potential function files ending with .pb in deepmd are supported, that is, the potential function files of the tensorflow version generated using dp --tf freeze.

0.2 Installation Dependencies

  • You must ensure that the new version of DP is installed and can run normally. This program contains DP-related dependencies.
  • The installation environment requirements of GPUMD itself must be met.

1 Installation details

Use the instance in AutoDL for testing。

If you need testing use AutoDL, please contact us.

And we have created an image in AutoDL that can run GPUMD-DP directly, which can be shared with the account that provides the user ID. Then, you will not require the following process and can be used directly.

2 GPUMD-DP installation (Offline version)

2.0 DP installation (Offline version)

Use the latest version of DP installation steps:

>> $ # Copy data and unzip files.
>> $ cd /root/autodl-tmp/
>> $ wget https://mirror.nju.edu.cn/github-release/deepmodeling/deepmd-kit/v3.0.0/deepmd-kit-3.0.0-cuda126-Linux-x86_64.sh.0 -O deepmd-kit-3.0.0-cuda126-Linux-x86_64.sh.0
>> $ wget https://mirror.nju.edu.cn/github-release/deepmodeling/deepmd-kit/v3.0.0/deepmd-kit-3.0.0-cuda126-Linux-x86_64.sh.1 -O deepmd-kit-3.0.0-cuda126-Linux-x86_64.sh.1
>> $ cat deepmd-kit-3.0.0-cuda126-Linux-x86_64.sh.0 deepmd-kit-3.0.0-cuda126-Linux-x86_64.sh.1 > deepmd-kit-3.0.0-cuda126-Linux-x86_64.sh
>> $ # rm deepmd-kit-3.0.0-cuda126-Linux-x86_64.sh.0 deepmd-kit-3.0.0-cuda126-Linux-x86_64.sh.1 # Please use with caution "rm"
>> $ sh deepmd-kit-3.0.0-cuda126-Linux-x86_64.sh -p /root/autodl-tmp/deepmd-kit -u # Just keep pressing Enter/yes.
>> $ source /root/autodl-tmp/deepmd-kit/bin/activate /root/autodl-tmp/deepmd-kit
>> $ dp -h

After running according to the above steps, using dp -h can successfully display no errors.

2.1 GPUMD-DP installation

The github link is Here.

>> $ wget https://codeload.github.com/Kick-H/GPUMD/zip/7af5267f4d8ba720830c154f11634a1942b66b08
>> $ cd ${GPUMD}/src-v0.1

Modify makefile as follows:

  • Line 19 is changed from CUDA_ARCH=-arch=sm_60 to CUDA_ARCH=-arch=sm_89 (for RTX 4090). Modify according to the corresponding graphics card model.

  • Line 25 is changed from INC = -I./ to INC = -I./ INC = -I./ -I/root/miniconda3/deepmd-kit/source/build/path_to_install/include/deepmd

  • Line 27 is changed fromLIBS = -lcublas -lcusolver to LIBS = -lcublas -lcusolver -L/root/miniconda3/deepmd-kit/source/build/path_to_install/lib -ldeepmd_cc

Then run the following installation command:

>> $ sudo echo "export LD_LIBRARY_PATH=/root/miniconda3/deepmd-kit/source/build/path_to_install/lib:$LD_LIBRARY_PATH" >> /root/.bashrc
>> $ source /root/.bashrc
>> $ make gpumd -j

2.2 Running Tests

>> $ cd /root/miniconda3/GPUMD-bu0/tests/dp
>> $ ../../src/gpumd

3 GPUMD-DP installation (Online version)

3.0 DP installation (Online version)

3.1 Conda environment

Create a new conda environment with python and activate it.

>> $ conda create -n tf-gpu2  python=3.9
>> $ conda activate tf-gpu2

3.2 Conda install some packages

Install CMake, CUDA-toolkit and Tensorflow. Please make sure the versions of CUDA-toolkit and Tensorflow is COMPATIBLE. My tensorflow version is 2.18.0.

>> $ pip install --upgrade cmake
>> $ conda install -c conda-forge cudatoolkit=11.8
>> $ pip install --upgrade tensorflow

3.3 download deep-kit and install

Download DP source code and compile the source files following DP docs. Here is cmake commands:

>> $ git clone https://github.com/deepmodeling/deepmd-kit.git
>> $ cd deepmd-kit/source
>> $ mkdir build
>> $ cd build
>> $ cmake -DENABLE_TENSORFLOW=TRUE -DUSE_CUDA_TOOLKIT=TRUE -DCMAKE_INSTALL_PREFIX=`path_to_install` -DUSE_TF_PYTHON_LIBS=TRUE ../
>> $ make -j
>> $ make install

We just need DP C++ interface, so we don't source all DP environment. The libraries will be installed in path_to_install.

3.4 Configure the makefile of GPUMD

The github link is Here.

>> $ wget https://codeload.github.com/Kick-H/GPUMD/zip/7af5267f4d8ba720830c154f11634a1942b66b08
>> $ cd ${GPUMD}/src
>> $ vi makefile

Configure the makefile of GPUMD. The DP code is included by macro definition USE_TENSORFLOW. So add it to CFLAGS

CFLAGS = -std=c++14 -O3 $(CUDA_ARCH) -DUSE_TENSORFLOW

Then we need to link the DP C++ libraries. Add this two lines to update the include and link paths and compile GPUMD.

INC += -Ipath_to_install/include/deepmd

LDFLAGS += -Lpath_to_install/lib -ldeepmd_cc

>> $ make gpumd -j

3.5 Run GPUMD

When run GPUMD, I get an error that could not find libraries of DP. So I need to add it to my library path. I choose a temporary method. Here is the run code:

LD_LIBRARY_PATH=path_to_install/lib:$LD_LIBRARY_PATH

Or you can add the environment to the ~/.bashrc

>> $ sudo echo "export LD_LIBRARY_PATH=/root/miniconda3/deepmd-kit/source/build/path_to_install/lib:$LD_LIBRARY_PATH" >> ~/.bashrc
>> $ source ~/.bashrc

3.6 Run Test

This DP interface need two files: setting file and DP potential file. The first file is very easy, used to make GPUMD know the atom number and types. For example:

dp 2 O H

Notice

The type list of setting file and potential file must be the same.

Copy link
Contributor Author

@Kick-H Kick-H left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests that have been conducted include:

  • Basic information comparison with lammps, including force, energy, potential, etc.
  • Some dynamic processes.

The project has come to an end for the time being, and areas that need to be improved in the future:

  • Increase in calculation speed
  • Support for triclinic boxes

@Kick-H Kick-H marked this pull request as ready for review January 11, 2025 03:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants