登录
gromacs_deepmd机器学习训练水分子力场
内含deepmd机器学习力场训练水分子,gromacs调用运行例子
star0
0/小时
v1.0
最新

进入工作目录

cd /workspace/test/

ls 00data 01train 02gmx

00data 训练数据

01train 训练文件夹

02gmx gromacs演示文件夹

训练

进入文件夹

cd /workspace/test/01train/ && ls

开始训练

dp train input.json

dp train input.json 
2026-01-01 09:27:59.177759: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2026-01-01 09:28:02.383262: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2026-01-01 09:28:05.951113: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/keras/src/export/tf2onnx_lib.py:8: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
  if not hasattr(np, "object"):
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
Switch to serial execution due to lack of horovod module.
[2026-01-01 09:28:20,026] DEEPMD INFO    Calculate neighbor statistics... (add --skip-neighbor-stat to skip this step)
[2026-01-01 09:28:21,633] DEEPMD INFO    If you encounter the error 'an illegal memory access was encountered', this may be due to a TensorFlow issue. To avoid this, set the environment variable DP_INFER_BATCH_SIZE to a smaller value than the last adjusted batch size. The environment variable DP_INFER_BATCH_SIZE controls the inference batch size (nframes * natoms). 
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1767259701.888737     151 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767259701.897942     151 mlir_graph_optimization_pass.cc:437] MLIR V1 optimization pass is not enabled
I0000 00:00:1767259702.090857     234 cuda_solvers.cc:175] Creating GpuSolver handles for stream 0x9085580
[2026-01-01 09:28:25,141] DEEPMD INFO    Adjust batch size from 1024 to 2048
[2026-01-01 09:28:25,283] DEEPMD INFO    Adjust batch size from 2048 to 4096
[2026-01-01 09:28:25,513] DEEPMD INFO    Adjust batch size from 4096 to 8192
[2026-01-01 09:28:25,850] DEEPMD INFO    Adjust batch size from 8192 to 16384
[2026-01-01 09:28:26,958] DEEPMD INFO    Neighbor statistics: training data with minimal neighbor distance: 0.885439
[2026-01-01 09:28:26,958] DEEPMD INFO    Neighbor statistics: training data with maximum neighbor size: [38 72] (cutoff radius: 6.000000)
[2026-01-01 09:28:26,998] DEEPMD INFO     _____               _____   __  __  _____           _     _  _   
[2026-01-01 09:28:26,998] DEEPMD INFO    |  __ \             |  __ \ |  \/  ||  __ \         | |   (_)| |  
[2026-01-01 09:28:26,998] DEEPMD INFO    | |  | |  ___   ___ | |__) || \  / || |  | | ______ | | __ _ | |_ 
[2026-01-01 09:28:26,998] DEEPMD INFO    | |  | | / _ \ / _ \|  ___/ | |\/| || |  | ||______|| |/ /| || __|
[2026-01-01 09:28:26,998] DEEPMD INFO    | |__| ||  __/|  __/| |     | |  | || |__| |        |   < | || |_ 
[2026-01-01 09:28:26,998] DEEPMD INFO    |_____/  \___| \___||_|     |_|  |_||_____/         |_|\_\|_| \__|
[2026-01-01 09:28:26,998] DEEPMD INFO    Please read and cite:
[2026-01-01 09:28:26,998] DEEPMD INFO    Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018)
[2026-01-01 09:28:26,998] DEEPMD INFO    Zeng et al, J. Chem. Phys., 159, 054801 (2023)
[2026-01-01 09:28:26,998] DEEPMD INFO    Zeng et al, J. Chem. Theory Comput., 21, 4375-4385 (2025)
[2026-01-01 09:28:26,998] DEEPMD INFO    See https://deepmd.rtfd.io/credits/ for details.
[2026-01-01 09:28:26,998] DEEPMD INFO    --------------------------------------------------------------------------------------------------------
[2026-01-01 09:28:26,998] DEEPMD INFO    installed to:          /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/deepmd
[2026-01-01 09:28:26,998] DEEPMD INFO    source:                v3.1.2-21-gb98f6c59-dirty
[2026-01-01 09:28:26,998] DEEPMD INFO    source branch:         devel
[2026-01-01 09:28:26,998] DEEPMD INFO    source commit:         b98f6c59
[2026-01-01 09:28:26,998] DEEPMD INFO    source commit at:      2025-12-23 08:15:14 +0000
[2026-01-01 09:28:26,998] DEEPMD INFO    use float prec:        double
[2026-01-01 09:28:26,998] DEEPMD INFO    build variant:         cuda
[2026-01-01 09:28:26,999] DEEPMD INFO    Backend:               TensorFlow
[2026-01-01 09:28:26,999] DEEPMD INFO    TF ver:                v2.20.0-rc0-4-g72fbba3d20f
[2026-01-01 09:28:26,999] DEEPMD INFO    build with TF ver:     2.20.0
[2026-01-01 09:28:26,999] DEEPMD INFO    build with TF inc:     /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/include/
[2026-01-01 09:28:26,999] DEEPMD INFO                           /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/include/
[2026-01-01 09:28:26,999] DEEPMD INFO    build with TF lib:     
[2026-01-01 09:28:26,999] DEEPMD INFO    running on:            908821a3377a
[2026-01-01 09:28:26,999] DEEPMD INFO    computing device:      gpu:0
[2026-01-01 09:28:26,999] DEEPMD INFO    CUDA_VISIBLE_DEVICES:  unset
[2026-01-01 09:28:26,999] DEEPMD INFO    Count of visible GPUs: 1
[2026-01-01 09:28:26,999] DEEPMD INFO    num_intra_threads:     0
[2026-01-01 09:28:26,999] DEEPMD INFO    num_inter_threads:     0
[2026-01-01 09:28:26,999] DEEPMD INFO    --------------------------------------------------------------------------------------------------------
I0000 00:00:1767259707.010023     151 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:28:27,022] DEEPMD INFO    ---Summary of DataSystem: training     -----------------------------------------------
[2026-01-01 09:28:27,022] DEEPMD INFO    found 3 system(s):
[2026-01-01 09:28:27,022] DEEPMD INFO                                        system  natoms  bch_sz   n_bch       prob  pbc
[2026-01-01 09:28:27,022] DEEPMD INFO                              ../00data/data_0     192       1      80  2.500e-01    T
[2026-01-01 09:28:27,022] DEEPMD INFO                              ../00data/data_1     192       1     160  5.000e-01    T
[2026-01-01 09:28:27,022] DEEPMD INFO                              ../00data/data_2     192       1      80  2.500e-01    T
[2026-01-01 09:28:27,022] DEEPMD INFO    --------------------------------------------------------------------------------------
[2026-01-01 09:28:27,029] DEEPMD INFO    ---Summary of DataSystem: validation   -----------------------------------------------
[2026-01-01 09:28:27,029] DEEPMD INFO    found 1 system(s):
[2026-01-01 09:28:27,029] DEEPMD INFO                                        system  natoms  bch_sz   n_bch       prob  pbc
[2026-01-01 09:28:27,029] DEEPMD INFO                              ../00data/data_3     192       1      80  1.000e+00    T
[2026-01-01 09:28:27,029] DEEPMD INFO    --------------------------------------------------------------------------------------
[2026-01-01 09:28:27,029] DEEPMD INFO    training without frame parameter
[2026-01-01 09:28:27,030] DEEPMD INFO    data stating... (this step may take long time)
[2026-01-01 09:28:27,224] DEEPMD INFO    built lr
[2026-01-01 09:28:27,922] DEEPMD INFO    built network
[2026-01-01 09:28:28,983] DEEPMD INFO    built training
[2026-01-01 09:28:28,984] DEEPMD WARNING To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
I0000 00:00:1767259708.990744     151 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:28:29,037] DEEPMD INFO    initialize model from scratch
[2026-01-01 09:28:29,922] DEEPMD INFO    start training at lr 1.00e-03 (== 1.00e-03), decay_step 5000, decay_rate 0.005925, final lr will be 3.51e-08
[2026-01-01 09:28:31,055] DEEPMD INFO    batch       0: trn: rmse = 2.55e+01, rmse_e = 6.63e-01, rmse_f = 8.05e-01, lr = 1.00e-03
[2026-01-01 09:28:31,055] DEEPMD INFO    batch       0: val: rmse = 2.53e+01, rmse_e = 6.64e-01, rmse_f = 7.98e-01
I0000 00:00:1767259711.849151     227 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:28:34,483] DEEPMD INFO    batch     100: trn: rmse = 1.11e+01, rmse_e = 9.31e-02, rmse_f = 3.52e-01, lr = 1.00e-03
[2026-01-01 09:28:34,483] DEEPMD INFO    batch     100: val: rmse = 1.14e+01, rmse_e = 9.72e-02, rmse_f = 3.59e-01
[2026-01-01 09:28:34,483] DEEPMD INFO    batch     100: total wall time = 4.56 s
[2026-01-01 09:28:36,909] DEEPMD INFO    batch     200: trn: rmse = 8.89e+00, rmse_e = 2.63e-02, rmse_f = 2.81e-01, lr = 1.00e-03
[2026-01-01 09:28:36,910] DEEPMD INFO    batch     200: val: rmse = 9.16e+00, rmse_e = 2.71e-02, rmse_f = 2.90e-01
[2026-01-01 09:28:36,910] DEEPMD INFO    batch     200: total wall time = 2.43 s
[2026-01-01 09:28:39,362] DEEPMD INFO    batch     300: trn: rmse = 7.73e+00, rmse_e = 8.21e-03, rmse_f = 2.44e-01, lr = 1.00e-03
[2026-01-01 09:28:39,362] DEEPMD INFO    batch     300: val: rmse = 7.24e+00, rmse_e = 7.06e-03, rmse_f = 2.29e-01
[2026-01-01 09:28:39,362] DEEPMD INFO    batch     300: total wall time = 2.45 s
[2026-01-01 09:28:41,776] DEEPMD INFO    batch     400: trn: rmse = 7.11e+00, rmse_e = 5.26e-02, rmse_f = 2.25e-01, lr = 1.00e-03
[2026-01-01 09:28:41,776] DEEPMD INFO    batch     400: val: rmse = 6.65e+00, rmse_e = 5.02e-02, rmse_f = 2.10e-01
[2026-01-01 09:28:41,776] DEEPMD INFO    batch     400: total wall time = 2.41 s
[2026-01-01 09:28:44,220] DEEPMD INFO    batch     500: trn: rmse = 6.05e+00, rmse_e = 3.17e-02, rmse_f = 1.91e-01, lr = 1.00e-03
[2026-01-01 09:28:44,221] DEEPMD INFO    batch     500: val: rmse = 6.01e+00, rmse_e = 3.18e-02, rmse_f = 1.90e-01
[2026-01-01 09:28:44,221] DEEPMD INFO    batch     500: total wall time = 2.44 s
[2026-01-01 09:28:46,604] DEEPMD INFO    batch     600: trn: rmse = 6.00e+00, rmse_e = 6.22e-03, rmse_f = 1.90e-01, lr = 1.00e-03
[2026-01-01 09:28:46,604] DEEPMD INFO    batch     600: val: rmse = 6.19e+00, rmse_e = 5.96e-03, rmse_f = 1.96e-01
[2026-01-01 09:28:46,605] DEEPMD INFO    batch     600: total wall time = 2.38 s
[2026-01-01 09:28:49,048] DEEPMD INFO    batch     700: trn: rmse = 4.95e+00, rmse_e = 1.40e-02, rmse_f = 1.56e-01, lr = 1.00e-03
[2026-01-01 09:28:49,048] DEEPMD INFO    batch     700: val: rmse = 5.05e+00, rmse_e = 9.82e-03, rmse_f = 1.60e-01
[2026-01-01 09:28:49,048] DEEPMD INFO    batch     700: total wall time = 2.44 s
[2026-01-01 09:28:51,456] DEEPMD INFO    batch     800: trn: rmse = 4.46e+00, rmse_e = 1.19e-02, rmse_f = 1.41e-01, lr = 1.00e-03
[2026-01-01 09:28:51,456] DEEPMD INFO    batch     800: val: rmse = 4.85e+00, rmse_e = 1.19e-02, rmse_f = 1.53e-01
[2026-01-01 09:28:51,456] DEEPMD INFO    batch     800: total wall time = 2.41 s
[2026-01-01 09:28:53,815] DEEPMD INFO    batch     900: trn: rmse = 4.46e+00, rmse_e = 1.61e-02, rmse_f = 1.41e-01, lr = 1.00e-03
[2026-01-01 09:28:53,815] DEEPMD INFO    batch     900: val: rmse = 4.73e+00, rmse_e = 1.73e-02, rmse_f = 1.50e-01
[2026-01-01 09:28:53,815] DEEPMD INFO    batch     900: total wall time = 2.36 s
[2026-01-01 09:28:56,236] DEEPMD INFO    batch    1000: trn: rmse = 4.28e+00, rmse_e = 6.60e-03, rmse_f = 1.35e-01, lr = 1.00e-03
[2026-01-01 09:28:56,236] DEEPMD INFO    batch    1000: val: rmse = 4.44e+00, rmse_e = 6.66e-03, rmse_f = 1.40e-01
[2026-01-01 09:28:56,236] DEEPMD INFO    batch    1000: total wall time = 2.42 s
[2026-01-01 09:28:56,449] DEEPMD INFO    saved checkpoint model.ckpt
[2026-01-01 09:28:58,886] DEEPMD INFO    batch    1100: trn: rmse = 4.15e+00, rmse_e = 2.25e-02, rmse_f = 1.31e-01, lr = 1.00e-03
[2026-01-01 09:28:58,886] DEEPMD INFO    batch    1100: val: rmse = 4.42e+00, rmse_e = 2.04e-02, rmse_f = 1.40e-01
[2026-01-01 09:28:58,886] DEEPMD INFO    batch    1100: total wall time = 2.65 s
[2026-01-01 09:29:01,263] DEEPMD INFO    batch    1200: trn: rmse = 4.15e+00, rmse_e = 2.93e-02, rmse_f = 1.31e-01, lr = 1.00e-03
[2026-01-01 09:29:01,263] DEEPMD INFO    batch    1200: val: rmse = 3.96e+00, rmse_e = 2.92e-02, rmse_f = 1.25e-01
[2026-01-01 09:29:01,263] DEEPMD INFO    batch    1200: total wall time = 2.38 s
[2026-01-01 09:29:03,640] DEEPMD INFO    batch    1300: trn: rmse = 3.90e+00, rmse_e = 2.56e-03, rmse_f = 1.23e-01, lr = 1.00e-03
[2026-01-01 09:29:03,640] DEEPMD INFO    batch    1300: val: rmse = 4.03e+00, rmse_e = 2.76e-03, rmse_f = 1.27e-01
[2026-01-01 09:29:03,640] DEEPMD INFO    batch    1300: total wall time = 2.38 s
[2026-01-01 09:29:06,034] DEEPMD INFO    batch    1400: trn: rmse = 3.75e+00, rmse_e = 2.24e-02, rmse_f = 1.19e-01, lr = 1.00e-03
[2026-01-01 09:29:06,035] DEEPMD INFO    batch    1400: val: rmse = 4.21e+00, rmse_e = 2.43e-02, rmse_f = 1.33e-01
[2026-01-01 09:29:06,035] DEEPMD INFO    batch    1400: total wall time = 2.39 s
[2026-01-01 09:29:08,461] DEEPMD INFO    batch    1500: trn: rmse = 3.76e+00, rmse_e = 9.10e-03, rmse_f = 1.19e-01, lr = 1.00e-03
[2026-01-01 09:29:08,462] DEEPMD INFO    batch    1500: val: rmse = 4.01e+00, rmse_e = 5.39e-03, rmse_f = 1.27e-01
[2026-01-01 09:29:08,462] DEEPMD INFO    batch    1500: total wall time = 2.43 s
[2026-01-01 09:29:10,864] DEEPMD INFO    batch    1600: trn: rmse = 4.11e+00, rmse_e = 3.65e-02, rmse_f = 1.30e-01, lr = 1.00e-03
[2026-01-01 09:29:10,864] DEEPMD INFO    batch    1600: val: rmse = 3.65e+00, rmse_e = 3.71e-02, rmse_f = 1.15e-01
[2026-01-01 09:29:10,864] DEEPMD INFO    batch    1600: total wall time = 2.40 s
[2026-01-01 09:29:13,290] DEEPMD INFO    batch    1700: trn: rmse = 3.35e+00, rmse_e = 1.42e-02, rmse_f = 1.06e-01, lr = 1.00e-03
[2026-01-01 09:29:13,290] DEEPMD INFO    batch    1700: val: rmse = 3.44e+00, rmse_e = 1.41e-02, rmse_f = 1.09e-01
[2026-01-01 09:29:13,290] DEEPMD INFO    batch    1700: total wall time = 2.43 s
[2026-01-01 09:29:15,716] DEEPMD INFO    batch    1800: trn: rmse = 4.97e+00, rmse_e = 5.37e-02, rmse_f = 1.57e-01, lr = 1.00e-03
[2026-01-01 09:29:15,716] DEEPMD INFO    batch    1800: val: rmse = 4.96e+00, rmse_e = 5.25e-02, rmse_f = 1.57e-01
[2026-01-01 09:29:15,716] DEEPMD INFO    batch    1800: total wall time = 2.43 s
[2026-01-01 09:29:18,091] DEEPMD INFO    batch    1900: trn: rmse = 3.49e+00, rmse_e = 1.32e-02, rmse_f = 1.10e-01, lr = 1.00e-03
[2026-01-01 09:29:18,091] DEEPMD INFO    batch    1900: val: rmse = 3.51e+00, rmse_e = 1.25e-02, rmse_f = 1.11e-01
[2026-01-01 09:29:18,091] DEEPMD INFO    batch    1900: total wall time = 2.37 s
[2026-01-01 09:29:20,444] DEEPMD INFO    batch    2000: trn: rmse = 3.53e+00, rmse_e = 8.19e-03, rmse_f = 1.11e-01, lr = 1.00e-03
[2026-01-01 09:29:20,444] DEEPMD INFO    batch    2000: val: rmse = 3.47e+00, rmse_e = 7.47e-03, rmse_f = 1.10e-01
[2026-01-01 09:29:20,444] DEEPMD INFO    batch    2000: total wall time = 2.35 s
[2026-01-01 09:29:20,547] DEEPMD INFO    saved checkpoint model.ckpt
[2026-01-01 09:29:22,907] DEEPMD INFO    batch    2100: trn: rmse = 3.32e+00, rmse_e = 3.29e-02, rmse_f = 1.05e-01, lr = 1.00e-03
[2026-01-01 09:29:22,907] DEEPMD INFO    batch    2100: val: rmse = 3.80e+00, rmse_e = 3.27e-02, rmse_f = 1.20e-01
[2026-01-01 09:29:22,907] DEEPMD INFO    batch    2100: total wall time = 2.46 s
[2026-01-01 09:29:25,300] DEEPMD INFO    batch    2200: trn: rmse = 3.13e+00, rmse_e = 8.70e-04, rmse_f = 9.91e-02, lr = 1.00e-03
[2026-01-01 09:29:25,300] DEEPMD INFO    batch    2200: val: rmse = 3.19e+00, rmse_e = 1.30e-03, rmse_f = 1.01e-01
[2026-01-01 09:29:25,301] DEEPMD INFO    batch    2200: total wall time = 2.39 s
[2026-01-01 09:29:27,679] DEEPMD INFO    batch    2300: trn: rmse = 3.22e+00, rmse_e = 6.98e-03, rmse_f = 1.02e-01, lr = 1.00e-03
[2026-01-01 09:29:27,679] DEEPMD INFO    batch    2300: val: rmse = 3.33e+00, rmse_e = 6.55e-03, rmse_f = 1.05e-01
[2026-01-01 09:29:27,679] DEEPMD INFO    batch    2300: total wall time = 2.38 s
[2026-01-01 09:29:30,055] DEEPMD INFO    batch    2400: trn: rmse = 3.43e+00, rmse_e = 9.74e-03, rmse_f = 1.08e-01, lr = 1.00e-03
[2026-01-01 09:29:30,055] DEEPMD INFO    batch    2400: val: rmse = 3.69e+00, rmse_e = 8.45e-03, rmse_f = 1.17e-01
[2026-01-01 09:29:30,055] DEEPMD INFO    batch    2400: total wall time = 2.38 s
[2026-01-01 09:29:32,390] DEEPMD INFO    batch    2500: trn: rmse = 3.20e+00, rmse_e = 2.88e-02, rmse_f = 1.01e-01, lr = 1.00e-03
[2026-01-01 09:29:32,390] DEEPMD INFO    batch    2500: val: rmse = 3.19e+00, rmse_e = 2.92e-02, rmse_f = 1.01e-01
[2026-01-01 09:29:32,390] DEEPMD INFO    batch    2500: total wall time = 2.34 s
[2026-01-01 09:29:34,730] DEEPMD INFO    batch    2600: trn: rmse = 4.28e+00, rmse_e = 2.26e-02, rmse_f = 1.35e-01, lr = 1.00e-03
[2026-01-01 09:29:34,730] DEEPMD INFO    batch    2600: val: rmse = 4.83e+00, rmse_e = 1.93e-02, rmse_f = 1.53e-01
[2026-01-01 09:29:34,730] DEEPMD INFO    batch    2600: total wall time = 2.34 s
[2026-01-01 09:29:37,117] DEEPMD INFO    batch    2700: trn: rmse = 3.13e+00, rmse_e = 7.97e-03, rmse_f = 9.89e-02, lr = 1.00e-03
[2026-01-01 09:29:37,117] DEEPMD INFO    batch    2700: val: rmse = 3.30e+00, rmse_e = 6.36e-03, rmse_f = 1.04e-01
[2026-01-01 09:29:37,117] DEEPMD INFO    batch    2700: total wall time = 2.39 s
[2026-01-01 09:29:39,497] DEEPMD INFO    batch    2800: trn: rmse = 3.03e+00, rmse_e = 3.23e-03, rmse_f = 9.59e-02, lr = 1.00e-03
[2026-01-01 09:29:39,498] DEEPMD INFO    batch    2800: val: rmse = 3.12e+00, rmse_e = 1.82e-03, rmse_f = 9.86e-02
[2026-01-01 09:29:39,498] DEEPMD INFO    batch    2800: total wall time = 2.38 s
[2026-01-01 09:29:41,920] DEEPMD INFO    batch    2900: trn: rmse = 3.49e+00, rmse_e = 2.05e-02, rmse_f = 1.10e-01, lr = 1.00e-03
[2026-01-01 09:29:41,920] DEEPMD INFO    batch    2900: val: rmse = 3.64e+00, rmse_e = 2.02e-02, rmse_f = 1.15e-01
[2026-01-01 09:29:41,921] DEEPMD INFO    batch    2900: total wall time = 2.42 s
[2026-01-01 09:29:44,273] DEEPMD INFO    batch    3000: trn: rmse = 3.25e+00, rmse_e = 1.43e-02, rmse_f = 1.03e-01, lr = 1.00e-03
[2026-01-01 09:29:44,273] DEEPMD INFO    batch    3000: val: rmse = 3.26e+00, rmse_e = 1.43e-02, rmse_f = 1.03e-01
[2026-01-01 09:29:44,274] DEEPMD INFO    batch    3000: total wall time = 2.35 s
[2026-01-01 09:29:44,379] DEEPMD INFO    saved checkpoint model.ckpt
[2026-01-01 09:29:46,781] DEEPMD INFO    batch    3100: trn: rmse = 3.06e+00, rmse_e = 5.73e-04, rmse_f = 9.68e-02, lr = 1.00e-03
[2026-01-01 09:29:46,781] DEEPMD INFO    batch    3100: val: rmse = 2.90e+00, rmse_e = 9.68e-04, rmse_f = 9.17e-02
[2026-01-01 09:29:46,781] DEEPMD INFO    batch    3100: total wall time = 2.51 s
[2026-01-01 09:29:49,187] DEEPMD INFO    batch    3200: trn: rmse = 2.84e+00, rmse_e = 1.40e-02, rmse_f = 8.97e-02, lr = 1.00e-03
[2026-01-01 09:29:49,187] DEEPMD INFO    batch    3200: val: rmse = 3.03e+00, rmse_e = 1.36e-02, rmse_f = 9.59e-02
[2026-01-01 09:29:49,187] DEEPMD INFO    batch    3200: total wall time = 2.41 s
[2026-01-01 09:29:51,521] DEEPMD INFO    batch    3300: trn: rmse = 2.75e+00, rmse_e = 1.52e-02, rmse_f = 8.69e-02, lr = 1.00e-03
[2026-01-01 09:29:51,522] DEEPMD INFO    batch    3300: val: rmse = 2.89e+00, rmse_e = 1.64e-02, rmse_f = 9.15e-02
[2026-01-01 09:29:51,522] DEEPMD INFO    batch    3300: total wall time = 2.33 s
[2026-01-01 09:29:53,852] DEEPMD INFO    batch    3400: trn: rmse = 3.18e+00, rmse_e = 1.68e-02, rmse_f = 1.01e-01, lr = 1.00e-03
[2026-01-01 09:29:53,852] DEEPMD INFO    batch    3400: val: rmse = 3.29e+00, rmse_e = 1.85e-02, rmse_f = 1.04e-01
[2026-01-01 09:29:53,852] DEEPMD INFO    batch    3400: total wall time = 2.33 s
[2026-01-01 09:29:56,210] DEEPMD INFO    batch    3500: trn: rmse = 3.15e+00, rmse_e = 7.60e-03, rmse_f = 9.97e-02, lr = 1.00e-03
[2026-01-01 09:29:56,210] DEEPMD INFO    batch    3500: val: rmse = 3.19e+00, rmse_e = 6.99e-03, rmse_f = 1.01e-01
[2026-01-01 09:29:56,210] DEEPMD INFO    batch    3500: total wall time = 2.36 s
[2026-01-01 09:29:58,606] DEEPMD INFO    batch    3600: trn: rmse = 2.67e+00, rmse_e = 1.93e-02, rmse_f = 8.43e-02, lr = 1.00e-03
[2026-01-01 09:29:58,606] DEEPMD INFO    batch    3600: val: rmse = 3.04e+00, rmse_e = 1.94e-02, rmse_f = 9.61e-02
[2026-01-01 09:29:58,606] DEEPMD INFO    batch    3600: total wall time = 2.40 s
[2026-01-01 09:30:00,963] DEEPMD INFO    batch    3700: trn: rmse = 2.46e+00, rmse_e = 1.03e-02, rmse_f = 7.79e-02, lr = 1.00e-03
[2026-01-01 09:30:00,963] DEEPMD INFO    batch    3700: val: rmse = 2.60e+00, rmse_e = 1.00e-02, rmse_f = 8.21e-02
[2026-01-01 09:30:00,963] DEEPMD INFO    batch    3700: total wall time = 2.36 s
[2026-01-01 09:30:03,273] DEEPMD INFO    batch    3800: trn: rmse = 2.37e+00, rmse_e = 2.76e-02, rmse_f = 7.48e-02, lr = 1.00e-03
[2026-01-01 09:30:03,273] DEEPMD INFO    batch    3800: val: rmse = 2.91e+00, rmse_e = 2.77e-02, rmse_f = 9.22e-02
[2026-01-01 09:30:03,273] DEEPMD INFO    batch    3800: total wall time = 2.31 s
[2026-01-01 09:30:05,682] DEEPMD INFO    batch    3900: trn: rmse = 3.00e+00, rmse_e = 1.46e-02, rmse_f = 9.50e-02, lr = 1.00e-03
[2026-01-01 09:30:05,683] DEEPMD INFO    batch    3900: val: rmse = 3.06e+00, rmse_e = 1.52e-02, rmse_f = 9.66e-02
[2026-01-01 09:30:05,683] DEEPMD INFO    batch    3900: total wall time = 2.41 s
[2026-01-01 09:30:08,032] DEEPMD INFO    batch    4000: trn: rmse = 2.66e+00, rmse_e = 2.16e-02, rmse_f = 8.43e-02, lr = 1.00e-03
[2026-01-01 09:30:08,032] DEEPMD INFO    batch    4000: val: rmse = 2.54e+00, rmse_e = 2.19e-02, rmse_f = 8.04e-02
[2026-01-01 09:30:08,032] DEEPMD INFO    batch    4000: total wall time = 2.35 s
[2026-01-01 09:30:08,132] DEEPMD INFO    saved checkpoint model.ckpt
[2026-01-01 09:30:10,496] DEEPMD INFO    batch    4100: trn: rmse = 3.04e+00, rmse_e = 3.27e-02, rmse_f = 9.61e-02, lr = 1.00e-03
[2026-01-01 09:30:10,496] DEEPMD INFO    batch    4100: val: rmse = 2.79e+00, rmse_e = 3.21e-02, rmse_f = 8.82e-02
[2026-01-01 09:30:10,497] DEEPMD INFO    batch    4100: total wall time = 2.46 s
[2026-01-01 09:30:12,884] DEEPMD INFO    batch    4200: trn: rmse = 3.64e+00, rmse_e = 2.19e-02, rmse_f = 1.15e-01, lr = 1.00e-03
[2026-01-01 09:30:12,884] DEEPMD INFO    batch    4200: val: rmse = 3.27e+00, rmse_e = 2.17e-02, rmse_f = 1.03e-01
[2026-01-01 09:30:12,884] DEEPMD INFO    batch    4200: total wall time = 2.39 s
[2026-01-01 09:30:15,295] DEEPMD INFO    batch    4300: trn: rmse = 2.52e+00, rmse_e = 9.23e-03, rmse_f = 7.98e-02, lr = 1.00e-03
[2026-01-01 09:30:15,295] DEEPMD INFO    batch    4300: val: rmse = 2.59e+00, rmse_e = 9.42e-03, rmse_f = 8.20e-02
[2026-01-01 09:30:15,295] DEEPMD INFO    batch    4300: total wall time = 2.41 s
[2026-01-01 09:30:17,678] DEEPMD INFO    batch    4400: trn: rmse = 2.88e+00, rmse_e = 1.02e-02, rmse_f = 9.11e-02, lr = 1.00e-03
[2026-01-01 09:30:17,678] DEEPMD INFO    batch    4400: val: rmse = 2.66e+00, rmse_e = 8.22e-03, rmse_f = 8.41e-02
[2026-01-01 09:30:17,678] DEEPMD INFO    batch    4400: total wall time = 2.38 s
[2026-01-01 09:30:20,029] DEEPMD INFO    batch    4500: trn: rmse = 2.62e+00, rmse_e = 1.16e-02, rmse_f = 8.28e-02, lr = 1.00e-03
[2026-01-01 09:30:20,029] DEEPMD INFO    batch    4500: val: rmse = 2.65e+00, rmse_e = 1.33e-02, rmse_f = 8.38e-02
[2026-01-01 09:30:20,029] DEEPMD INFO    batch    4500: total wall time = 2.35 s
[2026-01-01 09:30:22,415] DEEPMD INFO    batch    4600: trn: rmse = 3.01e+00, rmse_e = 2.31e-02, rmse_f = 9.51e-02, lr = 1.00e-03
[2026-01-01 09:30:22,415] DEEPMD INFO    batch    4600: val: rmse = 2.62e+00, rmse_e = 2.33e-02, rmse_f = 8.29e-02
[2026-01-01 09:30:22,415] DEEPMD INFO    batch    4600: total wall time = 2.39 s
[2026-01-01 09:30:24,785] DEEPMD INFO    batch    4700: trn: rmse = 2.33e+00, rmse_e = 1.50e-02, rmse_f = 7.37e-02, lr = 1.00e-03
[2026-01-01 09:30:24,785] DEEPMD INFO    batch    4700: val: rmse = 2.71e+00, rmse_e = 1.67e-02, rmse_f = 8.58e-02
[2026-01-01 09:30:24,785] DEEPMD INFO    batch    4700: total wall time = 2.37 s
[2026-01-01 09:30:27,173] DEEPMD INFO    batch    4800: trn: rmse = 2.78e+00, rmse_e = 5.42e-04, rmse_f = 8.79e-02, lr = 1.00e-03
[2026-01-01 09:30:27,173] DEEPMD INFO    batch    4800: val: rmse = 2.88e+00, rmse_e = 6.71e-04, rmse_f = 9.11e-02
[2026-01-01 09:30:27,173] DEEPMD INFO    batch    4800: total wall time = 2.39 s
[2026-01-01 09:30:29,543] DEEPMD INFO    batch    4900: trn: rmse = 2.68e+00, rmse_e = 1.28e-02, rmse_f = 8.48e-02, lr = 1.00e-03
[2026-01-01 09:30:29,544] DEEPMD INFO    batch    4900: val: rmse = 2.90e+00, rmse_e = 1.29e-02, rmse_f = 9.18e-02
[2026-01-01 09:30:29,544] DEEPMD INFO    batch    4900: total wall time = 2.37 s
[2026-01-01 09:30:31,935] DEEPMD INFO    batch    5000: trn: rmse = 2.08e-01, rmse_e = 1.02e-03, rmse_f = 7.89e-02, lr = 5.92e-06
[2026-01-01 09:30:31,936] DEEPMD INFO    batch    5000: val: rmse = 2.15e-01, rmse_e = 9.93e-04, rmse_f = 8.15e-02
[2026-01-01 09:30:31,936] DEEPMD INFO    batch    5000: total wall time = 2.39 s
[2026-01-01 09:30:32,039] DEEPMD INFO    saved checkpoint model.ckpt
[2026-01-01 09:30:34,424] DEEPMD INFO    batch    5100: trn: rmse = 2.16e-01, rmse_e = 5.07e-04, rmse_f = 8.21e-02, lr = 5.92e-06
[2026-01-01 09:30:34,425] DEEPMD INFO    batch    5100: val: rmse = 2.02e-01, rmse_e = 8.51e-04, rmse_f = 7.65e-02
[2026-01-01 09:30:34,425] DEEPMD INFO    batch    5100: total wall time = 2.49 s
[2026-01-01 09:30:36,844] DEEPMD INFO    batch    5200: trn: rmse = 2.20e-01, rmse_e = 1.37e-03, rmse_f = 8.33e-02, lr = 5.92e-06
[2026-01-01 09:30:36,844] DEEPMD INFO    batch    5200: val: rmse = 2.13e-01, rmse_e = 9.60e-04, rmse_f = 8.06e-02
[2026-01-01 09:30:36,844] DEEPMD INFO    batch    5200: total wall time = 2.42 s
[2026-01-01 09:30:39,231] DEEPMD INFO    batch    5300: trn: rmse = 1.88e-01, rmse_e = 3.04e-04, rmse_f = 7.14e-02, lr = 5.92e-06
[2026-01-01 09:30:39,231] DEEPMD INFO    batch    5300: val: rmse = 2.07e-01, rmse_e = 4.99e-04, rmse_f = 7.87e-02
[2026-01-01 09:30:39,231] DEEPMD INFO    batch    5300: total wall time = 2.39 s
[2026-01-01 09:30:41,571] DEEPMD INFO    batch    5400: trn: rmse = 2.09e-01, rmse_e = 6.61e-04, rmse_f = 7.93e-02, lr = 5.92e-06
[2026-01-01 09:30:41,571] DEEPMD INFO    batch    5400: val: rmse = 2.04e-01, rmse_e = 7.21e-04, rmse_f = 7.75e-02
[2026-01-01 09:30:41,571] DEEPMD INFO    batch    5400: total wall time = 2.34 s
[2026-01-01 09:30:43,980] DEEPMD INFO    batch    5500: trn: rmse = 1.94e-01, rmse_e = 3.24e-04, rmse_f = 7.38e-02, lr = 5.92e-06
[2026-01-01 09:30:43,981] DEEPMD INFO    batch    5500: val: rmse = 2.21e-01, rmse_e = 5.90e-04, rmse_f = 8.41e-02
[2026-01-01 09:30:43,981] DEEPMD INFO    batch    5500: total wall time = 2.41 s
[2026-01-01 09:30:46,400] DEEPMD INFO    batch    5600: trn: rmse = 1.76e-01, rmse_e = 2.88e-04, rmse_f = 6.71e-02, lr = 5.92e-06
[2026-01-01 09:30:46,401] DEEPMD INFO    batch    5600: val: rmse = 2.16e-01, rmse_e = 9.89e-04, rmse_f = 8.18e-02
[2026-01-01 09:30:46,401] DEEPMD INFO    batch    5600: total wall time = 2.42 s
[2026-01-01 09:30:48,768] DEEPMD INFO    batch    5700: trn: rmse = 1.99e-01, rmse_e = 3.38e-05, rmse_f = 7.58e-02, lr = 5.92e-06
[2026-01-01 09:30:48,768] DEEPMD INFO    batch    5700: val: rmse = 2.16e-01, rmse_e = 8.04e-04, rmse_f = 8.18e-02
[2026-01-01 09:30:48,768] DEEPMD INFO    batch    5700: total wall time = 2.37 s
[2026-01-01 09:30:51,192] DEEPMD INFO    batch    5800: trn: rmse = 2.53e-01, rmse_e = 1.86e-03, rmse_f = 9.57e-02, lr = 5.92e-06
[2026-01-01 09:30:51,192] DEEPMD INFO    batch    5800: val: rmse = 2.24e-01, rmse_e = 6.48e-04, rmse_f = 8.49e-02
[2026-01-01 09:30:51,192] DEEPMD INFO    batch    5800: total wall time = 2.42 s
[2026-01-01 09:30:53,594] DEEPMD INFO    batch    5900: trn: rmse = 1.99e-01, rmse_e = 7.57e-04, rmse_f = 7.54e-02, lr = 5.92e-06
[2026-01-01 09:30:53,594] DEEPMD INFO    batch    5900: val: rmse = 2.07e-01, rmse_e = 7.25e-04, rmse_f = 7.84e-02
[2026-01-01 09:30:53,594] DEEPMD INFO    batch    5900: total wall time = 2.40 s
[2026-01-01 09:30:55,964] DEEPMD INFO    batch    6000: trn: rmse = 2.04e-01, rmse_e = 1.51e-03, rmse_f = 7.73e-02, lr = 5.92e-06
[2026-01-01 09:30:55,964] DEEPMD INFO    batch    6000: val: rmse = 2.13e-01, rmse_e = 5.76e-04, rmse_f = 8.09e-02
[2026-01-01 09:30:55,964] DEEPMD INFO    batch    6000: total wall time = 2.37 s
[2026-01-01 09:30:56,095] DEEPMD INFO    saved checkpoint model.ckpt
[2026-01-01 09:30:58,477] DEEPMD INFO    batch    6100: trn: rmse = 2.05e-01, rmse_e = 2.19e-04, rmse_f = 7.78e-02, lr = 5.92e-06
[2026-01-01 09:30:58,477] DEEPMD INFO    batch    6100: val: rmse = 2.12e-01, rmse_e = 1.06e-03, rmse_f = 8.02e-02
[2026-01-01 09:30:58,477] DEEPMD INFO    batch    6100: total wall time = 2.51 s
[2026-01-01 09:31:00,853] DEEPMD INFO    batch    6200: trn: rmse = 2.16e-01, rmse_e = 2.39e-03, rmse_f = 8.10e-02, lr = 5.92e-06
[2026-01-01 09:31:00,853] DEEPMD INFO    batch    6200: val: rmse = 2.00e-01, rmse_e = 7.14e-04, rmse_f = 7.60e-02
[2026-01-01 09:31:00,854] DEEPMD INFO    batch    6200: total wall time = 2.38 s
[2026-01-01 09:31:03,239] DEEPMD INFO    batch    6300: trn: rmse = 2.07e-01, rmse_e = 1.68e-03, rmse_f = 7.82e-02, lr = 5.92e-06
[2026-01-01 09:31:03,239] DEEPMD INFO    batch    6300: val: rmse = 1.99e-01, rmse_e = 5.56e-04, rmse_f = 7.56e-02
[2026-01-01 09:31:03,239] DEEPMD INFO    batch    6300: total wall time = 2.39 s
[2026-01-01 09:31:05,608] DEEPMD INFO    batch    6400: trn: rmse = 2.02e-01, rmse_e = 3.39e-04, rmse_f = 7.70e-02, lr = 5.92e-06
[2026-01-01 09:31:05,608] DEEPMD INFO    batch    6400: val: rmse = 2.10e-01, rmse_e = 1.29e-03, rmse_f = 7.94e-02
[2026-01-01 09:31:05,608] DEEPMD INFO    batch    6400: total wall time = 2.37 s
[2026-01-01 09:31:07,965] DEEPMD INFO    batch    6500: trn: rmse = 2.16e-01, rmse_e = 2.96e-03, rmse_f = 8.06e-02, lr = 5.92e-06
[2026-01-01 09:31:07,965] DEEPMD INFO    batch    6500: val: rmse = 2.08e-01, rmse_e = 7.62e-04, rmse_f = 7.89e-02
[2026-01-01 09:31:07,965] DEEPMD INFO    batch    6500: total wall time = 2.36 s
[2026-01-01 09:31:10,246] DEEPMD INFO    batch    6600: trn: rmse = 2.33e-01, rmse_e = 5.48e-05, rmse_f = 8.87e-02, lr = 5.92e-06
[2026-01-01 09:31:10,247] DEEPMD INFO    batch    6600: val: rmse = 1.95e-01, rmse_e = 7.60e-04, rmse_f = 7.41e-02
[2026-01-01 09:31:10,247] DEEPMD INFO    batch    6600: total wall time = 2.28 s
[2026-01-01 09:31:12,546] DEEPMD INFO    batch    6700: trn: rmse = 1.94e-01, rmse_e = 4.74e-05, rmse_f = 7.39e-02, lr = 5.92e-06
[2026-01-01 09:31:12,546] DEEPMD INFO    batch    6700: val: rmse = 2.06e-01, rmse_e = 6.75e-04, rmse_f = 7.82e-02
[2026-01-01 09:31:12,547] DEEPMD INFO    batch    6700: total wall time = 2.30 s
[2026-01-01 09:31:14,868] DEEPMD INFO    batch    6800: trn: rmse = 1.85e-01, rmse_e = 6.07e-04, rmse_f = 7.01e-02, lr = 5.92e-06
[2026-01-01 09:31:14,868] DEEPMD INFO    batch    6800: val: rmse = 2.07e-01, rmse_e = 2.38e-04, rmse_f = 7.88e-02
[2026-01-01 09:31:14,868] DEEPMD INFO    batch    6800: total wall time = 2.32 s
[2026-01-01 09:31:17,168] DEEPMD INFO    batch    6900: trn: rmse = 2.33e-01, rmse_e = 5.41e-04, rmse_f = 8.84e-02, lr = 5.92e-06
[2026-01-01 09:31:17,168] DEEPMD INFO    batch    6900: val: rmse = 1.91e-01, rmse_e = 2.73e-04, rmse_f = 7.27e-02
[2026-01-01 09:31:17,168] DEEPMD INFO    batch    6900: total wall time = 2.30 s
[2026-01-01 09:31:19,451] DEEPMD INFO    batch    7000: trn: rmse = 2.09e-01, rmse_e = 3.88e-04, rmse_f = 7.95e-02, lr = 5.92e-06
[2026-01-01 09:31:19,451] DEEPMD INFO    batch    7000: val: rmse = 2.03e-01, rmse_e = 6.59e-04, rmse_f = 7.70e-02
[2026-01-01 09:31:19,451] DEEPMD INFO    batch    7000: total wall time = 2.28 s
[2026-01-01 09:31:19,559] DEEPMD INFO    saved checkpoint model.ckpt
[2026-01-01 09:31:21,861] DEEPMD INFO    batch    7100: trn: rmse = 2.12e-01, rmse_e = 2.99e-04, rmse_f = 8.05e-02, lr = 5.92e-06
[2026-01-01 09:31:21,861] DEEPMD INFO    batch    7100: val: rmse = 2.04e-01, rmse_e = 8.82e-04, rmse_f = 7.75e-02
[2026-01-01 09:31:21,861] DEEPMD INFO    batch    7100: total wall time = 2.41 s
[2026-01-01 09:31:24,165] DEEPMD INFO    batch    7200: trn: rmse = 2.04e-01, rmse_e = 9.94e-04, rmse_f = 7.72e-02, lr = 5.92e-06
[2026-01-01 09:31:24,165] DEEPMD INFO    batch    7200: val: rmse = 2.06e-01, rmse_e = 6.69e-04, rmse_f = 7.83e-02
[2026-01-01 09:31:24,165] DEEPMD INFO    batch    7200: total wall time = 2.30 s
[2026-01-01 09:31:26,487] DEEPMD INFO    batch    7300: trn: rmse = 2.06e-01, rmse_e = 1.61e-04, rmse_f = 7.83e-02, lr = 5.92e-06
[2026-01-01 09:31:26,487] DEEPMD INFO    batch    7300: val: rmse = 2.03e-01, rmse_e = 2.96e-04, rmse_f = 7.71e-02
[2026-01-01 09:31:26,488] DEEPMD INFO    batch    7300: total wall time = 2.32 s
[2026-01-01 09:31:28,815] DEEPMD INFO    batch    7400: trn: rmse = 1.71e-01, rmse_e = 1.48e-03, rmse_f = 6.45e-02, lr = 5.92e-06
[2026-01-01 09:31:28,816] DEEPMD INFO    batch    7400: val: rmse = 2.03e-01, rmse_e = 4.59e-04, rmse_f = 7.73e-02
[2026-01-01 09:31:28,816] DEEPMD INFO    batch    7400: total wall time = 2.33 s
[2026-01-01 09:31:31,186] DEEPMD INFO    batch    7500: trn: rmse = 1.87e-01, rmse_e = 5.35e-05, rmse_f = 7.10e-02, lr = 5.92e-06
[2026-01-01 09:31:31,187] DEEPMD INFO    batch    7500: val: rmse = 1.94e-01, rmse_e = 8.93e-04, rmse_f = 7.35e-02
[2026-01-01 09:31:31,187] DEEPMD INFO    batch    7500: total wall time = 2.37 s
[2026-01-01 09:31:33,533] DEEPMD INFO    batch    7600: trn: rmse = 2.03e-01, rmse_e = 5.59e-04, rmse_f = 7.73e-02, lr = 5.92e-06
[2026-01-01 09:31:33,533] DEEPMD INFO    batch    7600: val: rmse = 2.07e-01, rmse_e = 4.89e-04, rmse_f = 7.87e-02
[2026-01-01 09:31:33,533] DEEPMD INFO    batch    7600: total wall time = 2.35 s
[2026-01-01 09:31:35,890] DEEPMD INFO    batch    7700: trn: rmse = 2.19e-01, rmse_e = 3.28e-04, rmse_f = 8.34e-02, lr = 5.92e-06
[2026-01-01 09:31:35,891] DEEPMD INFO    batch    7700: val: rmse = 2.14e-01, rmse_e = 5.73e-04, rmse_f = 8.14e-02
[2026-01-01 09:31:35,891] DEEPMD INFO    batch    7700: total wall time = 2.36 s
[2026-01-01 09:31:38,247] DEEPMD INFO    batch    7800: trn: rmse = 1.76e-01, rmse_e = 3.36e-04, rmse_f = 6.70e-02, lr = 5.92e-06
[2026-01-01 09:31:38,247] DEEPMD INFO    batch    7800: val: rmse = 1.90e-01, rmse_e = 4.81e-04, rmse_f = 7.21e-02
[2026-01-01 09:31:38,247] DEEPMD INFO    batch    7800: total wall time = 2.36 s
[2026-01-01 09:31:40,632] DEEPMD INFO    batch    7900: trn: rmse = 1.94e-01, rmse_e = 1.34e-03, rmse_f = 7.34e-02, lr = 5.92e-06
[2026-01-01 09:31:40,633] DEEPMD INFO    batch    7900: val: rmse = 2.02e-01, rmse_e = 9.35e-04, rmse_f = 7.66e-02
[2026-01-01 09:31:40,633] DEEPMD INFO    batch    7900: total wall time = 2.39 s
[2026-01-01 09:31:43,033] DEEPMD INFO    batch    8000: trn: rmse = 2.15e-01, rmse_e = 2.62e-03, rmse_f = 8.06e-02, lr = 5.92e-06
[2026-01-01 09:31:43,034] DEEPMD INFO    batch    8000: val: rmse = 2.18e-01, rmse_e = 5.23e-04, rmse_f = 8.28e-02
[2026-01-01 09:31:43,034] DEEPMD INFO    batch    8000: total wall time = 2.40 s
[2026-01-01 09:31:43,147] DEEPMD INFO    saved checkpoint model.ckpt
[2026-01-01 09:31:45,564] DEEPMD INFO    batch    8100: trn: rmse = 2.11e-01, rmse_e = 1.24e-03, rmse_f = 7.98e-02, lr = 5.92e-06
[2026-01-01 09:31:45,564] DEEPMD INFO    batch    8100: val: rmse = 2.06e-01, rmse_e = 6.11e-04, rmse_f = 7.82e-02
[2026-01-01 09:31:45,564] DEEPMD INFO    batch    8100: total wall time = 2.53 s
[2026-01-01 09:31:47,945] DEEPMD INFO    batch    8200: trn: rmse = 1.88e-01, rmse_e = 9.40e-05, rmse_f = 7.14e-02, lr = 5.92e-06
[2026-01-01 09:31:47,945] DEEPMD INFO    batch    8200: val: rmse = 1.98e-01, rmse_e = 9.48e-04, rmse_f = 7.51e-02
[2026-01-01 09:31:47,946] DEEPMD INFO    batch    8200: total wall time = 2.38 s
[2026-01-01 09:31:50,309] DEEPMD INFO    batch    8300: trn: rmse = 2.03e-01, rmse_e = 4.20e-04, rmse_f = 7.71e-02, lr = 5.92e-06
[2026-01-01 09:31:50,309] DEEPMD INFO    batch    8300: val: rmse = 2.10e-01, rmse_e = 1.01e-03, rmse_f = 7.97e-02
[2026-01-01 09:31:50,309] DEEPMD INFO    batch    8300: total wall time = 2.36 s
[2026-01-01 09:31:52,675] DEEPMD INFO    batch    8400: trn: rmse = 2.21e-01, rmse_e = 1.72e-03, rmse_f = 8.35e-02, lr = 5.92e-06
[2026-01-01 09:31:52,676] DEEPMD INFO    batch    8400: val: rmse = 1.97e-01, rmse_e = 5.57e-04, rmse_f = 7.48e-02
[2026-01-01 09:31:52,676] DEEPMD INFO    batch    8400: total wall time = 2.37 s
[2026-01-01 09:31:55,045] DEEPMD INFO    batch    8500: trn: rmse = 2.22e-01, rmse_e = 8.82e-04, rmse_f = 8.42e-02, lr = 5.92e-06
[2026-01-01 09:31:55,045] DEEPMD INFO    batch    8500: val: rmse = 2.06e-01, rmse_e = 5.61e-04, rmse_f = 7.82e-02
[2026-01-01 09:31:55,045] DEEPMD INFO    batch    8500: total wall time = 2.37 s
[2026-01-01 09:31:57,451] DEEPMD INFO    batch    8600: trn: rmse = 1.91e-01, rmse_e = 2.88e-04, rmse_f = 7.25e-02, lr = 5.92e-06
[2026-01-01 09:31:57,452] DEEPMD INFO    batch    8600: val: rmse = 1.98e-01, rmse_e = 1.01e-03, rmse_f = 7.51e-02
[2026-01-01 09:31:57,452] DEEPMD INFO    batch    8600: total wall time = 2.41 s
[2026-01-01 09:31:59,826] DEEPMD INFO    batch    8700: trn: rmse = 1.91e-01, rmse_e = 4.27e-04, rmse_f = 7.26e-02, lr = 5.92e-06
[2026-01-01 09:31:59,826] DEEPMD INFO    batch    8700: val: rmse = 1.94e-01, rmse_e = 1.28e-03, rmse_f = 7.34e-02
[2026-01-01 09:31:59,826] DEEPMD INFO    batch    8700: total wall time = 2.37 s
[2026-01-01 09:32:02,233] DEEPMD INFO    batch    8800: trn: rmse = 1.83e-01, rmse_e = 1.69e-03, rmse_f = 6.89e-02, lr = 5.92e-06
[2026-01-01 09:32:02,234] DEEPMD INFO    batch    8800: val: rmse = 2.07e-01, rmse_e = 1.31e-03, rmse_f = 7.82e-02
[2026-01-01 09:32:02,234] DEEPMD INFO    batch    8800: total wall time = 2.41 s
[2026-01-01 09:32:04,611] DEEPMD INFO    batch    8900: trn: rmse = 1.88e-01, rmse_e = 5.71e-04, rmse_f = 7.14e-02, lr = 5.92e-06
[2026-01-01 09:32:04,611] DEEPMD INFO    batch    8900: val: rmse = 1.93e-01, rmse_e = 7.86e-04, rmse_f = 7.33e-02
[2026-01-01 09:32:04,611] DEEPMD INFO    batch    8900: total wall time = 2.38 s
[2026-01-01 09:32:07,012] DEEPMD INFO    batch    9000: trn: rmse = 1.82e-01, rmse_e = 1.14e-03, rmse_f = 6.89e-02, lr = 5.92e-06
[2026-01-01 09:32:07,013] DEEPMD INFO    batch    9000: val: rmse = 1.97e-01, rmse_e = 4.63e-04, rmse_f = 7.49e-02
[2026-01-01 09:32:07,013] DEEPMD INFO    batch    9000: total wall time = 2.40 s
[2026-01-01 09:32:07,141] DEEPMD INFO    saved checkpoint model.ckpt
[2026-01-01 09:32:09,553] DEEPMD INFO    batch    9100: trn: rmse = 1.86e-01, rmse_e = 5.13e-04, rmse_f = 7.07e-02, lr = 5.92e-06
[2026-01-01 09:32:09,553] DEEPMD INFO    batch    9100: val: rmse = 2.00e-01, rmse_e = 3.93e-04, rmse_f = 7.61e-02
[2026-01-01 09:32:09,554] DEEPMD INFO    batch    9100: total wall time = 2.54 s
[2026-01-01 09:32:11,942] DEEPMD INFO    batch    9200: trn: rmse = 1.91e-01, rmse_e = 6.75e-04, rmse_f = 7.26e-02, lr = 5.92e-06
[2026-01-01 09:32:11,942] DEEPMD INFO    batch    9200: val: rmse = 2.04e-01, rmse_e = 6.81e-04, rmse_f = 7.76e-02
[2026-01-01 09:32:11,942] DEEPMD INFO    batch    9200: total wall time = 2.39 s
[2026-01-01 09:32:14,240] DEEPMD INFO    batch    9300: trn: rmse = 2.09e-01, rmse_e = 7.54e-04, rmse_f = 7.94e-02, lr = 5.92e-06
[2026-01-01 09:32:14,240] DEEPMD INFO    batch    9300: val: rmse = 2.05e-01, rmse_e = 4.76e-04, rmse_f = 7.80e-02
[2026-01-01 09:32:14,240] DEEPMD INFO    batch    9300: total wall time = 2.30 s
[2026-01-01 09:32:16,591] DEEPMD INFO    batch    9400: trn: rmse = 1.83e-01, rmse_e = 3.04e-04, rmse_f = 6.96e-02, lr = 5.92e-06
[2026-01-01 09:32:16,592] DEEPMD INFO    batch    9400: val: rmse = 2.19e-01, rmse_e = 4.24e-04, rmse_f = 8.32e-02
[2026-01-01 09:32:16,592] DEEPMD INFO    batch    9400: total wall time = 2.35 s
[2026-01-01 09:32:18,975] DEEPMD INFO    batch    9500: trn: rmse = 2.11e-01, rmse_e = 2.25e-04, rmse_f = 8.02e-02, lr = 5.92e-06
[2026-01-01 09:32:18,975] DEEPMD INFO    batch    9500: val: rmse = 2.04e-01, rmse_e = 7.60e-04, rmse_f = 7.74e-02
[2026-01-01 09:32:18,975] DEEPMD INFO    batch    9500: total wall time = 2.38 s
[2026-01-01 09:32:21,368] DEEPMD INFO    batch    9600: trn: rmse = 2.05e-01, rmse_e = 2.92e-04, rmse_f = 7.80e-02, lr = 5.92e-06
[2026-01-01 09:32:21,368] DEEPMD INFO    batch    9600: val: rmse = 1.97e-01, rmse_e = 9.62e-04, rmse_f = 7.46e-02
[2026-01-01 09:32:21,368] DEEPMD INFO    batch    9600: total wall time = 2.39 s
[2026-01-01 09:32:23,781] DEEPMD INFO    batch    9700: trn: rmse = 1.80e-01, rmse_e = 6.18e-04, rmse_f = 6.83e-02, lr = 5.92e-06
[2026-01-01 09:32:23,781] DEEPMD INFO    batch    9700: val: rmse = 1.99e-01, rmse_e = 7.30e-04, rmse_f = 7.55e-02
[2026-01-01 09:32:23,781] DEEPMD INFO    batch    9700: total wall time = 2.41 s
[2026-01-01 09:32:26,167] DEEPMD INFO    batch    9800: trn: rmse = 1.84e-01, rmse_e = 1.82e-03, rmse_f = 6.93e-02, lr = 5.92e-06
[2026-01-01 09:32:26,167] DEEPMD INFO    batch    9800: val: rmse = 1.94e-01, rmse_e = 1.21e-03, rmse_f = 7.34e-02
[2026-01-01 09:32:26,167] DEEPMD INFO    batch    9800: total wall time = 2.39 s
[2026-01-01 09:32:28,546] DEEPMD INFO    batch    9900: trn: rmse = 1.89e-01, rmse_e = 1.07e-03, rmse_f = 7.16e-02, lr = 5.92e-06
[2026-01-01 09:32:28,546] DEEPMD INFO    batch    9900: val: rmse = 1.99e-01, rmse_e = 8.63e-04, rmse_f = 7.54e-02
[2026-01-01 09:32:28,546] DEEPMD INFO    batch    9900: total wall time = 2.38 s
[2026-01-01 09:32:30,876] DEEPMD INFO    batch   10000: trn: rmse = 8.22e-02, rmse_e = 1.08e-04, rmse_f = 8.08e-02, lr = 3.51e-08
[2026-01-01 09:32:30,877] DEEPMD INFO    batch   10000: val: rmse = 7.92e-02, rmse_e = 8.88e-04, rmse_f = 7.67e-02
[2026-01-01 09:32:30,877] DEEPMD INFO    batch   10000: total wall time = 2.33 s
[2026-01-01 09:32:30,997] DEEPMD INFO    saved checkpoint model.ckpt
[2026-01-01 09:32:30,997] DEEPMD INFO    average training time: 0.0230 s/batch (exclude first 100 batches)
[2026-01-01 09:32:30,998] DEEPMD INFO    finished training
[2026-01-01 09:32:30,998] DEEPMD INFO    wall time: 242.014 s

导出力场 冻结 dp freeze -o graph.pb

dp freeze -o graph.pb 
2026-01-01 09:39:54.029971: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2026-01-01 09:39:54.117534: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2026-01-01 09:39:56.230971: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/keras/src/export/tf2onnx_lib.py:8: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
  if not hasattr(np, "object"):
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
2026-01-01 09:40:00.245490: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:47] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1767260400.246672     509 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260400.288810     509 mlir_graph_optimization_pass.cc:437] MLIR V1 optimization pass is not enabled
[2026-01-01 09:40:00,519] DEEPMD INFO    The following nodes will be frozen: ['o_atom_energy', 'o_energy', 'descrpt_attr/ntypes', 'model_attr/model_version', 'train_attr/training_script', 'o_virial', 'descrpt_attr/rcut', 'train_attr/min_nbor_dist', 'fitting_attr/dfparam', 'o_atom_virial', 't_mesh', 'model_type', 'fitting_attr/daparam', 'model_attr/tmap', 'o_force', 'model_attr/model_type']
[2026-01-01 09:40:00,890] DEEPMD INFO    862 ops in the final graph.
(py312) root@908821a3377a:/workspace/test/01train# 

压缩模型加速

dp --tf compress -i graph.pb -o graph-compress.pb

dp --tf compress -i graph.pb -o graph-compress.pb 
2026-01-01 09:45:32.427819: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2026-01-01 09:45:32.489055: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2026-01-01 09:45:34.511560: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/keras/src/export/tf2onnx_lib.py:8: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
  if not hasattr(np, "object"):
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
2026-01-01 09:45:38.396558: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:47] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1767260738.397889     585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260738.429904     585 mlir_graph_optimization_pass.cc:437] MLIR V1 optimization pass is not enabled
I0000 00:00:1767260738.461692     585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260738.509365     585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:45:38,598] DEEPMD INFO    


[2026-01-01 09:45:38,599] DEEPMD INFO    stage 1: compress the model
[2026-01-01 09:45:38,601] DEEPMD WARNING Switch to serial execution due to lack of horovod module.
[2026-01-01 09:45:43,805] DEEPMD INFO     _____               _____   __  __  _____           _     _  _   
[2026-01-01 09:45:43,805] DEEPMD INFO    |  __ \             |  __ \ |  \/  ||  __ \         | |   (_)| |  
[2026-01-01 09:45:43,805] DEEPMD INFO    | |  | |  ___   ___ | |__) || \  / || |  | | ______ | | __ _ | |_ 
[2026-01-01 09:45:43,805] DEEPMD INFO    | |  | | / _ \ / _ \|  ___/ | |\/| || |  | ||______|| |/ /| || __|
[2026-01-01 09:45:43,805] DEEPMD INFO    | |__| ||  __/|  __/| |     | |  | || |__| |        |   < | || |_ 
[2026-01-01 09:45:43,805] DEEPMD INFO    |_____/  \___| \___||_|     |_|  |_||_____/         |_|\_\|_| \__|
[2026-01-01 09:45:43,806] DEEPMD INFO    Please read and cite:
[2026-01-01 09:45:43,806] DEEPMD INFO    Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018)
[2026-01-01 09:45:43,806] DEEPMD INFO    Zeng et al, J. Chem. Phys., 159, 054801 (2023)
[2026-01-01 09:45:43,806] DEEPMD INFO    Zeng et al, J. Chem. Theory Comput., 21, 4375-4385 (2025)
[2026-01-01 09:45:43,806] DEEPMD INFO    See https://deepmd.rtfd.io/credits/ for details.
[2026-01-01 09:45:43,806] DEEPMD INFO    --------------------------------------------------------------------------------------------------------
[2026-01-01 09:45:43,806] DEEPMD INFO    installed to:          /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/deepmd
[2026-01-01 09:45:43,806] DEEPMD INFO    source:                v3.1.2-21-gb98f6c59-dirty
[2026-01-01 09:45:43,806] DEEPMD INFO    source branch:         devel
[2026-01-01 09:45:43,806] DEEPMD INFO    source commit:         b98f6c59
[2026-01-01 09:45:43,806] DEEPMD INFO    source commit at:      2025-12-23 08:15:14 +0000
[2026-01-01 09:45:43,806] DEEPMD INFO    use float prec:        double
[2026-01-01 09:45:43,806] DEEPMD INFO    build variant:         cuda
[2026-01-01 09:45:43,806] DEEPMD INFO    Backend:               TensorFlow
[2026-01-01 09:45:43,806] DEEPMD INFO    TF ver:                v2.20.0-rc0-4-g72fbba3d20f
[2026-01-01 09:45:43,806] DEEPMD INFO    build with TF ver:     2.20.0
[2026-01-01 09:45:43,806] DEEPMD INFO    build with TF inc:     /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/include/
[2026-01-01 09:45:43,806] DEEPMD INFO                           /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/include/
[2026-01-01 09:45:43,806] DEEPMD INFO    build with TF lib:     
[2026-01-01 09:45:43,806] DEEPMD INFO    running on:            908821a3377a
[2026-01-01 09:45:43,806] DEEPMD INFO    computing device:      gpu:0
[2026-01-01 09:45:43,806] DEEPMD INFO    CUDA_VISIBLE_DEVICES:  unset
[2026-01-01 09:45:43,806] DEEPMD INFO    Count of visible GPUs: 1
[2026-01-01 09:45:43,806] DEEPMD INFO    num_intra_threads:     0
[2026-01-01 09:45:43,806] DEEPMD INFO    num_inter_threads:     0
[2026-01-01 09:45:43,806] DEEPMD INFO    --------------------------------------------------------------------------------------------------------
I0000 00:00:1767260743.821566     585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:45:43,822] DEEPMD INFO    training without frame parameter
I0000 00:00:1767260743.931699     585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260743.941946     585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260743.994315     585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:45:44,032] DEEPMD INFO    training data with lower boundary: [-0.3598215  -0.38828511]
[2026-01-01 09:45:44,032] DEEPMD INFO    training data with upper boundary: [7.69377098 8.70452294]
I0000 00:00:1767260744.353808     585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260744.397141     585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260744.449946     585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:45:44,502] DEEPMD INFO    built lr
[2026-01-01 09:45:45,192] DEEPMD INFO    built network
[2026-01-01 09:45:45,694] DEEPMD INFO    built training
[2026-01-01 09:45:45,694] DEEPMD WARNING To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
I0000 00:00:1767260745.701164     585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:45:45,733] DEEPMD INFO    initialize model from scratch
[2026-01-01 09:45:46,386] DEEPMD INFO    finished compressing
[2026-01-01 09:45:46,393] DEEPMD INFO    


[2026-01-01 09:45:46,393] DEEPMD INFO    stage 2: freeze the model
I0000 00:00:1767260746.672053     585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:45:46,879] DEEPMD INFO    The following nodes will be frozen: ['o_virial', 'fitting_attr/dfparam', 'o_energy', 'fitting_attr/daparam', 'o_atom_virial', 'train_attr/training_script', 'o_force', 'model_attr/model_version', 't_mesh', 'descrpt_attr/ntypes', 'descrpt_attr/rcut', 'model_attr/tmap', 'model_attr/model_type', 'o_atom_energy', 'model_type', 'train_attr/min_nbor_dist']
[2026-01-01 09:45:47,105] DEEPMD INFO    685 ops in the final graph.
(py312) root@908821a3377a:/workspace/test/01train# 

运行gromacs

进入文件夹

cd /workspace/test/02gmx/

将力场复制过来

cp ../01train/graph-compress.pb .

运行脚本即可

./md.sh

./md.sh 
gmx_mpi: Relink `/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/libtensorflow_framework.so.2' with `/usr/lib/x86_64-linux-gnu/libz.so.1' for IFUNC symbol `crc32_z'
DeePMD-kit: Successfully load libcudart.so.12
2026-01-01 09:50:43.052941: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
                 :-) GROMACS - gmx grompp, 2020.2-MODIFIED (-:

                            GROMACS is written by:
     Emile Apol      Rossen Apostolov      Paul Bauer     Herman J.C. Berendsen
    Par Bjelkmar      Christian Blau   Viacheslav Bolnykh     Kevin Boyd    
 Aldert van Buuren   Rudi van Drunen     Anton Feenstra       Alan Gray     
  Gerrit Groenhof     Anca Hamuraru    Vincent Hindriksen  M. Eric Irrgang  
  Aleksei Iupinov   Christoph Junghans     Joe Jordan     Dimitrios Karkoulis
    Peter Kasson        Jiri Kraus      Carsten Kutzner      Per Larsson    
  Justin A. Lemkul    Viveca Lindahl    Magnus Lundborg     Erik Marklund   
    Pascal Merz     Pieter Meulenhoff    Teemu Murtola       Szilard Pall   
    Sander Pronk      Roland Schulz      Michael Shirts    Alexey Shvetsov  
   Alfons Sijbers     Peter Tieleman      Jon Vincent      Teemu Virolainen 
 Christian Wennberg    Maarten Wolf      Artem Zhmurov   
                           and the project leaders:
        Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel

Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2019, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.

GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.

GROMACS:      gmx grompp, version 2020.2-MODIFIED
Executable:   /root/opt/gromacs/install/bin/gmx_mpi
Data prefix:  /root/opt/gromacs/install
Working dir:  /workspace/test/02gmx
Command line:
  gmx_mpi grompp -f md.mdp -c water.gro -p water.top -o md.tpr -maxwarn 3

Ignoring obsolete mdp entry 'ns_type'

NOTE 1 [file md.mdp]:
  leapfrog does not yet support Nose-Hoover chains, nhchainlength reset to 1

Setting the LD random seed to 1646275867
Generated 3 of the 3 non-bonded parameter combinations
Generating 1-4 interactions: fudge = 0.5
Generated 3 of the 3 1-4 parameter combinations
Excluding 2 bonded neighbours molecule type 'SOL'
Setting gen_seed to 1388784567
Velocities were taken from a Maxwell distribution at 300 K

NOTE 2 [file water.top, line 43]:
  In moleculetype 'SOL' 3 atoms are not bound by a potential or constraint
  to any other atom in the same moleculetype. Although technically this
  might not cause issues in a simulation, this often means that the user
  forgot to add a bond/potential/constraint or put multiple molecules in
  the same moleculetype definition by mistake. Run with -v to get
  information for each atom.

Analysing residue names:
There are:   256      Water residues
Number of degrees of freedom in T-Coupling group System is 2301.00
Determining Verlet buffer for a tolerance of 0.005 kJ/mol/ps at 298 K
Calculated rlist for 1x1 atom pair-list as 0.800 nm, buffer size 0.000 nm
Set rlist, assuming 4x4 atom pair-list, to 0.800 nm, buffer size 0.000 nm
Note that mdrun will redetermine rlist based on the actual pair-list setup
This run will generate roughly 1 Mb of data

There were 2 notes

Back Off! I just backed up md.tpr to ./#md.tpr.1#

GROMACS reminds you: "Don't pay any attention to what they write about you. Just measure it in inches." (Andy Warhol)

gmx_mpi: Relink `/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/libtensorflow_framework.so.2' with `/usr/lib/x86_64-linux-gnu/libz.so.1' for IFUNC symbol `crc32_z'
DeePMD-kit: Successfully load libcudart.so.12
2026-01-01 09:50:44.345835: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
                  :-) GROMACS - gmx mdrun, 2020.2-MODIFIED (-:

                            GROMACS is written by:
     Emile Apol      Rossen Apostolov      Paul Bauer     Herman J.C. Berendsen
    Par Bjelkmar      Christian Blau   Viacheslav Bolnykh     Kevin Boyd    
 Aldert van Buuren   Rudi van Drunen     Anton Feenstra       Alan Gray     
  Gerrit Groenhof     Anca Hamuraru    Vincent Hindriksen  M. Eric Irrgang  
  Aleksei Iupinov   Christoph Junghans     Joe Jordan     Dimitrios Karkoulis
    Peter Kasson        Jiri Kraus      Carsten Kutzner      Per Larsson    
  Justin A. Lemkul    Viveca Lindahl    Magnus Lundborg     Erik Marklund   
    Pascal Merz     Pieter Meulenhoff    Teemu Murtola       Szilard Pall   
    Sander Pronk      Roland Schulz      Michael Shirts    Alexey Shvetsov  
   Alfons Sijbers     Peter Tieleman      Jon Vincent      Teemu Virolainen 
 Christian Wennberg    Maarten Wolf      Artem Zhmurov   
                           and the project leaders:
        Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel

Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2019, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.

GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.

GROMACS:      gmx mdrun, version 2020.2-MODIFIED
Executable:   /root/opt/gromacs/install/bin/gmx_mpi
Data prefix:  /root/opt/gromacs/install
Working dir:  /workspace/test/02gmx
Command line:
  gmx_mpi mdrun -deffnm md -gpu_id 0


Back Off! I just backed up md.log to ./#md.log.1#
Compiled SIMD: AVX2_256, but for this host/run AVX_512 might be better (see
log).
Reading file md.tpr, VERSION 2020.2-MODIFIED (single precision)
Changing nstlist from 10 to 100, rlist from 0.8 to 0.8

1 GPU selected for this run.
Mapping of GPU IDs to the 1 GPU task in the 1 rank on this node:
  PP:0
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PP task will update and constrain coordinates on the CPU
Using 1 MPI process
Using 12 OpenMP threads 


WARNING: There are no atom pairs for dispersion correction
Init deepmd plugin from: input.json
Setting lambda: 1
Setting pbc: 1
Number of atoms: 768
Begin Init Model: graph-compress.pb
DeePMD-kit WARNING: Environmental variable DP_INTRA_OP_PARALLELISM_THREADS is not set. Tune DP_INTRA_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable DP_INTER_OP_PARALLELISM_THREADS is not set. Tune DP_INTER_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable OMP_NUM_THREADS is not set. Tune OMP_NUM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
2026-01-01 09:50:46.847147: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1767261046.943202     760 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10729 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767261047.008275     760 mlir_graph_optimization_pass.cc:437] MLIR V1 optimization pass is not enabled
Successfully load model!
DeePMD-kit WARNING: Environmental variable DP_INTRA_OP_PARALLELISM_THREADS is not set. Tune DP_INTRA_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable DP_INTER_OP_PARALLELISM_THREADS is not set. Tune DP_INTER_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable OMP_NUM_THREADS is not set. Tune OMP_NUM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
installed to:       /root/opt/deepmd/deepmd-kit/source/dp_gromacs
source:             v3.1.2-21-gb98f6c59-dirty
source branch:      devel
source commit:      b98f6c59
source commit at:   2025-12-23 08:15:14 +0000
support model ver.: 1.1 
build variant:      cuda
build with tf inc:  /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/include;/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/include
build with tf lib:  /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/libtensorflow_cc.so.2
set tf intra_op_parallelism_threads: 0
set tf inter_op_parallelism_threads: 0
Summary: 

Atom map: O H
Successfully init plugin!

Back Off! I just backed up md.trr to ./#md.trr.1#

Back Off! I just backed up md.edr to ./#md.edr.1#
starting mdrun 'lw_256.pdb'
10000 steps,      5.0 ps.

Writing final coordinates.

Back Off! I just backed up md.gro to ./#md.gro.1#

               Core t (s)   Wall t (s)        (%)
       Time:     1221.531      101.794     1200.0
                 (ns/day)    (hour/ns)
Performance:        4.244        5.655

GROMACS reminds you: "Philosophy of science is about as useful to scientists as ornithology is to birds." (Richard Feynman)

gmx_mpi: Relink `/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/libtensorflow_framework.so.2' with `/usr/lib/x86_64-linux-gnu/libz.so.1' for IFUNC symbol `crc32_z'
DeePMD-kit: Successfully load libcudart.so.12
2026-01-01 09:52:29.837566: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
                   :-) GROMACS - gmx rdf, 2020.2-MODIFIED (-:

                            GROMACS is written by:
     Emile Apol      Rossen Apostolov      Paul Bauer     Herman J.C. Berendsen
    Par Bjelkmar      Christian Blau   Viacheslav Bolnykh     Kevin Boyd    
 Aldert van Buuren   Rudi van Drunen     Anton Feenstra       Alan Gray     
  Gerrit Groenhof     Anca Hamuraru    Vincent Hindriksen  M. Eric Irrgang  
  Aleksei Iupinov   Christoph Junghans     Joe Jordan     Dimitrios Karkoulis
    Peter Kasson        Jiri Kraus      Carsten Kutzner      Per Larsson    
  Justin A. Lemkul    Viveca Lindahl    Magnus Lundborg     Erik Marklund   
    Pascal Merz     Pieter Meulenhoff    Teemu Murtola       Szilard Pall   
    Sander Pronk      Roland Schulz      Michael Shirts    Alexey Shvetsov  
   Alfons Sijbers     Peter Tieleman      Jon Vincent      Teemu Virolainen 
 Christian Wennberg    Maarten Wolf      Artem Zhmurov   
                           and the project leaders:
        Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel

Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2019, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.

GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.

GROMACS:      gmx rdf, version 2020.2-MODIFIED
Executable:   /root/opt/gromacs/install/bin/gmx_mpi
Data prefix:  /root/opt/gromacs/install
Working dir:  /workspace/test/02gmx
Command line:
  gmx_mpi rdf -f md.trr -s md.tpr -o md_rdf.xvg -ref 'name OW' -sel 'name OW'

Reading file md.tpr, VERSION 2020.2-MODIFIED (single precision)
Reading file md.tpr, VERSION 2020.2-MODIFIED (single precision)
trr version: GMX_trn_file (single precision)
Last frame        100 time    5.000   
Analyzed 101 frames, last time 5.000

Back Off! I just backed up md_rdf.xvg to ./#md_rdf.xvg.2#

GROMACS reminds you: "C is not a high-level language." (Brian Kernighan, C author)
@tty
镜像信息
已使用0
运行时长
0 H
镜像大小
30GB
最后更新时间
2026-01-01
支持卡型
RTX40系
+1
框架版本
TensorFlow-gromacs4090_3090_3080ti
CUDA版本
12.8
应用
JupyterLab: 8888
版本
v1.0
2026-01-01
TensorFlow:gromacs4090_3090_3080ti | CUDA:12.8 | 大小:30.00GB
gromacs_deepmd机器学习训练水分子力场一键部署 | 优云智算