0cd /workspace/test/
ls 00data 01train 02gmx
00data 训练数据
01train 训练文件夹
02gmx gromacs演示文件夹
进入文件夹
cd /workspace/test/01train/ && ls
开始训练
dp train input.json
dp train input.json
2026-01-01 09:27:59.177759: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2026-01-01 09:28:02.383262: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2026-01-01 09:28:05.951113: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/keras/src/export/tf2onnx_lib.py:8: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
if not hasattr(np, "object"):
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
Switch to serial execution due to lack of horovod module.
[2026-01-01 09:28:20,026] DEEPMD INFO Calculate neighbor statistics... (add --skip-neighbor-stat to skip this step)
[2026-01-01 09:28:21,633] DEEPMD INFO If you encounter the error 'an illegal memory access was encountered', this may be due to a TensorFlow issue. To avoid this, set the environment variable DP_INFER_BATCH_SIZE to a smaller value than the last adjusted batch size. The environment variable DP_INFER_BATCH_SIZE controls the inference batch size (nframes * natoms).
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1767259701.888737 151 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767259701.897942 151 mlir_graph_optimization_pass.cc:437] MLIR V1 optimization pass is not enabled
I0000 00:00:1767259702.090857 234 cuda_solvers.cc:175] Creating GpuSolver handles for stream 0x9085580
[2026-01-01 09:28:25,141] DEEPMD INFO Adjust batch size from 1024 to 2048
[2026-01-01 09:28:25,283] DEEPMD INFO Adjust batch size from 2048 to 4096
[2026-01-01 09:28:25,513] DEEPMD INFO Adjust batch size from 4096 to 8192
[2026-01-01 09:28:25,850] DEEPMD INFO Adjust batch size from 8192 to 16384
[2026-01-01 09:28:26,958] DEEPMD INFO Neighbor statistics: training data with minimal neighbor distance: 0.885439
[2026-01-01 09:28:26,958] DEEPMD INFO Neighbor statistics: training data with maximum neighbor size: [38 72] (cutoff radius: 6.000000)
[2026-01-01 09:28:26,998] DEEPMD INFO _____ _____ __ __ _____ _ _ _
[2026-01-01 09:28:26,998] DEEPMD INFO | __ \ | __ \ | \/ || __ \ | | (_)| |
[2026-01-01 09:28:26,998] DEEPMD INFO | | | | ___ ___ | |__) || \ / || | | | ______ | | __ _ | |_
[2026-01-01 09:28:26,998] DEEPMD INFO | | | | / _ \ / _ \| ___/ | |\/| || | | ||______|| |/ /| || __|
[2026-01-01 09:28:26,998] DEEPMD INFO | |__| || __/| __/| | | | | || |__| | | < | || |_
[2026-01-01 09:28:26,998] DEEPMD INFO |_____/ \___| \___||_| |_| |_||_____/ |_|\_\|_| \__|
[2026-01-01 09:28:26,998] DEEPMD INFO Please read and cite:
[2026-01-01 09:28:26,998] DEEPMD INFO Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018)
[2026-01-01 09:28:26,998] DEEPMD INFO Zeng et al, J. Chem. Phys., 159, 054801 (2023)
[2026-01-01 09:28:26,998] DEEPMD INFO Zeng et al, J. Chem. Theory Comput., 21, 4375-4385 (2025)
[2026-01-01 09:28:26,998] DEEPMD INFO See https://deepmd.rtfd.io/credits/ for details.
[2026-01-01 09:28:26,998] DEEPMD INFO --------------------------------------------------------------------------------------------------------
[2026-01-01 09:28:26,998] DEEPMD INFO installed to: /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/deepmd
[2026-01-01 09:28:26,998] DEEPMD INFO source: v3.1.2-21-gb98f6c59-dirty
[2026-01-01 09:28:26,998] DEEPMD INFO source branch: devel
[2026-01-01 09:28:26,998] DEEPMD INFO source commit: b98f6c59
[2026-01-01 09:28:26,998] DEEPMD INFO source commit at: 2025-12-23 08:15:14 +0000
[2026-01-01 09:28:26,998] DEEPMD INFO use float prec: double
[2026-01-01 09:28:26,998] DEEPMD INFO build variant: cuda
[2026-01-01 09:28:26,999] DEEPMD INFO Backend: TensorFlow
[2026-01-01 09:28:26,999] DEEPMD INFO TF ver: v2.20.0-rc0-4-g72fbba3d20f
[2026-01-01 09:28:26,999] DEEPMD INFO build with TF ver: 2.20.0
[2026-01-01 09:28:26,999] DEEPMD INFO build with TF inc: /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/include/
[2026-01-01 09:28:26,999] DEEPMD INFO /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/include/
[2026-01-01 09:28:26,999] DEEPMD INFO build with TF lib:
[2026-01-01 09:28:26,999] DEEPMD INFO running on: 908821a3377a
[2026-01-01 09:28:26,999] DEEPMD INFO computing device: gpu:0
[2026-01-01 09:28:26,999] DEEPMD INFO CUDA_VISIBLE_DEVICES: unset
[2026-01-01 09:28:26,999] DEEPMD INFO Count of visible GPUs: 1
[2026-01-01 09:28:26,999] DEEPMD INFO num_intra_threads: 0
[2026-01-01 09:28:26,999] DEEPMD INFO num_inter_threads: 0
[2026-01-01 09:28:26,999] DEEPMD INFO --------------------------------------------------------------------------------------------------------
I0000 00:00:1767259707.010023 151 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:28:27,022] DEEPMD INFO ---Summary of DataSystem: training -----------------------------------------------
[2026-01-01 09:28:27,022] DEEPMD INFO found 3 system(s):
[2026-01-01 09:28:27,022] DEEPMD INFO system natoms bch_sz n_bch prob pbc
[2026-01-01 09:28:27,022] DEEPMD INFO ../00data/data_0 192 1 80 2.500e-01 T
[2026-01-01 09:28:27,022] DEEPMD INFO ../00data/data_1 192 1 160 5.000e-01 T
[2026-01-01 09:28:27,022] DEEPMD INFO ../00data/data_2 192 1 80 2.500e-01 T
[2026-01-01 09:28:27,022] DEEPMD INFO --------------------------------------------------------------------------------------
[2026-01-01 09:28:27,029] DEEPMD INFO ---Summary of DataSystem: validation -----------------------------------------------
[2026-01-01 09:28:27,029] DEEPMD INFO found 1 system(s):
[2026-01-01 09:28:27,029] DEEPMD INFO system natoms bch_sz n_bch prob pbc
[2026-01-01 09:28:27,029] DEEPMD INFO ../00data/data_3 192 1 80 1.000e+00 T
[2026-01-01 09:28:27,029] DEEPMD INFO --------------------------------------------------------------------------------------
[2026-01-01 09:28:27,029] DEEPMD INFO training without frame parameter
[2026-01-01 09:28:27,030] DEEPMD INFO data stating... (this step may take long time)
[2026-01-01 09:28:27,224] DEEPMD INFO built lr
[2026-01-01 09:28:27,922] DEEPMD INFO built network
[2026-01-01 09:28:28,983] DEEPMD INFO built training
[2026-01-01 09:28:28,984] DEEPMD WARNING To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
I0000 00:00:1767259708.990744 151 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:28:29,037] DEEPMD INFO initialize model from scratch
[2026-01-01 09:28:29,922] DEEPMD INFO start training at lr 1.00e-03 (== 1.00e-03), decay_step 5000, decay_rate 0.005925, final lr will be 3.51e-08
[2026-01-01 09:28:31,055] DEEPMD INFO batch 0: trn: rmse = 2.55e+01, rmse_e = 6.63e-01, rmse_f = 8.05e-01, lr = 1.00e-03
[2026-01-01 09:28:31,055] DEEPMD INFO batch 0: val: rmse = 2.53e+01, rmse_e = 6.64e-01, rmse_f = 7.98e-01
I0000 00:00:1767259711.849151 227 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:28:34,483] DEEPMD INFO batch 100: trn: rmse = 1.11e+01, rmse_e = 9.31e-02, rmse_f = 3.52e-01, lr = 1.00e-03
[2026-01-01 09:28:34,483] DEEPMD INFO batch 100: val: rmse = 1.14e+01, rmse_e = 9.72e-02, rmse_f = 3.59e-01
[2026-01-01 09:28:34,483] DEEPMD INFO batch 100: total wall time = 4.56 s
[2026-01-01 09:28:36,909] DEEPMD INFO batch 200: trn: rmse = 8.89e+00, rmse_e = 2.63e-02, rmse_f = 2.81e-01, lr = 1.00e-03
[2026-01-01 09:28:36,910] DEEPMD INFO batch 200: val: rmse = 9.16e+00, rmse_e = 2.71e-02, rmse_f = 2.90e-01
[2026-01-01 09:28:36,910] DEEPMD INFO batch 200: total wall time = 2.43 s
[2026-01-01 09:28:39,362] DEEPMD INFO batch 300: trn: rmse = 7.73e+00, rmse_e = 8.21e-03, rmse_f = 2.44e-01, lr = 1.00e-03
[2026-01-01 09:28:39,362] DEEPMD INFO batch 300: val: rmse = 7.24e+00, rmse_e = 7.06e-03, rmse_f = 2.29e-01
[2026-01-01 09:28:39,362] DEEPMD INFO batch 300: total wall time = 2.45 s
[2026-01-01 09:28:41,776] DEEPMD INFO batch 400: trn: rmse = 7.11e+00, rmse_e = 5.26e-02, rmse_f = 2.25e-01, lr = 1.00e-03
[2026-01-01 09:28:41,776] DEEPMD INFO batch 400: val: rmse = 6.65e+00, rmse_e = 5.02e-02, rmse_f = 2.10e-01
[2026-01-01 09:28:41,776] DEEPMD INFO batch 400: total wall time = 2.41 s
[2026-01-01 09:28:44,220] DEEPMD INFO batch 500: trn: rmse = 6.05e+00, rmse_e = 3.17e-02, rmse_f = 1.91e-01, lr = 1.00e-03
[2026-01-01 09:28:44,221] DEEPMD INFO batch 500: val: rmse = 6.01e+00, rmse_e = 3.18e-02, rmse_f = 1.90e-01
[2026-01-01 09:28:44,221] DEEPMD INFO batch 500: total wall time = 2.44 s
[2026-01-01 09:28:46,604] DEEPMD INFO batch 600: trn: rmse = 6.00e+00, rmse_e = 6.22e-03, rmse_f = 1.90e-01, lr = 1.00e-03
[2026-01-01 09:28:46,604] DEEPMD INFO batch 600: val: rmse = 6.19e+00, rmse_e = 5.96e-03, rmse_f = 1.96e-01
[2026-01-01 09:28:46,605] DEEPMD INFO batch 600: total wall time = 2.38 s
[2026-01-01 09:28:49,048] DEEPMD INFO batch 700: trn: rmse = 4.95e+00, rmse_e = 1.40e-02, rmse_f = 1.56e-01, lr = 1.00e-03
[2026-01-01 09:28:49,048] DEEPMD INFO batch 700: val: rmse = 5.05e+00, rmse_e = 9.82e-03, rmse_f = 1.60e-01
[2026-01-01 09:28:49,048] DEEPMD INFO batch 700: total wall time = 2.44 s
[2026-01-01 09:28:51,456] DEEPMD INFO batch 800: trn: rmse = 4.46e+00, rmse_e = 1.19e-02, rmse_f = 1.41e-01, lr = 1.00e-03
[2026-01-01 09:28:51,456] DEEPMD INFO batch 800: val: rmse = 4.85e+00, rmse_e = 1.19e-02, rmse_f = 1.53e-01
[2026-01-01 09:28:51,456] DEEPMD INFO batch 800: total wall time = 2.41 s
[2026-01-01 09:28:53,815] DEEPMD INFO batch 900: trn: rmse = 4.46e+00, rmse_e = 1.61e-02, rmse_f = 1.41e-01, lr = 1.00e-03
[2026-01-01 09:28:53,815] DEEPMD INFO batch 900: val: rmse = 4.73e+00, rmse_e = 1.73e-02, rmse_f = 1.50e-01
[2026-01-01 09:28:53,815] DEEPMD INFO batch 900: total wall time = 2.36 s
[2026-01-01 09:28:56,236] DEEPMD INFO batch 1000: trn: rmse = 4.28e+00, rmse_e = 6.60e-03, rmse_f = 1.35e-01, lr = 1.00e-03
[2026-01-01 09:28:56,236] DEEPMD INFO batch 1000: val: rmse = 4.44e+00, rmse_e = 6.66e-03, rmse_f = 1.40e-01
[2026-01-01 09:28:56,236] DEEPMD INFO batch 1000: total wall time = 2.42 s
[2026-01-01 09:28:56,449] DEEPMD INFO saved checkpoint model.ckpt
[2026-01-01 09:28:58,886] DEEPMD INFO batch 1100: trn: rmse = 4.15e+00, rmse_e = 2.25e-02, rmse_f = 1.31e-01, lr = 1.00e-03
[2026-01-01 09:28:58,886] DEEPMD INFO batch 1100: val: rmse = 4.42e+00, rmse_e = 2.04e-02, rmse_f = 1.40e-01
[2026-01-01 09:28:58,886] DEEPMD INFO batch 1100: total wall time = 2.65 s
[2026-01-01 09:29:01,263] DEEPMD INFO batch 1200: trn: rmse = 4.15e+00, rmse_e = 2.93e-02, rmse_f = 1.31e-01, lr = 1.00e-03
[2026-01-01 09:29:01,263] DEEPMD INFO batch 1200: val: rmse = 3.96e+00, rmse_e = 2.92e-02, rmse_f = 1.25e-01
[2026-01-01 09:29:01,263] DEEPMD INFO batch 1200: total wall time = 2.38 s
[2026-01-01 09:29:03,640] DEEPMD INFO batch 1300: trn: rmse = 3.90e+00, rmse_e = 2.56e-03, rmse_f = 1.23e-01, lr = 1.00e-03
[2026-01-01 09:29:03,640] DEEPMD INFO batch 1300: val: rmse = 4.03e+00, rmse_e = 2.76e-03, rmse_f = 1.27e-01
[2026-01-01 09:29:03,640] DEEPMD INFO batch 1300: total wall time = 2.38 s
[2026-01-01 09:29:06,034] DEEPMD INFO batch 1400: trn: rmse = 3.75e+00, rmse_e = 2.24e-02, rmse_f = 1.19e-01, lr = 1.00e-03
[2026-01-01 09:29:06,035] DEEPMD INFO batch 1400: val: rmse = 4.21e+00, rmse_e = 2.43e-02, rmse_f = 1.33e-01
[2026-01-01 09:29:06,035] DEEPMD INFO batch 1400: total wall time = 2.39 s
[2026-01-01 09:29:08,461] DEEPMD INFO batch 1500: trn: rmse = 3.76e+00, rmse_e = 9.10e-03, rmse_f = 1.19e-01, lr = 1.00e-03
[2026-01-01 09:29:08,462] DEEPMD INFO batch 1500: val: rmse = 4.01e+00, rmse_e = 5.39e-03, rmse_f = 1.27e-01
[2026-01-01 09:29:08,462] DEEPMD INFO batch 1500: total wall time = 2.43 s
[2026-01-01 09:29:10,864] DEEPMD INFO batch 1600: trn: rmse = 4.11e+00, rmse_e = 3.65e-02, rmse_f = 1.30e-01, lr = 1.00e-03
[2026-01-01 09:29:10,864] DEEPMD INFO batch 1600: val: rmse = 3.65e+00, rmse_e = 3.71e-02, rmse_f = 1.15e-01
[2026-01-01 09:29:10,864] DEEPMD INFO batch 1600: total wall time = 2.40 s
[2026-01-01 09:29:13,290] DEEPMD INFO batch 1700: trn: rmse = 3.35e+00, rmse_e = 1.42e-02, rmse_f = 1.06e-01, lr = 1.00e-03
[2026-01-01 09:29:13,290] DEEPMD INFO batch 1700: val: rmse = 3.44e+00, rmse_e = 1.41e-02, rmse_f = 1.09e-01
[2026-01-01 09:29:13,290] DEEPMD INFO batch 1700: total wall time = 2.43 s
[2026-01-01 09:29:15,716] DEEPMD INFO batch 1800: trn: rmse = 4.97e+00, rmse_e = 5.37e-02, rmse_f = 1.57e-01, lr = 1.00e-03
[2026-01-01 09:29:15,716] DEEPMD INFO batch 1800: val: rmse = 4.96e+00, rmse_e = 5.25e-02, rmse_f = 1.57e-01
[2026-01-01 09:29:15,716] DEEPMD INFO batch 1800: total wall time = 2.43 s
[2026-01-01 09:29:18,091] DEEPMD INFO batch 1900: trn: rmse = 3.49e+00, rmse_e = 1.32e-02, rmse_f = 1.10e-01, lr = 1.00e-03
[2026-01-01 09:29:18,091] DEEPMD INFO batch 1900: val: rmse = 3.51e+00, rmse_e = 1.25e-02, rmse_f = 1.11e-01
[2026-01-01 09:29:18,091] DEEPMD INFO batch 1900: total wall time = 2.37 s
[2026-01-01 09:29:20,444] DEEPMD INFO batch 2000: trn: rmse = 3.53e+00, rmse_e = 8.19e-03, rmse_f = 1.11e-01, lr = 1.00e-03
[2026-01-01 09:29:20,444] DEEPMD INFO batch 2000: val: rmse = 3.47e+00, rmse_e = 7.47e-03, rmse_f = 1.10e-01
[2026-01-01 09:29:20,444] DEEPMD INFO batch 2000: total wall time = 2.35 s
[2026-01-01 09:29:20,547] DEEPMD INFO saved checkpoint model.ckpt
[2026-01-01 09:29:22,907] DEEPMD INFO batch 2100: trn: rmse = 3.32e+00, rmse_e = 3.29e-02, rmse_f = 1.05e-01, lr = 1.00e-03
[2026-01-01 09:29:22,907] DEEPMD INFO batch 2100: val: rmse = 3.80e+00, rmse_e = 3.27e-02, rmse_f = 1.20e-01
[2026-01-01 09:29:22,907] DEEPMD INFO batch 2100: total wall time = 2.46 s
[2026-01-01 09:29:25,300] DEEPMD INFO batch 2200: trn: rmse = 3.13e+00, rmse_e = 8.70e-04, rmse_f = 9.91e-02, lr = 1.00e-03
[2026-01-01 09:29:25,300] DEEPMD INFO batch 2200: val: rmse = 3.19e+00, rmse_e = 1.30e-03, rmse_f = 1.01e-01
[2026-01-01 09:29:25,301] DEEPMD INFO batch 2200: total wall time = 2.39 s
[2026-01-01 09:29:27,679] DEEPMD INFO batch 2300: trn: rmse = 3.22e+00, rmse_e = 6.98e-03, rmse_f = 1.02e-01, lr = 1.00e-03
[2026-01-01 09:29:27,679] DEEPMD INFO batch 2300: val: rmse = 3.33e+00, rmse_e = 6.55e-03, rmse_f = 1.05e-01
[2026-01-01 09:29:27,679] DEEPMD INFO batch 2300: total wall time = 2.38 s
[2026-01-01 09:29:30,055] DEEPMD INFO batch 2400: trn: rmse = 3.43e+00, rmse_e = 9.74e-03, rmse_f = 1.08e-01, lr = 1.00e-03
[2026-01-01 09:29:30,055] DEEPMD INFO batch 2400: val: rmse = 3.69e+00, rmse_e = 8.45e-03, rmse_f = 1.17e-01
[2026-01-01 09:29:30,055] DEEPMD INFO batch 2400: total wall time = 2.38 s
[2026-01-01 09:29:32,390] DEEPMD INFO batch 2500: trn: rmse = 3.20e+00, rmse_e = 2.88e-02, rmse_f = 1.01e-01, lr = 1.00e-03
[2026-01-01 09:29:32,390] DEEPMD INFO batch 2500: val: rmse = 3.19e+00, rmse_e = 2.92e-02, rmse_f = 1.01e-01
[2026-01-01 09:29:32,390] DEEPMD INFO batch 2500: total wall time = 2.34 s
[2026-01-01 09:29:34,730] DEEPMD INFO batch 2600: trn: rmse = 4.28e+00, rmse_e = 2.26e-02, rmse_f = 1.35e-01, lr = 1.00e-03
[2026-01-01 09:29:34,730] DEEPMD INFO batch 2600: val: rmse = 4.83e+00, rmse_e = 1.93e-02, rmse_f = 1.53e-01
[2026-01-01 09:29:34,730] DEEPMD INFO batch 2600: total wall time = 2.34 s
[2026-01-01 09:29:37,117] DEEPMD INFO batch 2700: trn: rmse = 3.13e+00, rmse_e = 7.97e-03, rmse_f = 9.89e-02, lr = 1.00e-03
[2026-01-01 09:29:37,117] DEEPMD INFO batch 2700: val: rmse = 3.30e+00, rmse_e = 6.36e-03, rmse_f = 1.04e-01
[2026-01-01 09:29:37,117] DEEPMD INFO batch 2700: total wall time = 2.39 s
[2026-01-01 09:29:39,497] DEEPMD INFO batch 2800: trn: rmse = 3.03e+00, rmse_e = 3.23e-03, rmse_f = 9.59e-02, lr = 1.00e-03
[2026-01-01 09:29:39,498] DEEPMD INFO batch 2800: val: rmse = 3.12e+00, rmse_e = 1.82e-03, rmse_f = 9.86e-02
[2026-01-01 09:29:39,498] DEEPMD INFO batch 2800: total wall time = 2.38 s
[2026-01-01 09:29:41,920] DEEPMD INFO batch 2900: trn: rmse = 3.49e+00, rmse_e = 2.05e-02, rmse_f = 1.10e-01, lr = 1.00e-03
[2026-01-01 09:29:41,920] DEEPMD INFO batch 2900: val: rmse = 3.64e+00, rmse_e = 2.02e-02, rmse_f = 1.15e-01
[2026-01-01 09:29:41,921] DEEPMD INFO batch 2900: total wall time = 2.42 s
[2026-01-01 09:29:44,273] DEEPMD INFO batch 3000: trn: rmse = 3.25e+00, rmse_e = 1.43e-02, rmse_f = 1.03e-01, lr = 1.00e-03
[2026-01-01 09:29:44,273] DEEPMD INFO batch 3000: val: rmse = 3.26e+00, rmse_e = 1.43e-02, rmse_f = 1.03e-01
[2026-01-01 09:29:44,274] DEEPMD INFO batch 3000: total wall time = 2.35 s
[2026-01-01 09:29:44,379] DEEPMD INFO saved checkpoint model.ckpt
[2026-01-01 09:29:46,781] DEEPMD INFO batch 3100: trn: rmse = 3.06e+00, rmse_e = 5.73e-04, rmse_f = 9.68e-02, lr = 1.00e-03
[2026-01-01 09:29:46,781] DEEPMD INFO batch 3100: val: rmse = 2.90e+00, rmse_e = 9.68e-04, rmse_f = 9.17e-02
[2026-01-01 09:29:46,781] DEEPMD INFO batch 3100: total wall time = 2.51 s
[2026-01-01 09:29:49,187] DEEPMD INFO batch 3200: trn: rmse = 2.84e+00, rmse_e = 1.40e-02, rmse_f = 8.97e-02, lr = 1.00e-03
[2026-01-01 09:29:49,187] DEEPMD INFO batch 3200: val: rmse = 3.03e+00, rmse_e = 1.36e-02, rmse_f = 9.59e-02
[2026-01-01 09:29:49,187] DEEPMD INFO batch 3200: total wall time = 2.41 s
[2026-01-01 09:29:51,521] DEEPMD INFO batch 3300: trn: rmse = 2.75e+00, rmse_e = 1.52e-02, rmse_f = 8.69e-02, lr = 1.00e-03
[2026-01-01 09:29:51,522] DEEPMD INFO batch 3300: val: rmse = 2.89e+00, rmse_e = 1.64e-02, rmse_f = 9.15e-02
[2026-01-01 09:29:51,522] DEEPMD INFO batch 3300: total wall time = 2.33 s
[2026-01-01 09:29:53,852] DEEPMD INFO batch 3400: trn: rmse = 3.18e+00, rmse_e = 1.68e-02, rmse_f = 1.01e-01, lr = 1.00e-03
[2026-01-01 09:29:53,852] DEEPMD INFO batch 3400: val: rmse = 3.29e+00, rmse_e = 1.85e-02, rmse_f = 1.04e-01
[2026-01-01 09:29:53,852] DEEPMD INFO batch 3400: total wall time = 2.33 s
[2026-01-01 09:29:56,210] DEEPMD INFO batch 3500: trn: rmse = 3.15e+00, rmse_e = 7.60e-03, rmse_f = 9.97e-02, lr = 1.00e-03
[2026-01-01 09:29:56,210] DEEPMD INFO batch 3500: val: rmse = 3.19e+00, rmse_e = 6.99e-03, rmse_f = 1.01e-01
[2026-01-01 09:29:56,210] DEEPMD INFO batch 3500: total wall time = 2.36 s
[2026-01-01 09:29:58,606] DEEPMD INFO batch 3600: trn: rmse = 2.67e+00, rmse_e = 1.93e-02, rmse_f = 8.43e-02, lr = 1.00e-03
[2026-01-01 09:29:58,606] DEEPMD INFO batch 3600: val: rmse = 3.04e+00, rmse_e = 1.94e-02, rmse_f = 9.61e-02
[2026-01-01 09:29:58,606] DEEPMD INFO batch 3600: total wall time = 2.40 s
[2026-01-01 09:30:00,963] DEEPMD INFO batch 3700: trn: rmse = 2.46e+00, rmse_e = 1.03e-02, rmse_f = 7.79e-02, lr = 1.00e-03
[2026-01-01 09:30:00,963] DEEPMD INFO batch 3700: val: rmse = 2.60e+00, rmse_e = 1.00e-02, rmse_f = 8.21e-02
[2026-01-01 09:30:00,963] DEEPMD INFO batch 3700: total wall time = 2.36 s
[2026-01-01 09:30:03,273] DEEPMD INFO batch 3800: trn: rmse = 2.37e+00, rmse_e = 2.76e-02, rmse_f = 7.48e-02, lr = 1.00e-03
[2026-01-01 09:30:03,273] DEEPMD INFO batch 3800: val: rmse = 2.91e+00, rmse_e = 2.77e-02, rmse_f = 9.22e-02
[2026-01-01 09:30:03,273] DEEPMD INFO batch 3800: total wall time = 2.31 s
[2026-01-01 09:30:05,682] DEEPMD INFO batch 3900: trn: rmse = 3.00e+00, rmse_e = 1.46e-02, rmse_f = 9.50e-02, lr = 1.00e-03
[2026-01-01 09:30:05,683] DEEPMD INFO batch 3900: val: rmse = 3.06e+00, rmse_e = 1.52e-02, rmse_f = 9.66e-02
[2026-01-01 09:30:05,683] DEEPMD INFO batch 3900: total wall time = 2.41 s
[2026-01-01 09:30:08,032] DEEPMD INFO batch 4000: trn: rmse = 2.66e+00, rmse_e = 2.16e-02, rmse_f = 8.43e-02, lr = 1.00e-03
[2026-01-01 09:30:08,032] DEEPMD INFO batch 4000: val: rmse = 2.54e+00, rmse_e = 2.19e-02, rmse_f = 8.04e-02
[2026-01-01 09:30:08,032] DEEPMD INFO batch 4000: total wall time = 2.35 s
[2026-01-01 09:30:08,132] DEEPMD INFO saved checkpoint model.ckpt
[2026-01-01 09:30:10,496] DEEPMD INFO batch 4100: trn: rmse = 3.04e+00, rmse_e = 3.27e-02, rmse_f = 9.61e-02, lr = 1.00e-03
[2026-01-01 09:30:10,496] DEEPMD INFO batch 4100: val: rmse = 2.79e+00, rmse_e = 3.21e-02, rmse_f = 8.82e-02
[2026-01-01 09:30:10,497] DEEPMD INFO batch 4100: total wall time = 2.46 s
[2026-01-01 09:30:12,884] DEEPMD INFO batch 4200: trn: rmse = 3.64e+00, rmse_e = 2.19e-02, rmse_f = 1.15e-01, lr = 1.00e-03
[2026-01-01 09:30:12,884] DEEPMD INFO batch 4200: val: rmse = 3.27e+00, rmse_e = 2.17e-02, rmse_f = 1.03e-01
[2026-01-01 09:30:12,884] DEEPMD INFO batch 4200: total wall time = 2.39 s
[2026-01-01 09:30:15,295] DEEPMD INFO batch 4300: trn: rmse = 2.52e+00, rmse_e = 9.23e-03, rmse_f = 7.98e-02, lr = 1.00e-03
[2026-01-01 09:30:15,295] DEEPMD INFO batch 4300: val: rmse = 2.59e+00, rmse_e = 9.42e-03, rmse_f = 8.20e-02
[2026-01-01 09:30:15,295] DEEPMD INFO batch 4300: total wall time = 2.41 s
[2026-01-01 09:30:17,678] DEEPMD INFO batch 4400: trn: rmse = 2.88e+00, rmse_e = 1.02e-02, rmse_f = 9.11e-02, lr = 1.00e-03
[2026-01-01 09:30:17,678] DEEPMD INFO batch 4400: val: rmse = 2.66e+00, rmse_e = 8.22e-03, rmse_f = 8.41e-02
[2026-01-01 09:30:17,678] DEEPMD INFO batch 4400: total wall time = 2.38 s
[2026-01-01 09:30:20,029] DEEPMD INFO batch 4500: trn: rmse = 2.62e+00, rmse_e = 1.16e-02, rmse_f = 8.28e-02, lr = 1.00e-03
[2026-01-01 09:30:20,029] DEEPMD INFO batch 4500: val: rmse = 2.65e+00, rmse_e = 1.33e-02, rmse_f = 8.38e-02
[2026-01-01 09:30:20,029] DEEPMD INFO batch 4500: total wall time = 2.35 s
[2026-01-01 09:30:22,415] DEEPMD INFO batch 4600: trn: rmse = 3.01e+00, rmse_e = 2.31e-02, rmse_f = 9.51e-02, lr = 1.00e-03
[2026-01-01 09:30:22,415] DEEPMD INFO batch 4600: val: rmse = 2.62e+00, rmse_e = 2.33e-02, rmse_f = 8.29e-02
[2026-01-01 09:30:22,415] DEEPMD INFO batch 4600: total wall time = 2.39 s
[2026-01-01 09:30:24,785] DEEPMD INFO batch 4700: trn: rmse = 2.33e+00, rmse_e = 1.50e-02, rmse_f = 7.37e-02, lr = 1.00e-03
[2026-01-01 09:30:24,785] DEEPMD INFO batch 4700: val: rmse = 2.71e+00, rmse_e = 1.67e-02, rmse_f = 8.58e-02
[2026-01-01 09:30:24,785] DEEPMD INFO batch 4700: total wall time = 2.37 s
[2026-01-01 09:30:27,173] DEEPMD INFO batch 4800: trn: rmse = 2.78e+00, rmse_e = 5.42e-04, rmse_f = 8.79e-02, lr = 1.00e-03
[2026-01-01 09:30:27,173] DEEPMD INFO batch 4800: val: rmse = 2.88e+00, rmse_e = 6.71e-04, rmse_f = 9.11e-02
[2026-01-01 09:30:27,173] DEEPMD INFO batch 4800: total wall time = 2.39 s
[2026-01-01 09:30:29,543] DEEPMD INFO batch 4900: trn: rmse = 2.68e+00, rmse_e = 1.28e-02, rmse_f = 8.48e-02, lr = 1.00e-03
[2026-01-01 09:30:29,544] DEEPMD INFO batch 4900: val: rmse = 2.90e+00, rmse_e = 1.29e-02, rmse_f = 9.18e-02
[2026-01-01 09:30:29,544] DEEPMD INFO batch 4900: total wall time = 2.37 s
[2026-01-01 09:30:31,935] DEEPMD INFO batch 5000: trn: rmse = 2.08e-01, rmse_e = 1.02e-03, rmse_f = 7.89e-02, lr = 5.92e-06
[2026-01-01 09:30:31,936] DEEPMD INFO batch 5000: val: rmse = 2.15e-01, rmse_e = 9.93e-04, rmse_f = 8.15e-02
[2026-01-01 09:30:31,936] DEEPMD INFO batch 5000: total wall time = 2.39 s
[2026-01-01 09:30:32,039] DEEPMD INFO saved checkpoint model.ckpt
[2026-01-01 09:30:34,424] DEEPMD INFO batch 5100: trn: rmse = 2.16e-01, rmse_e = 5.07e-04, rmse_f = 8.21e-02, lr = 5.92e-06
[2026-01-01 09:30:34,425] DEEPMD INFO batch 5100: val: rmse = 2.02e-01, rmse_e = 8.51e-04, rmse_f = 7.65e-02
[2026-01-01 09:30:34,425] DEEPMD INFO batch 5100: total wall time = 2.49 s
[2026-01-01 09:30:36,844] DEEPMD INFO batch 5200: trn: rmse = 2.20e-01, rmse_e = 1.37e-03, rmse_f = 8.33e-02, lr = 5.92e-06
[2026-01-01 09:30:36,844] DEEPMD INFO batch 5200: val: rmse = 2.13e-01, rmse_e = 9.60e-04, rmse_f = 8.06e-02
[2026-01-01 09:30:36,844] DEEPMD INFO batch 5200: total wall time = 2.42 s
[2026-01-01 09:30:39,231] DEEPMD INFO batch 5300: trn: rmse = 1.88e-01, rmse_e = 3.04e-04, rmse_f = 7.14e-02, lr = 5.92e-06
[2026-01-01 09:30:39,231] DEEPMD INFO batch 5300: val: rmse = 2.07e-01, rmse_e = 4.99e-04, rmse_f = 7.87e-02
[2026-01-01 09:30:39,231] DEEPMD INFO batch 5300: total wall time = 2.39 s
[2026-01-01 09:30:41,571] DEEPMD INFO batch 5400: trn: rmse = 2.09e-01, rmse_e = 6.61e-04, rmse_f = 7.93e-02, lr = 5.92e-06
[2026-01-01 09:30:41,571] DEEPMD INFO batch 5400: val: rmse = 2.04e-01, rmse_e = 7.21e-04, rmse_f = 7.75e-02
[2026-01-01 09:30:41,571] DEEPMD INFO batch 5400: total wall time = 2.34 s
[2026-01-01 09:30:43,980] DEEPMD INFO batch 5500: trn: rmse = 1.94e-01, rmse_e = 3.24e-04, rmse_f = 7.38e-02, lr = 5.92e-06
[2026-01-01 09:30:43,981] DEEPMD INFO batch 5500: val: rmse = 2.21e-01, rmse_e = 5.90e-04, rmse_f = 8.41e-02
[2026-01-01 09:30:43,981] DEEPMD INFO batch 5500: total wall time = 2.41 s
[2026-01-01 09:30:46,400] DEEPMD INFO batch 5600: trn: rmse = 1.76e-01, rmse_e = 2.88e-04, rmse_f = 6.71e-02, lr = 5.92e-06
[2026-01-01 09:30:46,401] DEEPMD INFO batch 5600: val: rmse = 2.16e-01, rmse_e = 9.89e-04, rmse_f = 8.18e-02
[2026-01-01 09:30:46,401] DEEPMD INFO batch 5600: total wall time = 2.42 s
[2026-01-01 09:30:48,768] DEEPMD INFO batch 5700: trn: rmse = 1.99e-01, rmse_e = 3.38e-05, rmse_f = 7.58e-02, lr = 5.92e-06
[2026-01-01 09:30:48,768] DEEPMD INFO batch 5700: val: rmse = 2.16e-01, rmse_e = 8.04e-04, rmse_f = 8.18e-02
[2026-01-01 09:30:48,768] DEEPMD INFO batch 5700: total wall time = 2.37 s
[2026-01-01 09:30:51,192] DEEPMD INFO batch 5800: trn: rmse = 2.53e-01, rmse_e = 1.86e-03, rmse_f = 9.57e-02, lr = 5.92e-06
[2026-01-01 09:30:51,192] DEEPMD INFO batch 5800: val: rmse = 2.24e-01, rmse_e = 6.48e-04, rmse_f = 8.49e-02
[2026-01-01 09:30:51,192] DEEPMD INFO batch 5800: total wall time = 2.42 s
[2026-01-01 09:30:53,594] DEEPMD INFO batch 5900: trn: rmse = 1.99e-01, rmse_e = 7.57e-04, rmse_f = 7.54e-02, lr = 5.92e-06
[2026-01-01 09:30:53,594] DEEPMD INFO batch 5900: val: rmse = 2.07e-01, rmse_e = 7.25e-04, rmse_f = 7.84e-02
[2026-01-01 09:30:53,594] DEEPMD INFO batch 5900: total wall time = 2.40 s
[2026-01-01 09:30:55,964] DEEPMD INFO batch 6000: trn: rmse = 2.04e-01, rmse_e = 1.51e-03, rmse_f = 7.73e-02, lr = 5.92e-06
[2026-01-01 09:30:55,964] DEEPMD INFO batch 6000: val: rmse = 2.13e-01, rmse_e = 5.76e-04, rmse_f = 8.09e-02
[2026-01-01 09:30:55,964] DEEPMD INFO batch 6000: total wall time = 2.37 s
[2026-01-01 09:30:56,095] DEEPMD INFO saved checkpoint model.ckpt
[2026-01-01 09:30:58,477] DEEPMD INFO batch 6100: trn: rmse = 2.05e-01, rmse_e = 2.19e-04, rmse_f = 7.78e-02, lr = 5.92e-06
[2026-01-01 09:30:58,477] DEEPMD INFO batch 6100: val: rmse = 2.12e-01, rmse_e = 1.06e-03, rmse_f = 8.02e-02
[2026-01-01 09:30:58,477] DEEPMD INFO batch 6100: total wall time = 2.51 s
[2026-01-01 09:31:00,853] DEEPMD INFO batch 6200: trn: rmse = 2.16e-01, rmse_e = 2.39e-03, rmse_f = 8.10e-02, lr = 5.92e-06
[2026-01-01 09:31:00,853] DEEPMD INFO batch 6200: val: rmse = 2.00e-01, rmse_e = 7.14e-04, rmse_f = 7.60e-02
[2026-01-01 09:31:00,854] DEEPMD INFO batch 6200: total wall time = 2.38 s
[2026-01-01 09:31:03,239] DEEPMD INFO batch 6300: trn: rmse = 2.07e-01, rmse_e = 1.68e-03, rmse_f = 7.82e-02, lr = 5.92e-06
[2026-01-01 09:31:03,239] DEEPMD INFO batch 6300: val: rmse = 1.99e-01, rmse_e = 5.56e-04, rmse_f = 7.56e-02
[2026-01-01 09:31:03,239] DEEPMD INFO batch 6300: total wall time = 2.39 s
[2026-01-01 09:31:05,608] DEEPMD INFO batch 6400: trn: rmse = 2.02e-01, rmse_e = 3.39e-04, rmse_f = 7.70e-02, lr = 5.92e-06
[2026-01-01 09:31:05,608] DEEPMD INFO batch 6400: val: rmse = 2.10e-01, rmse_e = 1.29e-03, rmse_f = 7.94e-02
[2026-01-01 09:31:05,608] DEEPMD INFO batch 6400: total wall time = 2.37 s
[2026-01-01 09:31:07,965] DEEPMD INFO batch 6500: trn: rmse = 2.16e-01, rmse_e = 2.96e-03, rmse_f = 8.06e-02, lr = 5.92e-06
[2026-01-01 09:31:07,965] DEEPMD INFO batch 6500: val: rmse = 2.08e-01, rmse_e = 7.62e-04, rmse_f = 7.89e-02
[2026-01-01 09:31:07,965] DEEPMD INFO batch 6500: total wall time = 2.36 s
[2026-01-01 09:31:10,246] DEEPMD INFO batch 6600: trn: rmse = 2.33e-01, rmse_e = 5.48e-05, rmse_f = 8.87e-02, lr = 5.92e-06
[2026-01-01 09:31:10,247] DEEPMD INFO batch 6600: val: rmse = 1.95e-01, rmse_e = 7.60e-04, rmse_f = 7.41e-02
[2026-01-01 09:31:10,247] DEEPMD INFO batch 6600: total wall time = 2.28 s
[2026-01-01 09:31:12,546] DEEPMD INFO batch 6700: trn: rmse = 1.94e-01, rmse_e = 4.74e-05, rmse_f = 7.39e-02, lr = 5.92e-06
[2026-01-01 09:31:12,546] DEEPMD INFO batch 6700: val: rmse = 2.06e-01, rmse_e = 6.75e-04, rmse_f = 7.82e-02
[2026-01-01 09:31:12,547] DEEPMD INFO batch 6700: total wall time = 2.30 s
[2026-01-01 09:31:14,868] DEEPMD INFO batch 6800: trn: rmse = 1.85e-01, rmse_e = 6.07e-04, rmse_f = 7.01e-02, lr = 5.92e-06
[2026-01-01 09:31:14,868] DEEPMD INFO batch 6800: val: rmse = 2.07e-01, rmse_e = 2.38e-04, rmse_f = 7.88e-02
[2026-01-01 09:31:14,868] DEEPMD INFO batch 6800: total wall time = 2.32 s
[2026-01-01 09:31:17,168] DEEPMD INFO batch 6900: trn: rmse = 2.33e-01, rmse_e = 5.41e-04, rmse_f = 8.84e-02, lr = 5.92e-06
[2026-01-01 09:31:17,168] DEEPMD INFO batch 6900: val: rmse = 1.91e-01, rmse_e = 2.73e-04, rmse_f = 7.27e-02
[2026-01-01 09:31:17,168] DEEPMD INFO batch 6900: total wall time = 2.30 s
[2026-01-01 09:31:19,451] DEEPMD INFO batch 7000: trn: rmse = 2.09e-01, rmse_e = 3.88e-04, rmse_f = 7.95e-02, lr = 5.92e-06
[2026-01-01 09:31:19,451] DEEPMD INFO batch 7000: val: rmse = 2.03e-01, rmse_e = 6.59e-04, rmse_f = 7.70e-02
[2026-01-01 09:31:19,451] DEEPMD INFO batch 7000: total wall time = 2.28 s
[2026-01-01 09:31:19,559] DEEPMD INFO saved checkpoint model.ckpt
[2026-01-01 09:31:21,861] DEEPMD INFO batch 7100: trn: rmse = 2.12e-01, rmse_e = 2.99e-04, rmse_f = 8.05e-02, lr = 5.92e-06
[2026-01-01 09:31:21,861] DEEPMD INFO batch 7100: val: rmse = 2.04e-01, rmse_e = 8.82e-04, rmse_f = 7.75e-02
[2026-01-01 09:31:21,861] DEEPMD INFO batch 7100: total wall time = 2.41 s
[2026-01-01 09:31:24,165] DEEPMD INFO batch 7200: trn: rmse = 2.04e-01, rmse_e = 9.94e-04, rmse_f = 7.72e-02, lr = 5.92e-06
[2026-01-01 09:31:24,165] DEEPMD INFO batch 7200: val: rmse = 2.06e-01, rmse_e = 6.69e-04, rmse_f = 7.83e-02
[2026-01-01 09:31:24,165] DEEPMD INFO batch 7200: total wall time = 2.30 s
[2026-01-01 09:31:26,487] DEEPMD INFO batch 7300: trn: rmse = 2.06e-01, rmse_e = 1.61e-04, rmse_f = 7.83e-02, lr = 5.92e-06
[2026-01-01 09:31:26,487] DEEPMD INFO batch 7300: val: rmse = 2.03e-01, rmse_e = 2.96e-04, rmse_f = 7.71e-02
[2026-01-01 09:31:26,488] DEEPMD INFO batch 7300: total wall time = 2.32 s
[2026-01-01 09:31:28,815] DEEPMD INFO batch 7400: trn: rmse = 1.71e-01, rmse_e = 1.48e-03, rmse_f = 6.45e-02, lr = 5.92e-06
[2026-01-01 09:31:28,816] DEEPMD INFO batch 7400: val: rmse = 2.03e-01, rmse_e = 4.59e-04, rmse_f = 7.73e-02
[2026-01-01 09:31:28,816] DEEPMD INFO batch 7400: total wall time = 2.33 s
[2026-01-01 09:31:31,186] DEEPMD INFO batch 7500: trn: rmse = 1.87e-01, rmse_e = 5.35e-05, rmse_f = 7.10e-02, lr = 5.92e-06
[2026-01-01 09:31:31,187] DEEPMD INFO batch 7500: val: rmse = 1.94e-01, rmse_e = 8.93e-04, rmse_f = 7.35e-02
[2026-01-01 09:31:31,187] DEEPMD INFO batch 7500: total wall time = 2.37 s
[2026-01-01 09:31:33,533] DEEPMD INFO batch 7600: trn: rmse = 2.03e-01, rmse_e = 5.59e-04, rmse_f = 7.73e-02, lr = 5.92e-06
[2026-01-01 09:31:33,533] DEEPMD INFO batch 7600: val: rmse = 2.07e-01, rmse_e = 4.89e-04, rmse_f = 7.87e-02
[2026-01-01 09:31:33,533] DEEPMD INFO batch 7600: total wall time = 2.35 s
[2026-01-01 09:31:35,890] DEEPMD INFO batch 7700: trn: rmse = 2.19e-01, rmse_e = 3.28e-04, rmse_f = 8.34e-02, lr = 5.92e-06
[2026-01-01 09:31:35,891] DEEPMD INFO batch 7700: val: rmse = 2.14e-01, rmse_e = 5.73e-04, rmse_f = 8.14e-02
[2026-01-01 09:31:35,891] DEEPMD INFO batch 7700: total wall time = 2.36 s
[2026-01-01 09:31:38,247] DEEPMD INFO batch 7800: trn: rmse = 1.76e-01, rmse_e = 3.36e-04, rmse_f = 6.70e-02, lr = 5.92e-06
[2026-01-01 09:31:38,247] DEEPMD INFO batch 7800: val: rmse = 1.90e-01, rmse_e = 4.81e-04, rmse_f = 7.21e-02
[2026-01-01 09:31:38,247] DEEPMD INFO batch 7800: total wall time = 2.36 s
[2026-01-01 09:31:40,632] DEEPMD INFO batch 7900: trn: rmse = 1.94e-01, rmse_e = 1.34e-03, rmse_f = 7.34e-02, lr = 5.92e-06
[2026-01-01 09:31:40,633] DEEPMD INFO batch 7900: val: rmse = 2.02e-01, rmse_e = 9.35e-04, rmse_f = 7.66e-02
[2026-01-01 09:31:40,633] DEEPMD INFO batch 7900: total wall time = 2.39 s
[2026-01-01 09:31:43,033] DEEPMD INFO batch 8000: trn: rmse = 2.15e-01, rmse_e = 2.62e-03, rmse_f = 8.06e-02, lr = 5.92e-06
[2026-01-01 09:31:43,034] DEEPMD INFO batch 8000: val: rmse = 2.18e-01, rmse_e = 5.23e-04, rmse_f = 8.28e-02
[2026-01-01 09:31:43,034] DEEPMD INFO batch 8000: total wall time = 2.40 s
[2026-01-01 09:31:43,147] DEEPMD INFO saved checkpoint model.ckpt
[2026-01-01 09:31:45,564] DEEPMD INFO batch 8100: trn: rmse = 2.11e-01, rmse_e = 1.24e-03, rmse_f = 7.98e-02, lr = 5.92e-06
[2026-01-01 09:31:45,564] DEEPMD INFO batch 8100: val: rmse = 2.06e-01, rmse_e = 6.11e-04, rmse_f = 7.82e-02
[2026-01-01 09:31:45,564] DEEPMD INFO batch 8100: total wall time = 2.53 s
[2026-01-01 09:31:47,945] DEEPMD INFO batch 8200: trn: rmse = 1.88e-01, rmse_e = 9.40e-05, rmse_f = 7.14e-02, lr = 5.92e-06
[2026-01-01 09:31:47,945] DEEPMD INFO batch 8200: val: rmse = 1.98e-01, rmse_e = 9.48e-04, rmse_f = 7.51e-02
[2026-01-01 09:31:47,946] DEEPMD INFO batch 8200: total wall time = 2.38 s
[2026-01-01 09:31:50,309] DEEPMD INFO batch 8300: trn: rmse = 2.03e-01, rmse_e = 4.20e-04, rmse_f = 7.71e-02, lr = 5.92e-06
[2026-01-01 09:31:50,309] DEEPMD INFO batch 8300: val: rmse = 2.10e-01, rmse_e = 1.01e-03, rmse_f = 7.97e-02
[2026-01-01 09:31:50,309] DEEPMD INFO batch 8300: total wall time = 2.36 s
[2026-01-01 09:31:52,675] DEEPMD INFO batch 8400: trn: rmse = 2.21e-01, rmse_e = 1.72e-03, rmse_f = 8.35e-02, lr = 5.92e-06
[2026-01-01 09:31:52,676] DEEPMD INFO batch 8400: val: rmse = 1.97e-01, rmse_e = 5.57e-04, rmse_f = 7.48e-02
[2026-01-01 09:31:52,676] DEEPMD INFO batch 8400: total wall time = 2.37 s
[2026-01-01 09:31:55,045] DEEPMD INFO batch 8500: trn: rmse = 2.22e-01, rmse_e = 8.82e-04, rmse_f = 8.42e-02, lr = 5.92e-06
[2026-01-01 09:31:55,045] DEEPMD INFO batch 8500: val: rmse = 2.06e-01, rmse_e = 5.61e-04, rmse_f = 7.82e-02
[2026-01-01 09:31:55,045] DEEPMD INFO batch 8500: total wall time = 2.37 s
[2026-01-01 09:31:57,451] DEEPMD INFO batch 8600: trn: rmse = 1.91e-01, rmse_e = 2.88e-04, rmse_f = 7.25e-02, lr = 5.92e-06
[2026-01-01 09:31:57,452] DEEPMD INFO batch 8600: val: rmse = 1.98e-01, rmse_e = 1.01e-03, rmse_f = 7.51e-02
[2026-01-01 09:31:57,452] DEEPMD INFO batch 8600: total wall time = 2.41 s
[2026-01-01 09:31:59,826] DEEPMD INFO batch 8700: trn: rmse = 1.91e-01, rmse_e = 4.27e-04, rmse_f = 7.26e-02, lr = 5.92e-06
[2026-01-01 09:31:59,826] DEEPMD INFO batch 8700: val: rmse = 1.94e-01, rmse_e = 1.28e-03, rmse_f = 7.34e-02
[2026-01-01 09:31:59,826] DEEPMD INFO batch 8700: total wall time = 2.37 s
[2026-01-01 09:32:02,233] DEEPMD INFO batch 8800: trn: rmse = 1.83e-01, rmse_e = 1.69e-03, rmse_f = 6.89e-02, lr = 5.92e-06
[2026-01-01 09:32:02,234] DEEPMD INFO batch 8800: val: rmse = 2.07e-01, rmse_e = 1.31e-03, rmse_f = 7.82e-02
[2026-01-01 09:32:02,234] DEEPMD INFO batch 8800: total wall time = 2.41 s
[2026-01-01 09:32:04,611] DEEPMD INFO batch 8900: trn: rmse = 1.88e-01, rmse_e = 5.71e-04, rmse_f = 7.14e-02, lr = 5.92e-06
[2026-01-01 09:32:04,611] DEEPMD INFO batch 8900: val: rmse = 1.93e-01, rmse_e = 7.86e-04, rmse_f = 7.33e-02
[2026-01-01 09:32:04,611] DEEPMD INFO batch 8900: total wall time = 2.38 s
[2026-01-01 09:32:07,012] DEEPMD INFO batch 9000: trn: rmse = 1.82e-01, rmse_e = 1.14e-03, rmse_f = 6.89e-02, lr = 5.92e-06
[2026-01-01 09:32:07,013] DEEPMD INFO batch 9000: val: rmse = 1.97e-01, rmse_e = 4.63e-04, rmse_f = 7.49e-02
[2026-01-01 09:32:07,013] DEEPMD INFO batch 9000: total wall time = 2.40 s
[2026-01-01 09:32:07,141] DEEPMD INFO saved checkpoint model.ckpt
[2026-01-01 09:32:09,553] DEEPMD INFO batch 9100: trn: rmse = 1.86e-01, rmse_e = 5.13e-04, rmse_f = 7.07e-02, lr = 5.92e-06
[2026-01-01 09:32:09,553] DEEPMD INFO batch 9100: val: rmse = 2.00e-01, rmse_e = 3.93e-04, rmse_f = 7.61e-02
[2026-01-01 09:32:09,554] DEEPMD INFO batch 9100: total wall time = 2.54 s
[2026-01-01 09:32:11,942] DEEPMD INFO batch 9200: trn: rmse = 1.91e-01, rmse_e = 6.75e-04, rmse_f = 7.26e-02, lr = 5.92e-06
[2026-01-01 09:32:11,942] DEEPMD INFO batch 9200: val: rmse = 2.04e-01, rmse_e = 6.81e-04, rmse_f = 7.76e-02
[2026-01-01 09:32:11,942] DEEPMD INFO batch 9200: total wall time = 2.39 s
[2026-01-01 09:32:14,240] DEEPMD INFO batch 9300: trn: rmse = 2.09e-01, rmse_e = 7.54e-04, rmse_f = 7.94e-02, lr = 5.92e-06
[2026-01-01 09:32:14,240] DEEPMD INFO batch 9300: val: rmse = 2.05e-01, rmse_e = 4.76e-04, rmse_f = 7.80e-02
[2026-01-01 09:32:14,240] DEEPMD INFO batch 9300: total wall time = 2.30 s
[2026-01-01 09:32:16,591] DEEPMD INFO batch 9400: trn: rmse = 1.83e-01, rmse_e = 3.04e-04, rmse_f = 6.96e-02, lr = 5.92e-06
[2026-01-01 09:32:16,592] DEEPMD INFO batch 9400: val: rmse = 2.19e-01, rmse_e = 4.24e-04, rmse_f = 8.32e-02
[2026-01-01 09:32:16,592] DEEPMD INFO batch 9400: total wall time = 2.35 s
[2026-01-01 09:32:18,975] DEEPMD INFO batch 9500: trn: rmse = 2.11e-01, rmse_e = 2.25e-04, rmse_f = 8.02e-02, lr = 5.92e-06
[2026-01-01 09:32:18,975] DEEPMD INFO batch 9500: val: rmse = 2.04e-01, rmse_e = 7.60e-04, rmse_f = 7.74e-02
[2026-01-01 09:32:18,975] DEEPMD INFO batch 9500: total wall time = 2.38 s
[2026-01-01 09:32:21,368] DEEPMD INFO batch 9600: trn: rmse = 2.05e-01, rmse_e = 2.92e-04, rmse_f = 7.80e-02, lr = 5.92e-06
[2026-01-01 09:32:21,368] DEEPMD INFO batch 9600: val: rmse = 1.97e-01, rmse_e = 9.62e-04, rmse_f = 7.46e-02
[2026-01-01 09:32:21,368] DEEPMD INFO batch 9600: total wall time = 2.39 s
[2026-01-01 09:32:23,781] DEEPMD INFO batch 9700: trn: rmse = 1.80e-01, rmse_e = 6.18e-04, rmse_f = 6.83e-02, lr = 5.92e-06
[2026-01-01 09:32:23,781] DEEPMD INFO batch 9700: val: rmse = 1.99e-01, rmse_e = 7.30e-04, rmse_f = 7.55e-02
[2026-01-01 09:32:23,781] DEEPMD INFO batch 9700: total wall time = 2.41 s
[2026-01-01 09:32:26,167] DEEPMD INFO batch 9800: trn: rmse = 1.84e-01, rmse_e = 1.82e-03, rmse_f = 6.93e-02, lr = 5.92e-06
[2026-01-01 09:32:26,167] DEEPMD INFO batch 9800: val: rmse = 1.94e-01, rmse_e = 1.21e-03, rmse_f = 7.34e-02
[2026-01-01 09:32:26,167] DEEPMD INFO batch 9800: total wall time = 2.39 s
[2026-01-01 09:32:28,546] DEEPMD INFO batch 9900: trn: rmse = 1.89e-01, rmse_e = 1.07e-03, rmse_f = 7.16e-02, lr = 5.92e-06
[2026-01-01 09:32:28,546] DEEPMD INFO batch 9900: val: rmse = 1.99e-01, rmse_e = 8.63e-04, rmse_f = 7.54e-02
[2026-01-01 09:32:28,546] DEEPMD INFO batch 9900: total wall time = 2.38 s
[2026-01-01 09:32:30,876] DEEPMD INFO batch 10000: trn: rmse = 8.22e-02, rmse_e = 1.08e-04, rmse_f = 8.08e-02, lr = 3.51e-08
[2026-01-01 09:32:30,877] DEEPMD INFO batch 10000: val: rmse = 7.92e-02, rmse_e = 8.88e-04, rmse_f = 7.67e-02
[2026-01-01 09:32:30,877] DEEPMD INFO batch 10000: total wall time = 2.33 s
[2026-01-01 09:32:30,997] DEEPMD INFO saved checkpoint model.ckpt
[2026-01-01 09:32:30,997] DEEPMD INFO average training time: 0.0230 s/batch (exclude first 100 batches)
[2026-01-01 09:32:30,998] DEEPMD INFO finished training
[2026-01-01 09:32:30,998] DEEPMD INFO wall time: 242.014 s
导出力场 冻结 dp freeze -o graph.pb
dp freeze -o graph.pb
2026-01-01 09:39:54.029971: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2026-01-01 09:39:54.117534: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2026-01-01 09:39:56.230971: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/keras/src/export/tf2onnx_lib.py:8: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
if not hasattr(np, "object"):
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
2026-01-01 09:40:00.245490: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:47] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1767260400.246672 509 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260400.288810 509 mlir_graph_optimization_pass.cc:437] MLIR V1 optimization pass is not enabled
[2026-01-01 09:40:00,519] DEEPMD INFO The following nodes will be frozen: ['o_atom_energy', 'o_energy', 'descrpt_attr/ntypes', 'model_attr/model_version', 'train_attr/training_script', 'o_virial', 'descrpt_attr/rcut', 'train_attr/min_nbor_dist', 'fitting_attr/dfparam', 'o_atom_virial', 't_mesh', 'model_type', 'fitting_attr/daparam', 'model_attr/tmap', 'o_force', 'model_attr/model_type']
[2026-01-01 09:40:00,890] DEEPMD INFO 862 ops in the final graph.
(py312) root@908821a3377a:/workspace/test/01train#
压缩模型加速
dp --tf compress -i graph.pb -o graph-compress.pb
dp --tf compress -i graph.pb -o graph-compress.pb
2026-01-01 09:45:32.427819: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2026-01-01 09:45:32.489055: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2026-01-01 09:45:34.511560: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/keras/src/export/tf2onnx_lib.py:8: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
if not hasattr(np, "object"):
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
2026-01-01 09:45:38.396558: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:47] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1767260738.397889 585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260738.429904 585 mlir_graph_optimization_pass.cc:437] MLIR V1 optimization pass is not enabled
I0000 00:00:1767260738.461692 585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260738.509365 585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:45:38,598] DEEPMD INFO
[2026-01-01 09:45:38,599] DEEPMD INFO stage 1: compress the model
[2026-01-01 09:45:38,601] DEEPMD WARNING Switch to serial execution due to lack of horovod module.
[2026-01-01 09:45:43,805] DEEPMD INFO _____ _____ __ __ _____ _ _ _
[2026-01-01 09:45:43,805] DEEPMD INFO | __ \ | __ \ | \/ || __ \ | | (_)| |
[2026-01-01 09:45:43,805] DEEPMD INFO | | | | ___ ___ | |__) || \ / || | | | ______ | | __ _ | |_
[2026-01-01 09:45:43,805] DEEPMD INFO | | | | / _ \ / _ \| ___/ | |\/| || | | ||______|| |/ /| || __|
[2026-01-01 09:45:43,805] DEEPMD INFO | |__| || __/| __/| | | | | || |__| | | < | || |_
[2026-01-01 09:45:43,805] DEEPMD INFO |_____/ \___| \___||_| |_| |_||_____/ |_|\_\|_| \__|
[2026-01-01 09:45:43,806] DEEPMD INFO Please read and cite:
[2026-01-01 09:45:43,806] DEEPMD INFO Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018)
[2026-01-01 09:45:43,806] DEEPMD INFO Zeng et al, J. Chem. Phys., 159, 054801 (2023)
[2026-01-01 09:45:43,806] DEEPMD INFO Zeng et al, J. Chem. Theory Comput., 21, 4375-4385 (2025)
[2026-01-01 09:45:43,806] DEEPMD INFO See https://deepmd.rtfd.io/credits/ for details.
[2026-01-01 09:45:43,806] DEEPMD INFO --------------------------------------------------------------------------------------------------------
[2026-01-01 09:45:43,806] DEEPMD INFO installed to: /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/deepmd
[2026-01-01 09:45:43,806] DEEPMD INFO source: v3.1.2-21-gb98f6c59-dirty
[2026-01-01 09:45:43,806] DEEPMD INFO source branch: devel
[2026-01-01 09:45:43,806] DEEPMD INFO source commit: b98f6c59
[2026-01-01 09:45:43,806] DEEPMD INFO source commit at: 2025-12-23 08:15:14 +0000
[2026-01-01 09:45:43,806] DEEPMD INFO use float prec: double
[2026-01-01 09:45:43,806] DEEPMD INFO build variant: cuda
[2026-01-01 09:45:43,806] DEEPMD INFO Backend: TensorFlow
[2026-01-01 09:45:43,806] DEEPMD INFO TF ver: v2.20.0-rc0-4-g72fbba3d20f
[2026-01-01 09:45:43,806] DEEPMD INFO build with TF ver: 2.20.0
[2026-01-01 09:45:43,806] DEEPMD INFO build with TF inc: /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/include/
[2026-01-01 09:45:43,806] DEEPMD INFO /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/include/
[2026-01-01 09:45:43,806] DEEPMD INFO build with TF lib:
[2026-01-01 09:45:43,806] DEEPMD INFO running on: 908821a3377a
[2026-01-01 09:45:43,806] DEEPMD INFO computing device: gpu:0
[2026-01-01 09:45:43,806] DEEPMD INFO CUDA_VISIBLE_DEVICES: unset
[2026-01-01 09:45:43,806] DEEPMD INFO Count of visible GPUs: 1
[2026-01-01 09:45:43,806] DEEPMD INFO num_intra_threads: 0
[2026-01-01 09:45:43,806] DEEPMD INFO num_inter_threads: 0
[2026-01-01 09:45:43,806] DEEPMD INFO --------------------------------------------------------------------------------------------------------
I0000 00:00:1767260743.821566 585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:45:43,822] DEEPMD INFO training without frame parameter
I0000 00:00:1767260743.931699 585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260743.941946 585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260743.994315 585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:45:44,032] DEEPMD INFO training data with lower boundary: [-0.3598215 -0.38828511]
[2026-01-01 09:45:44,032] DEEPMD INFO training data with upper boundary: [7.69377098 8.70452294]
I0000 00:00:1767260744.353808 585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260744.397141 585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767260744.449946 585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:45:44,502] DEEPMD INFO built lr
[2026-01-01 09:45:45,192] DEEPMD INFO built network
[2026-01-01 09:45:45,694] DEEPMD INFO built training
[2026-01-01 09:45:45,694] DEEPMD WARNING To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
I0000 00:00:1767260745.701164 585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:45:45,733] DEEPMD INFO initialize model from scratch
[2026-01-01 09:45:46,386] DEEPMD INFO finished compressing
[2026-01-01 09:45:46,393] DEEPMD INFO
[2026-01-01 09:45:46,393] DEEPMD INFO stage 2: freeze the model
I0000 00:00:1767260746.672053 585 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10129 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
[2026-01-01 09:45:46,879] DEEPMD INFO The following nodes will be frozen: ['o_virial', 'fitting_attr/dfparam', 'o_energy', 'fitting_attr/daparam', 'o_atom_virial', 'train_attr/training_script', 'o_force', 'model_attr/model_version', 't_mesh', 'descrpt_attr/ntypes', 'descrpt_attr/rcut', 'model_attr/tmap', 'model_attr/model_type', 'o_atom_energy', 'model_type', 'train_attr/min_nbor_dist']
[2026-01-01 09:45:47,105] DEEPMD INFO 685 ops in the final graph.
(py312) root@908821a3377a:/workspace/test/01train#
进入文件夹
cd /workspace/test/02gmx/
将力场复制过来
cp ../01train/graph-compress.pb .
运行脚本即可
./md.sh
./md.sh
gmx_mpi: Relink `/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/libtensorflow_framework.so.2' with `/usr/lib/x86_64-linux-gnu/libz.so.1' for IFUNC symbol `crc32_z'
DeePMD-kit: Successfully load libcudart.so.12
2026-01-01 09:50:43.052941: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
:-) GROMACS - gmx grompp, 2020.2-MODIFIED (-:
GROMACS is written by:
Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen
Par Bjelkmar Christian Blau Viacheslav Bolnykh Kevin Boyd
Aldert van Buuren Rudi van Drunen Anton Feenstra Alan Gray
Gerrit Groenhof Anca Hamuraru Vincent Hindriksen M. Eric Irrgang
Aleksei Iupinov Christoph Junghans Joe Jordan Dimitrios Karkoulis
Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson
Justin A. Lemkul Viveca Lindahl Magnus Lundborg Erik Marklund
Pascal Merz Pieter Meulenhoff Teemu Murtola Szilard Pall
Sander Pronk Roland Schulz Michael Shirts Alexey Shvetsov
Alfons Sijbers Peter Tieleman Jon Vincent Teemu Virolainen
Christian Wennberg Maarten Wolf Artem Zhmurov
and the project leaders:
Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2019, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.
GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.
GROMACS: gmx grompp, version 2020.2-MODIFIED
Executable: /root/opt/gromacs/install/bin/gmx_mpi
Data prefix: /root/opt/gromacs/install
Working dir: /workspace/test/02gmx
Command line:
gmx_mpi grompp -f md.mdp -c water.gro -p water.top -o md.tpr -maxwarn 3
Ignoring obsolete mdp entry 'ns_type'
NOTE 1 [file md.mdp]:
leapfrog does not yet support Nose-Hoover chains, nhchainlength reset to 1
Setting the LD random seed to 1646275867
Generated 3 of the 3 non-bonded parameter combinations
Generating 1-4 interactions: fudge = 0.5
Generated 3 of the 3 1-4 parameter combinations
Excluding 2 bonded neighbours molecule type 'SOL'
Setting gen_seed to 1388784567
Velocities were taken from a Maxwell distribution at 300 K
NOTE 2 [file water.top, line 43]:
In moleculetype 'SOL' 3 atoms are not bound by a potential or constraint
to any other atom in the same moleculetype. Although technically this
might not cause issues in a simulation, this often means that the user
forgot to add a bond/potential/constraint or put multiple molecules in
the same moleculetype definition by mistake. Run with -v to get
information for each atom.
Analysing residue names:
There are: 256 Water residues
Number of degrees of freedom in T-Coupling group System is 2301.00
Determining Verlet buffer for a tolerance of 0.005 kJ/mol/ps at 298 K
Calculated rlist for 1x1 atom pair-list as 0.800 nm, buffer size 0.000 nm
Set rlist, assuming 4x4 atom pair-list, to 0.800 nm, buffer size 0.000 nm
Note that mdrun will redetermine rlist based on the actual pair-list setup
This run will generate roughly 1 Mb of data
There were 2 notes
Back Off! I just backed up md.tpr to ./#md.tpr.1#
GROMACS reminds you: "Don't pay any attention to what they write about you. Just measure it in inches." (Andy Warhol)
gmx_mpi: Relink `/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/libtensorflow_framework.so.2' with `/usr/lib/x86_64-linux-gnu/libz.so.1' for IFUNC symbol `crc32_z'
DeePMD-kit: Successfully load libcudart.so.12
2026-01-01 09:50:44.345835: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
:-) GROMACS - gmx mdrun, 2020.2-MODIFIED (-:
GROMACS is written by:
Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen
Par Bjelkmar Christian Blau Viacheslav Bolnykh Kevin Boyd
Aldert van Buuren Rudi van Drunen Anton Feenstra Alan Gray
Gerrit Groenhof Anca Hamuraru Vincent Hindriksen M. Eric Irrgang
Aleksei Iupinov Christoph Junghans Joe Jordan Dimitrios Karkoulis
Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson
Justin A. Lemkul Viveca Lindahl Magnus Lundborg Erik Marklund
Pascal Merz Pieter Meulenhoff Teemu Murtola Szilard Pall
Sander Pronk Roland Schulz Michael Shirts Alexey Shvetsov
Alfons Sijbers Peter Tieleman Jon Vincent Teemu Virolainen
Christian Wennberg Maarten Wolf Artem Zhmurov
and the project leaders:
Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2019, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.
GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.
GROMACS: gmx mdrun, version 2020.2-MODIFIED
Executable: /root/opt/gromacs/install/bin/gmx_mpi
Data prefix: /root/opt/gromacs/install
Working dir: /workspace/test/02gmx
Command line:
gmx_mpi mdrun -deffnm md -gpu_id 0
Back Off! I just backed up md.log to ./#md.log.1#
Compiled SIMD: AVX2_256, but for this host/run AVX_512 might be better (see
log).
Reading file md.tpr, VERSION 2020.2-MODIFIED (single precision)
Changing nstlist from 10 to 100, rlist from 0.8 to 0.8
1 GPU selected for this run.
Mapping of GPU IDs to the 1 GPU task in the 1 rank on this node:
PP:0
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PP task will update and constrain coordinates on the CPU
Using 1 MPI process
Using 12 OpenMP threads
WARNING: There are no atom pairs for dispersion correction
Init deepmd plugin from: input.json
Setting lambda: 1
Setting pbc: 1
Number of atoms: 768
Begin Init Model: graph-compress.pb
DeePMD-kit WARNING: Environmental variable DP_INTRA_OP_PARALLELISM_THREADS is not set. Tune DP_INTRA_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable DP_INTER_OP_PARALLELISM_THREADS is not set. Tune DP_INTER_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable OMP_NUM_THREADS is not set. Tune OMP_NUM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
2026-01-01 09:50:46.847147: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1767261046.943202 760 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10729 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:00:03.0, compute capability: 8.6
I0000 00:00:1767261047.008275 760 mlir_graph_optimization_pass.cc:437] MLIR V1 optimization pass is not enabled
Successfully load model!
DeePMD-kit WARNING: Environmental variable DP_INTRA_OP_PARALLELISM_THREADS is not set. Tune DP_INTRA_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable DP_INTER_OP_PARALLELISM_THREADS is not set. Tune DP_INTER_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable OMP_NUM_THREADS is not set. Tune OMP_NUM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
installed to: /root/opt/deepmd/deepmd-kit/source/dp_gromacs
source: v3.1.2-21-gb98f6c59-dirty
source branch: devel
source commit: b98f6c59
source commit at: 2025-12-23 08:15:14 +0000
support model ver.: 1.1
build variant: cuda
build with tf inc: /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/include;/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/include
build with tf lib: /usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/libtensorflow_cc.so.2
set tf intra_op_parallelism_threads: 0
set tf inter_op_parallelism_threads: 0
Summary:
Atom map: O H
Successfully init plugin!
Back Off! I just backed up md.trr to ./#md.trr.1#
Back Off! I just backed up md.edr to ./#md.edr.1#
starting mdrun 'lw_256.pdb'
10000 steps, 5.0 ps.
Writing final coordinates.
Back Off! I just backed up md.gro to ./#md.gro.1#
Core t (s) Wall t (s) (%)
Time: 1221.531 101.794 1200.0
(ns/day) (hour/ns)
Performance: 4.244 5.655
GROMACS reminds you: "Philosophy of science is about as useful to scientists as ornithology is to birds." (Richard Feynman)
gmx_mpi: Relink `/usr/local/miniconda3/envs/py312/lib/python3.12/site-packages/tensorflow/libtensorflow_framework.so.2' with `/usr/lib/x86_64-linux-gnu/libz.so.1' for IFUNC symbol `crc32_z'
DeePMD-kit: Successfully load libcudart.so.12
2026-01-01 09:52:29.837566: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
:-) GROMACS - gmx rdf, 2020.2-MODIFIED (-:
GROMACS is written by:
Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen
Par Bjelkmar Christian Blau Viacheslav Bolnykh Kevin Boyd
Aldert van Buuren Rudi van Drunen Anton Feenstra Alan Gray
Gerrit Groenhof Anca Hamuraru Vincent Hindriksen M. Eric Irrgang
Aleksei Iupinov Christoph Junghans Joe Jordan Dimitrios Karkoulis
Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson
Justin A. Lemkul Viveca Lindahl Magnus Lundborg Erik Marklund
Pascal Merz Pieter Meulenhoff Teemu Murtola Szilard Pall
Sander Pronk Roland Schulz Michael Shirts Alexey Shvetsov
Alfons Sijbers Peter Tieleman Jon Vincent Teemu Virolainen
Christian Wennberg Maarten Wolf Artem Zhmurov
and the project leaders:
Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2019, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.
GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.
GROMACS: gmx rdf, version 2020.2-MODIFIED
Executable: /root/opt/gromacs/install/bin/gmx_mpi
Data prefix: /root/opt/gromacs/install
Working dir: /workspace/test/02gmx
Command line:
gmx_mpi rdf -f md.trr -s md.tpr -o md_rdf.xvg -ref 'name OW' -sel 'name OW'
Reading file md.tpr, VERSION 2020.2-MODIFIED (single precision)
Reading file md.tpr, VERSION 2020.2-MODIFIED (single precision)
trr version: GMX_trn_file (single precision)
Last frame 100 time 5.000
Analyzed 101 frames, last time 5.000
Back Off! I just backed up md_rdf.xvg to ./#md_rdf.xvg.2#
GROMACS reminds you: "C is not a high-level language." (Brian Kernighan, C author)
