Hello everyone, I have questions about training mode on a model containing water and slab (about 12 Angstrom thickness water and 30 Angstrom thickness slab). For now I have proposed two training modes:
1. First I train water part (Delete the slab part and constrain water in a box fitting its thickness). Then I continue training slab part with the trained water-ML_ABN. In the end I train the whole model with the last ML_ABN. It is thought to be accurate that the final RMSE force error around 0.05 ev/Angstrom.
2. First I train water part (Delete the slab part and constrain water in a box fitting its thickness). Then I continue training slab part from scratch without the last water-trained ML_ABN. After training of slab, I glued water's and slab's ML_ABN together. In the end, I train the whole model with this ML_ABN. It is thought to be accurate that the final RMSE force error around 0.05 ev/Angstrom.
So I want to know that which mode is more recommended? Which mode is faster? What is more, I read the article Jinnouchi, R., Lahnsteiner, J., Karsai, F., Kresse, G. & Bokdam, M. Phase Transitions of Hybrid Perovskites Simulated by Machine-Learning Force Fields Trained on the Fly with Bayesian Inference. Phys. Rev. Lett. 122, 225701 (2019). . In this article, The hybrid perovskite MAPbI3 was trained uner NVT ensemble. Does it mean that for other hybrid perovskite like FAPbI3, it is also better to choose NVT but not NPT ensemble to train them?
How to select training mode?
Moderators: Global Moderator, Moderator
-
- Newbie
- Posts: 25
- Joined: Wed Jul 20, 2022 7:18 am
Re: How to select training mode?
In fact, the above question is from this sentence
I want to know that training of isolated molecule whether continue with the last ML_ABN of training surface?
in https://www.vasp.at/wiki/index.php/Best ... rce_fields.When the system contains different components, train them separately first. For instance, when the system has a surface of a crystal and a molecule binding on that surface. First, train the bulk crystal, then the surface, next the isolated molecule, and finally the entire system. This way a significant amount of ab-initio calculation can be saved in the computationally most expensive combined system.
I want to know that training of isolated molecule whether continue with the last ML_ABN of training surface?
-
- Global Moderator
- Posts: 215
- Joined: Fri Jul 01, 2022 2:17 pm
Re: How to select training mode?
Dear jun_yin2,
In principle you want to train a combined system which consists of water molecules and some slab which I guess is some
bulk material.
The recommended way of doing this is to first train your slab without the water molecules.
Then train the water molecules, without including the slab.
Then combine the two ML_AB and train the entire system. This is also what is meant in the vasp wiki with
After you finished this procedure and obtained a ML_AB file, you should switch the ML_MODE = REFIT
wiki/index.php/ML_MODE
It is recommended to do the refitting several times with different hyper parameters. Like this you can optimize the hyper-parameters and obtain the most suitable fit to your data. The quality of the fit can be obtained in the ML_LOGFILE from the lines starting with ERR.
The recommended hyper-parameters to optimize are
ML_RCUT2 and ML_RCUT1
ML_MRB1 and ML_MRB2
ML_LMAX2
ML_EPS_LOW
I hope this is of help and answers your question.
Otherwise don't hesitate to contact us again.
All the best Jonathan
In principle you want to train a combined system which consists of water molecules and some slab which I guess is some
bulk material.
The recommended way of doing this is to first train your slab without the water molecules.
Then train the water molecules, without including the slab.
Then combine the two ML_AB and train the entire system. This is also what is meant in the vasp wiki with
As you see the vasp wiki also mentions that if you are not interested in the water molecules then you can skip the training on the isolated water molecules.When the system contains different components, train them separately first. For instance, when the system has a surface of a crystal and a molecule binding on that surface. First, train the bulk crystal, then the surface, potentially the isolated molecule, and finally the entire system (if you have no need to describe the isolated molecule you may skip training on the molecule). This way a significant amount of ab-initio calculation can be saved in the computationally most expensive combined system.
After you finished this procedure and obtained a ML_AB file, you should switch the ML_MODE = REFIT
wiki/index.php/ML_MODE
It is recommended to do the refitting several times with different hyper parameters. Like this you can optimize the hyper-parameters and obtain the most suitable fit to your data. The quality of the fit can be obtained in the ML_LOGFILE from the lines starting with ERR.
The recommended hyper-parameters to optimize are
ML_RCUT2 and ML_RCUT1
ML_MRB1 and ML_MRB2
ML_LMAX2
ML_EPS_LOW
I hope this is of help and answers your question.
Otherwise don't hesitate to contact us again.
All the best Jonathan