2-1. The Adaptive Force-Matching Method
The Adaptive Force-Matching Method (AFMM) is an automated iterative process of six steps aiming to render a force field parameter set derived from ab initio level calculations. Every six steps is grouped as a
“generation”, and every generation observes its own associated force field while providing the next generation a new, ideally improved, force field.
With this work, our goal is to model water with a flexible single point charge model via AFMM.
2-1-1. Step 1: Force Field Validation
A force field model and parameter set is provided and checked, resulting in a new generation and possibly halting the circular process with completion declared.
At the very beginning, an initial guess force field is manually provided and no automated validation is needed. Said initial guesses can likely be acquired from previous literature, though an educated guess should also suffice.
If the parameters provided came from the last generation, then its convergence with the parameters of previous generations will be checked.
10 Upon convergence, AFMM is finished and the latest force field, the final generation, is declared as the result; otherwise the provided force field will be used in the now-current generation.
With this work, we have employed Wang Feng’s initial-guess 𝐶 are respectively the Lennard-Jones repulsion and attraction constant for oxygen, 𝑞𝑚𝑞𝑛
𝑟𝑚𝑛 is the columbic energy term, and 𝑟𝑖, 𝜃𝑗, 𝑟𝑘, and 𝑟𝑚𝑛 are respectively the bond length, bond angle, oxygen-oxygen intermolecular distance, and partial charge distance arguments.
In this work, the halting is manually initiated and executed, and is based on the preliminary observation of convergence in the radial distribution function across each generation.
2-1-2. Step 2: Molecular Dynamic Simulation
A molecular dynamics simulation of the system in question is run with the current generation force field. In doing so a thermodynamic population of possible configuration states of the system is generated.
This work uses the TINKER molecular modeling software for its
11 molecular dynamic calculations, [8] The MD model is a 18.6445 Å periodic water cube with 216 water molecules, all inter molecular forces is cut-off at 9.0Å . Lennard-Jones potential and Ewald Summation [9] are both utilized.
The epsilon combination rule for Lennard Jones potential is geometric, meaning Oxygen-Hydrogen interactions will not include van der Waals forces. All simulation is done under canonical ensemble with Nosé-Hoover thermostat [10] (tau=0.2) set at 300K. Each simulation lasts 40 simulated picoseconds with a time-step of 0.5 femtosecond, structures are saved every femtosecond.
2-1-3. Step 3: Snapshot Sampling.
With the molecular dynamic trajectory generated from the previous step, one can provide statistically significant samples of snapshots.
Additionally, each time frame can provide multiple samples by focusing on different molecules. As step 4 involves QM/MM calculation, molecules “in focus” will be calculated ab initio, while those “out of focus”
but not “out of frame” will be handled by molecular dynamic calculation using the force field of the current generation; which is why the former is dubbed the QM region, and the latter MM region. Further details on the QM/MM model will be covered in the next section
In this work, we picked the very last structural frame of the MD trajectory created from the previous step as the basis for all of the samples
12 of that generation, with 96 different QM region chosen from it. Our QM region has 7 water molecules, which was deemed sufficient by Wang Feng after special weighing (covered in step 5), and contains 6 closest water molecule to the central water molecule, or the focal point, of that QE region.
In every generation the 96 water molecule closest to an arbitrary origin is chosen to be the focal points of their corresponding QM Region. All water molecules not included in the QM region is included in the MM region.
2-1-4. Step 4: QM/MM Force Calculations.
The provided samples of snapshots will be calculated and the forces applied to the QM region atoms is recorded. Because the calculation of each sample is not dependent on each other, this step can be perfectly parallelized.
In this work, Gaussian 09 is used for QM/MM force calculations, with the QM part calculated with MP2 [11]/AUG-cc-pVDZ [12] [13], and the MM part with the force field of current generation along with electronic embedding.
2-1-5. Step 5: Data Weighting
The parallelized computational results of the previous step is assembled into a collection of ab initio forces linked to relevant configurational snapshot. Atoms deemed to be more or less representative
13 of the system will be weighted accordingly.
In this work we utilize Wang Feng’s weighing rule, [4] where a water molecule’s weight is dependent on how many of its first solvation shell neighbors are included in the QM region. If there is three or four such neighbor included in the QM region, then the relevant water molecule will be weighted as standard. With only two neighbors, the data is half-discounted in weighting. With one or less the molecule is completely ignored.
2-1-6. Step 6: Force Matching
Finally, the force data from step 4 will be used as the target for the new force fields’ re-parameterization via fitting algorithms, using the relevant configurational snapshot (as chosen in step 3 and employed in step 4) as context, and with considerations from the weighting of step 5. Once the fitting is done, a new set of force field is created, step 1 is revisited, and in the process starting a new generation.
In this work we employ a non-linear fitting process, L-BFGS-B, provided by the Python module SciPy and NumPy.
14 The objective function has three terms, each covers atomic forces, inter-molecular forces, and inter-molecular torque.
= √∑ 𝑤
𝛼|
𝑓𝛼𝑎𝑖−𝑓𝛼𝑓𝑖𝑡𝑓𝛼𝑎𝑖
|
2/ ∑ 𝑤
𝛼𝑛
𝛼+ √∑ 𝑤
𝑖|
𝑓𝑖𝑎𝑖−𝑓𝑖𝑓𝑖𝑡𝑓𝑖𝑎𝑖
|
2/ ∑ 𝑤
𝑖𝑛
𝑖+
√∑ 𝑢
𝑖|
𝜏𝑖𝑎𝑖−𝜏𝑖𝑓𝑖𝑡𝜏𝑖𝑎𝑖
|
2/ ∑ 𝑢
𝑖𝑛
𝑖𝑤𝛼, 𝑤𝑖, 𝑢𝑖 is the weight assigned from Step 5, 𝑛𝛼 and 𝑛𝑖 are the amount of atoms and molecules considered.
The input data is collect from the previous 4 generations of configurations and ab initio calculations, this is done to give the algorithm some inertia, and to prevent extreme fluctuation in parameter values across generations.
2-2. Radial Distribution Function
A Radial Distribution Function (RDF) of a configuration describes the probability of a specific type of atom being found around another type of atom. RDF is frequently used to represent the structural property of fluids, and thus we employ it to validate the force field of this work. An RDF can be easily extracted from molecular dynamic simulations by integrating data over the duration of a simulation of a fluid after equilibrium is reached.
RDF can also be interpreted from physical experiments such as X-ray or
15 neutron scattering, though the operating principles of these methods is beyond this work.
On the subject of RDFs of water, it is a topic of study for over a couple of decades, [14][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32]
and during this period of time, the pinnacle of physical experiments was conducted and released back in 2000 by A. K. Soper. [33] A more recent revision with updated methodology, however, has prompt A.K. Soper to publish a rather different RDF results in 2013, and has accordingly and aptly asked the following question in the subtitle of the paper: “Is There Anything We Can Say for Sure?” [34]
It is one of this work’s aim to contribute on this topic by providing additional certainty in the structure of water, via an updated ab initio method, and if not immediately and directly, eventually and indirectly as a stepping stone through some of its innovation.
2-3. Potential of Mean Force
This work employs one final in silico experiment in the form of potential of mean force (PMF) experiments. This type of experiment is designed to find the free energy required for a selected group of atoms to go through a specific path, spatial or chemical. This is done through the summation or integration of the mean forces the subject is influenced by, throughout different points of its trajectory. As integration of force over
16 distance is work, assuming the absence of pressure-volume and other forms of work, the work we’ve calculated is the Gibbs/Helmholtz free energy.
(Note that since pressure-volume work is the sole difference among Gibbs and Helmholtz free energy, they are indistinguishable here.)
In this work, the force field acquired through AFMM is also tested by PMF experiments. Among the bulk of water of 729 molecules is a sole water molecule affixed to a specific point among a leaving trajectory. This 730th water molecule is referred to as the probing molecule, and for every single ångström that it moves on its trajectory, each of the configuration underwent its own MD simulation, which provides the mean force data needed to conduct PMF. All of the simulations was down with 2fs-timesteps over 400ns under canonical ensemble, with 730 water molecules in periodic boundary condition of (27.96675 Å x83.90025 Å x27.96675 Å), at a temperature of 300K.
To physically constrain a water molecule through a pre-planned path and make measurements is beyond the current capabilities of contemporary science, so this experiment can only be conducted in silico. Nonetheless high level ab initio calculations of PMF concerning water do already exist, and certain physical experiments, such as solvation experiments, can be used to cross-reference the calculated results.
17