3.4. Force Field Training Targets

A number of different targets can be used for training force fields in the ACT. Below, we mention the most important ones including, where needed, the technical details necessary to appreciate the methods. For information on obtaining or generating training data we refer to Section Training Data.

Computing interaction energies and components of interaction energies is a crucial part of force field development. In the ACT we have hitherto used data from symmetry-adapted perturbation theory (SAPT) calculations [107, 108]. It is not entirely trivial to match the energy components from SAPT to force field terms, however.

3.4.1. Algorithm to compute energy components in ACT

The component of the interaction energy of a dimer can be computed from the difference between dimer and monomers \(A\) and \(B\) [109]:

(3.54)\[E_x^{inter}(AB) = E_x^{total}(AB) - \left(E_x(A) + E_x(B)\right)\]

where \(x\) is exchange or dispersion and \(total\) indicates that the energy includes both the intra- and intermolecular interactions. To compute the electrostatics or induction energy of a dimer in the gas phase, the ACT first computes the relaxed energy of the two monomers \(A\) and \(B\), that is, the energy of the shell particles is minimized with respect to their positions, yielding \(E_x(A)\) and \(E_x(B)\). The rationale for this is that SAPT computes electrostatic energies between unperturbed monomers based on the response/relaxation of monomer Hartree-Fock (HF) orbitals in the electric field of the interacting partner. Then, the energies of the dimer \(AB\) are computed in three steps:

  1. electrostatics is computed with shells located in the relaxed monomer positions, yielding \(E_{elec}^{total}(AB)\),

  2. the shells of compound \(A\) are allowed to relax (further) in the electric field from compound \(B\), while shells of compound \(B\) remain at their monomer positions, and vice versa, yielding the second order relaxation \(E_{induc}^{inter(2)}\) (see ref. [61] for details),

  3. the shells are allowed to relax completely, yielding the total \(E_{induc}\) from which the higher order terms, named \(E_{induc}^{inter(3)}\) here for convenience, can be derived by subtracting \(E_{induc}^{inter(2)}\). According to ref. [61], parameters of models corresponding to the higher other terms, including, potentially, charge transfer, can be trained to the \(\delta_{HF}\) contribution of the SAPT induction energy. Here, we have added the exponential term proposed by McDaniel and Schmidt for this purpose (section Induction correction).

The terms below can be compared directly to SAPT:

(3.55)\[E_{elec} ^{inter}(AB) = E_{elec}^{total}(AB) - \left(E_{ei}(A) + E_{ei}(B)\right)\]
(3.56)\[E_{induc}^{inter(2)}(AB) = \left(E^{total}_{ei}\|_A + E^{total}_{ei}\|_B\right)-2E_{elec} ^{total}(AB)\]
(3.57)\[E_{induc}^{inter(3)}(AB) = E_{ei}^{total}(AB) - E_{induc}^{(2)}(AB) - E_{elec}^{total}(AB)\]

where \(ei\) is short for \(elec+induc\) and the notation \(\|_X\) indicates that the shells of compound \(X\) are kept fixed in the relaxed monomer conformation. If we sum Eqns. (3.55)- (3.57) we recover Eqn. (3.54) where \(x\) equals \(ei\). Eqn. (3.55) corresponds to the electrostatics in SAPT.

To summarize, ACT computes the induction term in two parts: a second order term \(E_{induc}^{inter(2)}(AB)\) (Eqn. (3.56)) and a third-and-higher-order term \(E_{induc}^{inter(3)}(AB)\) (Eqn. (3.57)), the sum of which corresponds to the total induction from SAPT.

McDaniel and Schmidt [61] proposed that Eqn. (3.57) should be equal to the \(\delta_{HF}\) + \(\delta_{MP2}\) terms from SAPT while Eqn. (3.56) would correspond to the polarization energy. They then continue to suggest how this can be implemented in a force field:

\[E_{induc} ~=~ E_{shell} + E_{pol} + E_{\delta HF}\]

where \(E_{shell}\) is Eqn. (3.8) and

\[E_{pol} ~=~ A^{ind}_{ij} {\rm exp}(-b_{ij} r_{ij})\]

where \(r_{ij}\) is the interatomic distance and \(b_{ij}\) a constant, and

\[E_{\delta HF} ~=~ A^{\delta HF}_{ij} {\rm exp}(-b_{ij} r_{ij})\]

where both \(A^{ind}_{ij}\) and \(A^{\delta_{HF}}_{ij}\) are determined from a negative geometric combination rule

\[A_{ij} ~=~ -\sqrt{A_i A_j}\]

meaning these terms are always attractive. These potentials use the \(b_{ij}\) that is used in the exchange energy (using a Buckingham potential, Eqn. (3.13)). Whether this is the most appropriate way of splitting terms and reproducing SAPT energies remains to be determined.

In the output the ACT training module alexandria train_ff there are two terms related to induction. The term INDUCTIONCORRECTION refers to Eqn. (3.57). If that is present, the term INDUCTION refers to Eqn. (3.56), if not it refers to the sum of the two.

3.4.2. Monomer Energies and Forces

Typically, a series of single-point quantum calculations are done at a user-defined level of theory. These calculations can then be converted to a molprop file. More information to come.

3.4.3. Other Properties

In principle, all the molecular properties mentioned in Section Predicting Molecular Properties can be used for training, but it is highly recommended to leave some properties for validation.

3.5. Parameters for Force Field Training

Each potential available in the ACT has its own set of parameters. Those are listed in Table 3.4 for non-bonded parameters, Table 3.5 for bonded parameters. Parameters for virtual sites are given in Table 3.6 and for charge generation algorithms in Table 3.7. In total, well over 80 parameter categories can be trained using the ACT.

Table 3.4 Force field parameters related to non-bonded potentials that can be trained using ACT. Note that the parameter names are case-sensitive and, in some cases, they correspond to a Greek symbol in the equation.

Potential

Equation

Parameters

COULOMB_GAUSSIAN

(3.2)

zeta

COULOMB_SLATER

(3.5)

zeta

POLARIZATION

(3.8)

alpha

MACDANIEL_SCHMIDT

(3.9)

a1dexp bdexp

LJ14_7

(3.10)

sigma epsilon gamma delta

GENERALIZED_BUCKINGHAM

(3.11)

rmin epsilon gamma delta

WANG_BUCKINGHAM

(3.12)

sigma epsilon gamma

BUCKINGHAM

(3.13)

A b C

TANG_TOENNIES

(3.15)

Att btt c6tt c8tt c10tt

TT2

(3.16)

Att2b bExchtt2b c6tt2b c8tt2b c10tt2b

SLATER_ISA

(3.17)

A b

LJ12_6

(3.18)

sigma epsilon

LJ12_6_4

(3.19)

sigma epsilon gamma

BORN_MAYER

(3.20)

A b

Generalized mean

(3.36)

exponent

Table 3.5 Force field parameters related to bonded potentials that can be trained using ACT. Note that the parameter names are case-sensitive and, in some cases, they correspond to a Greek symbol in the equation.

Potential

Equation

Parameters

HARMONIC_BONDS

(3.37)

kb bondlength bondenergy

MORSE_BONDS

(3.38)

beta De bondlength D0

HUA_BONDS

(3.39)

De bondlength b c

HARMONIC_ANGLES

(3.40)

kt angle

LINEAR_ANGLES

(3.41)

klin a

HARMONIC_DIHEDRALS

(3.44)

kimp

FOURIER_DIHEDRALS

(3.45)

c0 c1 c2 c3 c4 c5

PROPER_DIHEDRALS

(3.46)

kp mult phi0

Table 3.6 Parameters related to virtual sites that can be trained using ACT. Note that the parameter names are case-sensitive. Equations will be added later.

Virtual Site

Equation

Parameters

VSITE1

vs1a

VSITE2

vs2a

VSITE2FD

vs2fd_a

VSITE3

vs3a vs3b

VSITE3S

vs3a

VSITE3FD

vs3fd_a vs3fd_b

VSITE3FAD

vs3fad_a vs3fad_b

VSITE3OUT

vs3out_a vs3out_b vs3out_c

VSITE4

vs4a vs4b vs4c

VSITE4S

vs4sa vs4sb

VSITE4S3

vs4s3a

Table 3.7 Parameters related to charge algorithms that can be trained using ACT. Note that the parameter names are case-sensitive. Equations will be added later.

Algorithm

Equation

Parameters

EEM

(3.50)

eta chi

SQE

(3.51)

eta chi delta_eta delta_chi