Long-range corrected hybrid meta-generalized-gradient approximations with dispersion corrections

(1)

(2)

Long-range corrected hybrid meta-generalized-gradient approximations with dispersion corrections

You-Sheng Lin,¹Chen-Wei Tsai,¹Guan-De Li,¹and Jeng-Da Chai^1,2,a)

1Department of Physics, National Taiwan University, Taipei 10617, Taiwan

2Center for Theoretical Sciences and Center for Quantum Science and Engineering, National Taiwan University, Taipei 10617, Taiwan

(Received 9 January 2012; accepted 2 April 2012; published online 20 April 2012)

We propose a long-range corrected hybrid meta-generalized-gradient approximation (GGA) functional, based on a global hybrid meta-GGA functional, M05 [Y. Zhao, N. E. Schultz, and D. G.

Truhlar, J. Chem. Phys. 123, 161103 (2005)], and empirical atom-atom dispersion corrections. Our resulting functional, ωM05-D, is shown to be accurate for a very wide range of applications, such as thermochemistry, kinetics, noncovalent interactions, equilibrium geometries, frontier orbital energies, fundamental gaps, and excitation energies. In addition, we present three new databases, IP131 (131 ionization potentials), EA115 (115 electron affinities), and FG115 (115 fundamental gaps), consisting of experimental molecular geometries and accurate reference values, which will be useful in the assessment of the accuracy of density functional approximations. © 2012 American Institute of Physics. [http://dx.doi.org/10.1063/1.4704370]

I. INTRODUCTION

Because of its satisfactory accuracy and modest cost in many applications, Kohn-Sham density functional theory (KS-DFT) (Refs. 1 and2) has become one of the most popular electronic structure methods for large ground-state systems.^3–6 Its extension for treating excited-state systems, time-dependent density functional theory (TDDFT),^7,8 has also been widely used.

The crucial ingredient of KS-DFT, the exact exchange- correlation (XC) energy functional E_xc[ρ], however, remains unknown and needs to be approximated. Functionals based on the local density approximation (LDA), modeling the XC energy density locally with that of a uniform electron gas (UEG), have been quite successful for nearly free electron systems,^3,4though still insufficiently accurate for most quantum chemical applications. Functionals based on the generalized gradient approximations (GGAs), additionally incor- porating the gradient of local density into the LDA, have achieved reasonable accuracy in many applications. As an extension of the GGA (for rather restricted set of density variables), meta-GGA (MGGA) offers itself quite naturally.

Functionals depending directly on the Laplacian of the density have not been pursued intensively, because of the difficulty of numerical evaluation. MGGAs, which adopt the kinetic energy density as a substitute for the Laplacian, have shown evidences of superiority over GGAs.^9–11

However, the LDA, GGAs, and MGGAs (commonly denoted as DFAs for density functional approximations) are based on the localized model XC holes, while the exact XC hole should be fully nonlocal. Currently, perhaps the most successful approaches to taking into account the nonlocality of XC hole are provided by hybrid DFT methods, incorpo- rating a fraction of the exact Hartree-Fock (HF) exchange

a)Author to whom correspondence should be addressed. Electronic mail:

[email protected].

into the DFAs. Hybrid density functionals have achieved remarkable accuracy and have expanded the usefulness of DFT for many applications. Noticeably, global hybrid MGGA functionals,^12–20where the XC energy density depends on the local density, the gradient of local density, a fraction of exact exchange, as well as the exact KS kinetic energy density (a function of the occupied KS orbitals),^21–24 have been shown to potentially perform better than global hybrid GGA functionals,15–20,25,26 due to the additional ingredient of kinetic energy density in global hybrid MGGA functionals.

In global hybrid functionals, a small fraction of the exact HF exchange is added to a semilocal density functional. In certain situations, especially in the asymptotic re- gions of molecular systems, a large fraction (even 100%) of HF exchange is needed. Aiming to remedy this, long- range corrected (LC) hybrid DFT schemes have been actively developed.^27–37LC hybrids retain the full HF exchange only for the long-range electron-electron interactions, and thereby resolve a significant part of the self-interaction problems associated with global hybrid functionals.

On the other hand, the development of accurate short- range (SR) exchange density functionals E_x^SR[ρ], plays an important role in the progress of LC-DFT. In the first LC scheme, an ansatz for the conversion of any E_x to E_x^SR was proposed by Iikura et al.,²⁷ and has become widely used. However, their resulting LC hybrid GGA functionals do not outperform the corresponding global hybrid GGA functionals for thermochemistry. In 2006, Vydrov et al. pro- posed a different LC scheme,³¹ based on integrating a GGA model exchange hole. Their resulting LC-ωPBE functional has shown improved performance for thermochemistry and barrier heights, and is comparable to global hybrid GGA functionals such as B3LYP.^38,39 However, further improvements following this direction require the development of more accurate model exchange holes, which is a quite challenging task.

(3)

Another approach to more accurate LC hybrid functionals was proposed by Chai and Head-Gordon.³⁵ First, aug- menting the SR local spin density exchange energy density by a flexible enhancement factor (of the Becke’s 1997 form⁴⁰) and fully reoptimizing the LC functional on a diverse training set, yields the ωB97 functional. Second, including an ad- justable fraction of SR HF exchange in the ωB97 functional with the similar reoptimization procedure, leads to the ωB97X functional. ωB97 and ωB97X have been shown to be accurate across a diverse set of test data, containing thermochemistry, kinetics, and noncovalent interactions.³⁵

However, problems associated with the lack of nonlocality of the DFA correlation hole, such as the lack of dispersion interactions (the missing of van der Waals forces), are not resolved by the LC hybrid schemes. The correlation functionals in typical LC hybrids are treated semilocally, which cannot capture the long-range (LR) correlation effects.^41,42To remedy this, the DFT-D scheme was applied⁴³ to extend the ωB97X functional with damped atom-atom dispersion corrections, denoted as ωB97X-D.³⁶ Consequently, ωB97X-D can obtain dispersive effects with essentially zero additional computational cost relative to ωB97X. As an alternative approach, ωB97X has also been combined with the double- hybrid methods,^44–48 which mix both the HF exchange and nonlocal orbital correlation energy from the second-order per- turbation energy expression in wave function theory. The resulting ωB97X-2 functional³⁷has yielded very high accuracy for thermochemistry, kinetics, and noncovalent interactions, though its fifth-order scaling with respect to system size may limit its applicability to larger systems.

As the ωB97 series are LC hybrid GGAs, it seems a natu- ral step to develop LC hybrid MGGAs and to assess their performance. In this work, we propose a new LC hybrid MGGA- D functional, denoted as ωM05-D, which is shown to be accurate for a wide range of applications, when compared with the two closely related functionals: a global hybrid MGGA functional (M05-2X) (Ref.19) and a LC hybrid GGA-D functional (ωB97X-D).³⁶ The rest of this paper is organized as follows. In Sec.II, we briefly describe the relevant schemes developed in the LC hybrid approach. In Sec.III, we propose a new SR exchange functional, which serves as suitable basis functionals for systematically generating accurate LC hybrid MGGA functionals. The performance of the ωM05-D functional is compared with other functionals in Sec.IV(on the training set), and in Sec.V(on some test sets). In Sec.VI, we give our conclusions.

II. RATIONALES OF LC HYBRID SCHEMES

For the LC hybrid schemes, one first defines the long- range and short-range operators to partition the Coulomb operator. The most popular type of splitting operator used is the standard error function (erf),

1

r₁₂ =erf(ωr12)

r₁₂ +erfc(ωr12)

r₁₂ , (1)

where r12≡ |r¹²| = |r¹− r²| (atomic units are used through- out this paper). On the right hand side of Eq.(1), the first term

is long-ranged, while the second term is short-ranged. The parameter ω defines the range of these operators.

In this work, we employ the erf/erfc partition, and use the following expression (as suggested in the recent LC hybrid schemes^35,36,49) for the LC hybrid functionals (c_x is a fractional number to be determined):

E_xc^LC-DFA= Ex^LR-HF+ c^xE_x^SR-HF+ (1 − c^x)E_x^SR-DFA+ Ec^DFA, (2) where E_x^LR-HF, the LR-HF exchange, is computed by the occupied KS orbitals ψiσ(r) with the LR operator,

E_x^LR-HF= −1 2

!

σ

!occ.

i,j

" "

ψ_iσ^∗(r1)ψ_{j σ}^∗ (r2)

×erf(ωr12)

r₁₂ ψj σ(r1)ψiσ(r2)dr1dr₂. (3) E_x^SR-HF, the SR-HF exchange, is computed similarly to the above but with the SR operator,

E_x^SR-HF= −1 2

!

σ

!occ.

i,j

" "

ψ_iσ^∗ (r1)ψ_{j σ}^∗ (r2)

×erfc(ωr₁₂)

r₁₂ ψj σ(r1)ψiσ(r2)dr1dr₂. (4) E_x^SR-DFA is the SR exchange approximated by DFAs, and E_c^DFAis the correlation functional the same as that of the full Coulomb interaction.

In view of the E_xc^LC-DFAin Eq.(2), as E_x^LR-HFand E_x^SR-HF are well defined, and accurate approximations for E_c^DFA are widely available, the accuracy of E_x^SR-DFA is thus closely related to the accuracy of a LC hybrid functional.^35,36The ana- lytical form of the SR-LDA (the simplest SR-DFA) exchange functional E_x^SR-LDA can be obtained by the integration of the square of the LDA density matrix with the SR operator,⁵⁰

E_x^SR-LDA=!

σ

"

e_xσ^SR-LDA(ρσ)dr. (5) Here, e^SR-LDA_xσ (ρσ) is the SR-LDA exchange energy density for σ-spin,

e^SR-LDA_xσ (ρσ) = exσ^LDAF(aσ), (6) where

e_xσ^LDA(ρσ) = −3 2

# 3 4π

$1/3

ρ_σ^4/3(r) (7) is the LDA exchange energy density for σ -spin, kFσ

≡ (6π²ρσ(r))^1/3 is the local Fermi wave vector, and aσ

≡ ω/(2kFσ) is a dimensionless parameter controlling the value of the attenuation function F(aσ),

F(aσ) = 1 −8 3aσ

%√πerf# 1 2aσ

$

− 3a^σ+ 4a³σ

+&

2aσ− 4aσ³

'exp

#

− 1 4a_σ²

$ (

. (8)

To develop a possible SR-DFA exchange functional E_x^SR-DFA based on the knowledge of a DFA exchange functional E_x^DFA, there are three schemes as follows. Consider the

(4)

general expression of DFA exchange functional, which is E_x^DFA=!

σ

"

e^LDA_xσ (ρσ)F_xσ^DFAdr, (9) where F_xσ^DFA is the DFA enhancement factor for σ -spin. De- pending on the type of DFA, F_xσ^DFA= 1 for a LDA, Fxσ^DFA

= Fxσ^GGA(ρσ,∇ρ^σ) for a GGA, F_xσ^DFA= Fxσ^MGGA(ρσ,∇ρ^σ, τσ) for a meta-GGA, where ρσ(r) is the spin density, ∇ρ^σ(r) is the spin density gradient, and

τσ = 1 2

!occ.

i

|∇ψ^iσ|² (10)

is the spin kinetic energy density.

The first scheme was proposed by Iikura, Tsuneda, Yanai, and Hirao (ITYH),^27,28,33where E_x^SR-DFAcan be obtained by substituting a modified Fermi wave vector,

kσ = kF σ

)F_xσ^DFA (11)

into SR exchange energy density of Eq. (6), which a prioriproduces E_x^SR-DFAfrom any E_x^DFA, and reduces nicely to E^SR-LDA_x from a E_x^LDA. Although the ITYH scheme pos- sesses an admirable simplicity, some of its deficiencies (which potentially limit its accuracy) have been found.⁵¹

The second scheme was proposed by Vydrov, Heyd, Krukau, and Scuseria (VHKS),^31,32 where for a given spherically averaged exchange hole h^DFA_x (r, r₁₂), E_x^SR-DFAis evaluated as

E_x^SR-DFA = 2π

"

ρ(r)dr

" _∞

0 erfc(ωr12)h^DFA_x (r, r12)r12dr₁₂. (12) The pivot of this scheme is the engineering of the DFA exchange hole. The GGA model exchange hole of Ernzer- hof and Perdew⁵² (EP) provides a framework for modeling any GGA exchange hole. It has made considerable appear- ances in real applications after parametrization to reproduce the Perdew, Burke, and Ernzerhof (PBE) GGA.⁵³ In 2008, Henderson, Janesko, and Scuseria⁵¹ (HJS) proposed another general model for the spherically averaged exchange hole corresponding to a GGA exchange functional, based on the work of EP. The HJS model improves upon the EP model by pre- cisely reproducing the energy of the parent GGA, and by enabling fully analytic evaluation of range-separated hybrid density functionals. For meta-GGA, the TPSS exchange and correlation hole models have been “reverse-engineered”.⁵⁴ However, the resulting LC-TPSS functional (a LC hybrid MGGA) has no satisfactory long-range correction effect.³¹

The third scheme was proposed by Chai and Head- Gordon (CHG),^35,36where E_x^SR-DFAis evaluated as

E_x^SR-DFA=!

σ

"

e^SR-LDA_xσ (ρσ)F_xσ^DFAdr. (13) This simple scheme is expected to work well for a small ω.

For highly parametrized E_x^DFA, such as the B97,⁴⁰M05,¹⁸and M08 (Ref. 26) functionals, the CHG scheme is particularly attractive due to its simplicity. However, how large is not too large for the ω suitable for the CHG scheme? In Secs. III–

V, we will compare the performance of two new LC hybrid

MGGA-D functionals, where one is developed by the CHG scheme, while the other is developed by a new scheme provided in this work, and our results help to answer the above question.

III. LC HYBRID MGGA-D FUNCTIONALS

In this section, we introduce our new LC hybrid MGGA- D functionals. Note that LC-TPSS has been developed by utilizing the TPSS exchange hole (based on the VHKS scheme),³¹but it is found that LC-TPSS does not benefit much by admixture of HF exchange. The M11 functional⁴⁹has been developed based on the extension of a global hybrid MGGA functional, M08,²⁶to LC-DFT, following the CHG scheme.³⁵ Parallel to the strategy of the ωB97 series,^35,36we choose to modify the M05 functional. The M05 functional is a global hybrid MGGA functional with a powerful form,^18,19 and our work is based on modifying this functional. Its exchange part consists of the PBE exchange functional and a reasonable kinetic-energy-density enhancement factor. The PBE exchange is a theoretically sound starting point because it satis- fies the correct UEG limit and also has reasonable behavior at large values of the reduced spin density gradient sσ.

To satisfy the UEG limit of SR exchange, we replace the PBE exchange energy density e^PBE_xσ (ρσ,∇ρ^σ) with the SR- PBE exchange energy density e^SR-PBE_xσ (ρσ,∇ρ^σ) generated by the HJS model exchange hole (based on the VHKS scheme), whose virtues are indicated in Sec. II. To achieve a flexible functional form, we retain the kinetic-energy-density enhancement factor (similar to the CHG scheme). We denote this resulting functional as SR-M05 (short-range M05) exchange, as it reduces to the M05 exchange at ω = 0.

E^SR-M05_x =!

σ

"

e^SR-PBE_xσ (ρσ,∇ρ^σ)f (wσ)dr, (14) where f (wσ) is the kinetic-energy-density enhancement factor,

f(wσ) =

!m i=0

aiwⁱ_σ. (15)

wσis a function of tσ, and tσis a function of the kinetic energy density τσof electrons with spin σ , as designed by Becke,²³

wσ = (t^σ− 1)/(t^σ+ 1), (16) where

tσ = τσ^LDA/τσ, (17)

τ_σ^LDA≡ 3

10(6π²)^2/3ρ_σ^5/3. (18) In general, the enhancement factor should be ω-dependent.

But from the works of LC-TPSS (Ref. 31) and M11,⁴⁹ the optimal ω for a LC hybrid MGGA is expected to be small as well. For a sufficiently small ω value, our proposed functional form, inspired by the VHKS and CHG schemes, should be a good approximation.

We use the same form for the correlation functional as the M05 correlation functional, which can be decomposed into

(5)

same-spin E^M05_{cσ σ} and opposite-spin E_cαβ^M05components, E_c^M05=!

σ

E_{cσ σ}^M05+ Ecαβ^M05. (19) For the opposite-spin terms,

E_cαβ^M05=

"

e^LDA_cαβ gαβ

&

s_av² 'dr, (20)

gαβ

&

s²_av'

=

!n i=0

cαβ,iuⁱ_αβ, (21)

u_αβ= γαβs_av²

1 + γ^αβs_av² , (22)

γ_αβ= 0.0062, (23)

s_av² = 1 2

&

s_α²+ sβ²

', (24)

and for the same-spin terms, E^M05_{cσ σ} =

"

e^LDA_{cσ σ}g_{σ σ}&

s_σ²'# 1 − τ_σ^W

τσ

$

dr, (25)

gσ σ

&

s_σ²'

=

!n i=0

cσ σ,iuⁱ_{σ σ}, (26)

uσ σ = γσ σs_σ²

1 + γσ σs_σ², (27)

γ_{σ σ}= 0.06. (28)

1 − τσ^W/τσis a self-interaction correction factor proposed by Becke,²² in which τ_σ^W is the von Weizsa¨cker kinetic energy density⁵⁵given by

τ_σ^W= |∇ρ^σ|² 8ρσ

. (29)

In a one-electron case, τσ= τσ^W, so Eq.(25)vanishes in any one-electron system. The correlation energy densities e_cαβ^LDA and e^LDA_{cσ σ} are derived from the Perdew-Wang parametrization of the LDA correlation energy,⁵⁶ using the approach of Stoll et al.,⁵⁷

e_cαβ^LDA(ρα, ρβ) = ec^LDA(ρα, ρβ) − e^LDAc (ρα,0) − e^LDAc (0, ρβ), (30)

e^LDA_{cσ σ} = e^LDAc (ρσ,0). (31) Based on the above functional expansions, we propose a new LC hybrid MGGA functional, ωM05-D. It contains a fraction of the SR-HF exchange,

E_xc^ωM05-D= Ex^LR-HF+ c^xE_x^SR-HF+ E^SR-M05x + Ec^M05. (32) We enforce the exact UEG limit for the ωM05-D functional by imposing the following constraints:

c_{σ σ,0}= 1, (33)

c_αβ,0 = 1, (34)

and

a₀+ c^x= 1. (35)

Following the general form of the DFT-D scheme,⁴³ our total energy

E_DFT-D= EKS-DFT+ Edisp (36) is computed as the sum of a KS-DFT part and an empirical atomic-pairwise dispersion correction. We choose to use the same form of unscaled dispersion correction as implemented in ωB97X-D,³⁶

E_disp= −

N!at−1 i=1

Nat

!

j=i+1

C^ij₆

R_ij⁶f_damp(Rij), (37) where Natis the number of atoms in the system, C^ij₆ is the dispersion coefficient for atom pair ij, and Rijis an interatomic distance. The damping function,

f_damp(Rij) = 1

1 + a(Rij/R_r)⁻¹² (38) enforces the conditions of zero dispersion correction at short interatomic separations and correct asymptotic pairwise vdW potentials. Here, R_ris the sum of vdW radii of the atomic pair ij, and the only non-linear parameter, a, controls the strength of dispersion corrections.

To achieve an optimized functional for well-balanced performance across typical applications, we use the same diverse training set described in Ref. 35, which contains 412 accurate experimental and accurate theoretical results, including the 18 atomic energies from the H atom to the Ar atom,⁵⁸ the atomization energies of the G3/99 set (223 molecules),^59–61 the ionization potentials (IPs) of the G2-1 set⁶²(40 molecules, excluding SH2(²A1) and N2(²*) cations due to the known convergence problems for semilocal density functionals⁶⁰), the electron affinities (EAs) of the G2-1 set (25 molecules), the proton affinities (PAs) of the G2-1 set (8 molecules), the 76 barrier heights of the NHTBH38/04 and HTBH38/04 sets,^15,63 and the 22 noncovalent interactions of the S22 set.⁶⁴The S22 data are weighted ten times more than the others. All the parameters in ωM05-D are determined self-consistently by a least-square fitting procedure described in Ref. 35. For the non-linear parameter optimization, we focus on a range of possible ω values (0.0, 0.1, 0.2, 0.3, and 0.4 bohr⁻¹), and optimize the corresponding a values in the steps described in Ref.36.

M05 and M05-2X (Refs.18and19) both used m= 11 in Eq.(15)and n = 4 in Eqs.(21)and(26). However, during the optimization procedure of ωM05-D, we found that the statis- tical errors are close for m= 10 and m= 11, while the one with m= 11 has parameters significantly larger. A recent study by Wheeler and Houk has shown that large magnitude of the parameters in Eq.(15)may result in large grid errors.⁶⁵ More- over, the use of large parameters increases the possibility of convergence difficulty as well as the over-fitting effects. Thus, we choose m = 10 instead of 11 in Eq. (15). The optimized

(6)

TABLE I. Optimized parameters for ωM05-D. Here, the non-linear param- eter a is defined in Eq.(38), and others are defined in Eq.(32).

a 30.0

ω 0.2 bohr⁻¹

cx 0.369592

i ai cαβ, i cσ σ, i

0 0.630408 1.00000 1.00000

1 − 0.219121 − 0.95491 − 5.26863

2 − 0.14411 12.138 17.9935

3 1.27732 − 35.1041 − 17.6408

4 − 1.59959 19.5804 0.625687

5 − 5.94702

6 13.5822

7 10.5048

8 − 28.7168

9 − 6.89761

10 19.0574

parameters of the ωM05-D functional are given in TableI, in which the ω value is same as that of ωB97X-D, while the frac- tion of SR-HF exchange, cx, is larger than that of ωB97X-D (≈0.22). This helps to reduce the self-interaction error (SIE) of the functional, as can be seen in Sec.V.

We also tried a simple model (based on the CHG scheme), where the SR-PBE exchange energy density e^SR-PBE_xσ (ρσ,∇ρ^σ) used in Eq. (14) is substituted with e^SR-LDA_xσ (ρσ)F_x^PBE(sσ), that is, the SR-LDA exchange energy density in Eq.(6)multiplied by the PBE enhancement factor.

We tried this because the mathematical form of the latter is significantly simpler than that of the former, and is the model on which M11 based. The parametrization is the same for this simple model, which we denoted by ωM05s-D. Compared to ωM05-D, the optimal ω value is also 0.2 bohr⁻¹, but the cor- responding optimal a value is found to be 100 and the linear parameters are also larger.

IV. RESULTS FOR THE TRAINING SET

All calculations are performed with a development ver- sion of Q-CHEM 3.2.⁶⁶ Spin-restricted theory is used for singlet states and spin-unrestricted theory for others, unless noted otherwise. For the binding energies of the weakly bound systems, the counterpoise correction⁶⁷is employed to reduce basis set superposition error (BSSE).

Results for the training set are computed using the 6- 311++G(3df,3pd) basis set with the fine grid, EML(75,302), consisting of 75 Euler-Maclaurin radial grid points⁶⁸ and 302 Lebedev angular grid points.⁶⁹ The error for each en- try is defined as error = theoretical value − reference value. The notation used for characterizing statistical errors is as follows: mean signed errors (MSEs), mean ab- solute errors (MAEs), root-mean-square (rms) errors, maximum negative errors (Max(−)), and maximum positive errors (Max(+)).

First, we show the results of the first iteration of fitting procedure, comparing the new LC scheme with the CHG scheme (the simple model) for ω = 0.1, 0.2, 0.3, and 0.4 bohr⁻¹. We optimize ωM05 and ωM05s using the corresponding ωPBE and ωPBEs orbitals, and denote these optimized functionals as ωM05* and ωM05s*. The statistical errors are believed to be quite close to those obtained self-consistently. As can be seen in Table II, the difference between the performance of ωM05* and ωM05s* is noticeable for ω = 0.2 bohr⁻¹, and becomes larger for a larger ω value. Therefore, a LC hybrid MGGA functional with a larger ω value (such as M11 with ω = 0.25 bohr⁻¹) may perform better with our new scheme than with the CHG scheme.

In subsequent iterations, we include the dispersion corrections, increase the training weight of S22 set, and found the functionals optimized with ω = 0.2 bohr⁻¹. To view the effect of the long-range correction and the dispersion corrections, we also consider the functional form M05 and M05-D. The latter is the limiting case where ω = 0 for ωM05-D, of which

TABLE II. Comparisons between the ωM05* and ωM05s* functionals (defined in the text) for different ω values. Statistical errors are in kcal/mol.

ω(bohr⁻¹) 0.1 0.1 0.2 0.2 0.3 0.3 0.4 0.4

System Error ωM05* ωM05s* ωM05* ωM05s* ωM05* ωM05s* ωM05* ωM05s*

Atoms MSE − 0.15 0.24 0.05 0.63 0.23 0.97 0.46 1.33

(18) MAE 2.02 2.09 1.81 2.35 2.00 3.36 3.22 5.05

G3/99 MSE 0.06 0.09 0.05 0.03 − 0.04 − 0.12 − 0.18 − 0.27

(223) MAE 1.77 1.79 1.66 1.76 1.78 2.02 2.10 2.35

IP MSE − 0.58 − 1.48 − 0.84 − 1.36 − 0.38 − 0.32 0.30 0.73

(40) MAE 2.75 3.06 2.81 3.08 2.68 2.81 2.64 2.79

EA MSE − 1.50 − 1.70 − 1.29 − 1.15 − 0.94 − 0.70 − 0.64 − 0.39

(25) MAE 2.50 2.56 2.33 2.22 2.07 1.97 1.98 1.91

PA MSE − 1.65 − 2.68 − 1.49 − 2.71 − 1.07 − 2.11 − 0.78 − 1.54

(8) MAE 1.87 2.68 1.83 2.82 1.79 2.53 1.86 2.43

NHTBH MSE − 1.26 − 1.09 − 0.68 − 0.39 0.08 0.40 0.85 1.17

(38) MAE 1.98 1.82 1.51 1.40 1.46 1.67 1.71 1.95

HTBH MSE − 1.96 − 1.95 − 1.95 − 1.68 − 1.61 − 1.24 − 1.21 − 0.84

(38) MAE 2.19 2.08 2.12 1.84 1.95 1.56 1.86 1.47

S22 MSE 2.65 1.91 1.80 1.06 1.01 0.56 0.45 0.26

(22) MAE 2.65 1.91 1.80 1.06 1.02 0.67 0.71 0.63

All MSE − 0.31 − 0.42 − 0.30 − 0.35 − 0.21 − 0.18 − 0.11 − 0.02

(412) MAE 2.03 2.03 1.86 1.90 1.84 2.01 2.06 2.28

(7)

TABLE III. Statistical errors (in kcal/mol) of the training set. The M05-D* and M05* functionals are defined in the text. M05-2X was not particularly parametrized using this training set.

System Error ωM05-D ωM05s-D M05-D* M05* M05-2X ωB97X-D

Atoms MSE 0.37 0.83 0.18 − 0.48 − 3.01 − 0.05

(18) MAE 2.02 2.28 2.61 1.98 5.10 2.57

G3/99 MSE − 0.03 − 0.03 − 0.10 − 0.05 2.01 − 0.24

(223) MAE 1.62 1.73 1.78 1.78 3.65 1.93

IP MSE − 0.80 − 1.33 0.06 0.27 1.10 0.19

(40) MAE 2.86 3.04 2.84 2.51 3.35 2.74

EA MSE − 1.02 − 0.98 − 0.54 − 0.84 − 0.23 0.07

(25) MAE 2.12 2.13 2.13 2.35 2.48 1.91

PA MSE − 1.48 − 2.66 − 0.94 − 1.76 − 1.26 1.42

(8) MAE 2.10 3.07 1.31 2.17 1.51 1.50

NHTBH MSE − 0.94 − 0.59 − 1.38 − 1.32 0.13 − 0.45

(38) MAE 1.57 1.53 2.04 2.08 1.75 1.51

HTBH MSE − 2.82 − 2.33 − 2.95 − 1.77 − 0.65 − 2.57

(38) MAE 2.83 2.37 3.08 2.14 1.51 2.70

S22 MSE − 0.01 − 0.01 0.04 3.46 0.73 − 0.08

(22) MAE 0.27 0.21 0.23 3.46 0.87 0.21

All MSE − 0.51 − 0.49 − 0.49 − 0.21 1.02 − 0.36

(412) MAE 1.83 1.89 1.99 2.05 3.05 1.96

the corresponding optimal a value is found to be 2. We reop- timize M05 and M05-D functionals on the same training set using the M05-2X orbitals, truncate their functional expan- sions at the same orders m = 10 and n = 4, and denote these two reoptimized functionals as M05* and M05-D*. Just like the ωB97X functional without dispersion correction, all data in the training set are equally weighted in the least-squares fitting for M05*.

The overall performance of our new ωM05-D is compared with the trial simple model ωM05s-D, M05-D*, M05*, and M05-2X,¹⁹ as well as existing ωB97X-D (a LC hybrid GGA-D).³⁶ Note that M05 (Ref.18) and M05-2X share the same functional form, but the former is distracted to deal with transition-metal compounds, so the latter should be our con- cern. In the ωB97 series, ωB97X-D has the closest relation- ship to ωM05-D, while ωB97 and ωB97X, developed without dispersion corrections, are expected to perform poorly for noncovalent interactions.

In TableIII, the first comparison (ωM05-D vs. ωM05s- D) partially determines the choice of our proposed functional. Although ωM05-D performs worse than ωM05s-D for HTBH, the overall performance of ωM05-D in the training set is the best.

A second comparison between ωM05-D and M05-D* in- dicates that the exact long-range exchange indeed leads to an overall improvement to MGGA, although not as large as that to GGA.^31,35,36The third comparison is between M05-D* and M05*. The cooperation of the training weight and the empirical dispersion corrections leads to a significant improvement in the results for noncovalent interactions (the S22 data) and a modest overall change. Recently, there have been the updated reference values for the S22 set.⁷⁰ We have also ex- amined the performance of ωM05-D against the updated S22 reference values. As shown in the supplementary material,⁷¹ the overall performance of the functional against the updated reference values is similar to that against the original ones.

V. RESULTS FOR THE TEST SETS

To test the performance of ωM05-D outside its training set, we also evaluate its performance on various test sets in- volving 48 atomization energies in the G3/05 test set,⁷² 30 chemical reaction energies taken from the NHTBH38/04 and HTBH38/04 databases,^15,63 29 noncovalent interactions,^63,64 166 optimized geometry properties of covalent systems,⁷³ 12 intermolecular bond lengths,⁶⁴ 4 dissociation curves of symmetric radical cations as well as three new databases, consisting of 131 vertical IPs, 115 vertical EAs, and 115 fundamental gaps. For excitation energies, we perform TDDFT calculations for 19 valence excitation energies, 23 Rydberg excitation energies, and one long-range charge-transfer excitation curve of two well-separated molecules. Each EA can be evaluated by two different ways, and each fundamental gap can be evaluated by three different ways, so there are a total of 1038 pieces of data in the test sets, which are larger and more diverse than the training set. Unspecified detailed information of the test sets as well as the basis sets, and numerical grids used is given in Ref.35.

A. Atomization energies, reaction energies, and noncovalent interactions

TableIVsummarized the general energetic results in the same way as in Ref. 36, for convenience of further comparisons. Since the 30 chemical reaction energies are taken from the NHTBH38/04 and HTBH38/04 databases calculated in TableIII, the EML(75,302) grid is used. In TableIV, the comparison between ωM05-D and ωM05s-D shows noticeable difference in atomization energies, and makes great influence on the choice of our proposed functional.

B. Equilibrium geometries

Satisfactory predictions of molecular geometries of covalent and non-covalent systems by density functionals are

(8)

TABLE IV. Statistical errors (in kcal/mol) of the test sets.

System Error ωM05-D ωM05s-D M05-2X ωB97X-D

G3/05 MSE − 0.85 − 1.67 0.00 0.25

(48) MAE 3.21 3.79 5.24 3.02

RE MSE − 0.58 − 0.65 − 0.86 − 0.24

(30) MAE 1.49 1.32 1.65 1.63

Non-covalent MSE − 0.11 − 0.05 0.50 − 0.15

(29) MAE 0.31 0.30 0.61 0.43

All MSE − 0.58 − 0.95 − 0.11 0.01

(107) MAE 1.94 2.15 2.98 1.93

necessary for practical use. For covalent systems, we perform geometry optimizations for each functional on the equilibrium experimental test set (EXTS),⁷³ while for non-covalent systems, we compute the intermolecular bond lengths of 12 weakly bound complexes taken from the S22 set,⁶⁴ using 6- 311++G(3df,3pd) basis set with the EML(75,302) grid. As shown in TableV, performance of all the hybrid functionals in predicting optimized geometries of EXTS is similar, while the performance of simple model (ωM05s-D) is somewhat worse for the intermolecular bond lengths. We decide our proposed model to be ωM05-D in this subsection. For brevity, the performance of ωM05s-D will not be shown for subsequent calculations.

C. Dissociation of symmetric radical cations

Common semilocal functionals are generally accurate for systems near equilibrium. However, due to considerable self- interaction errors in semilocal functionals, spurious fractional charge dissociation occurs.^32,74,75This situation becomes am- plified for symmetric charged radicals X⁺₂, such as H⁺₂, He⁺₂, Ne⁺₂, and Ar⁺₂. Gr¨afenstein and co-workers have obtained qualitatively correct result for these systems^76,77 using self- interaction-corrected DFT proposed by Perdew and Zunger,⁷⁸ and confirmed that the errors of standard DFT methods should be dominated by the SIEs.

We perform unrestricted calculations with the aug-cc- pVQZ basis set and a high-quality EML(250,590) grid. The DFT results are compared with results from HF theory, and

TABLE V. Statistical errors (in Å) of EXTS (Ref.73) and bond lengths of 12 weakly bound complexes from the S22 set (Ref.64). The results of ωB97X-D are taken from Ref.36.

System Error ωM05-D ωM05s-D M05-2X ωB97X-D

EXTS (166) MSE 0.003 0.001 − 0.004 − 0.002

MAE 0.010 0.009 0.009 0.009

rms 0.019 0.014 0.014 0.013

Max(−) − 0.081 − 0.083 − 0.082 − 0.078

Max(+) 0.177 0.067 0.054 0.055

Weak (12) MSE − 0.041 − 0.069 − 0.021 − 0.044

MAE 0.061 0.078 0.062 0.064

rms 0.083 0.102 0.080 0.085

Max(−) − 0.189 − 0.195 − 0.165 − 0.198

Max(+) 0.043 0.029 0.140 0.056

FIG. 1. Dissociation curve of H⁺₂. Zero level is set to E(H) + E(H⁺) for each method.

the very accurate CCSD(T) theory (coupled-cluster theory with iterative singles and doubles and perturbative treatment of triple substitutions).^79,80 The HF method is exact in Fig.

1, and gives qualitatively correct results from Figs.2–4. Al- though ωM05-D has the same amount of LR-HF exchange as ωB97X-D, the larger fraction of SR-HF exchange included in ωM05-D helps to reduce its remaining SIE. Therefore, the error of ωM05-D is smaller than that of ωB97X-D, especially for larger cations (e.g., Ne⁺₂ and Ar⁺₂). The global hybrid functional M05-2X exhibits the undesirable X⁺₂ dissociation curves, displaying a spurious energy barrier at intermediate bond length R.

D. Frontier orbital energies

Let IP(N) be the ionization potential and EA(N) be the electron affinity of the N-electron system, which are defined as

IP(N) = E^N−1− E^N, (39)

FIG. 2. Dissociation curve of He⁺₂. Zero level is set to E(He) + E(He⁺) for each method.

(9)

FIG. 3. Dissociation curve of Ne⁺₂. Zero level is set to E(Ne) + E(Ne⁺) for each method.

EA(N) = E^N− E^N+1, (40) respectively, with E_Nbeing the total energy of N-electron system. For the exact DFT, the vertical ionization potential of a neutral molecule is identical to the minus HOMO (highest occupied molecular orbital) energy of the neutral molecule,^3,81

IP(N) = −+^N(N), (41)

and the vertical electron affinity of a neutral molecule is iden- tical to the minus HOMO energy of the anion (since EA(N)

= IP(N + 1) by definition),

EA(N) = −+N+1(N + 1), (42) where +M(N) is the Mth orbital energy of N-electron system.

The vertical electron affinity of a neutral molecule may also be approximated by the minus LUMO (lowest unoccupied molecular orbital) energy of the neutral molecule, but it is proved that there exists a difference between the vertical EA

FIG. 4. Dissociation curve of Ar⁺₂. Zero level is set to E(Ar) + E(Ar⁺) for each method.

TABLE VI. Statistical errors (in eV) for the IP131 database. Error is defined as −+^N(N) − IP^vertical. Experimental geometries and reference values are used for all molecules.

System Error ωM05-D M05-2X ωB97X-D

Atoms MSE − 1.48 − 2.06 − 1.64

(18) MAE 1.48 2.06 1.64

rms 1.74 2.16 1.98

Molecules MSE − 0.68 − 1.23 − 0.92

(113) MAE 0.68 1.23 0.92

rms 0.76 1.27 1.00

Total MSE − 0.79 − 1.34 − 1.02

(131) MAE 0.79 1.34 1.02

rms 0.96 1.43 1.18

and the minus LUMO energy,

,_xc = +N+1(N + 1) − +N+1(N), (43) where the difference ,xc arises from the discontinuity of exchange-correlation potentials.^82–84Recent study shows that ,_xcis close to zero for LC hybrid functionals,⁸⁵so the minus LUMO energy calculated by a LC hybrid functional should be close to the vertical EA.

To evaluate the performance of the functionals on the HOMO energy of the neutral molecule, we collect a new database, IP131, which consists of experimental vertical IPs of 18 atoms and 113 molecules in the experimental geometries. The geometries and most of the reference values are collected from the NIST database.⁸⁶ Other publications⁸⁷ are adopted for the experimental vertical IPs of some molecules. The DFT calculations are performed with 6-311++G(3df,3pd) basis and EML(75,302) grid. As can be seen in TableVI, ωM05-D gives the best results. The global hybrid M05-2X gives the worst results here due to its incor- rect long-range XC-potential behavior.

To evaluate the performance of the functionals on the vertical electron affinity, we construct another database called EA115, which consists of 18 atoms and 97 molecules. For the molecular geometries, it is a subset of IP131. Because experimental vertical EAs are not as widely available as experimental vertical IPs, the reference values of vertical EAs are obtained via the accurate CCSD(T) calculations (using Eq. (40)). The CCSD(T) correlation energies in the basis- set limit are extrapolated from calculations using the aug-cc- pVTZ and aug-cc-pVQZ basis sets:⁸⁸

E_XY^∞ =E_X^corrX³− EY^corrY³

X³− Y³ , (44)

where X = 3 and Y= 4 for the aug-cc-pVTZ and aug-cc- pVQZ basis, respectively. The electron affinities are evaluated in two different ways, as shown in TableVIIfor the minus HOMO energy of the anion, and TableVIIIfor the minus LUMO energy of the neutral molecule. Clearly, the LC hybrid functionals outperform the global hybrid M05-2X. The reference values and molecular geometries of IP131 and EA115 are given in the supplementary material⁷¹along with detailed DFT results.

(10)

TABLE VII. Statistical errors (in eV) for the EA115 database. Error is defined as −+N + 1(N + 1) − EA^vertical. Experimental geometries and CCSD(T) reference values are used for all molecules.

Atoms MSE − 0.46 − 1.21 − 0.53

(18) MAE 0.49 1.21 0.57

rms 0.73 1.35 0.84

Moelcules MSE − 0.54 − 1.18 − 0.54

(97) MAE 0.55 1.18 0.56

rms 0.80 1.32 0.82

Total MSE − 0.53 − 1.18 − 0.54

(115) MAE 0.55 1.18 0.56

rms 0.79 1.32 0.82

E. Fundamental gaps

The fundamental gap Eg of a molecule with N electrons is defined as

Eg = IP(N) − EA(N). (45)

Following Eqs.(39)and(40)for the definitions of IP and EA, three self-consistent field (SCF) calculations (for the neutral molecule, cation and anion) are required to obtain the fundamental gap of a molecule. Using Eqs.(41)and(42), the fundamental gap of a molecule can also be obtained by two SCF calculations (for the neutral molecule and anion).

Following Janak’s theorem,⁸⁹the fundamental gap can be approximated by the HOMO-LUMO gap⁸⁴

,KS = +^N+1(N) − +^N(N), (46) and we can obtain the fundamental gap of a system using only one calculation. But from Eqs.(41)–(43),(45), and(46), we know that there exists a difference between the fundamental gap and HOMO-LUMO gap,

Eg= ,^KS+ ,^xc. (47)

As previously mentioned, ,_xchas been shown to be close to zero for LC hybrid functionals,⁸⁵so the HOMO-LUMO gap calculated by a LC hybrid functional should be close to the fundamental gap.

To evaluate the performance of the functionals on fundamental gap, we construct another database called FG115,

TABLE VIII. Statistical errors (in eV) of the minus LUMO energy of the neutral molecule for the EA115 database. Experimental geometries and CCSD(T) reference values are used for all molecules.

Atoms MSE − 0.27 0.57 − 0.02

(18) MAE 0.73 1.02 0.74

rms 0.92 1.12 0.89

Moelcules MSE − 0.24 0.60 0.05

(97) MAE 0.60 0.75 0.52

rms 0.69 0.94 0.60

Total MSE − 0.24 0.60 0.04

(115) MAE 0.62 0.79 0.55

rms 0.73 0.97 0.65

TABLE IX. Statistic errors (in eV) of HOMO-LUMO gaps for the FG115 database. The energy gap of each system is evaluated by only one SCF calculation.

Atoms MSE − 1.14 − 2.56 − 1.55

(18) MAE 1.43 2.56 1.79

rms 1.62 2.79 2.05

Molecules MSE − 0.62 − 2.00 − 1.15

(97) MAE 0.73 2.00 1.15

rms 0.93 2.13 1.34

Total MSE − 0.70 − 2.08 − 1.21

(115) MAE 0.84 2.08 1.25

rms 1.07 2.24 1.48

which shares the same molecular geometries with the EA115 database. For consistency, the reference values of fundamental gaps are also obtained via the CCSD(T) calculations described in Sec.V D(using Eqs.(39),(40), and(45)).

To examine the performance of density functionals, we evaluate the fundamental gaps using three different estimates, with 6-311++G(3df,3pd) basis and EML(75,302) grid. The results are shown from TablesIX–XI, in order of increasing the number of SCF calculations required for each molecule. In the estimate requiring three calculations, the results are similar for the three functionals. ωB97X-D gives worse results than other functionals in the estimate requiring two calculations. In the simplest estimate, the HOMO-LUMO gap, which requires only one SCF calculation for each system, ωM05-D significantly outperforms the other two functionals. The reference values of FG115 and detailed HOMO-LUMO gap results by DFT methods are given in the supplementary material.⁷¹

F. Excitation energies

To assess the performance of density functionals on excitation energies, we perform TDDFT calculations on five small molecules,⁹⁰ which include nitrogen gas (N2), carbon monoxide (CO), water (H2O), ethylene (C2H4), and formaldehyde (CH₂O), with 6-311(2+,2+)G** basis and EML(99,590) grid. The molecular geometries, experimental values of excitation energy are taken from Ref.90. The detail

TABLE X. Statistic errors (in eV) of fundamental gaps for the FG115 database, each evaluated by the difference of HOMO energies between the neutral molecule and anion. The energy gap of each system is evaluated by two SCF calculations.

Atoms MSE − 0.95 − 0.83 − 1.04

(18) MAE 0.98 0.87 1.08

rms 1.17 1.00 1.30

Molecules MSE − 0.31 − 0.42 − 0.55

(97) MAE 0.56 0.51 0.72

rms 0.70 0.60 0.85

Total MSE − 0.41 − 0.48 − 0.63

(115) MAE 0.62 0.57 0.78

rms 0.79 0.68 0.93