Fisher information metric for the Langevin equation and least informative models of continuous stochastic dynamics

(1)

continuous stochastic dynamics

Kevin R. Haas, Haw Yang, and Jhih-Wei Chu

Citation: The Journal of Chemical Physics 139, 121931 (2013); doi: 10.1063/1.4820491 View online: http://dx.doi.org/10.1063/1.4820491

View Table of Contents: http://scitation.aip.org/content/aip/journal/jcp/139/12?ver=pdfcov Published by the AIP Publishing

Articles you may be interested in

Lagrangian dynamics in stochastic inertia-gravity waves Phys. Fluids 22, 126601 (2010); 10.1063/1.3518137

Relation between Langevin type equation driven by the chaotic force and stochastic differential equation AIP Conf. Proc. 519, 359 (2000); 10.1063/1.1291585

From the Langevin equation to the fractional Fokker–Planck equation AIP Conf. Proc. 502, 375 (2000); 10.1063/1.1302409

Stochastic modelling of nonlinear dynamical systems AIP Conf. Proc. 502, 313 (2000); 10.1063/1.1302402

High-accuracy discrete path integral solutions for stochastic processes with noninvertible diffusion matrices. II. Numerical evaluation

J. Chem. Phys. 107, 3505 (1997); 10.1063/1.474690

(2)

Fisher information metric for the Langevin equation and least informative

models of continuous stochastic dynamics

Kevin R. Haas,1_{Haw Yang,}2,a)_{and Jhih-Wei Chu}1,3,4,a)

1_{Department of Chemical and Biomolecular Engineering, University of California-Berkeley, Berkeley,}

California 94720, USA

2_{Department of Chemistry, Princeton University, Princeton, New Jersey 08544, USA}

3_{Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan} 4_{Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan} (Received 8 May 2013; accepted 23 August 2013; published online 25 September 2013)

The evaluation of the Fisher information matrix for the probability density of trajectories generated by the over-damped Langevin dynamics at equilibrium is presented. The framework we developed is general and applicable to any arbitrary potential of mean force where the parameter set is now the full space dependent function. Leveraging an innovative Hermitian form of the corresponding Fokker-Planck equation allows for an eigenbasis decomposition of the time propagation probability density. This formulation motivates the use of the square root of the equilibrium probability den-sity as the basis for evaluating the Fisher information of trajectories with the essential advantage that the Fisher information matrix in the specified parameter space is constant. This outcome greatly eases the calculation of information content in the parameter space via a line integral. In the con-tinuum limit, a simple analytical form can be derived to explicitly reveal the physical origin of the information content in equilibrium trajectories. This methodology also allows deduction of least in-formative dynamics models from known or available observables that are either dynamical or static in nature. The minimum information optimization of dynamics is performed for a set of different constraints to illustrate the generality of the proposed methodology. © 2013 AIP Publishing LLC. [http://dx.doi.org/10.1063/1.4820491]

I. INTRODUCTION

Complex molecular systems are often studied by track-ing the temporal evolution of important coordinates to reveal the hidden metastable states and to characterize the transi-tions between them. A central objective of many experimental and theoretical endeavors is thus to resolve the system dy-namics from the measured data.1–3 _{In this regard, the} infor-mation content for the parameters of interest in a particular measurement is of prime interest and is the central focus of the present work. For biomolecular conformational changes, the over-damped Langevin equation is often employed as the model for a mechanistic understanding.4–6On top of the deter-ministic potential of mean force (PMF), the Langevin model incorporates the solvent-induced stochastic fluctuations via the diffusion coefficient7 _{to describe system dynamics along} a set of chosen coordinates or order parameters.8–10 _Unless specified otherwise, this work addresses the quantification of information content for Langevin dynamics.

A major difficulty of understanding biomolecular dynam-ics is that the commonly used characterization methods are often limited to processes with distinct temporal and spatial scales. For example, the computer simulation of molecular dynamics can be used to record the coordinates and velocities of the atoms of biomolecules as well as the surrounding sol-vent molecules but has limited ability to access structural tran-sitions on longer time scales (>μs).11–13Different techniques a)_{Authors to whom correspondence should be addressed. Electronic}

addresses: hawyang@princeton.edu and jwchu@nctu.edu.tw

of nuclear magnetic resonance (NMR) can be used to acquire the transition rates of the different aspects of biomolecule conformational changes on different time scales (usually with the necessary assumption of a two-state model)14–16 _{but the} ensemble averaging nature washes away the rich mechanis-tic details in molecular individualities. Single-molecule meth-ods such as those via the Förster resonance energy transfer (FRET)17–19 _{do away with the issues of ensemble averaging} but face the challenge of the low signal-to-noise ratio convo-luted with photon-counting statistics.20,21_{As a result, not only} the data analysis in each category of these measurements is complicated, but also the systematic combination of informa-tion across different techniques is a challenging issue.

We reason that the foundation for quantitatively integrat-ing the knowledge from different data types could be based on an information measure of dynamics parameters from the recorded time trajectories. In this regard, the Fisher informa-tion provides a framework with a clear statistical picture and straightforward linkage to thermodynamics;22–24 _therefore, it is employed here to quantify the information underlying the Langevin dynamics model. With this information met-ric, the determination of the parameters of time propagation can serve as a common objective over different data types for cross validation and knowledge integration between fields.25

In order to extract maximum information of dynamics from time-dependent data, we aim to quantify the Fisher in-formation of trajectories (FIT).26,27This approach of evaluat-ing information content in the path space is different from the typical methods of tracking the statistics of dynamic

(3)

variables,28,29including the rate of changes of Fisher informa-tion matrices in the space of a single or a few time slices.30,31 In this work, we devise numerical and analytical methods to determine the Fisher information of the PMF and diffusion coefficient in the trajectories of Langevin dynamics directly without the need of performing Monte Carlo simulations.27 As a result, an explicit connection between parameters in the equation of motion and FIT is established, which demon-strates how dynamics information is encoded through time propagation.

Since FIT quantifies the disorder inscribed in the prob-ability function of the data, or the likelihood function of the model parameters, it can be applied to deduce the properties of the equation of motion such as the PMF and diffusion co-efficient in Langevin dynamics under the constraints given by the selected statistics of observables.32–34Therefore, minimiz-ing FIT with such constraints would give the least informative dynamics (LID) model. For retrieving parameters relating to time propagation, this constrained optimization approach has been addressed for Markov states systems.35,36 _For continu-ous stochastic dynamics, however, acquiring an explicit func-tional form of the information content in trajectories faces the challenges of infinite dimensionality, non-differentiability with respect to time of the Winer process, and path integral. This work provides numerical and analytical solutions for de-termining the FIT of Langevin dynamics at equilibrium.

The rest of the paper is organized as the following. The application of Fisher information to continuous but not dif-ferentiable trajectories at equilibrium is established in Sec.II

with our rationale of selecting the basis of representation dis-cussed in Sec. III. SectionIVoutlines the numerical proce-dure we developed to calculate FIT for continuous stochastic dynamics and Sec.Vderives the analytical form of FIT in the continuum limit. SectionVIapplies the analytic form of FIT for measuring information content in trajectories and the re-sult of which is used in Sec.VIIto derive the least informative dynamics under various constraints of based on the Langevin equation followed by our conclusion.

II. THE FISHER INFORMATION MATRIX FOR LANGEVIN DYNAMICS

The Fisher information defines a size measure (Rieman-nian manifold) for the volume element of information content for a corresponding set of model parameters. The line integral of a parameter change with the Fisher information matrix is formally the dissipation function for moving in the space of model parameters.23,37,38 _{This metric translates between the} parameter space of system dynamics and the information con-tent of the resulting probability distribution of system trajec-tories. Therefore, Fisher information can be used to assess the manner by which changing the properties of time propagation may vary the information content in system dynamics.26

Using a general coordinate x to describe the dynam-ics of a system, the concern of our analysis is the infor-mation content for the mean force profile F(x) and dif-fusion coefficient D(x) contained in the Langevin trajecto-ries X(t) with ˙x= βD(x)F (x) +√2D(x)dWt; t is time and

β is one over the Boltzmann constant kB multiplying the

system temperature T. The Wiener process specifying the stochastic force in this equation of motion satisfiesdWtdWt

= δ(t − t_{)dt. The profile of the deterministic mean force is} related to the PMF, V (x), as F (x)= −dV (x)/dx. The equi-librium distribution of system states in the continuous space of x is related to the PMF as peq(x)∝ exp(−V (x)/kBT). A trajectory, X(t); t ∈ [0, tobs], in this case is a continuous but non-differentiable function of time. In a measurement, this stochastic trajectory is generally recorded at specific in-stances separated by a time resolution t to create a vector

Xt = [X(0), X(t), X(2t), . . . , X(tobs)]. This vector exists in a trajectory space of dimensionality N = tobs/t with tobs the duration of observation and the coordinates denoted as the set {xτ|τ = 0, 1, 2, . . . , N}. Based on this setup, we aim to

find the Fisher information of the deterministic and stochastic components in the Langevin equation in a multidimensional vector space. Then, limt→ 0 will be performed on the final

results to recover the complete information content in trajec-tories in the continuum limit.

The collection of function parameters of the Langevin equation is now combined into θ= {θi} for the convenience

of derivation. Here, i may go to infinity for describing the parameters associated with the dense set of points for a con-tinuous function. The Fisher information metric is defined as the expectation value for the product of the derivatives of the log probability density of the trajectory with respect to the components in θ Ii,j( θ)= EXt ∂ln P ( Xt) ∂θi ∂ln P ( Xt) ∂θj θ . (1)

The E_X_t[·] in the above equation denotes the expectation eval-uated by the path integration over Xt,

D XtP( Xt)[·]. The

Fisher information is thus a matrix for the (i, j) pairs of pa-rameters evaluated at the current values of θ.

In calculating this path integral, the Markovian nature of the Langevin equation can lead to tremendous simplification. In particular, the probability density of Xtcan be factored via

the probability densities of time propagation that connect two consecutive time slices,

P( Xt)= p(x0)

N−1 τ=0

p(xτ+1|xτ). (2)

In this equation, p(x0) is the static distribution of system states at time zero. For equilibrium trajectories, p(x0)→ peq(x0) is employed for specifying the probability densities of the initial states. Therefore, each component in the Fisher information matrix becomes Iij = tobs/t τ,τ₌₋₁ D Xt ∂ln p(xτ₊₁|xτ) ∂θi ∂ln p(xτ₊₁|xτ) ∂θj P( Xt). (3) The contribution from peq(x0) is included by setting p(x0|x−1) = peq(x0). For the path integral in Eq.(3), the time indices that do not appear in the derivatives are marginalized out so that

P( Xt)→ p(τ, τ + 1, τ, τ+ 1). Furthermore, unless τ = τ,

the other terms in the double sum of Eq. (3)contribute zero toIijdue to the ability to isolate a normalization condition of

(4)

the conditional probability densities p(xτ+ 1|xτ), τ = 0. . . N: dx∂ln p(x) ∂θ p(x)= ∂dxp(x) ∂θ = 0. (4)

The only contributing terms toIij thus come from the

Fisher information matrix of the equilibrium distribution, Ieq, and that of the conditional probability of time propagation,

It: Ieq= dx0 ∂ln peq(x0) ∂ θ ∂ln peq(x0) ∂ θ peq(x0), (5) It = dxtdx0 ∂ln p(xt|x0) ∂ θ ∂ln p(xt|x0) ∂ θ p(xt, x0). (6)

Here, the notation of coordinates was simplified by implying that t= t and noting that there are tobs/t equivalent terms of It. As a result, the Fisher information of P ( Xt) is

I = Ieq+ tobs

tIt. (7)

The connection of Eqs.(5)–(7)between an entire path in-tegral and an inin-tegral over two closely spaced time points is a general result of Markovian dynamics. Therefore, if the FIT for a handful of parameters is of interest, the values can be cal-culated via a short-time expansion of the time propagator with Monte Carlo simulations for computing the expectation.27For cases that the parameterization of dynamics is in the space of continuous functions of infinite dimensionality, though, such as for quantifying the information of PMF in the Langevin equation, the approach of brute-force sampling becomes im-practical. However, this difficulty could be resolved if an analytical expression of FIT was available. One of the main results of this work is Eq. (48)for the FIT of Langevin dy-namics. This expression then allows a line integral to be per-formed in the parameter space for quantifying the dissipation measure of information (Eqs.(54)and(55)).

The calculation of FIT via Eq.(7)relies on evaluating the derivatives and integrals defined for the static distribution in Eq. (5)and the dynamic propagator in Eq.(6). An essential key toward achieving this goal is the selection of the parame-ter set. The strategy we followed is trying to eliminate the de-pendence of the matrix elements on other parameters because a constant Fisher information matrix is convenient for eval-uating the information content via a line integral.23,37,38 We found that the most natural basis for the deterministic com-ponents of the Langevin equation is the square root of the equilibrium probability density39_{(vide infra):}

ρeq(x)= peq(x)∝ exp x F(x)/kBT . (8)

Therefore, we now define Fisher information with respect to a composite quantity of the functions for density and diffu-sion, i.e., θ→ {ρ_eq(x), D(x)}. The Fisher information thus becomes functional derivatives and is a scalar field over the arguments (x, y) of the parameterizing functions rather than a matrix.

We start the evaluation of FIT from the equilibrium term

Ieq(x, y)= dx0 δln peq(x0) δρeq(x) δln peq(x0) δρeq(y) peq(x0). (9) Since peq(x)= ρeq2(x), the functional derivatives in Eq.(9)are

δln ρ2 eq(x0) δρ_eq(x) = 2 δ(x0− x) ρ_eq(x0) . (10)

Therefore, the equilibrium Fisher information is just the inte-gral of delta functions:

Ieq= 4

dx0δ(x0− x)δ(x0− y) = 4δ(y − x). (11) Using ρeq(x) to define the parameter space of the Fisher infor-mation, Ieq is thus constant in the sense that it is not a func-tional of ρeq(x).

As will be shown later, the property of ρeq(x) in mak-ing Ieqconstant also facilitates the evaluation of Itdefined in

Eq.(6). The remaining task of calculating FIT with respect to the deterministic components of the Langevin equation is then evaluating the Fisher information of the conditional probabil-ity densprobabil-ity with respect to ρeq(x):

It(x, y)= dxtdx0 δln p(xt|x0) δρeq(x) δln p(xt|x0) δρeq(y) p(xt, x0). (12) This Fisher information matrix is a rank 2 tensor field over the space coordinate.

For evaluating the functional derivatives of p(xt|x0), we rely on the Fokker-Planck equation (FPE) that governs the temporal evolution of p(xt|x0): ∂p(xt|x0) ∂t = −∇ · J (xt) (13) and J(xt)= − D(xt)∇p(xt|x0)− D(xt)F (xt) kBT p(xt|x0) . (14) In this formula, x can in general be a multidimensional vec-tor and the gradients in the FPE apply to the xt coordinate.

The initial condition of this partial differential equation is

p(xt|x0)|t= 0= δ(xt− x0) and the boundary conditions are zero flux (J (Boundary)= 0) for conserving the total probability.

The FPE can be equivalently expressed in terms of peq(x): ∂p(xt|x0) ∂t = ∇ · D(xt)peq(xt)∇ p(xt|x0) p_eq(xt) . (15)

Next, the unique features of expressing the FPE via the equi-librium density ρeq(x) in Eq.(8)are discussed in preparing for the evaluation of It(x, y).

III. THE HERMITIAN OPERATOR OF THE FPE OF LANGEVIN DYNAMICS

With ρeq(x) being the square root of peq(x), the prob-ability density of time propagation can be symmetrized

(5)

in time: ρ(xt, x0)= p(xt|x0) peq(x0) peq(xt) , (16)

such that the probability density of a trajectory also has the temporal symmetry

P( Xt)= ρeq(x0)

N₋₁ τ=0

ρ(xτ₊₁, xτ)ρeq(xN). (17)

Expressing Eq. (12) via the symmetric time propagator37 gives the factors of It(x, y):

It = I_tρ dxtdx0 δln ρ(xt, x0) δρeq(x) δln ρ(xt, x0) δρeq(y) p(xt, x0) + 2δ(x − y) − 2ρ(x, y). (18)

The details of deriving Eq.(18)are in AppendixA.

In the continuum limit of t → 0+, the fact that ρ(xt,

x0) approaches δ(xt− x0) leads to cancellation of the last two terms in Eq.(18). We will thus focus on the Itρ term, an

in-tegration in the two-dimensional space. After expanding the functional logarithms therein, we will evaluate FIT according to I_tρ(x, y)= dxtdx0 δρ(xt, x0) δρeq(x) δρ(xt, x0) δρeq(y) ρeq(xt)ρeq(x0) ρ(xt, x0) . (19) The FPE of the symmetric propagator ρ(xt, x0) can be found by substituting the expression in Eq.(16)into Eq.(15):

∂ρ(xt, x0) ∂t = −Hρ(xt, x0), H= − 1 ρ_eq(x)∇ · D(x)ρ_eq2(x)∇ 1 ρ_eq(x) . (20) The boldface font is used to denote operators in this work.

The proof detailed in Appendix B shows that H in the FPE of Eq. (20) is Hermitian. As a result, ρ(xt, x0) can be expressed via the matrix elements of H with the Dirac notation

ρ(xt, x0)= xt|e−Ht|x0. (21) In Sec.IV, a procedure for calculating I_tρ based on the Her-mitian version of the FPE of the Langevin equation is devel-oped. This scheme is then employed to evaluate Itρ at

differ-ent values of t. Since the presdiffer-ented formulation is general and does not rely on assuming extreme values of t, it can be used to capture the dynamic behaviors in different temporal resolutions as illustrated by the results presented in Sec.IV.

IV. CALCULATIONS OF FIT VIA AN EIGENBASIS EXPANSION

The Hermitian nature of H allows a complete set of or-thogonal eigenvectors ψi’s to be found with real eigenvalues

λi’s such that for each eigenbasisψi|H|ψj = λiδij. Based

on this property, we performed an eigen-decomposition of the symmetric operator of the FPE (sFPE) of Eq. (20). In

0.7 0.9 1.1 1.3 −2 0 2 4 6 Position x

Potential Mean Force

V( x) D=1 0.7 0.9 1.1 1.3 0 2 4 6 8 10 Position x Eqilibrium Probability p eq (x )

FIG. 1. The reference potential of mean force and the corresponding equi-librium probability density for illustrating the calculations of the Fisher in-formation of trajectories. The model has 3 local minima with approximately 5 kBT barriers separating these metastable states.

particular, we assume a form for the eigenvectors of ψi(x)

= ρeq(x)φi(x) with the modifying function φi(x) determined

from the input of ρeqand diffusion coefficient by using spec-tral elements40 _{to solve the generalized eigenvalue problem} φiρeq|H|ρeqφj = λiδijφiρeq|ρeqφj. By inserting the

com-pleteness property of_i|ψiψi| = 1 that resolves the

iden-tity operator, and noting the orthogonality of the eigenbasis,

ρ(xt, x0) can now be decomposed as

ρ(xt, x0)= i,j xt|ψiψi|e−Ht|ψjψj|x0 = i ψi(xt)e−λitψi(x0). (22)

The eigenvectors and eigenvalues themselves are found via a spectral finite element method40 (sFEM) although the results of FIT calculations are invariant to the particular choice of the numerical method as long as the eigenvectors and eigenvalues are determined accurately.

To illustrate the resulting values of ρ(xt, x0) for a non-trivial model system that contains three wells separated by 5 kBT barriers in the PMF shown in Fig. 1, the two-dimensional density field ρ(xt, x0) calculated from the eigen-decomposition method discussed above is plotted as log-contours in Fig. 2 at an intermediate time resolution of t = 0.025 s which is also on the same order of the relaxation time (t= 0.036 s) out of the middle intermediate state.

For calculating the FIT according to Eq. (19), the func-tional derivative of the density field with respect to ρeqneeds to be determined. By employing an arbitrary test function f(x), the functional derivative is defined as

dxδρ(xt, x0) δρ_eq(x) f(x)= d dεxt|e −t H[ρeq+εf ]|x 0 ε=0 . (23) Since the reference Hamiltonian does not necessarily commute with the operator perturbed by f(x), i.e., [ H0_{, H}_] = 0, the exponential in Eq. (23) is not necessarily factoriz-able for taking the derivative. To overcome this difficulty, we This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:

(6)

x Δt x 0 0.7 0.9 1.1 1.3 0.7 0.9 1.1 1.3 ln(ρ) −8 −6 −4 −2 0 2

FIG. 2. The contours of ln ρ(xt, x0) at t= 0.025 s for the reference PMF

shown in Fig.1at D= 1 in the corresponding unit. The high-density regions along the diagonal axis around the three local minima are clear. However, off-diagonal densities representing transitions between states within the time window are also clear.

employ the approximate Trotter splitting formalism41 e−t(H0+εH) = e−t H0/2e−εt He−t H0/2+ O(t3), (24) wherein the errors due to approximation decrease as t goes toward the continuum limit. The effect of the perturbation is thus evaluated effectively at the time midpoint between xtand

x0. Performing the derivative with respect to ε as prescribed by Eq.(23)to the approximate form in Eq.(24)leads to

dxδρ(xt, x0) δθ(x) f(x) ≈ −txt|e−t H 0_/₂ He−t H0/2|x0 = −t i,j ψi(xt)e−λit/2ψi|H|ψje−λjt/2ψj(x0). (25) After inserting the completeness property of the eigenbasis of

i|ψiψi| = 1 in between all of the operators in the first

line of Eq.(25) and applying orthogonality relationship be-tween eigenvectors ofψk|H0|ψi = λiδik, the second line of

Eq.(25)can be obtained.

The Hamiltonian of the first order perturbation with re-spect to ρeq, H, in Eq.(25)can be found by tracking ε in the total Hamiltonian H[ρeq+ εf ] = −1 (ρeq+εf )(x) ∇ · D(x)(ρeq+ εf )2(x)∇ 1 (ρeq+ εf )(x) . (26) The coefficient of the first order Taylor expansion of H can then be evaluated to obtain H:

H= d dεH[ρeq+ εf ] ε=0 , (27) H= − 2 ρ_eq(x)∇ · D(x)f (x)ρeq(x)∇ 1 ρ_eq(x) +f(x) ρ2 eq(x) ∇ · D(x)ρ_eq2(x)∇ 1 ρeq(x) + 1 ρeq(x)∇ · D(x)ρ_eq2(x)∇ f(x) ρ2 eq(x) . (28)

With the knowledge of H, the error in Eq.(25)is purely due to the Trotter splitting approximation.

The only remaining factors for calculating FIT are the ψi|H|ψj terms in the second line of Eq.(25). For the first

and third line of Eq. (28), integration by parts can be per-formed to isolate the test function f(x) in the integral. Utiliz-ing the relation of φi(x)= ψi(x)/ρeq(x) and recognizing the existence of terms of the form−φi(x) Hψj(x), one obtains

ψi|H|ψj = dxf(x)[2∇φj(x)· (D(x)ρeq(x)∇φi(x)) −φj(x)λiψi(x)− φi(x)λjψj(x)]. (29) Therefore, δψi|H|ψj δρ_eq(x) = 2∇φi(x)· (D(x)ρeq(x)∇φj(x)) − ρeq(x)(λi+ λj)φi(x)φj(x). (30)

Along the same token, one can find the functional derivative of H with respect to the diffusion coefficient

δψi|H|ψj

δD(x) = ∇φi(x)· ρ 2

eq(x)∇φj(x). (31)

In the case that the diffusion coefficient is not a function of x, i.e., a constant, the functional derivative reduces to

dψj|H|ψi

dD =

λi

Dδij. (32)

Combining the information in Eqs. (25) and (30), the functional derivative of ρ(xt, x0) with respect to ρeq(x) can be calculated as δρ(xt, x0) δρeq(x) ≈ −t i,j ψi(xt)e−λit/2 δψi|H|ψj δρeq(x) e−λjt/2_ψ j(x0). (33) Armed with the knowledge of the functional derivatives from the eigenbasis construction, the key terms of Fisher in-formation in Eq.(19)can be expressed:

I_tρ(x, y) (t)2 = i,j k,l δψi|H|ψj δρeq(x) δψk|H|ψl δρeq(y) ik_{j l}(t). (34) For all of the functional dependence on (xt, x0) that are inte-grated away in Eq. (34)is captured in the “overlap” integral

(7)

ALGORITHM I. FIT calculation using the eigenbasis resolved via apply-ing a spectral finite element method (sFEM) to solve the Hermitian FPE of Eq.(20).

procedure FISHER(ρeq(x), D; t)

Obtain λiand φivia sFEM Perform the 2D integral in ik

j l(t)

Calculate functional derivatives Go through the 4-loop summation for Itρ

end procedure

over two consecutive time slices:

ik_{j l}(t)≡ dxtdx0e−(λi+λk)t/2ψi(xt)ψk(xt) × n φn(xt)e−λntφn(x0) −1 × e−(λj+λl)t/2_ψ j(x0)ψl(x0). (35) According to Eqs. (34)and(35), AlgorithmI below is used to calculate the Fisher information matrix of trajecto-ries for the Langevin equation parametrized by the equilib-rium density ρeqand constant diffusion coefficient D.

For the reference model shown in Fig.1, the general ap-proach of eigenbasis expansion presented above is employed to calculate the Fisher information metric of Langevin tra-jectories at various time resolutions with AlgorithmI. Fig.3

shows a contour plot of Itρ(x, y)/t as well as the cross

derivatives with respect to the spatial coordinates of ρeq(x) and the diffusion coefficient. The matrix element of D alone is also shown as the actual numerical value in the figure. The behaviors of FIT at different time resolutions are discussed in the following to motivate the development of an analytical expression in the continuum limit.

At a low time resolution of t≈ 0.15 s that is on the order of the slowest timescale of system relaxation, the major component affecting the FIT is the low probability transition state of ρeq(x) that is caused by the barrier located at x= 1.1 in the PMF. Therefore, reducing this free energy barrier would lead to a faster system relaxation and hence a less prescriptive or informative dynamics model.

At an intermediate time resolution of t≈ 0.02 s, the de-pendence of FIT on the existence of the low density states at

x= 0.9 and 1.1 is still significant as that in the previous case,

but the off-diagonal negative couplings in the matrix start to emerge. Such pattern indicates that a flatter potential barrier would also result in a less informative time propagator. This result highlights the dependence of FIT on the time resolu-tion used for investigaresolu-tion and the general applicability of our eigenbasis expansion approach as the framework of analysis. At a high time resolution of t ≈ 0.002 s, the impor-tance and details of the underlying equilibrium probability of states vanishes, leading to a nearly tri-banded matrix with pos-itives values along the diagonal terms and negative numbers for the off-diagonal elements. The reason is that FIT begins to capture the diffusion processes within individual wells at this timescale and the specific features of ρeq(x) are not as promi-nent as in the previous cases. The origin of this tri-banded

ρ_eq(x) ρ eq (y) 0.7 0.9 1.1 1.3 0.7 0.9 1.1 1.3 D ′ D 4.04 ρ_eq(x) ρ eq (y) 0.7 0.9 1.1 1.3 0.7 0.9 1.1 1.3 D ′ D 6.40 ρ_eq(x) ρ eq (y) 0.7 0.9 1.1 1.3 0.7 0.9 1.1 1.3 D ′ D 17.26 ρ_eq(x) I Δt ρ_(x,y) −1.5 −1 −0.5 0 0.5 1 1.5 x 104 FIG. 3. The contour of the Fisher information matrix of I_tρ(x, y)/t for t = 0.15, 0.02, and 0.002 s. The upper band in contour is the Fisher information metric with respect to D and ρ,I(D, y). Corner value is the Fisher element for diffusionI(D, D).

(8)

nature of FIT becomes explicit in the continuum limit of

t→ 0+.

V. FISHER INFORMATION OF TRAJECTORIES IN THE CONTINUUM LIMIT

In this derivation, it is convenient to employ the following equivalent expression of the Fisher information of p(xt|x0):39

It(x, y)= dxtdx0 δ2_{ln p(x} t|x0) δρeq(x)δρeq(y) p(xt, x0). (36)

In the continuum limit of t→ 0+, p(xt|x0) approaches the delta function δ(xt− x0) which has no dependence on ρeq(x); therefore, the functional derivatives vanish and It reaches

zero in the continuum limit. On the other hand, the denom-inator of the FIT term of Itin Eq.(7)also goes to zero as t

→ 0+_{. After applying the L’Hopital rule, utilizing again t} = t to simplify the notation, and letting t = 0, Eq. (36)

becomes lim t_→0 It t = dxtdx0 δ2 _{(∂ ln p(x} t|x0)/∂t)|t=0 δρeq(x)δρeq(y) p(xt, x0) + dxtdx0 δ2ln p(xt|x0) δρeq(x)δρeq(y) ∂p(xt, x0) ∂t t=0 . (37) When p(xt|x0)|t=0 → δ(xt − x0) in the continuum limit, the lack of ρeq(x) dependence makes the second term zero. Af-ter applying the functional derivatives to the logarithm and canceling out the factor δ(xt − x0) from p(xt, x0)|t=0 → δ(xt

− x0)peq(x0) , Eq.(37)is simplified to lim t→0 It t = dxtdx0 δ2(∂p(xt|x0)/∂t)|t=0 δρeq(x)δρeq(y) p_eq(x0). (38)

The functional dependence of the time derivative of the conditional probability is then given by the FPE:

∂p(xt|x0) ∂t t=0 = ∇xt· D(xt)ρ_eq2(xt)∇xt δ(xt− x0) ρ2 eq(xt) , (39) where the specific coordinate upon which the gradient opera-tor acts is indicated by the subscript on the gradient operaopera-tor. Applying the product rule on the inner gradient, the FPE be-comes ∂p(xt|x0) ∂t t=0 = −2∇xt· D(xt)δ(xt− x0) ∇xtρeq(xt) ρeq(xt) + ∇xt· D(xt)∇xtδ(xt− x0) . (40)

The second term of Eq.(40)does not depend on ρeq(x). The only contribution to the Fisher information thus comes from the first term in Eq. (40). With this understanding, Eq.(38)becomes lim t→0 It t = −2 dxtdx0 ×δ2[∇xt · (D(xt)δ(xt− x0)∇xtln ρeq(xt))] δρeq(x)δρeq(y) ρ_eq2(x0). (41)

With the exchange of linear operators of space and functional derivatives, the functional derivatives of the logarithm of the equilibrium density described below can be used to evaluate Eq.(41): δ2ln ρeq(xt) δρeq(x)δρeq(y) = −δ(xt− x)δ(xt− y) ρ2 eq(xt) . (42)

After applying this result, Eq.(41)becomes lim t→0 It t = 2 dxtdx0∇xt· D(xt)δ(xt− x0)∇xt ×δ(xt− x)ρ 2 eq(x0)δ(xt− y) ρ2 eq(xt) . (43) The factor ρ2

eq(x0) is now moved inside the spatial derivatives of xt. Integration of Eq. (43)over x0 sets x0 = xt, and this

equation can thus be reduced to lim t_→0 It t = 2 dxt∇xt· (D(xt)∇xtδ(xt− x)δ(xt− y)). (44) In order to obtain an explicit expression for the Fisher in-formation metric from the integral form of Eq.(44), we eval-uate the effect on a twice-differentiable test function G(x, y) that vanishes at domain boundaries. The integral of informa-tion evaluainforma-tion with G(x, y) is

dxdyG(x, y) lim t→0 It t = 2

dxtdxdyG(x, y)∇xt· (D(xt)∇xtδ(xt− x)δ(xt− y)).

(45) After performing integrating by parts twice on xt, the

integra-tion over x yields x= xtfrom the δ(xt− x) term, and a change

of variable of xtto x transforms Eq.(45)into

dxdyG(x, y) lim t→0 It t = 2

dxdyδ(x− y)∇x· (D(x)∇xG(x, y)). (46)

Now reversing operations and performing integration by parts with respect to x twice to bring G(x, y) outside of the gradient operators, we obtain dxdyG(x, y) lim t→0 It t = 2

dxdyG(x, y)[∇x· (D(x)∇xδ(x− y))]. (47)

For an arbitrary G(x, y) that satisfies the regularity con-ditions, the equality of Eq.(47)implies the equivalence be-tween the bracketed items inside the integral. Therefore, one of the main results of this work, the analytical expression for the Fisher information metric of ρeq, is read off from Eq.(48) and applied to the total FIT of Eq.(7)as

I(x, y)[ρeq]= 4δ(x − y) + 2tobs∇x· (D(x)∇xδ(x− y)).

(9)

In the case that the diffusion coefficient is a constant, Eq. (48)indicates that limt→0It/t is formally the

Lapla-cian kernel42 as indeed seen in the numerical example shown in Fig.3with the increase of time resolution (decrease of t). Thus, our Fisher information metric is truly a Fisher informa-tion operator whose net effect can only be realized once the operators are applied in Sec.VI. With the Fisher information metric presented in Eq.(48), the procedure of evaluating FIT is discussed in Sec.VI.

VI. FISHER INFORMATION AS A MEASURE OF CHANGE IN INFORMATION

If a ξ ∈ [0, 1] variable is employed to specify the state of model parameters and interpolate between two different sets of parameters, the Fisher information metric can be used to quantify the information change in varying the parameter set23,43 J = 1 0 dξ i,j ∂θi ∂ξIi,j( θ) ∂θj ∂ξ . (49)

The arc length along the ξ curve in the information space can also be calculated via the Fisher information

S = 1 0 dξ i,j ∂θi ∂ξIi,j( θ) ∂θj ∂ξ . (50)

The triangle inequality states thatJ ≥ S2; here, the equality is true if the integrand is constant along the curve. For FIT via the basis of ρeq, constant Fisher information matrix is satisfied in the continuum limit, Eq.(48).

To perform the line integral of FIT, we define a linear path interpolating an essentially flat reference model with

ρref

eq(x)= limL→∞

√

1/2L and the desired profile that is only denoted as ρ∗_eq(x) for now (with the boundary condition that

ρeq(x)|−L, L= 0). Similarly, a linear path can also be defined for the diffusion coefficient coordinate that goes from a refer-ence value Dref_{to the optimized D}∗_{. Therefore,}

ρeq(ξ )= ξρ∗eq(x)+ (1 − ξ)ρeqref(x), (51)

D(ξ )= ξD∗(x)+ (1 − ξ)Dref(x), (52) where ξ ∈ [0, 1]. In this work, we concern specifically with the case that the diffusion constant is not varying due to the divergence in the information content in the continuum limit for cases in which D*(x)= Dref_(x).44 _{(See Appendix}_D_for a specific example of this divergence between two Gaussian processes.) Thus, we only require the Fisher information as a function of the equilibrium probability density given by Eq.(48).

To calculate the information change, the Fisher informa-tion matrix in the continuum limit given by Eq.(48)is applied to Eq.(49)where the sum is replaced by the corresponding integral in the space of equilibrium density functionsi, j→

dxdy. Performing the line integral over dξ with integration

by parts twice on the x coordinate gives the analytical form of

the information dissipation measure

J =

dxdy ρ_eq(y)δ(x− y)∇x· (D(x)∇xρeq(x)). (53) Taking the integral over y eliminates the delta function to give the 1D integral of

J =

dx ρeq(x)∇ · (D(x)∇ρeq(x)). (54) Equation(54)is a primary result of this work. Based on the analytical expression we derived for the Fisher informa-tion metric in the chosen space for parametrizing the equainforma-tion of motion, we illustrate how the information content of the dy-namic parameters in equilibrium trajectories can be evaluated. This result is analytical and does not require numerical diago-nalization of the time propagation operator. Furthermore, ex-pressing ρeq(x) in terms of the mean force F(x) in the special case of a constant diffusion coefficient, Eq.(54)becomes

J [F (x)] = S2_{[F (x)]}₌ Dβ2 4 F

2_(x)

eq. (55) The same information expression can also be obtained by us-ing an entropy measure of the probability densities of trajec-tories and taking the continuum limit.45 _{Next, Eq.}₍₅₄₎_will serve as the foundation for the analysis of dynamical infor-mation in Langevin trajectories.

VII. THE EQUILIBRIUM DISTRIBUTIONS OF LEAST INFORMATIVE DYNAMICS

With the analytical expression of FIT, the LID models under various constraints can be deduced via the constrained optimization approach.39,46,47 To the best of our knowledge, deriving parameters of the equation of motion using dynamic information as the objective function has not yet been estab-lished for continuous stochastic processes. In this section, we use several examples to illustrate how Eq.(54)can be used to achieve this objective.

Finding the least informative distribution of ρeq(x) under a set of constraints can be performed via minimizing the infor-mation objective function ofJ in Eq.(54)under the specified constraints. The resulting Lagrangian for this optimization is

L(ρeq(x),ω) = tobsDρeq|∇ · ∇|ρeq FIT − i ωi ρeq|Ci|ρeq − Ci Constraint . (56)

In Eq. (56), ωi’s are the Lagrange multipliers for the

con-straints indexed here by i, Ci’s are the constraint operators,

and Ci’s are the desired values for the constraints. For

exam-ple, in the constraint of C1= 1, i.e., the operator of summing all probabilities, C1= 1 is the targeted value to ensure proper normalization of the equilibrium distribution.

Minimization of the Lagrangian of Eq. (56) is accom-plished by setting the functional derivative δL/δρeq(x) to zero to reach a solution ρ∗_eq(x)(x; ω) that depends on the Lagrange This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:

(10)

multipliers parametrically

L∗₍_{ω) = inf}

ρeq(x)

L(ρeq(x),ω). (57) The solution of Eq. (57) requires the Lagrange multipli-ers to satisfy the Karush–Kuhn–Tucker (KKT) conditions of optimality.48_{Alternatively, the Lagrange multipliers can also} be determined by optimizing the Lagrangian

ω∗= arg min

ω L

∗₍_ω). ₍₅₈₎

This is similar in favor to the derivation of the maximum en-tropy Boltzmann distribution for the canonical ensemble.49 Applications of using the analytical form of FIT to de-rive the least informative parameters for Gaussian pro-cesses with a constant force and diffusion coefficient as well as the Ornstein-Uhlenbeck (OU) process are presented in AppendixesDandE, respectively.

If all of the constraints are expressed in the form of lin-ear operators as in Eq.(56), numerical techniques for solving partial differential equations can be utilized to determine the optimal profiles of LID. Several examples of using the pro-cedure outlined above to deduce continuous PMF profiles is presented next to highlight the generality of the theoretical framework we developed. The diffusion coefficient is treated as a constant in these calculations. The least informative dy-namics models also illustrate how the criterion of reducing trajectory information differs from the static objective of max-imizing the entropy of the state distribution.

A. The LID model constraining on fixed domain bounds

If the only observation regarding the dynamics of a sys-tem is that x is bounded in the fixed domain between [−L,

L], the LID model with a constant diffusion coefficient is

found by setting the functional derivative of the Lagrangian in Eq.(56)with respect to ρ to zero:

δL

δρ(x) = 2tobsD∇

2_ρ_(x)_{− ω}

1ρ= 0. (59) The normalization constraint is achieved by setting C1= 1 and C1 = 1. With the boundary conditions of ρ(x)|−L, L= 0, the solution of this Strum-Liouville problem is

ρ∗(x)= C1cos(ωx), (60)

ω=nπ

2L ; n∈ N. (61)

Here, n’s are natural numbers and C1 is determined by the normalization condition.

After substituting the above form of ρ∗(x) into the defini-tion ofL, the choice of ωcan be achieved by performing the minimization: ω∗ = arg min ω L∗₍_{ω) = arg min} ω C₁ _nπ 2L 2 . (62) 0.7 0.9 1.1 1.3 0 1 2 3 4 x p eq (x ) Bounded FIT S eq 0.7 0.9 1.1 1.3 0 1 2 3 4 x p eq (x ) Mean/Variance FIT S eq

FIG. 4. LID (least informative dynamics) distributions. (Top) For the bounded domain constraint of x∈ [0.7, 1.3]. The flat profile obtained via maximizing the equilibrium entropy Seq= ln peq(x)eqis shown for

compar-ison. The profile obtained by minimizing FIT (Fisher information of trajecto-ries) has the functional form of cos2_((x_{− μ)π/2L) instead. (Bottom) For the}

constraints of fixed averagex = μ and variance (x − μ)2_{= σ}2_{. In this}

case, maximizing Seqand minimizing FIT give the same functional form of

the profile as∼exp ( − (x − μ)2/2/σ2).

This optimization simply selects the smallest possible n. As a result, p_eq∗(x)= 1 Lcos 2xπ 2L , (63) F∗(x)= π Ltan _xπ 2L . (64)

Fig.4plots the distribution of x based on the principle of LID in comparison with the maximum entropy profile of maxi-mum entropy Seq=

peq(x)ln peq(x), which is a flat distribu-tion in the domain with diverging forces at the boundaries. Because the Fisher information of trajectories includes a mea-sure of the deterministic force in dynamics as indicated in Eq. (55), diverging values of F(x) at the boundaries are not desirable. Alternatively, a distribution cos2(x) that smoothly decays to a zero gradient at the domain edges is selected in-stead according to the criterion of LID.

(11)

B. The LID model constraining on fixed values of the mean and variance ofpeq

If the knowledge of mean μ and variance σ2 _{of x are} available, the LID distribution of x can be found by setting the two constraints C1= x1 and C2= x21. This optimiza-tion can be performed in the Fourier space and this approach is adopted here to illustrate the generality offered by having the analytical expression of FIT. Because the Lagrangian is essentially a set of inner products, the Parseval’s theorem50 says that the Fourier transforms (FT) of functions f(x) and

g(x), ˜f(k) and ˜g(k), respectively, satisfy

dx f(x)g†(x)=

dk ˜f(k) ˜g†(k). (65)

The† sign indicates complex conjugation. Because the FT of ∇ρ(x) is ik ˜ρ(k), where i is the imaginary number, the FT of the information measure in Eq.(54)reads

J = −D

dk k2ρ˜2(k). (66) This functional form is isomorphic to the square curvature po-tential when using the Green’s function analysis to determine the power spectrum of a density field, where the magnitude of the Fourier components decays with the wave number k as

˜

ρ(k)∝ 1/(1 + k2_).51

For the constraint on mean, g(x) = xρ(x), g(k) = id ˜ρ(k)/dk, and variance h(x) = x2_ρ_{(x), ˜}_h_(k)_{= −d}2_ρ_˜_(k)/ dk2_{, the FT version of the Lagrangian is thus}

L( ˜ρ(k)) = dkρ˜(k) − tobsDDk2ρ˜(k) − ω1i dρ˜ dk(k)− ω2 d2ρ˜ dk2(k) . (67)

The functional derivative δL/δ ˜ρ(k) in the Fourier space then gives the differential equation for optimization:

ω2 d2_ρ_˜ dk2(k)+ ω1i dρ˜ dk(k)+ tobsDk 2_ρ_˜_(k)_{= 0.} ₍₆₈₎ By utilizing the translation operation in the Fourier space, we assume the functional form of the solution as ˆρ(k)= e−ω1ik/2ω2_ρ_˜_{(k) to eliminate the first derivative in the} equation d2_ρ_ˆ dk2(k)+ t_obsD ω₂ k 2₋ ω21 4ω2 2 ˆ ρ(k)= 0. (69) The solutions of this equation are the Hermite functions, the same as those for the wave function of a quantum harmonic oscillator. The ground state mode with the least information thus has the form of a Gaussian

˜

p(k)∝ e−ω1ik_e−ω2k2_, ₍₇₀₎ where the requirement of constraints is incorporated into ω for simplicity. Inverse Fourier transform then takes Eq. (70)

to the real-space solution:

ρ_eq∗(x)= F−1( ˜p(k))∝ e−ω2(x−ω1)2_. ₍₇₁₎ With the constants determined to satisfy the constraints of the observed mean and variance as well as probability nor-malization, the final result for the equilibrium probability is a

Gaussian as shown in Fig.4:

p∗_eq(x)=√ 1 2π σ2e

−(x−μ)2_/_2σ2

. (72)

This result of LID model is the same as that obtained by maximizing the static entropy Seqdespite the use of FIT. How-ever, this coincidence can be rationalized by inspecting the two different information measures and their corresponding inequality bounds to establish that the Gaussian equilibrium probability density is indeed optimal in both cases:52

FIT : dx x2ρ2(x) dk k2ρ˜2(k) ≥ 1 16π2, (73) Seq: dx ρ2(x) ln ρ2(x)+ dkρ˜2(k) ln ˜ρ2(k)≤ ln 2 − 1. (74) The inequality of Eq. (73)can be viewed as an uncertainty principle of continuous stochastic dynamics.52 _{It contains a} static measure of the variance in position multiplied by the dynamic information ofJ . Here, J plays the analogous role of the variance of velocity in the Heisenberg uncertainty of quantum mechanics.53 _{On the other hand, the inequality of} Eq.(74)is the Hirschman uncertainty principle54_{which adds} a static measure of equilibrium entropy Seq with the Shan-non information measure of the velocity distribution. In both cases, a balance can be reached between a static and dynamic measure of information and the Gaussian distribution is opti-mal and achieves equality in both scenarios among all of the normalized profiles of peq(x)∈ L2.

C. The LID model constraining on fixed values of the mean first passage times (MFPTs)

If the transition between two metastable states locating at

xA and xB with the reaction rate of kA→Bis known, the mean

first passage time between the two states is τrxn(xA→ xB)

= k−1

A→B. This knowledge can be used as the constraint

for finding the corresponding LID model. According to Gardiner,55_{the mean first passage time from x}

Ato xBcan be expressed as τxA→xB = k −1 A_→B= 1 D xB xA dx ρ−2(x) x xL dxρ2(x), τ_x_B_→x_A = k_B−1_→A= 1 D xA xB dx ρ−2(x) x xR dxρ2(x). (75)

Here, we use ρ= √p as the variable within the finite domain

x∈ [xL, xR].

The constraints on MFPTs defined in Eq.(75)are non-linear and do not follow the advantageous non-linear mode of Eq. (56). UsingτxB→xA andτxA→xB to represent the

opera-tion on ρ in Eq.(75), the Lagrangian of LID becomes

L = tobsD dx(∇ρ(x))2+ ω1 dxρ2(x)− 1 + ω2 τxB→xA[ρ]− k −1 B_→A + ω3 τxA→xB[ρ]− k −1 A_→B . (76) The optimization of this Lagrangian is achieved by setting its functional derivative with respect to ρ to zero, and requires This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:

(12)

the solution of the resulting equation 2tobsD∇2ρ(x)+ ω12ρ(x)+ ω2 δτxB→xA δρ(x) + ω3 δτxA→xB δρ(x) = 0. (77)

However, there is no simple and closed form solution for this equation because the functional derivatives of the MF-PTs with respect to ρ depend on the specific locations within the domain in a nonlinear manner:

δτxA→xB δρ(x) = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 2 Dρ xB xA ρ−2(x)dx x≤ xA 2 Dρ(x) xB x dxρ−2(x)− 2 Dρ −3_(x) x xL dxρ2(x) xA< x≤ xB 0 x > xB. (78)

The derivation of this equation is shown in AppendixF. The nonlinear optimization problem of Eq.(77)can be solved nu-merically by using a quasi-Newton method iteratively.48_The resulting PMFs and equilibrium distributions for three sets of reaction rates are shown in Fig.5.

0.6 0.8 1 1.2 1.4 0 0.5 1 1.5 2 2.5 3 x p eq (x ) 0.6 0.8 1 1.2 1.4 −2 −1 0 1 2 3 4 5 6 x V( x) k AB= 2, kBA= 2 k_AB= 5, k_BA= 3.3 k AB= 1.7, kBA= 5

FIG. 5. Least informative dynamic models for using the mean first passage times as constrains. (Top) The optimal equilibrium probability density p∗_eq(x) for the constraint of rate constant/mean first passage time between two states located at xA= 0.8 and xB= 1.2 with diffusion constant D = 1. (Bottom) The corresponding potential of mean force V∗(x)= − ln peq∗(x). The constrained

values of rate constants in the unit of s−1are labeled as the legend of each profile.

It can be seen from Fig.5that the LID profiles of peq(x) constrained on MFPTs generally follow those of Kramer’s type of reactions56 _{that have quadratic potentials for stable} states connected by an inverted quadratic barrier. This re-sult thus provides a framework for justifying Kramer’s ap-proach of modeling rare event processes. Observation of the LID models with MFPT constraints indicates that the results follow the Hammond-Leffler postulate based on a heuristic arguments, which states that the transition state x‡of an “en-dothermic” reaction is closer to the product.57_{Therefore, the} principle of LID is shown here to be consistent with the out-standing theories of chemical kinetics.

In this example of MFPT constraints, we illustrate that the analytical expression of FIT can be applied with the knowledge of kinetic observables to stitch together a least informative profile of the equilibrium probability density of state distribution. The theoretical framework developed in this work thus provides a systematic way of modeling dynamic systems that evolve with stochasticity. The diverse examples presented in this section establish that the principle of LID can also be used to estimate the underdetermined continuous pa-rameters given the limited number of observables in the mea-sured data.

VIII. CONCLUSION

For trajectories following the Langevin equation at equi-librium, a theoretical framework is developed in this work to evaluate the Fisher information. The corresponding Fokker-Planck equation of Langevin dynamics is transformed into a Hermitian form to allow an eigenbasis decomposition of the time propagator. Essential to the success of this approach is using the square root of the equilibrium probability den-sity as the form on model parameters. This unique choice of basis not only symmetrizes the FPE, it also makes the Fisher information matrix constant, resulting in tremendous simplification in the calculation of the information metric of trajectories in the parameter space. In the continuum limit, we show that the analytical form of the Fisher information matrix for Langevin trajectories is a Laplacian kernel. With this constant-information matrix, a line integral in the pa-rameter space can be devised to give an analytical measure

(13)

of dynamics information given the PMF and diffusion co-efficient of the Langevin equation. An immediate impact of this derivation is enabling the imputation of least informa-tive dynamics model constrained on the observables that are known or available. Although the examples of LID models presented in this work were derived from scalar constraints, generalization to using an arbitrary functional in describing the observable of interest, such as the likelihood of single-molecule FRET measurements,58–60 _{other properties} mea-sured by single-molecule methods, or molecular simulation results, is expected to be equally applicable. The methodol-ogy developed in this work can be used to systematically uti-lize the measurable data to formulate a dynamics model based on the principle of the least informative description.

ACKNOWLEDGMENTS

This work was supported by University of California, Berkeley, USA, the Princeton University, USA, and the Na-tional Chiao Tung University, Taiwan.

APPENDIX A: DERIVATION OF EQ.(18)OF FIT FOR SYMMETRICρ(xt,x0)

The starting point is Eq.(12)and is restated here:

It(x, y)= dxtdx0 δln p(xt|x0) δρeq(x) δln p(xt|x0) δρeq(y) p(xt, x0). (A1) Inputting p(xt|x0) = ρeq(xt)ρ(xt, x0)/ρeq(x0) and ln p(xt|x0) = ln ρeq(xt)+ ln ρ(xt, x0)− ln ρeq(x0) leads to 9 terms: It(x, y)= dxtdx0 δln ρ(xt, x0) δρeq(x) δln ρ(xt, x0) δρeq(y) p(xt, x0) (A2) + dxtdx0 δln ρeq(xt) δρ_eq(x) δln ρeq(xt) δρ_eq(y) p(xt, x0) (A3) + dxtdx0 δln ρeq(x0) δρeq(x) δln ρeq(x0) δρeq(y) p(xt, x0) (A4) − dxtdx0 δln ρeq(x0) δρeq(x) δln ρeq(xt) δρeq(y) p(xt, x0) (A5) − dxtdx0 δln ρeq(xt) δρeq(x) δln ρeq(x0) δρeq(y) p(xt, x0) (A6) + dxtdx0 δln ρeq(xt) δρeq(x) δln ρ(xt, x0) δρeq(y) p(xt, x0) (A7) − dxtdx0 δln ρeq(x0) δρ_eq(x) δln ρ(xt, x0) δρ_eq(y) p(xt, x0) (A8) + dxtdx0 δln ρ(xt, x0) δρ_eq(x) δln ρeq(xt) δρ_eq(y) p(xt, x0) (A9) − dxtdx0 δln ρ(xt, x0) δρeq(x) δln ρeq(x0) δρeq(y) p(xt, x0). (A10) The collection of integrals of that makeup It(x, y) can

be paired down because the density field is symmetric to ex-change of xt↔x0, ρ(xt, x0)= ρ(x0, xt). Performing this

ex-change on Eq. (A10)gives an exact copy and cancels with Eq. (A9). Likewise Eq.(A8)cancels with Eq.(A7). The re-maining 4 integrals containing the ρeqterm require the func-tional derivative δln ρeq(xt) δρeq(x) = δ(xt− x) ρeq(x) . (A11)

Therefore, Eqs. (A5) and(A6) become the integral of two delta functions dxtdx0 δ(xt− x) ρeq(x) δ(x0− y) ρeq(y) p(xt, x0)= ρ(x, y), (A12) for which the property of ρ(x, y) = p(x, y)/ρeq(x)ρeq(y) is utilized as well. Applying the same functional derivative to Eqs.(A3)and(A4)gives the alternative result:

dx₀δ(x0− x) ρeq(x) δ(x0− y) ρeq(y) dxtp(xt, x0)= δ(x − y). (A13) Here, the integration over xtcan be done from the onset due

to the lack of its dependence in the delta functions and noting that p(x0)/ρeq(x0)ρeq(x0)= 1. Combining these results gives Eq.(18): It(x, y)= dxtdx0 δln ρ(xt, x0) δρeq(x) δln ρ(xt, x0) δρeq(y) × p(xt, x0)+ 2δ(x − y) − 2ρ(xt, x0). (A14)

APPENDIX B: PROOF OF THE HERMITIAN PROPERTY OF THE SYMMETRIZED FOKKER-PLANCK EQUATION

The Hermitian property is established by showing that ψi|H|ψj = ψi|H|ψj. For the targeted operator defined in

Eq.(20), the left-hand side is dxψi(x) ρeq(x) ∇ · (D(x)ρ2 eq(x)∇ ψj(x) ρeq(x) ). (B1)

This equation can be transformed by an integration by parts to − dx(D(x)ρ_eq2(x)∇ψj(x) ρeq(x) )· ∇ψi(x) ρeq(x) . (B2)

The zero boundary terms in the space of square integrable functions ψ⊂L2 _{were applied. Another integration by parts} on the∇(ψj(x)/ρeq(x)) term gives

dxψj(x) ρeq(x) ∇ · (D(x)ρ2 eq(x)∇ ψi(x) ρeq(x) ), (B3)

(14)

TABLE I. Interpolation points for V (x).

Position 0.5 0.8 0.9 1 1.1 1.2 1.5

V(x) 60 0 4 ln (4) 5 ln (2) 60

which is recognized asψi|H|ψj. The Hermitian property of

the operator defined in Eq.(20)is thus verified.

APPENDIX C: POTENTIAL OF MEAN FORCE OF THE REFERENCE MODEL

The potential of mean force V (x)/kBT is constructed using the piecewise cubic Hermite interpolation polynomial pchip()in MATLABwith the following points (TableI):

APPENDIX D: FIT FOR THE GAUSSIAN PROCESS OF CONSTANT FORCE

The time propagator for this process is

p(xt|x0)= 1 √ 4π Dtexp −(xt− x0− FDt)2 4Dt . (D1) The calculation of FIT via Eq. (12) starts by taking derivatives of p(xt|x0) with respect to the x-independent force F and diffusion coefficient D

∂ln p(xt|x0) ∂F = (x− FDt) 2 , (D2) ∂ln p(xt|x0) ∂D = F(x−FDt) 2D + (x−FDt)2 4D2_t − 1 2D. (D3) Here, x= xt− x0. Each element of the Fisher information matrix can then be found by multiplying these derivatives by one another and taking the expectation over the joint proba-bilitydxtdx0p(xt, x0) which by rotating the domain, can easily be done as an integration over x and x= xt+ x0:

IF,F = E (x− FDt)2 4 =Dt 2 , (D4) IF,D = ID,F = F t 2 , (D5) ID,D= 1 2D2 + F2_t 2D . (D6)

The following properties for the moments of the Gaussian dis-tribution have also been used to calculate the matrix elements:

E!(X− μ)N"= ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ 0 N = 1 σ2 N = 2 0 N = 3 3σ3 N = 4 . (D7)

The mean μ is FDt and the variance σ2_{is 2Dt.}

Taking the continuous limit for the Fisher information of trajectories,I = limt→0+tobsIt/t, the terms with t

can-cels the first factor terms for ID, D. To evaluate the information

change between parameter setsJ1→2, we integrate according to Eq.(49)for the path

F(ξ )= F1+ (F2− F1)ξ, (D8)

D(ξ )= D1+ (D2− D1)ξ, (D9) where ξ∈ [0, 1]. After applying the sum over the two parame-ters and the Fisher informationi,j

∂θi

∂ξIi,j( θ) ∂θj

∂ξ, the

follow-ing integral remains:

J1→2=tobs 2 1 0 dξ (F )2D(ξ ) 2 (D10) +(FD)F (ξ) + (D)2 F2_{(ξ )} 2D(ξ ) + t−1 2D2_{(ξ )} . (D11) Here, F= F2− F1and D= D2− D1. After performing the integration and collecting terms, the information change is J1→2= tobs (F2D2− F1D1)2 2D2 + lim t_→0+ tobs t 2−D2 D1 −D1 D2 . (D12) This information change is 0 for the case when D1= D2 and F1= F2. However, the two groupings have different scal-ing behaviors as t→ 0. The second grouping which solely depends upon the ratio of diffusion constants gives a rate of divergence at∝1/t. This asymptotic behavior for Gaussian processes has also been worked out via the Kolmogorov-Siani entropy measure.61 _{However, the term with the difference in} the square force remains constant through different values of time resolution and grows linearly with the length of the tra-jectory tobs.

APPENDIX E: FIT FOR THE ORNSTEIN-UHLENBECK PROCESS

The OU process is a system governed by a simple har-monic potential V (x)= −1/2kx2 with the spring constant k reflecting the width of the harmonic PMF. The equilibrium probability density of the OU process is Gaussian:

peq(x)= # k 2πe −kx2_/₂ . (E1)

The time propagation probability density for the OU pro-cess is p(xt|x0)= √ k 2π (1− e−2Dkt) × exp −kxt− x0e−kDt 2 2− 2e−2Dkt . (E2) In the continuum limit, we take the Taylor expansion of the exponential exp ( − 2Dkt) 1 − 2Dkt to give the