The Gillespie algorithm - Numerical Approaches for Simulating Stochastic Process

Chapter 3 Numerical Approaches for Simulating Stochastic Process

3.1 The Gillespie algorithm

Consider a system of fixed volume Ω contains N different species of reactants, with M different reactions among those reactants, or reaction channels, denotes

) dynamics of the system can be described by the following ordinary differential equation (O.D.E.) form: describe the system state at time t. In stochastic process, X_i(t) is actually a random variable. To analyze it, define propensity function aj for reaction channel Rj , which is that

Eq. (3.2) is the fundamental premise for numerically solving the master equation.

Previously it was thought to be an ad hoc assumption for stochastization of the deterministic chemical kinetics [20]. Later it has been proved that the Eq. (3.2) has a rigorous microphysical basis [21].

For each reaction channel Rj, we also need to define the state-change vector v , _j whose ith component is given by

channel.

one

by produced of

number in the

change the

i ji

X v ≡

(3.3)

Together, Eq. (3.2) and Eq. (3.3) specify the behavior of each reaction channel, the state-change vector v is a fixed vector with integer value, and it’s component is _j usually +1 for production channel, and -1 for degradation channel, while the propensity function aj changes value from time to time accords to the system state x . From previous argument [17], the function aj has mathematical form

( )

(x _j _j x

j c h

a = (3.4)

Here, cj is the specific probability rate constant for channel Rj, and cj dt gives the probability that “one” random chosen pair of reactant molecules in Rj will react accordingly in the next infinitesimal time. The unit of cj is the inverse of time, and the value can be determined from traditional reaction constant kj. If the system’s volume Ω is given, cj can be derived from kj by transforming the unit of kj into the same unit of cj. For example, consider two reactant molecular species S and S , and a reaction

channel R of which _μ S₁ reacts with S₂ to give the product S ₃

The reaction channel R_μ was conventionally described by reaction rate constant

kμ in chemical kinetics, more precisely, it’s the average reaction rate per unit volume divided by the product of the average densities of the reactants [17]. Put the definition of cj together, we have

Where x₁ and x₂ represents the molecular concentration of reactant species S₁ and S₂. However, in deterministic formulation we do not consider statistical variance, therefore x₁x₂ = x₁ x₂ , hence it gives the relation

μ.

μ c

k =Ω (3.7)

For other types of reaction channel, this relationship can vary, and we summarize it as following

change the value of k of the first three types of reaction after we transfer it to _μ c in _μ stochastic simulation.

The function )h_j(x in Eq. (3.4) is defined to be the number of distinct combinations of Rj reactant molecules available in the state x [19]. For example, the

h for the following types of reaction can be j

The time-dependent stochastic trajectory can thus be simulated once cj and v are _j given. Here, the evolution of Eq. (3.2) implies that the state vector X(t)is a jump-type Markov process non-negative N-dimensional integer lattice [19], and a traditional way to analyze it is to write it in the form of conditional probability function

The time-dependent evolution of this conditional probability function can be derived by advance the system with a small time step dt one at a time, so that we can exclude the probability for two or more reactions to occur in dt . Using Eq. (3.2), together with the previous argument [21], the probability of the system being in state x at time t+ can thus be described dt

[

⁽ ^, ^| ^, ⁾ ⁽ ⁾

]

If we take the limit, the differential form of Eq. (3.11) becomes the chemical master equation fully understand the behavior of this stochastic process. As we have mentioned above, it is rare that an exact analytical solution to Eq. (3.12) can be found; but at least, we could generate many trajectories enough to describe the probability density of P.

To numerically simulate Eq. (3.12), an intuitive way is to give a very small time interval step dt , and iterate the following steps

Table 3-1 An intuitive SSA

Step 0

Start with system state x , calculate the probability ₀ a_j(x₀)dt for each reaction channel

Step 1

Generate uniform random variables, update the system state according to the probability a_j(x₀)dt and state change vector v of each reaction _j channel Rj

Step 2 Advance the system time with dt , back to Step 0

The stochastic simulation result is an exact realization for Eq. (3.12), but the defect is that, since dt is required to be small enough, the probability a_j(x₀)dt that a reaction happens in reaction channel Rj will also be small, it means the computer repeat those three steps frequently, but most of time there’s no reaction happened, this leads to inefficiency of this naive stochastic simulation algorithm, and makes it even unusable if

→0

dt .

Another way to solve this problem of inefficiency is the Gillespie stochastic simulation algorithm, which defines another exact consequence of Eq. (3.2), the next-reaction density functionp(τ, j|x,t) [17], [21], where

The τ in Eq. (3.13) stands for the time that elapses without any reaction occurs in the system, and ^p⁽τ^,^j^|^x^,^t⁾^dτ can also be expressed as the product

by elementary probability argument. Since



(x gives the probability that some

reaction will take place in the next dt according to the definition of Eq. (3.2), we can

divide τ , say, into K equal length subintervals. Let ε denotes the subinterval where ε =τ/K , K is a positive integer, the probability none of the reactions

R ,...,₁ occurs in the first ε subinterval (t, t+ε)is

[

¹ ⁽ ⁾ ⁽ ⁾

]

¹ ⁽ ⁾ ⁽ ⁾^.

1 1

∏

_j^M₌ ⁻^a^j ^x ^ε⁺^o ^ε ⁼ ⁻



^M_j₌ ^a^j ^x ^ε⁺^o ^ε ^(3.15)

Where ^o(ε)describes the probability that more than one reaction occurs in )

(t t+ε , and we don’t really need to worry about it since this term can be very small for small ε. The right side of Eq. (3.15) is also the probability that no reaction occurs in the subsequent intervals(t, t+2ε),(t, t+3ε)and so on, since the system state x

remains if no reaction occurs, hence the term



= M j

aj 1

)

(^x ε remains the same. For the rest

of subintervals, using the same argument, we have

. ) ( ) ( 1

) (

1 0

M K

j o

p 



 



 − +



= ε ε

τ ^x ^(3.16)

For K→∞, together with ε =τ/K, Eq. (3.16) is equal to

. ) ( exp

) (

0 



 



−



= M j

p τ ^xτ ^(3.17)

Hence, after plugging Eq. (3.17) into Eq. (3.14), it becomes

The Eq. (3.18) provides the fundamental basis for Gillespie stochastic simulation algorithm [17], in which a random pair of (τ, j)is generated by Monte Carlo procedure according the joint probability density function. The following procedure briefly describes the Gillespie algorithm [17], [18].

Table 3-2 Standard procedures for Gillespie SSA

Step G₀ the system state vector.

2. Calculate ( ) .

1. Generate two independent uniform distributed random number r₁ and r₂, where r₁∈[0,1] r₂∈[0,1].

2. Determine τ^:

1).

1 ln(

0 r

= a τ

3. Determine the reaction channel R in which the reaction happens: _μ

Take _μ so that .

1 0 2 1



−

≤

< ^μ

j j j

j ra a

Step G3

1. Update the simulation time τ.

= t t

2. Update the system state according to R _μ

μ. v x

x= +

3. If t>t_stop, or if no more reactants left (all h_μ =0), terminate the simulation; otherwise, return to Step G1.

Here, we use a simple example to demonstrate the Gillespie algorithm. Consider a simple birth-and-death model for a species S , where

: :

2 1

φ φ

⎯→

⎯

⎯→

⎯

c c

S R

(3.19)

Reaction channel R₁ describes a zero order constant production of species S , and R₂ describes a first order degradation of S ; c₁, c ₂ is dimensionless probability rate constant, as defined in Eq. (3.8).

For conventional understanding in kinetic description, the ordinary differential equation form of this model will be

2 .

1 k x

dt k

dx = − (3.20)

Where x denotes the concentration of S in system’s volume Ω^, k is the i

conventional reaction rate constant; the unit of k₁ is concentration over time, and the unit of k₂ is the inverse of time. The relationship between k in kinetic description _i and c in Gillespie algorithm has been discussed in Eq. (3.8). The following figure _i demonstrates the simulation result of the model, with the parameter k₁=0.02M/sec and k₂ =0.001/sec , and system’s volume Ω=1L (Hence, the corresponding

sec / 02 .

1=0

c and c₂ =0.001/sec)

Figure 3-1 Three stochastic trajectories generated by Gillespie SSA.

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0

5 10 15 20 25 30 35

sample trajectories

time(sec)

particle numbers

1st realization 2nd realization 3rd realization O.D.E. solution

Three trajectories are shown in the Figure 3-1, which are generated by repeating Gillespie algorithm three times, using the birth-and-death model in Eq. (3.19), with parameters described in text. The dashed line represent the analytical solution of Eq.

(3.20), which is a continuous function with mathematical form )

1 ( )

( )

( ²

1 e kt

k t k x t

X =Ω =Ω − ⁻ . The capital X denotes the particle number of species

S , and the vertical axis shows the discrete particle number of species S . The

Gillespie algorithm is programmed by MATLAB^®.

In order to reconstruct the probability space of X(t), a large quantities of trajectories should be made. Here, since we are interested in the steady state X , we ^* take value of each trajectory at the end of simulation time (10,000 sec in this case) and analyze statistics of the distribution.

5 10 15 20 25 30 35 40

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18

Steady state distribution

particle number

probability

mean = 20.0788 Var = 19.9798 Fano = 0.99507

The histogram shown in Figure 3-2 is generated by steady state values(value at 10,000 sec. in this case) of 10,000 trajectories simulated with Gillespie algorithm, using the birth-and-death model described in Eq. (3.19). The statistics is made by collecting the steady state value for each trajectory. The distribution of X has mean = 20.08, ^* variance = 19.98 and the Fano factor(defined as variance/mean) ≅ 1, which represents a Poisson distribution.

在文檔中朗之萬方程式在蛋白質瞬間增量模型之應用 (頁 17-28)