Chapter 5 The Burst Production Model of Single Gene Expression
5.1 Modified Gillespie algorithm in burst production model
5.2.2 Derivation of the Langevin form of burst production
Since the random process can be exactly simulated by Gillespie algorithm, either for two-step model or burst model, and we have shown that for single gene expression, the two-step model can be simplified into burst model in section 5.1. The advantage of burst model is that: the description of mRNA is implicit. It helps to bypass the problem of invalid Langevin approximation to two-step model in low gene expression case. In this section, we use the Langevin equation to approximate the burst model, instead of the two-step model, to describe the stochastic process of a single gene expression.
The concept of approximating Gillespie algorithm with a Langevin equation is very similar to what has been discussed in section 3.3. Here, we use the same notations as in Chapter 3 that describe the Gillespie algorithm. Consider a system with size Ω contains N different reactant species Si (i=1,...,M) and M different reaction channels )Rj (j=1,...,M . Suppose the system’s state X(t) at the current time t is denoted as xt . Let K xj( t,τ) (τ >0) be the firing times of reaction channel
) ,..., 1 (
j M
Rj = within time step τ , as previously defined in Eq. (3.27). Given that the reactant species Sp( p∈[1,N]) is produced in a burst-like production with average burst size β through a burst production channel Rb (b∈[1,M]) with the state change vector vb, which has an exponential random variable Exponential(β) as the pth component, or simplyvbp =Exponential(β). The term Exponentia denotes an l exponentially distributed random variable, and Exponential(β) is an exponentially distributed random variable with mean equals to β , and the probability density function is just the same as that in Eq. (5.4). For the reactant species which are not
),
; ,..., 1 ( ) , ( )
( ) (
1
p i N i
v K
t X t
X M
j j t ji
i
i + = +
= ≠= τ
τ x (5.11)
where X denotes the particle number of species i S . The description of Eq. (5.11) is i just as same as that has been discussed in section 3.3.1.
For the reactant species S that is produced in burst, the increment of particle p number within time step τ from burst production channel, can be described by two random variables: the random number of burst events within τ , denoted as K xb( t,τ), and the random burst size of each burst event. Suppose the burst is “independent” of each other, we can simply sum up all these independent burst to get the net production of burst channel within time step τ . Since we consider the burst size distributed exponentially with mean β, the net production of burst channel is to add up these independent exponentially distributed random variables Exponential(β), and this is just like the case of Erlang distribution that we have discussed in section 5.2.2. The only difference is that: the number of random variables Exponential(β) is another random variable K xb( t,τ). In other words, for the example of buying ticket in the movie theater, as we just discussed in section 5.2.2, this time we don’t know how many people there are in front of us.
Here, we use the notation K to replace b K xb( t,τ) for simplification. The net production of burst channel within time step τ can be described by an Erlang random variable with a random shape parameter K , and a scale parameter b β. The result can be denoted as Erlang(Kb,β). The term Erlang denotes an Erlang random variable, which has the probability density function:
.
Since the random variable K describes the number of burst events, it is a b positive integer, the random variable Erlang(Kb,β) still lies in the domain of Erlang distribution.
Hence, for the reactant species S that is produced in burst, the particle number p of S at time p t+τ is the net effect of non-burst channels, plus the net production of exponentially distributed burst size.
Suppose τ satisfies the leaping condition in Table 3-3, which requires τ to be small enough so that the propensity for each reaction channel only change a little as discussed in Eq. (3.29), then we can use a Poisson random variable to “approximate”
firing times K xj( t,τ) for every reaction channel (including burst production channel in our case). For the same reason, the number of burst events K can be approximated b by a Poisson random variable, and the Eq. (5.13) becomes:
lies in the domain of Erlang distribution.
Suppose we can further find a subset of these τ that can satisfy another condition:
require τ to be large enough so that the Poisson random variables can be approximated by normal random variables with same mean and variance, as we have discussed in section 3.3.2. Under this condition, the Eq. (5.14) becomes
Since the shape parameter Kb which has been approximated by )
), ( (ab t τ
Poisson x is now a real number Normal(ab(xt)τ,ab(xt)τ) rather than an integer, the Erlang random variable in Eq. (5.14) is changed to the Gamma random variable in Eq. (5.15). For simplification, we denote the normal random variable
)
The third term on the right side of Eq. (5.16) forms a new probability space, it becomes an Gamma variate with random shape parameter Kb' and a fixed scale parameter β . Our next move is to analyze Gamma(Kb',β). The following derivation is inspired by Ref.[35]. Let δ denotes a standard Gamma random variable
)
Replacing Gamma(x;k,β)with its explicit form of probability density function in Eq. (5.5), we obtain:
After integration by parts, we have
. ) 1 ( ] [ )
(s Ee s s k
Lδ = −δ = + β − (5.20)
Now, let δb to be an Gamma variate with random shape parameter Kb' and a fixed scale parameter β , where
).
, ' (
~ β
δb GammaKb (5.21)
The Laplace transform of δb will be
].
] '
| [ [
] [ ) (
k K e E E
e E s L
b s s
b b b
=
=
=
−
− δ δ δ
(5.22)
Plugging the result of Eq. (5.20), we have
].
) 1 [(
)
( b'
b
s K
E s
Lδ = + β − (5.23)
The expected value of δb will be the negative first derivative of Lδb(s) evaluated at s=0, where
].
.
Since a Gamma random variable can be approximated by a normal random variable with same mean and variance, if the shape parameter is large enough. Because in condition two, we already require τ to large enough to make the mean of K b' large enough (see Eq. (3.32)). Hence, under the condition two, the random variable
) , ' (Kb β
Gamma can also be approximated by the normal random variable with same mean and variance. According to Eq. (5.27) and Eq. (5.28), it means:
),
if τ is large enough to satisfy the condition two. Plugging Eq. (5.29) into Eq. (5.16), we obtain:
By linear combination theorem for normal random variables (see Eq. (3.35)), we can rewrite Eq. (5.30) into:
)].
Simply substitute the notation τ with dt, and suppose that each dt has same size and satisfies both conditions, we finally arrive at the Langevin equation that describes burst, which has the form:
].
This equation describe the approximated stochastic increment for the reactant species S which is involved in a burst production channel. On the right side of Eq. p (5.31), the terms inside first bracket describe the original Langevin approximation by Ref.[19], which is the same as that discussed in section 3.3. The terms inside second bracket describe the Langevin approximation for burst production channel R , with the b exponentially distributed burst size which has mean β. We found that the difference between burst channel and non-burst channel is that, the burst channel has larger noise term β 2a xb( t)dtNb(0,1) than a non-burst one vjp aj(xt)dtNj(0,1). In other words, if β describes a deterministic uniform size of state change rather than an average of exponential distribution, it will simply go back to the non-burst Langevin form. The contribution of the factor 2 originates from Eq. (5.26), which takes
] factor 2 also characterize the higher noise contribution of a burst channel, compared to the other non-burst channels.
Figur
In short, th
re 5-6 A el (right), an gevin equati
To model t inal two-ste re 5-6) to b oximated L t-Langevin)
times of no
he scheme o
A schematic nd for Gille ion (lower p
the stochas ep model ( bypass the l
Langevin
stic process (upper-left i low copy n
form of b find out the
h than the n
n can be sum
n for two-st thm (upper
of single g in Figure 5 number of m burst mod
the next fig
left) or a bur d their corre
ssion, we f burst model ecules. Nex right in F uted burst eft in Figure
gure:
ursting one-s esponding
first simplif l (upper-rig xt, we deriv Figure 5-6