Chapter 2 Literature Review
2.3 Stochastic Models for Disease Natural History
A sequence of random variables {ܺఈ,Ƚ= 0,1,…} is called a Markov chain if, for every
collection of integers, ߙ ൏ ߙଵǡ ൏ ڮ ൏ ߙ ൏ ߚ, the conditional distributions of
ܺఉsatisfy the relation:
ܲ൛ܺఉ ൌ ݅ఉหܺఈబǡ ǥ ǡ ܺఈൟ ൌ ܲ൛ܺఉ ൌ ݅ఉหܺఈൟ, for ݅ఉ
The outcome in the future (ܺఉൌ ݅ఉሻ݅ݏ݈݊݊݃݁ݎ݀݁݁݊݀݁݊ݐݑ݊ݐ݄݁ܽݏݐݏݐܽݐ݁
ov mmmmmoododododelelelelelelelele
me me me me
mes tththththththththatatatatatatatatPPPPPPPPD D D D D D D D isisisisisisisisaaaaaaaa
sed ddddasasasasasa ssssstatatatagegeggg VVVVVVViiiiisss s s
29
(ܺఈబǡ ǥ ܺఈషభሻ
For each ܺఈ, the absolute probability is denoted by ܲሼܺఈ ൌ ݅ఈሽ ൌ ܽഀ
For every pair of random variables, ఈandܺఉ, the conditional probability is denoted by
ܲሼܺఉ ൌ ݅ఉŽܺఈ ൌ ݅ఈሽ ൌ ܲഀǤഁġ
The joint probabilities of ܺఈǡ ܺఉǡ ܺఊ, for Ƚ ൏ Ⱦ ൏ ɀ, are given by
ܲ൛ܺఈൌ ݅ఈǡ ܺఉ ൌ ݅ఉǡ ܺఊൌ ݅ఊൟ ൌ ܽഀܲഀǡഁܲഁǡംǡ ܽ݊݀ ܲ൛ܺఈൌ ݅ఈǡ ܺఉ ൌ ݅ఉൟ ൌ ܽഀܲഀǡഁ
Therefore, for any collection of integers Ƚ ൏ Ⱦ ൏ ڮ ൏ Ɂ ൏ ɂ, the joint probabilities are
ܲ൛ܺఈൌ ݅ఈǡ ܺఉ ൌ ݅ఉǡ ǥ ǡ ܺఋൌ ݅ఋǡ ܺఌ ൌ ݅ఌൟ ൌ ܽഀܲഀǡഁǥ ܲഃǡച
A Markov chain with state space being the set of all the non-negative integers is
completely determined by the initial absolute probability distribution
ܲሼܺ ൌ ݅ሽ ൌ ܽబǡ݅ ൌ ͳǡʹǡ… and the transition probabilities
ܲሼܺఈାଵ ൌ ݅ఈାଵȁܺఈൌ ݅ఈሽ ൌ ܲഀǡഀశభ , ݅ఈǡ ݅ఈାଵ ൌ ͳǡʹǡ ǥ for Ƚ=0,1,…
The transition probabilities of a time homogeneous chain is denoted by
ܲሼܺఈାଵ ൌ ݆ȁܺఈ ൌ ݅ሽ ൌ ܲ
The transition probability ܲ for a three-state Markov model can be arranged in the form
of a matrix
P=൭
ܲ ܲଵ ܲଶ
ܲଵ ܲଵଵ ܲଵଶ
ܲଶ ܲଶଵ ܲଶଶ൱
ഀ
ഀ
ഀ
ഀ
ഀ
bilititititity y y y yy isisisisisddddenenotototttttedededededbbbbbbyyyyy
30
2.3.2 Three-state Homogeneous Markov Model for Disease Natural History
Chen et al applied a three-state Markov model to estimate sojourn time in chronic
disease screening without data of interval cases.43They model the disease with a
continuous-time Markov process in which X(t), the state of an individual at time t, is a random variable with a state space Ω={0,1,2}, where 0 represents no disease, 1 represents
preclinical screen detective disease (PCDP) and 2 represents clinical phase (CP). The
clinical phase in this model is an absorbing state in Markov processes language because
the natural history cannot be estimated beyond diagnosis due to the effect of therapy. They
also assume this is a progressive model.
The transition rates in the three-state model can be expressed as an intensity matrix,
൭െߣଵ
ߣଵ represents the transition rate from no disease to the PCDP, ߣଶ represents the transition
rate from the PCDP to the clinical phase.
Given the transition intensity matrix above, transition probabilities for a three-state model
can be expressed as
31
The likelihood function based on the prevalent screen in a cohort with N individuals is
ܮ
ଵሺǤ ሻ ൌ ෑ ൬ ܲ
ଵሺݒ
ሻ
ݒ represents age at fist screen for mth subject
ݔ ൌ ͳ when the mth subject is detected as a positive case
ݔ ൌ Ͳ otherwise.
However, as the previous mention above, the Markov model used to assume a
homogeneous process that a constant hazard rate with time for progression for state to
state. This may be unrealistic in medicine and biology.
2.3.3 Three-state Model with Weibull Distribution
In order to deal with the non-constant hazard in the stochastic model, Chen et al
propose a non-homogeneous three-state model for the disease natural history of oral
cancer.44They model the time of transitions from normal to leukoplakia and leukoplakia
to invasive carcinoma with two Weibull distributions. The transition probabilities for
staying in a no disease state (state 0), transitions from normal to leukoplakia (state 1) (2 (2 (2 (2 ( --3)3)3)3)3)
32
and from normal to invasive carcinoma (state 2) in a given time interval [t1, t2] are
expression as follows:
ܲሺݐଵǡ ݐଶሻ ൌ ͳ െ න ݂ଵ
௧మ ௧భ
ሺݑሻ݀ݑ
ܲଵሺݐଵǡ ݐଶሻ ൌ ݂௧௧మ ଵ
భ ሺݑሻ ቀͳ െ ݂௧మ ଶ
௨ ሺݒሻݒቁ ݀ݑ(2-4)
ܲଶሺݐଵǡ ݐଶሻ ൌ න ݂௧మ ଵ
௧భ
ሺݑሻ න ݂௧మ ଶ
௨
ሺݒሻ݀ݒ݀ݑ
f1(t) and f2(t) are the probability density function of Weibull distributions for time of
transition from states 0 to 1 and from state 1 to 2. The two Weibull distributions are
denoted as W1(ߣଵ,ߛଵ)and W2(ߣଶ, ߛଶ). ߣଵ andߣଶ are scale parameters and ߛଵ and ߛଶ are shape parameters for the two corresponding transitions. The transition rates as a
function of time are expressed as follows:
ߣ ൌ ߣߛݐఊିଵ where i=1 or 2
The probability of remaining in state i-1 in time t is
ܵሺݐሻ ൌ ቄെ ߣ௧ ߛݑఊିଵݑቅ ൌ ሺെߣݐఊሻ (2-5)
The corresponding probability density function is
݂ሺሻ ൌ ߣߛݑఊିଵሺെߣݐఊሻ
The transition probabilities for staying in state 1 and state 2 were also denoted as
follows:
ܲଵଵሺݐଵǡ ݐଶሻ ൌ ͳ െ ݂௧మ ଶ ௧భ ሺݑሻݑ
ܲଵଶሺݐଵǡ ݐଶሻ ൌ ݂௧మ ଶ
௧భ ሺݑሻݑ (2-6) al [[[[[ttttt11111, tttt ]22222]]]]]]]]ararararararararareeeeeeee
33
The natural history from state 1 (leukoplakia) to state 2 (invasive carcinoma) is usually
unobservable due to the interruption of medical treatment. We can only estimate
parameters via equation (1), P00, P01and P02.
2.3.4 Incorporation of patient specific covariates
The effect of patient specific covariates, say x, on the three-state stochastic model was
assessed by the exponential regression model that treats scale parameter in the Weibull
distribution as a function of patient-specific covariates. It is expressed as follows:
ߣ ൌ ߣሺߚ߯ሻ
ߣ : the scale parameter of Weibull distribution for state i
߯ : a vector of covariates for subject m
ߚ : corresponding regression coefficient
2.3.5 Bayesian inversion for a non-standard case-cohort design
For an n-state disease natural history, n sets of random samples for each transition were
selected in case-cohort study design in Chen et al. Let S denoted an indicator of whether
a subject was sampled (S=1). For individual i, let ߨ௧ be sampling fractions for state j
at time ti . ߨ௧ was denoted as follows:
ߨ௧ ൌ ሺ ൌ ͳȁͲ ՜ ݆Ǣ ݐሻ
nomommmma)aa)a)a iiiiiiss s ss sssuususususususususuauauauauauauauaualllllllllllllllllly y yyyyyyy
y y y
yeeeeestiimimimimimimimimatatatatatatatateee e e e
34
The sampling fractions for state j can be expressed as ߨ if we assume that sampling
fractions are independent of the individual. Using Bayesian inversion, the probability of
transition of being state j at time tigiven a subject was sampled is P(0՜ ݆Ǣ ݐȁܵ ൌ ͳሻ
The transition probabilities P0j(ti) are derived from equation (1).
Likelihood function, parameter estimation and model validation
The data on the first oral examination were used to estimate the parameters relate to the
disease natural history. This yields three possible observed transitions before the first
examination: staying in normal (state 0 Æ 0), normal to leukoplakia (state 0Æ 1) and
normal to invasive carcinoma (state 0 Æ 2). According to the above equation, P(0՜ ݆Ǣ ݐȁܵ ൌ ͳሻ
The likelihood function for the normal-leukoplakia-invasive carcinoma cohort with
three covariates is
age i of the first examination.
thahahahahattttt sasaaaaaampmpmpmpmpmpmpmpmplililililililililingngngngngngngngng
35
2.3.6 Five-state non-homogeneous stochastic model
Chen et al further extended the three-state model to the k-state model.45They use
normal-adenoma-carcinoma for colorectal cancer for the example. The natural history of the colorectal cancer is classified by adenoma size. The state space Ω={0,1,2,3,4},
where state 0 represent normal, state 1 represent diminutive adenoma, state 2 represent
small adenoma, state 3 represent large adenoma, and state 4 represent invasive
carcinoma. They apply the hazard rate from normal (state 0) to diminutive adenoma
(state 1) change with time and denoted as ߣଵሺݐሻ with Weibull distribution. The Markov
property was assumed for the remaining transition rate of ߣଶ to ߣସdue to the
complexity of algebra increases if each transition rate is modelled by the Weibull
distribution. The natural history of the above process is divided into two parts: 1.
Non-homogeneous Markov property for the hazard rate for normal to diminutive adenoma. 2.
Homogeneous Markov property for the remaining transitions. The transition matrix is as
follows:
The time of transition from states 0 to 1 is modeled byߣଵሺݐሻ with Weibull distribution.
The remaining transition matrix M is as below:
d de de de
dell.4545454545454545ThThThThThThThTheyeyeyeyeyeyeyeyuuuuuuuusesesesesesesese
nattturururururralalalalalhhhhisistotoryryryryryr
36
As the non-homogeneous part that models the hazard rate of the onset of diminutive
adenoma with a Weibull distribution, the transition probabilities from state 0 (normal) to
state 1-4 can be derived as follows.
The probabilities for subjects staying as normal during [t1, t2] is
ܲሺݐଵǡ ݐଶሻ ൌ ͳ െ ݂௧௧మ ଵሺݑሻ݀ݑ
భ (2-10)
݂ଵሺݐሻ : the probability density functions of Weibull distribution for the transition from
state 0 to 1
The probabilities for an individual progressing from state 0 to state j during [t1, t2] is
ܲሺݐଵǡ ݐଶሻ ൌ ݂௧௧మ ଵሺݑሻ ൈ ܲଵெሺݑǡ ݐଶሻ݀ݑ
భ (2-11)
j=1,2,3,4; ܲଵெ(.): transition probabilities derived from ܲெሺܽǡ ܾሻ
According to the equation as below, P(0՜ ݆Ǣ ݐȁܵ ൌ ͳሻ
The likelihood function for adenoma-carcinoma is ς ൬σగబగబబሺ௧ሻ
37
2.3.7 Semi-Markov Model
To consider death as an absorbing state, the five-state Markov model (Figure 4-2) is
extended to the following model.
As the transition from the current sate to the next state, particularly absorbing state i.e.
death, is highly dependent on how long they stay in the current stat, a six-state
semi-Markov model will be proposed to model the temporal natural history of H-Y based PD.
State space Ω, Ω={0,1,2,3,4,5} is defined similarly as above. Let X={X0, X1,,…, Xn}
denote n observed successive transitions for an individual during a period of time t, where
X0is the initial state and Xnis the X final state after n transitions. We assume the total number of transition is finite and XאΩ. As a six-state semi-Markov process will be
applied, in addition to X, which is said to form an embedded Markov chain, we still Free of PD
38
require sojourn time distribution to depict the time spent in the current state before
transition to the next state. In parallel with X, T= {T0, T1, …Tn} is denoted to represent the
entry time into state Xnafter n transitions. According to X and T, a semi-Markov process
can be formed by transition probabilities (Pij) and distribution of sojourn time (Fij(t))
expressed by
ܲ ൌ ሺܺାଵ ൌ ݆ȁܺ ൌ ݅ሻ (4-7)
ܲ is a homogeneous process
ܨሺሻ ൌ ሺܶାଵെ ܶ ݐȁܺାଵൌ ݆ǡ ܺ ൌ ݅ሻ
For example, the transition from SD early H-Y stage (I&II) (j=1) to death (j=5) is
determined by the transition probability (P15) and also the distribution for the time spent in
early SD H-Y stage F15(t).
Fij(t) is specified by a generalized Weibull distribution expressed by ܨሺሻ ൌ ͳ െ ൬െ ൬ఙ௧
ೕ൰ఔೕ൰ (4-8)
The parameters of ɐ and ɋ can change with time.
ߥ and ߪare estimated using the maximum likelihood method.
Suppose we have N individual (m=1,…..N) and the subject m had nmsuccessive
transition. The observed sequence is denoted as {߯ǡ ǥ ߯ሽ and the corresponding entry
times into state X is denoted by {ܶǡ ܶଵǡ ǥ ǡ ܶశభሽ.
The likelihood function
statatatatateee e e bebebeeeeeeefofofofofofofofoorererererererere
ot ot ot ot
otededededddddttttttto o oooooorererererererereprprprprprprprpresesesesesesesesenenenenenenenentt t tt tt ttthththththththehe
i-MaMaMaMaMaarkrkrkrkrkovovovovo pppproooooocececececesssssssssss
39
ሺɐǡ ɋሻ ൌ ςேୀଵሼςୀଵሺܲ௫షభ ௫݂௫షభ ௫ሺܶെ ܶିଵ ሻ ൈ σஷ௫ ܲ௫ ܵ௫ ሺܶశభെ
ܶሻఋశభ (4-9)
The latter part is related to right censoring with censoring indicator of Ɂ ߜశభ ൌ ͳ if ܺ is not final state
ߜశభ ൌ Ͳ otherwise