Semi- and Hidden Markov Process

Chapter 2 Literature Review

2.3 Semi- and Hidden Markov Process

Continuous-time multistate models are widely used in the natural history of chronic diseases. But if we only can observe the process at discrete time points, we have no information about the times or types of events between observation times. The inference becomes difficult. To overcome this issue, the Markov assumption has been made to imply that the sojourn time in these disease states follows exponential distribution which possess the memoryless property, so that it can limit the transition

rates between these states no longer depend on time since entry into the current state.

However, actually the transition intensities of the process often depend on time since entry into a state that calls semi-Markov process. Therefore, the study conducted by Titman^[6] provides an alternative to alleviate this problem by developing an approach that used the phase-type sojourn time distribution to fit semi-Markov models with panel-observed data. In addition, the approach was extended to data where the observed states were subject to measurement errors.

Panel-observed data are that the observation time periods of each measurement are identical for the same patient. Therefore, given the certain observation time, we can observe types of disease states. It no longer needs Markov assumption. Therefore, the panel-observed data can make the inference become easier.

There were several previous studies which also proposed different ways to fit semi-Markov process: (1) If the observation scheme is sufficiently frequent, the likelihood for a semi-Markov model can be expressed easier. All transitions can be observed, although transition times are interval censored. If the process is a panel data, multiple transitions may occur between observations and we need to use multidimensional integral to obtain the likelihood, which becomes very complicated.

(2) When it comes to multiple transitions mentioned in the previous study, the likelihood function would become complex. If it is a progressive model that means

there is only one possible path of transitions and cannot reverse, computation of the integral may be feasible as the model has a small number of states such as 3 or 4. (3) Nonparametric estimation is possible via self-consistent estimators in progressive model.

(4) Progressive model can be fitted semiparametrically with penalized likelihood. (5) Taking two-state recurrent model for example, as it allows reverse transitions that means it can return to the original state, computation of the likelihood will become more intractable. Regarding evenly spaced observation, a minimum chi-square estimation approach can be used to overcome the problem for this model. (6) Stopping-time resampling has been proposed as a simulation based method of computation. (7) If at least one state in the model has the Markov property, the inference for the panel observed semi-Markov models will be much easier. Because of Markov property, the likelihood for an individual can be factorized into sojourn times of departure from the Markov state. (8) In a two state recurrent disease process with panel observed data, they assumed the existence of latent process was a time homogeneous birth death process and its state space was {0,1,2,…}. If a subject was in state 0, he/she would be considered to be disease free. Other stated were considered ill. Therefore, sojourns in the observable illness state are not exponential and the observable process was a semi-Markov process. However, the computation might become straightforward, if the latent Markov structure of the model allowed the likelihood to be expressed as a hidden

Markov model (HMM).

In many clinical studies, the x_i may be regarded as the measurements of a biomarker or screening test. These measurements may have measurement error so that there is a nonzero probability that the state is misclassified. Instead of observing the x_i

directly, we observe o₁, … , o_n. The misclassification probabilities are defined as P{O(t) = s | X(t) = r} = e_rs. (2-6)

That means at time t, it is exactly in state r, but we observe it is in state s. Based on the misclassification probabilities, e_rs remains constant through time and X(t) is a Markov process, so we know that conditional on the true underlying states, the observed states are independent and the o_i can be modeled by a HMM. To present the likelihood contribution of misclassification for an individual, each transition depends on the

complete history of the process. So for each individual, the matrices were constructed as M1, … , Mn, and Mi is an R × R matrix with (r,s) entry prs(ti−1, ti) × es,o_i with t₀ = 0. It presents the misclassification probability that a subject is in state r at time i-1

and actually reaches in state s at time i, but is misclassified in state oi. Then, the likelihood contribution for an individual can be written as

L = π𝐌𝐌_𝟏𝟏𝐌𝐌_𝟐𝟐… 𝐌𝐌_𝐧𝐧𝟏𝟏 (2-7) where π presents the vector of initial state probabilities and 𝟏𝟏 presents a vector of ones of length R. Covariates affecting the transition rates can be modeled by

µ_rs(t; 𝐲𝐲) = µrs(t)exp (βrsT𝐲𝐲), where y is a vector of explanatory variables. Covariate

effects may also be incorporated into the matrix of misclassification probability by assuming linearity on a logit scale logit(e_rs) = αrsT𝐲𝐲.

To describe a Coxian phase-type distribution, they gave a simple two state (alive,dead) survival model for example, demonstrating how a Coxian phase-type distribution could be applied to the sojourn time distribution of each transient state of a general, multistate, semi-Markov model. Consider a two state survival model X(t) with state {1=alive,2=dead}, for which the transition intensity from alive to dead is time inhomogeneous. For a Coxian phase-type model, the sojourn time in the transient state is assumed to be governed by a latent Markov process X*(t) with k transient phases and one absorbing phase k+1 (=dead). The latent process is progressive, so the movement from transient phase j ∈ {1, … , k} is either to the adjacent phase j+1 or to the absorbing state k+1 as below.

The solid line frame presents the observed state X(t) that we can only observe a subject is either alive or dead. The dashed line frame means the latent state X*(t) that

we cannot observe. At time zero, the process is in phase 1. There are two types of parameters. One is (λ₁, … , λ_k−1), the transition intensities between transient phases and the other is (µ₁, … , µ_k), the transition intensities from the transient phase to the absorbing state. These parameters are constant with time, but intensities are different between phases. It induces time inhomogeneity in the movement between the observable states (from alive to dead).

Consider a semi-Markov process X(t) with state space S={1,…,R}, where R is an absorbing state, and t represents time from entry into the initial state. For each of the

observable states r ∈ S we assume there exists a latent process X*(t) with states r₁, … , r_k but we observe only that the subject is in state r. The state space S* of latent

process X*(t) are

𝐒𝐒^∗ = {1₁, 1₂, … , 1_𝑘𝑘} ∪ {2₁, 2₂, … , 2_𝑘𝑘} ∪ ⋯ ∪ {(R − 1)₁, (R − 1)₂, … , (R − 1)_k} ∪ R ,

its dimension is {k(R-1)+1}. In each observable state, it is not necessary to have the same number of latent states.

The sojourn distribution of each nonabsorbing state r of X(t) is assumed to be a k-phase Coxian phase-type distribution, with parameters λr₁, … , λr_k−1, the rates for movement between phases of state r and µr₁s, … , µr_k−1s, the rates for movement out of state r to state s as follows.

The likelihood can be expressed as (2-7), where for an individual the matrix 𝐌𝐌_𝐢𝐢 become {k(R−1) + 1} × {k(R−1) + 1} with (r, s) entry es,x_iprs(t_(i−1), ti), for s ∈ S*. If s is a phase of the observed state xi, then es,x_i = P{X(t) = xi|X^∗(t) = s} takes the value 1 and 0 otherwise.

To incorporate misclassification error, the process is extended to the hidden semi-Markov model (HSMM). The details of the framework refers to Titman et al^[6].

Suppose the misclassification probability matrix is e as (2-6) and each state in X(t) is phase-type distribution. If the latent process X^∗(t) ∈ {r₁, … , r_k} then X(t)=r for r=1,…,R. So the misclassification probability

er∗_js = P�O(t) = s�X^∗(t) = rj� = P(O(t) = s|X(t) = r) = ers , (2-8) for r,s=1,…,R and j=1,…,k. We can find that they are independent of j. Therefore, the latent Markov process, X*(t), defines X(t) deterministically and O(t) | X(t) is multinomial.

The likelihood contribution from an individual can be calculated as

L = π𝐌𝐌_𝟏𝟏^∗𝐌𝐌_𝟐𝟐^∗… 𝐌𝐌_𝐧𝐧^∗𝟏𝟏, (2-9) where M_i^∗ is a {k(R−1) + 1} × {k(R−1) + 1} matrix with (r*, s*) entry

e_s^∗^∗_,o_ip_r^∗_s^∗(t_(i−1), t_i) , for r*, s* ∈ S*. The difference between the HSMM and

semi-Markov model is that the es,x_i in the semi-Markov case is either 0 or 1, but in the hidden semi-Markov case, the e_rs may lie between 0 and 1 and can be treated as unknown parameters.

To explore the development of bronchiolitis obliterans syndrome (BOS) in post-lung-transplantation patients, they used the HMM and the HSMM to fit the data to identify which model was better. It shows the HMM might be the lack of time

homogeneity, so the HSMM could provide a better fit to the data using the phase-type methodology. Through these methods they were able to better characterize the natural

history of lung function decline after thoracic transplantation.

在文檔中佇列閾值及寇斯多相統計模型探討與大腸直腸癌早期發現和住院之相關時間分布 (頁 21-29)