The innovation of combing queue process, hurdle model and Coxian phase-type
distribution
The values of current thesis are not only held from the aspect of applications but also from the aspect the methodology. It is very interesting to put emphasis on the reciprocal feedbacks between both aspects rather than only put emphasis on single aspect. It is very intuitive to ask why we need the Coxian phase-type model here? Is it sufficient for the research people to merely apply the Queue process and the hurdle model with Poisson process? The combination of Queue process, hurdle model, and Coxian phase-type is motivated a very large population-based screening data in Taiwan.
We are faced with high demand for around over five million participants eligible for the uptake of CRC screening with FIT, yielding a high demand for the referral of positive FIT to undergo colonoscopy. In contrast to the conventional Queue process that evaluates the arrival rate as opposed to departure rate relating to service time distribution, the non-compliance (non-susceptibility) problem for the referral of positive FIT made the traditional Queue process infeasible and may resort to the use of hurdle model. In addition, those who were willing to consent to undergo colonoscopy may be classified into different types according to WT for colonoscopy. This raised the rationale for using the Coxian phase-type model for detecting whether it can identify hidden
45
phase during the WT so as to provide a new insight into information used for health promotion for enhancing the referral rate. Although the current thesis was to integrate three types of model, we still analyzed the data with step-by-step procedure from a simple statistical approach to the final new queue hurdle Coxian phase-type model in order to get a better understanding of the contrasts between the proposed models by decomposing each part into analysis.
After modelling screening data, health policy-makers are also concerned with the LOS in hospital for CRC patients because different types of LOS may reflect different severity of disease status (including the severity of CRC and co-morbidity) as well as various costs involved in hospitalization and modelling the transition between different hidden phases. The incorporation of relevant covariates is also one of novelties in the current thesis. The idea of this part was identical to those envisaged by the Marshal et al study.
Thoughts of Statistical Models
As mentioned above, we had tried a step-by-step approach with various statistical methods to identify factors affected WT for undergoing colonoscopy. At first, we utilized the Cox regression model to elucidate factors affecting WT. It might be inappropriate because we did not take non-complier into account. We attempted to use
46
logistic regression model to deal with the non-compliance part and the Cox regression model pertaining to WT distribution, and compared with the hurdle model, which is the mixture of the logistic regression model and the truncated Poisson regression model.
However, we found that on the non-hurdle part, some characteristics had dissimilar rates to undergo colonoscopic examination between the Cox regression and the truncated Poisson regression. Because in the presence of covariates in the Cox regression model, it cannot have a proportional hazards structure if the covariates are modelled through p via a binomial regression model[7]. The hurdle model provided two sets of results. These results could also be obtained separately by fitting both a logistic regression and Poisson model[8] that we had found the similar results. The main difference between the hurdle model and the separate model of logistic regression and Poisson model is that covariance between each parameters exists in the hurdle model. As a result, we decided to use the hurdle model to deal with WT issues for colonoscopy. Actually, there is the other model which has the similar concept with the hurdle model called zero-inflated model that have been dealt with by COM-Poisson[9] and generalized Poisson model[10]. The zero-inflated model can deal with zero part (non-complier) as well, but it is not appropriate to apply on screen data because that most of invited subjects are willing to undergo colonoscopy.
In the analysis of LOS for hospitalization, parameter estimation for the Coxian
47
phase-type distribution was nontrivial. Based on the maximum likelihood method, there are lots of algorithms such as the Nelder-Mead algorithm, the Quasi Newton algorithm[11] and the Newton-Raphson algorithm. We used these three methods to estimate parameters and compared their results with BIC score to determine which methods would fit the Coxian phase-type distribution better. In Table 6-1, both of the Newton-Raphson algorithm and Nelder-Mead algorithm show 3-phase model was the most appropriate model due to the minimum BIC. However, the Quasi Newton algorithm shows 2-phase model was better. In addition, we found that if it was a 1- or 2-phase model, all of them would obtain the same estimates, but when we considered 3- or 4-phase model, they became different. Therefore, among the comparison of all algorithms, both 3- and 4-phase model show the Newton-Raphson algorithm was more suitable because it could get smaller BIC. As a result, we thought the Newton-Raphson algorithm might be the most suitable method to estimate parameters.
The Coxian phase-type distribution describes the time to absorption of a finite Markov chain in continuous time and can be adequate for the continuous positively
skewed data with a long tail to get a better understanding of the underlying dynamic hidden phases. However, the real scenario of WT distribution also include non-response
data (time=0) and queue process that render the conventional Coxian phase-type model inadequate. As mentioned above, to solve these issues, we therefore developed the
48
hurdle model in combination with the Coxian phase-type. In the queue hurdle Coxian phase-type model, we used the queue process to estimate the arrival rate of eligible screenees, applying the concept of hurdle model to determine if attendees would receive the confirmatory diagnosis or not, and modelled their WT by the Coxian phase-type distribution if they actually complied with colonoscopy. Based on this model, it is more convenient to consider these three scenarios simultaneously.
With the limited clinical resources, the development of the queue hurdle Coxian phase-type distribution not only provides a new insight into the underlying mechanism of WT for early detection and the duration of hospitalization of CRC, but also can help clinicians or hospital managers improve the quality of service and provide some useful information for making decisions. When applying this model to population-based screening program with the problems of queue and non-response to colonoscopy the findings gave a clue to explore the reasons dominating such differences including provider factors such as the implementation of screening program and medical resources and population factors such as the knowledge and attitude toward CRC screening and medical interventions. They also provide more insight on the promotion of the referral of positive FIT identified from the participants with the uptake of screening program.
49
Limitations
This new model assumed the arrival rate and the probability of non-compliance were independent. However, in fact, the probability of non-compliance would be affected by the arrival rate. To cope the individual correlation between the parameters, we may use the hierarchical model to improve this circumstance, because the complicated processes can be modelled by a sequence of relatively simple models placed in a hierarchy.
In conclusion, we developed a new queue hurdle Coxian phase-type model to solve the compliance with the uptake of screening using the queue process, the problem of non-compliance with the referral of positive results of screenees to have confirmatory diagnosis using the hurdle model in combination with the Coxian phase-type model to identify hidden phases during the WT for undergoing colonoscopy for the referrals. The Coxian phase-type model was also applied to model the LOS in hospitalization for the treated patients diagnosed as CRC.
50
Reference
1. Chiu HM, Chen LS, Yen MF, Chiu YH, Fann CY, Lee YC, Pan SL, Wu MS, Liao CS, Chen HH, Koong SL, Chiou ST. Effectiveness of Fecal Immunochemical Testing in Reducing Colorectal Cancer Mortality From the One Million Taiwanese Screening Program. Cancer. 2015; 10.1002
2. Zorzi M, Fedeli U, Schievano E, Bovo E, Guzzinati S, Baracco S, Fedato C, Saugo M, Dei Tos AP. Impact on colorectal cancer mortality of screening programmes based on the faecal immunochemical test. GI cancer. 2014; 10.1136
3. Yu D, Hopman WM, Paterson WG. Wait time for endoscopic evaluation at a Canadian tertiary care centre: Comparison with Canadian Association of Gastroenterology targets. Can J Gastroenterol. 2008; 22(7):621-6.
4. Marshall AH, Shaw B, McClean SI. Estimating the costs for a group of geriatric patients using the Coxian phase-type distribution. Statistics In Medicine. 2007;
26:2716-2729.
5. Marshall AH, McCrink L. Discrete Conditional Phase-Type Model (DC_Ph) for patient waiting time with a logistic regression component to predict patient admission to hospital. Computer-Based Medical Systems. 2009; 553-556
6. Titman AC, Sharples LD. Semi-Markov Models with Phase-Type Sojourn Distributions. Biometrics. 2010; 66: 742-752
51
7. Ibrahim JG, Chen MH, Sinha D. Bayesian Survival Analysis. 2001, New York:
Springer-Verlag.
8. Dwivedi AK, Dwivedi SN, Deo S, Shukla R, Kopras E. Statistical models for predicting number of involved nodes in breast cancer patients. Health (Irvine Calif).
2010; 2(7):641-651.
9. Conway RW, Maxwell WL. A queuing model with state dependent service rates.
Journal of Industrial Engineering. 1962; 12: 132–136.
10. Consul PC, Famoye F. Generalized Poisson regression model. Communications in Statistics, Theory and Methods.1992; 21: 89-109.
11. Marshall AH, Zenga M. Recent developments in fitting Coxian phase-type distributions in healthcare. The XIIIth International Conference “Applied Stochastic Models and Data Analysis” 2009; 482-485.
52
Figure 3-1. Demographics of screening participants in Taiwanese national CRC screening program from 2004 to 2013
53
Figure 3-2. Time trend of screening participants number and FIT positive rate in Taiwanese nationwide CRC screening program
Figure 3-3. Time trend of referral rate and waiting time for colonoscopy in Taiwanese nationwide CRC screening program
54
Figure 5-1. Empirical data on Marshall’s study (a) and our simulated data (b)
55
Figure 5-2. Empirical data on waiting time for colonoscopy
56
Figure 5-3. Transition probabilities of Coxian two-phase model by risk score
57
Figure 5-4. Empirical data on LOS
Figure 5-5. Fitted three-phase Coxian phase-type distribution for SKH data set
58
Figure 5-6. Transition probability over time by gender
Figure 5-7. Transition probability over time by age
59
Table 3-1. Demographics of screening participants in Taiwanese national CRC screening program from 2004 to 2013
Characteristics
Female 2,936,358 150,080 (5.11)
Age (years) 50-54 1,496,838 78,316 (5.23)
55-59 1,434,402 85,178 (5.94)
60-64 1,143,251 81,531 (7.13)
65-69 903,859 71,839 (7.95)
Geographic area Northern 2,214,345 134,884 (6.09)
Middle 1,179,006 76,333 (6.47)
Southern 1,405,194 93,844 (6.68)
Eastern/offshore islands 179,805 11,803 (6.56) Type of
screening units
Hospital 2,574,431 185,103 (7.19)
Public health centers 1,954,430 93,811 (4.80)
Local clinics 449,489 37,950 (8.44)
Urbanization Main urban 3,787,368 241,452 (6.38)
Secondary urban 352,888 21,343 (6.05)
Rural 838,094 54,069 (6.45)
Period 2004-2009 1,254,391 46,151 (3.68)
2010-2013 3,723,959 270,713 (7.27)
Screening round Prevalent 3,027,035 201,128 (6.64)
Subsequent 1,951,315 115,736 (5.93)
Overall 4,978,350 316,864 (6.36)
60
Table 3-2. Descriptive results of attendees, positive rate, referral rate, the distribution of waiting time (WT)
Year Number of Attendees
Number of positive attendees
Positive rate
Referral rate Waiting time
Overall Colonoscopy Overall Colonoscopy
medium Q3 medium Q3
2004 83,756 2,886 3.5 66.7 50.6 26 42 27 43
2005 194,583 6,959 3.6 76.8 60.7 25 44 25 43
2006 210,114 6,576 3.1 82.7 72.8 24 43 24 43
2007 259,450 8,757 3.4 86.4 78.7 24 40 24 40
2008 218,712 7,587 3.5 86.9 77.3 22 35 22 35
2009 287,776 13,386 4.7 84.8 77.3 23 38 23 37
2010 940,241 64,559 6.9 65.7 56.2 25 43 25 43
2011 765,036 57,391 7.5 64.1 57.0 24 42 24 42
2012 1,016,069 72,970 7.2 65.8 59.8 27 47 27 47
2013 1,002,613 75,793 7.6 67.4 63.7 28 51 28 51
61
Table 3-3. Comparison of referral rate and median WT for colonoscopy in inaugural and rolling out period
Characteristics
Inaugural period (2004-2009) Rolling out period (2010-2014) No. of subjects
62
Table 3-4. Distributions of discharge types
Type of Discharge N Mean of LOS
(day) SD Min Max
1 : Discharge 2 2.5 0.71 2 3
3 : Discharge with OPD arranged
123 11.21 14.79 1 93
4 : Death 24 29.86 45.44 3 215
5 : AMAD 22 11 12.58 1 40
6 : Transferred 5 13.2 13.03 3 34
A : AMAD under critical condition
2 22 18.38 9 35
*OPD : outpatient department ; AMAD : Against medical advice discharge
63
Table 5-1. Results for fitting Coxian phase-type distribution to the simulated data on LOS of Marshall study compared with the original findings
Marshall data Simulated data
No. of
64
Table 5-2. Univariate analysis of factors affecting the compliance with colonoscopy and WT for undergoing colonoscopy
Characteristics Hurdle part Non-hurdle part
P-value Coefficient OR (95% CI) Coefficient RR (95% CI)
Gender Male -0.5440 * 1 -3.7709 * 1 0.3679
Female 0.0119 1.012 (0.991,1.033) 0.0083 1.008 (0.996,1.021)
Age (years) 50-54 -0.5628 * 1 -3.7703 * 1 < 0.0001
55-59 0.0076 1.008 (0.984,1.032) 0.0060 1.006 (0.992,1.020)
60-64 0.0185 1.019 (0.995,1.043) 0.0050 1.005 (0.991,1.020)
65-69 0.0770 1.080 (1.054,1.107) 0.0017 1.002 (0.987,1.017)
Geographic area Northern -0.5507 * 1 -3.7521 * 1 < 0.0001
Middle 0.0334 1.034 (1.012,1.057) 0.0370 1.038 (1.024,1.051)
Southern 0.0080 1.008 (0.988,1.029) -0.0680 0.934 (0.923,0.946)
Eastern/offshore islands 0.0492 1.050 (1.004,1.100) -0.0735 0.929 (0.903,0.956) Type of
screening units
Hospital 0.3353 1.398 (1.369,1.428) -0.0912 0.913 (0.902,0.924) < 0.0001
Public health centers -0.8399 * 1 -3.6916 * 1
Local clinic 0.8226 2.276 (2.208,2.347) -0.2178 0.804 (0.788,0.821)
Urbanization Main urban 0.0284 1.029 (0.992,1.067) -0.0320 0.968 (0.948,0.990) < 0.0001
Secondary urban -0.5742 * 1 -3.7415 * 1
Rural 0.0829 1.086 (1.043,1.132) -0.0055 0.995 (0.970,1.020)
Period 2004-2009 -1.0216 * 1 -3.5508 * 1 < 0.0001
2010-2013 0.5585 1.748 (1.694,1.804) -0.2551 0.775 (0.762,0.788)
Screening round Prevalent 0.3688 1.446 (1.415,1.478) -0.0083 0.992 (0.979,1.004) < 0.0001
Subsequent -0.7768 * 1 -3.7620 * 1
* : intercept
65
Table 5-3. Model Selection for the hurdle regression model for the possible interaction assessment of putative factors
Types of Model (additional variables) df AIC P-value H : None
N : period*unittype (Exclude gender)
34 413693 <0.0001 H : period*unittype、period*area
N : period*unittype、period*area (Exclude gender)
37 413605 <0.0001 H : period*unittype、period*area
N : period*unittype、period*area、period*urban (Exclude gender)
39 413578 <0.0001
All models contain gender, age, area (Geographic area), unittype (Type of screening units), urban (Urbanization), subs (Screening round) and period effect.
H : Hurdle part N : Non-hurdle part
66
Table 5-4. Multivariate analysis on main effect and interaction of factors affecting the non-compliance with colonoscopy
Characteristics Coefficient aOR (95% CI) P-value
Gender male -1.0836 * 1 <0.0001
female 0.0744 1.077 (1.061,1.093)
Age 50-54 -1.0836 * 1 <0.0001
55-59 0.1050 1.111 (1.088,1.134)
60-64 0.1369 1.147 (1.123,1.171)
65-69 0.2432 1.275 (1.247,1.303)
Urbanization Main urban -1.0836 * 1 <0.0001
Secondary urban 0.0343 1.035 (1.004,1.066)
Rural 0.1336 1.143 (1.117,1.169)
Screening round
Prevalent 0.4464 1.563 (1.537,1.588) <0.0001
Subsequent -1.0836 * 1
Period 2004-2009 Geographic area Northern -1.8200 ** 1 <0.0001
Middle -0.0132 0.987 (0.927,1.046)
Southern 0.1534 1.166 (1.099,1.233)
Eastern/offshore islands 0.4115 1.509 (1.355,1.664) Type of screening
units
Hospital 0.9326 2.541 (2.394,2.688) <0.0001
Public health centers -1.8200 ** 1
Local clinic 0.5400 1.716 (1.200,2.232)
2010-2013 Geographic area Northern -1.0836 * 1
Middle 0.0761 1.079 (1.057,1.101)
Southern 0.0360 1.037 (1.017,1.056)
Eastern/offshore islands 0.0051 1.005 (0.957,1.054) Type of screening
units
Hospital 0.0768 1.080 (1.058,1.102)
Public health centers -1.0836 * 1
Local clinic 0.5839 1.793 (1.744,1.842)
* : intercept ; ** : intercept and period effect
67
Table 5-5. Multivariate analysis of main effect and interaction of factors affecting WT for undergoing colonoscopy
Characteristics Coefficient aRR (95% CI) P-value
Age 50-54 0.0217 1.022 (1.009,1.035) 0.0460
Period 2004-2009 Geographic area Northern 0.0321 1.033 (0.968,1.097) <0.0001
Middle 0.1276 1.136 (1.069,1.203)
Southern 0.0770 1.080 (1.014,1.147)
Eastern/offshore islands -3.7554 * 1
Type of screening units
Hospital -0.2038 0.816 (0.657,0.975) <0.0001
Public health centers 0.2006 1.222 (0.987,1.458)
* : intercept ; ** : intercept and period effect
68
Table 5-6. The estimated results of Coxian phase-type models No. of phases 𝑝𝑝̂ = 0.26472 (0.00205) (non-compliance) 𝜇𝜇̂1 = 0.02870 (0.00016) (referral rate)
769183
2 𝜈𝜈̂ = 0.00021 (9.7 × 10−7) (arrival rate) 𝑝𝑝̂ = 0.26472 (0.00205) (non-compliance) 𝜇𝜇̂1 = 0.03040 (0.00019) (referral rate) 𝜇𝜇̂2 = 0.00590 (0.00046) (referral rate) 𝜆𝜆̂1 = 0.00043 (0.00006) (transition rate)
768284
3 𝜈𝜈̂ = 0.00021 (9.7 × 10−7) (arrival rate) 𝑝𝑝̂ = 0.26472 (0.00205) (non-compliance) 𝜇𝜇̂1 = 0.03037 (0.00019) (referral rate)
𝜇𝜇̂2 = 0 (0.00701) (referral rate) 𝜇𝜇̂3 = 0.00633 (0.00062) (referral rate) 𝜆𝜆̂1 = 0.00031 (0.00011) (transition rate) 𝜆𝜆̂2 = 0.01708 (0.00899) (transition rate)
768308
69
Table 5-7. The expected WT calculated with queue hurdle Coxian two-phase phase-type model
Table 5-8. Estimated results of queue hurdle one-phase Coxian phase-type model with the covariate of risk score affecting WT for the referral of colonoscopy
No. of phases 𝑝𝑝̂ = 0.26471 (0.00205) (non-compliance)
𝜇𝜇̂01 = 0.02604 (0.00021) 𝛽𝛽̂1 = 0.19073 (0.01096)
WTlow=38 WThigh=32
768868
70
Table 5-9. Estimated results of queue hurdle two-phase Coxian phase-type model with the covariate of risk score affecting WT for the referral of colonoscopy
71
72
Table 5-10. Estimated results of fitting Coxian phase-type distribution to SKH data set No. of
phases Parameters (SD) BIC
1 𝜇𝜇̂1 = 0.0726 (0.0054) 1295
Table 5-11. The expected LOS in phase i (days) among the three-phase Coxian Phase-type models
73
Table 5-12. The comparison of two 3-phase Coxian models assuming three and two absorbing rates
Original Model (3-phase model) (three absorbing rates)
Alternative Model
74
Table 5-13. Model selections for Coxian phase-type model No. of
phases LOS (days) BIC
1 LOS=14 1470
2 LOS1=9 LOS2=36 1451
3 LOS1=6 LOS2=2 LOS3=30 1439
75
Table 5-14. Descriptive results of length of stay (LOS) by gender
Variable N Mean of
LOS (day)
Median of
LOS (day) SD Min Max
Female 70 16.04 9 19.21 1 103
Male 108 12.31 5 23.68 1 215
Total 178 13.78 7 22.05 1 215
76
Table 5-15. Estimated results on transition rates and regression coefficients regarding the effect of gender in two-phase Coxian phase-type model
77
78
Table 5-16. Descriptive results of length of stay (LOS) by age
Variable N Mean of
LOS (day)
Median of
LOS (day) SD Min Max
< 60 53 17.45 7 35.29 1 215
60-74 65 10.18 5 12.54 1 73
> 74 60 14.43 8.5 12.80 2 60
Total 178 13.78 7 22.05 1 215
79
Table 5-17. Estimated results on transition rates and regression coefficients regarding the effect of age in two-phase Coxian phase-type model Parameter
80
81
82
Table 6-1. The estimated results of Coxian phase-type models with three approaches
Method Newton-Raphson Quasi-Newton Nelder-Mead Simplex
No. of
phases Parameters BIC Parameters BIC Parameters BIC
1 𝜇𝜇̂1 = 0.0726 1300 𝜇𝜇̂1 = 0.0726 1300 𝜇𝜇̂1 = 0.0726 1300