Chapter 2 Literature Review
2.3 Intervention programs for HAI
2.3.3 Bundle-based intervention
In Cheng et al’s study in ICU in Chi Mei medical center, they introduced a
catheter-associated urinary tract infection (CAUTI) bundle since August, 2013. They found that the CAUTI was reduced to zero 52. Interestingly, they also reported that the bundle effect could affect other infection sites and decreased the incidence, including ventilator-associated pneumonia (from 3.69 to 2.90/1,000 ventilator days), central line-associated bloodstream infection (from 2.08 to 1.92/1,000 catheter days) and HAI (from 7.30 to 4.91 per 1,000 inpatient days) 53.
16 2.4 HAIs modelling
Many longitudinal follow-up health issues can be quantified and represented as time series tracings. Time series analysis have been well applied in many areas, such as mechanics, physics, meteorology, genetics and finances. In the area of epidemiology, incidence and mortality studies using time series analysis have become commonplace in these decades.
Multivariable Poisson regression was used for the analysis of Escherichia coli BSIs incidence by Al-Hassam et al. in a population-based study and K. pneumoniae by Anderson et al. 34,35. Chazan et al. used Poission generalized linear models with the assumption of constant dispersion over time to evaluate Escherichia coli BSIs 33.
Multivariable ARIMA time series with trend and lagged temperature was applied in the Kaier et al.’s study on ESBL-producing E.coli and Klebsiella species 16.
The incidences of infectious diseases have the characteristics of seasonality, autocorrelations, cyclic pattern which were noted in several studies 15. Observations close in time tend to have higher correlations with each other. The cardinal assumption of traditional regression model, chi-square test and ANOVAs is independent of the observations, which violates the characteristics of clinical infectious diseases. Moreover, the intervention policy lunched needs time to take
17
effect. When effects persist over time, an appropriate model will include lagged variables.
Interventional time-series were used in many studies to evaluate the intervention impact. Vernaz et al. studied the effects of antibiotics use and hand rub consumption on the incidence of methicillin-resistant Staphylococcus aureus (MRSA) and Clostridium difficile. They used ARIMA model with transfer functions and
demonstrated that the lagged effects were seen in different antibiotics classes and the consumption of alcohol-based hand rub. No seasonal variation was detected in the monthly incidence of MRSA. This study did not distinguish the community-acquired or healthcare-associated MRSA infections 54. Another study of MRSA incidence in neonate intensive care unit by Sakamoto et al. also applied ARIMA (0,1,1) model and included 4 lag zero covariates. Although they showed that there were no significant association seen in the patient-to-nurse ratio, bed occupancy rates, and MRSA colonization pressure, there was possibility that the delayed effect of these factors could affect the incidence of MRSA. Still there was no seasonal variation noted in their study 55.
The prior studies have been limited by their inability to incorporate covariates that might be important confounding factors 56, to assess the dependence structure of
18
consecutive observations 15,17,35, or limited to some specific microorganisms 15,35,56. Time series models, however, take correlations of the successive observations (incidences) into account and give more valid inferences. Time series applying Box-Jenkins model and state-space model can manage these problems.
Some empirical methods can also be used in the time series forecasting.
Exponential weighted moving average is very popular for economical use, such as forecasting inventory level, monthly sales, etc. In a non-seasonal time series without trend, the future value of the observation can be obtained.
𝑦̂𝑡+1|𝑡 = 𝜆 ∑(1 − 𝜆)𝑗𝑦𝑡−𝑗 (0 ≤ 𝜆 < 1)
𝑡−1
𝑗=0
(2-1)
It shows that the future value is the linear combinations of the past
observations, with different weights. It is known as exponential smoothing.
𝑦̂𝑡+1|𝑡 = λ𝑦𝑡+ (1 − 𝜆)𝑦̂𝑡|𝑡−1 (2-2)
𝑦̂𝑡+1|𝑡 is a weighted average of 𝑦𝑡 and 𝑦̂𝑡|𝑡−1 , and the updating of the time t-1
forecast is achieved when the new observation 𝑦𝑡 is available. Though it is friendly for the user, the forecast uncertainty or estimate uncertainty cannot be assessed easily by exponentially weighted moving average method.
19 2.4.1 Poisson models
Since the incidence of HAIs and associated covariates were of interest, the Poisson model can be expressed as follows. log{𝑛𝑖} = log{𝑁𝑖} + x𝑖′𝜷 , where ni
denotes number of events of interest (episodes of HAIs), Ni denotes patient-days (offset) in each level, x is the covariates vector, and 𝛃 was the corresponding regression coefficients. Poisson regression models were used to calculate the relative risk of developing HAIs 57. The study by Shen showed that attack rate together with case-fatality rate led to an increase in mortality, particularly seen in E.coli and fungus
57.
2.4.2 Negative binomial models
Negative binomial regression models were used to estimate the relative risk of developing HAIs in the cases of over-dispersion noted in the Poisson model.
The negative binomial distribution describes the number of failures before the rth success. The probability of the number of failures required before having the rth
success given 𝑟, 𝑝 is expressed as P(Y = 𝑦|𝑟, 𝑝) = (𝑦 + 𝑟 − 1
𝑦 ) 𝑝𝑟(1 − 𝑝)𝑦, 𝑦=0,1,2,… (2-3)
20
Y is the negative binomial random variable and it denotes the number of failures before the rth success; p is the probability of success and r is the number of success events 58. Let 𝑌1, 𝑌2, … , 𝑌𝑛be a set of count data. The data is negative binomial distributed and can be expressed as
P(Y𝑖 = 𝑦𝑖|𝛼, 𝜇𝑖) =Γ(𝑦𝑦 𝑖+1 𝛼⁄ )
𝑖!Γ(1 𝛼⁄ ) (1+𝛼𝜇1
𝑖)1 𝛼⁄ (1 −1+𝛼𝜇1
𝑖)𝑦𝑖 (2-4) In Poisson distribution, the variance of 𝑌𝑖, var(𝑌𝑖), is equal to the expectation of 𝑌𝑖, say E(𝑌𝑖). When the var(𝑌𝑖) is larger than E(𝑌𝑖), it is called the over-dispersion.
Negative binomial model is well known for its advantage for adjusting
over-dispersion by introducing an additional parameter. It provides a model with var(𝑌𝑖)=αE(𝑌𝑖). The α is the parameter that can be estimated and α is larger than one
59. Over-dispersion occurs in many situations, such as lack of independence between
the observations, inappropriate link functions between the random component and the systemic components, and the existence of unmeasured covariates. In ecological bird migration studies, Linden et al. showed that sampling, flocking behavior or aggregation, environmental variability, or combinations of these factors could results in over-dispersion 60. Luo J. et al. uses negative binomial distribution to model hypoglycemic events with baseline hypoglycemic event rate adjustment 61.
21 2.4.3 Autoregressive models
Let 𝑦𝑡 denote the random component of generalized linear model (GLM) for a stationary time series data. A pth order autoregressive regression model under the
context of GLM is expressed as follows.
𝑌𝑡= ∅1𝑌𝑡−1+ ∅2𝑌𝑡−2+ ⋯ + ∅𝑝𝑌𝑡−𝑝+ 𝑢𝑡 𝑡 = 1,2, … , T
(2-5)
When p = 1 , it is often called the first-order autoregressive model denoted by AR(1)
with the following expression
𝑌𝑡= ∅𝑌𝑡−1+ 𝑢𝑡 (2-6)
If the data are centered, the equation is re-expressed as
(𝑌𝑡− 𝜇) = ∅(𝑌𝑡−1− 𝜇) + 𝑢𝑡 (2-7)
𝑢 is the mean of the series.
Note that |∅| < 1 is required to meet the stationarity. From the equation (2-5), we have
V(𝑌𝑡) = ∅2V(𝑌𝑡−1) + 𝑉(𝑢𝑡)
22
𝑉0 = ∅2𝑉0+ 𝜎𝑢2 (2-8)
𝑉0 = 𝜎𝑢2 1 − ∅2
E(𝑌𝑡𝑌𝑡−𝑚) = ∅𝐸(𝑌𝑡−1𝑌𝑡−𝑚) + 𝐸(𝑌𝑡−𝑚𝑢)
Applying the property of stationarity of the series and the independence of 𝑌𝑡−1and 𝑢𝑡 By induction, we have autocorrelation function
𝑉𝑚= ∅𝑚𝑉0
𝜌𝑚 =𝑉𝑉𝑚
0 = ∅𝑚 (2-9)
As |∅| < 1 , the autocorrelation function is exponentially decreasing in m.
For 0 < ∅ < 1, all 𝜌𝑚 are positive.
For −1 < ∅ < 0, 𝜌1 < 0, the sign of subsequent autocorrelations alternate.
23
When = 2 , a stationary series 𝑌𝑡 is the second-order autoregressive model denoted
by AR(2) with the following expression.
𝑌𝑡= ∅1𝑌𝑡−1+ ∅2𝑌𝑡−2+ 𝑢𝑡 (2-10)
Using a similar derivation, as done for the AR(1), we have
𝑉𝑘 = ∅1𝑉𝑘−1+ ∅2𝑉𝑘−2 (2-11)
𝜌𝑚 = ∅1𝜌𝑚−1+ ∅2𝜌𝑚−2 (2-12)
To meet stationarity, the coefficients ∅1 and ∅2 have to satisfy the following
constraints
∅1+ ∅2 < 1, ∅2− ∅1 < 1, −1 < ∅2 < 1
for the AR(2) model.
The equation (4.3.8) is so called Yule-Walker equation.
For 𝑚 = 1,
𝜌1 = ∅1𝜌0+ ∅2𝜌−1 𝜌0 = 1 and 𝜌−1= 𝜌1, we have
𝜌1 = ∅1 1 − ∅2 For 𝑚 = 2,
𝜌2 = ∅1𝜌1+ ∅2
24
𝜌2 = ∅2+1−∅∅12
2
From (2-10), we also have
V(𝑌𝑡) = ∅12V(𝑌𝑡−1) + ∅22𝑉(𝑌𝑡−2) + 2∅1∅2𝐶𝑜𝑣(𝑌𝑡−1, 𝑌𝑡−2) + 𝜎𝑢2 (2-13)
Following (2-11), we have
𝑉1 = ∅1𝑉0+ ∅2𝑉−1
= ∅1𝑉0+ ∅2𝑉1
𝑉1 = ∅1 𝑉0 1 − ∅2 Substituting of this equation into (2-13)
𝑉0 =(1−∅ 𝜎𝑢2(1−∅2)
2)(1−∅12−∅22)−2∅1∅2 (2-14)
The use of AR(1) and AR(2) mentioned above is assumed with a stationary time-series. However, most of time series are not-stationary. To solve this problem, we first define.
B(𝑦𝑡) = 𝑦𝑡−1
B( . ) represents backward shift parameter 𝑦𝑡− ∅1𝑦𝑡−1− ∅2𝑦𝑡−2− ⋯ − ∅𝑝 𝑦𝑡−𝑝
25
= 𝑦𝑡(1 − ∅1𝐵 − ∅2𝐵2− ⋯ − ∅𝑃𝐵𝑃) = 𝑢𝑡
⇒ ∅(B)𝑦𝑡 = 𝑢𝑡
A non-stationary time series can be transformed to stationarity by differencing (of order d).
For instance,
𝑍𝑡− 𝑍𝑡−1 = (1 − 𝐵)𝑍𝑡 = (1 − 𝐵)2𝑦𝑡 is stationary.
⇒d=2
2.4.4 Moving average models
yt = μt− θ1 μt−1− θ2 μt−2− ⋯ − θq μt−q
The moving average of order q is denoted by MA(q) with the following expression
yt= μt− θ μt−1 (2-15)
E(yt) = 0
ν0 = ν(yt) = σ2+ θ2σμ2 = σ2(1 + θ2)
Cov(yt, yt−1) = E(yt yt−1)
= E[(μt− θ μt−1)(μt−1− θ μt−2)]
= E(μt μt−1) − θ[E(μt−12 ) + E(μt μt−2)] + θ2E(μt−1 μt−2)
∵ μ1 , μ2, … are independent with E(μt) = 0 for all t.
ν1 = Cov(yt, yt−1) = −θσμ2
26
Cov(yt, yt−m) = 0, m = 2,3, …
The autocorrelation function is expressed as follows ρ1 = ν1
ν0 = − θ
1 + θ2 (2-16)
ρm = 0, m = 2,3, …
Replacing θ with 1θ , (2-16) gives the same autocorrelation.
The re-arrangement of 2-15 gives
μt= yt+ θ μt−1
= yt+ θ(yt−1+ θ μt−2) = yt+ θ yt−1+ θ2 μt−2
μt = yt+ θ yt−1+ θ2 yt−2+ ⋯
⟹ yt= −(θ yt−1+ θ2 yt−2+ ⋯ ) + μt
If |θ| < 1, it can be seen that MA(1) model is easily inverted to an infinite-order AR model.
27
Second-order Moving Average Model MA(2)
yt = μt− θ1 μt−1− θ2 μt−2 (2-17)
The auto covariance functions are
ν1 = Cov(yt, yt−1)
= −θ1 σμ2+ θ1 θ2 σμ2 = −(θ1+ θ2)σμ2 ν2 = Cov(yt, yt−2) = −θ2 σμ2 ν0 = ν(yt) = σμ2+ θ12σμ2 + θ22σμ2
= (1 + θ12+ θ22) σμ2
The autocorrelation functions for MA(2) process are ρ1 =ν1
2.4.5 Autoregressive moving average models
With the application of the p-order of AR and the q-order of MA, the general form of combining AR with MA is expressed as follows
yt = (ϕ1 yt−1+ ϕ2 yt−2+ ⋯ + ϕp yt−p)
+ (μt− θ1μt−1− θ2 μt−2− ⋯ − θqμt−q)
(2-20)
which is denoted as ARMA(p,q).
28
When p=1 and q=1, we have ARMA(1,1), which is expressed by
yt= ϕ yt−1+ μt− θ μt−1
The auto covariance for the ARMA(1,1) is
E(yt yt−m) = ϕ E(yt−1 yt−m) + E(μt yt−m) − θE(μt−1 yt−m)
⇒ νk = ϕ νk−1+ E(μt yt−m) − θ E(μt−1 yt−m) When k=0 ⇒ ν0 = ϕ ν1+ E(μt yt) − θ E(μt−1 yt) − (4.3.6)
E(μt yt) = σμ2 Note that
E(μt−1 yt) = ϕ E(μt−1 yt−1) + E(μt μt−1) − θE(μt−12 )
= ϕσμ2− θσμ2 = (ϕ − θ)σμ2
⇒ ν0 = ϕ ν1+ σμ2 − θ(ϕ − θ)σμ2 When k=1
ν1 = ϕ ν0+ E(μt yt−1) − θ E(μt−1 yt−1)
= ϕ𝑣0+ E(μt yt−1) − θ σμ2
E(μt yt) = ϕ E(μt yt−1) + E(μt2) − θE(μt μt−1) σμ2 = ϕ E(μt yt−1) + μa2
29
2.4.6 Dynamic linear models
In the early sixties, state-space models originated in the engineering fields. Its application in time series were noted in Akalke, Harrison and Stevens. It became established during the eighties by Aoki, Harvey, West and Harrison. It becomes more popular in the recent years and more applications are found in the area of molecular biology, ecology, space technology and finance 62-64.
The dynamic linear model can be a special case of a general state-space models,
30
being Gaussian and linear. Dynamic linear models have flexibility to deal with nonstationary time series, parameter uncertainty, structural change, and statistical analysis. The dynamic linear models are often more interpretable than ARMA models, especially in the regression relationships. When the time series have discontinuity or shifts, the state-space model can deal with it directly.
In the case of nonstationary time series, one should do preliminary
transformations in order to reach stationarity in ARMA analysis. State-space models can deal with nonstationary time series without preliminary transformation of the data.
If we consider a system filled with disturbances, as in the real world, we may apply a time series with the build-in disturbances or noises.
The dynamic linear model is expressed as
Y𝑡 = F𝒕′𝜽𝒕+ 𝑣𝑡 (2-21)
θ𝒕 = 𝐆𝒕𝜽𝒕−𝟏+ 𝜔𝑡 (2-22)
F𝒕 is a p-dimensional vector of regression vector or known constants.
𝜽𝒕 = (𝜃𝑡,1, 𝜃𝑡,2,…, 𝜃𝑡,𝑝)′ , a 𝑝x1 state vector at time t.
𝑣𝑡~N(0, V𝒕 ) is observation noise
G𝒕 is a p x p state evolution matrix at time t.
31
𝜔𝑡~N(0, W𝒕) is the state evolution noise at time t.
The dynamic linear model is a model with two sub-models. The first equation is an observation process. It is determined by a latent process called systemic process, the second equation. If we knew the latent process at
successive time points, the Y𝑡′𝑠 would be independent. The second equation is the systemic process, which cannot be observed directly. This equation assumed that θ𝒕 depends on itself of the previous time point, 𝜽𝒕−𝟏, through an
evolution matrix, 𝐆𝒕. In the dynamic linear model, the assumption is linear relationship and Gaussianity. Then the forecasting of the future Y𝑡+𝑘′𝑠 can be obtained sequentially.
The simple form of dependence among the Y𝑡′𝑠 is the Markov property.
π(𝑦𝑡|𝑦1:𝑡−1) = π(𝑦𝑡|𝑦𝑡−1) for any t > 1.
(𝑌𝑡)𝑡≥1 is a Markov chain with Markov property. The information about 𝑦𝑡
carried by 𝑦1:𝑡−1 is the same with that carried by 𝑦𝑡−1. This means that 𝑦𝑡 depends on 𝑦𝑡−1 alone.
The finite-dimensional joint distributions can be expressed as
π(𝑦1:𝑡) = π(𝑦1) ∙ ∏ π(𝑦𝑗|𝑦𝑗−1)
𝑡
𝑗=2
(2-23)
in a Markov chain.
32
Then the state-space model is specified as
π(𝜃0:𝑡, 𝑦1:𝑡) = π(𝜃0) ∙ ∏ π(𝜃𝑗|𝜃𝑗−1)
𝑡
𝑗=1
∙ π(𝑦𝑗|𝜃𝑗) (2-24)
In the linear state space model, one can specify linear distributions. For instance, in normal linear model, one can use normal distributions in the following
functions ℎ𝑡 and 𝑔𝑡.
A general state-space model:
𝑌𝑡= ℎ𝑡(𝜃𝑡, 𝑣𝑡) (2-25)
𝜃𝑡 = 𝑔𝑡(𝜃𝑡−1, 𝑤𝑡) (2-26)
2.4.7 Regression models and ARMA models in DLM representation
Using state-space formulation, ARMA models can be expressed as dynamic
linear models65,66. (See equation (2-21), (2-22))
Y𝑡 = F𝒕′𝜽𝒕+ 𝑣𝑡 (2-21)
θ𝒕 = 𝐆𝒕𝜽𝒕−𝟏+ 𝜔𝑡 (2-22)
F𝒕 is a p-dimensional vector of regression vector or known constants.
𝜽𝒕 = (𝜃𝑡,1, 𝜃𝑡,2,…, 𝜃𝑡,𝑝)′ , a 𝑝x1 state vector at time t.
𝑣𝑡~N(0, V𝒕 ) is observation noise
33 G𝒕 is a p x p state evolution matrix at time t.
𝜔𝑡~N(0, W𝒕) is the state evolution noise at time t.
(a) Static regression model: (From equation (2-21), (2-22))
Y𝑡 = F𝒕′𝜽𝒕+ 𝑣𝑡 (2-21)
θ𝒕 = 𝐆𝒕𝜽𝒕−𝟏+ 𝜔𝑡 (2-22)
We have
G𝒕 = I𝒑= [1 ⋯ 0
⋮ ⋱ ⋮
0 ⋯ 1] and W𝒕 = 𝟎 for all t. (2-27)
For example, we can include an explanatory variable, x, as
𝑌𝑡 = 𝜃1+ 𝜃2𝑥𝑡+ 𝑣𝑡, 𝑣𝑡 𝑖𝑖𝑑 ~N(0, 𝜎2) (2-28)
A DLM with (𝜃𝑡 = [𝜃1
𝜃2], 𝐹𝑡 = [1
𝑥𝑡], G𝒕 = I, W𝒕 = 𝟎 ) is a static regression with explanatory variable x, and the vector (𝑌𝑡, 𝑥𝑡), t=1,2,… is observed over time.
(b) Time-varying regression model: (From equation (2-21), (2-22))
Y𝑡 = F𝒕′𝜽𝒕+ 𝑣𝑡 (2-21)
θ𝒕 = 𝐆𝒕𝜽𝒕−𝟏+ 𝜔𝑡 (2-22)
34 G𝒕 = I𝒑= [1 ⋯ 0
⋮ ⋱ ⋮
0 ⋯ 1] and W𝒕 ≠ 𝟎 , random walk. (2-29) For example, we can include an explanatory variable, x, as
𝑌𝑡= 𝜃𝑡,1+ 𝜃𝑡,2𝑥𝑡+ 𝑣𝑡, 𝑣𝑡 𝑖𝑖𝑑 ~N(0, 𝜎2) (2-30) 𝜃𝑡 = [𝜃𝑡,1
𝜃𝑡,2], 𝐹𝑡 = [1 𝑥𝑡].
(c) Linear growth model (local linear trend with time-varying slope in the dynamics
for 𝜇𝑡) (From equation (2-21), (2-22),
(d) Static autoregression, AR(p) model:
Y𝑡 = 𝜇𝑡+ 𝑣𝑡 (2-31)
μ𝑡 = μt−1+ 𝛽𝑡−1+ 𝜔𝑡,1 (2-32)
𝛽𝑡 = 𝛽𝑡−1+ 𝜔𝑡,2 (2-33)
35 (From equation (2-21),
Y𝑡 = F𝒕′𝜽𝒕+ 𝑣𝑡 (2-21)
F𝑡′ = (𝑦𝑡−1, 𝑦𝑡−2, … , 𝑦𝑡−𝑝), 𝜽′= (∅1, ∅2, … , ∅𝑝), 𝑣𝑡 = 𝜖𝑡, for all t.
(d-1) Alternatively, the Gaussian ARMA and DLM models are the same in the
time-invariant case. (Hannan and Deistler, 1988). From equation (2-21) and (2-22),
Y𝑡 = F𝒕′𝜽𝒕+ 𝑣𝑡 (2-21)
(e) Time-varying autoregressive coefficients (TVAR) model: (also called random coefficient autoregressive (RCAR) model). (Congdon p145)
36 The lag coefficients 𝜽𝒕 can vary over time.
From equation (2-21) and (2-22),
Y𝑡 = F𝒕′𝜽𝒕+ 𝑣𝑡 (2-21)
θ𝒕 = 𝐆𝒕𝜽𝒕−𝟏+ 𝜔𝑡 (2-22)
In which
F𝑡′ = (𝑦𝑡−1, 𝑦𝑡−2, … , 𝑦𝑡−𝑝) and W𝒕 ≠ 𝟎 for all t. (2-36)
(f) Autoregressive moving average (ARMA) model,
(f-1) μ = 0, ARMA(p,q)
37
ω𝑡= (1, 𝜓1, … , 𝜓𝑚−1)′𝜖𝑡= 𝐑𝜖𝑡 V = 0, and W = 𝐑𝐑′𝜎2
𝜃𝑡 = (𝜃1,𝑡, … , 𝜃𝑚,𝑡)′
𝜎2: the variance of the error sequence (𝜖𝑡)
(f-2) observational noise-free DLM: 𝑣𝑡 = 𝟎 in equation (2-21)
Y𝑡 = F𝒕′𝜽𝒕 (2-39)
θ𝒕 = 𝐆𝒕𝜽𝒕−𝟏+ 𝜔𝑡 (2-22)
Dimension: m
Evolution variance matrix
U = U(1, 𝜓1, … , 𝜓𝑚−1)′(1, 𝜓1, … , 𝜓𝑚−1).
(g) Polynomial trend model:
In equation (2-21) and (2-22),
Y𝑡 = F𝒕′𝜽𝒕+ 𝑣𝑡 (2-21)
θ𝒕 = 𝐆𝒕𝜽𝒕−𝟏+ 𝜔𝑡 (2-22)
We have
F = (1, 1, 0, … ,0)′
38
𝜽′ = (𝜇𝑡, ∅2, … , ∅𝑝) (2-40)
2.4.8 Stationary time series Stationarity definition:
1. Strict Stationary if for any integer n, the distribution function of any random vector 𝑍𝑡,𝑛+1 = (𝑍𝑡, … , 𝑍𝑡+𝑛)′ is independent of the time t.
2. Weakly stationary if both E[𝑍𝑡,𝑛+1] and V[𝑍𝑡,𝑛+1] are independent of the time t.
3. Gaussian (normal) stationary if it is weakly stationary and the distribution of every 𝑍𝑡,𝑛+1 is normal.
A DLM {F𝑡, 𝐺𝑡, Vt, 𝑊𝑡} is a zero mean weakly stationary DLM if and only if
(1) It is equivalent to an observable constant DLM 67;
(2) The eigenvalues of G lie inside the unit circle, i.e., |𝜆𝑖| < 1, 𝑓𝑜𝑟 𝑖 = 1, … , 𝑛
2.4.9 Filtering, Smoothing, and Forecasting in dynamic linear models Filtering means to use the recent data to revise the previous values of the
previous state. The filtering distribution is π(𝜃𝑡|𝑦1:𝑡). And π(𝜃𝑡−𝑘|𝐷𝑡), 𝑓𝑜𝑟 𝑘 ≥ 1 is called the k-step time t filtered distribution for the state vector. Smoothing is a
39
retrospective reconstruction of the time series system using the filtered distribution.
Forecasting is to compute the value or the state of the future based on the filtering density of time t state, θ𝒕. π(𝑦𝑡+1|𝑦1:𝑡)is the one-step-ahead predictive density.
2.4.10 Generalized autoregressive moving-average models
The problems of non-Gaussian time series is encountered in many research fields such as biology, epidemiology, and medical studies, in that the observations are likely to be positive and sometimes are integers or counts data. Time series models for counts have been used by several authors 68. The extension of the normal distributed time series to the suitable distribution is reasonable. Poisson autoregressive model is therefore considered.
Hsu et al. applied generalized Poisson autoregressive model on the Japanese encephalitis to evaluate the secular time trend, seasonal variation pattern, autoregressive order of Japanese encephalitis incident cases and temporal relationship of lagged temperature and precipitation13. They also adjusted vaccination rate, pig
density, and geographic area variation. The model form is written as follows.
log (𝐸(𝑦𝑡|𝑦𝑡−1, … 𝑦𝑡−𝑙, 𝑥1, 𝑥2, 𝑥3, … , 𝑥𝑝, 𝑡, 𝑚1, … , 𝑚11)) = ln(𝑃𝑌) + 𝛼 + ∑𝑛𝑘=1𝑣𝑘𝑇𝑀𝑡−𝑘+ ∑𝑚𝑖=1𝜋𝑖𝑅𝑎𝑡−𝑖+ 𝛿1𝑡 + 𝛼3+ 𝛽1𝑥1+ ⋯ + 𝛽𝑝𝑥𝑝+
(2.41)
40 𝜔1𝑚1+ ⋯ + 𝜔11𝑚11+ ∑𝑙𝑗=1𝛾𝑗𝑦𝑡−𝑗
Where 𝑦𝑡 is the number of Japanese encephalitis cases by month at time 𝑡.
𝑥1, … , 𝑥𝑝 are covariates such as pig density and geographic area. 𝑚1, … , 𝑚11 are
dummy seasonal variables. 𝑦𝑡−𝑗 are autoregressive orders and ln(𝑃𝑌) = ln(Person − Years) is the offset. ∑𝑛𝑘=1𝑣𝑘𝑇𝑀𝑡−𝑘+ ∑𝑚𝑖=1𝜋𝑖𝑅𝑎𝑡−𝑖 is the
combinations of lagged temperature and precipitation. 𝑣𝑘 , 𝜋𝑖 , 𝛿1 , 𝛽1, … , 𝛽𝑝, 𝜔1, … , 𝜔11are the corresponding regression coefficients 13.
The proposed model, albeit it has been extended form Gaussian data to non-Gaussian data, is the failure of taking into account the uncertainty of parameters, and correlated property and heterogeneity property due to hierarchical data.
Due to the limitation of the application of previous models for the infection data, we developed a novel Bayesian generalized linear mixed ARIMA model and applied it to HAIs.
41 Chapter 3 Materials and Methods
3.1 Setting
A cohort of healthcare-associated infections was followed during the period of January 1, 1994 and December 31, 2011 in an urban tertiary medical center in northern Taipei with 921-bed and approximately 27,000 inpatient admission annually.
3.2 Patient enrollment and Definition
Patients who fulfilled the criteria of healthcare-associated infections (HAIs) were eligible for this HAIs cohort since 1994. All confirmed cases and related factors were prospectively collected in the database.
Hospital information system
The nationwide health insurance has been launched since 1-March 1995 in Taiwan 69. Before this significant change in medical system, the hospital information system, including inpatient, outpatient, emergency, prescription, bio-laboratory, etc., was established for automatic insurance reimbursement and payment. Besides those systems, the hospital-associated infection registry system also was separately set up to systematically collect the HAIs for quality control of hospital care which was
independently from insurance payment. But, those could link with admission of
42
hospitalized status and laboratory examination using chart number linkage. Therefore, the all hospitalized admission and infection status could be completed accessed since 1994. The database contains information about every admitted patient, includes, date of admission, date of discharge, and date of transfer, ward of admission, physician attending service, operation information if available, discharge condition, and the International Classification of Diseases (Ninth Revision) Clinical Modification (ICD-9-CM) codes of diagnoses, admission types (from emergency or outpatient
department),
Hospital-associated infection registry system
The infection control team established a nosocomial infection registry system since 1993. Ever since then, a central infection committee has coordinated the infection control and confirmed enrolled cases every week. HAIs were defined according to the U.S. Centers for Disease Control and Prevention standards. The U.S.
CDC defined an HAI as a localized or systemic condition resulting from an adverse reaction to the presence of an infectious agents or its toxins70. HAIs were classified as urinary tract infection (UTI), surgical site infection (SSI), pneumonia (PNEU), bloodstream (Bacteremia), skin and soft tissue (SST), gastrointestinal system (GI), eye, ear, nose, throat, or moth (EENT), and other (central nervous, reproductive tract,
43
bone and joint, and cardiovascular system infections) 70,71. The database contains comprehensive information about age, gender, infection sites, culture sampled date, indwelling bladder catheter, central venous catheter, admitted medical departments, and numbers of HAIs episodes during each admission. This registry system are used for clinical infectious diseases studies 72,73.
Incident cases: Incident cases were those free of HAIs hospitalized patients attacked
by a new episode of HAI at least 48 hours after their admission in each month.
HAI episode: A HAI episode was defined as a new infection acquired in the hospital
documented after at least 48 hours admission in the hospital.
HAI-related 30-day death: For the incident cases who died within 30 days and
caused by the HAI were defined.
Patient-days: It was calculated since admission entrance and ended by the first
infection occur. The patient-days of non-infectious patients was reckoned the duration of hospitalization.
Incidence rate: The incidence rate was expressed as number of infection episodes per
1,000 patient-days.
Death rate: Death rate was the number of deaths due to HAIs per 100 HAIs patients.
44
Mortality rate: Mortality rate was calculated as the death cases per 1,000
patient-days which was also the product of the incidence rate and the death rate.
Culture: Microbiological specimens were collected as recommended by the CDC.
Thresholds of positive culture was defined according to the CDC for different infection sites70,71.
3.3 Study design
This study was designed as an incidence-death follow-up cohort. The current study focused on the incidence part of the cohort (Figure 1). Those incident cases were followed up for 30 days, and those who died within 30 days and caused by the HAI episode were defined as HAIs related 30-day death. Since the seasonal variations exist in hospital admissions 17, this study use disease incidence as a measurement unit instead of counts.
During the study period of twenty-years, the hospital policy-maker conducted infection control interventions. The intervention programs of the hospital were Plan-Do-Check-Act (PDCA) program, Hygiene Programs, Centers for Disease Control, R.O.C. (Taiwan) (CDC) Hand-hygiene project and the urinary tract infection Quality Improvement Program of Taiwan Joint Commission on Hospital Accreditation
45
(TJCHA), and bundle care program were conducted by the hospital infection control team. The details of each intervention were described as follows.
PDCA program: The hospital infection control decision-makers initiated a program
using Edward Deming’s Plan-Do-Check-Act (PDCA) cycle-combined monitoring for
each warning outbreak of HAI incidence since 2005. It was a warning-feedback system that the infection control team monitored the infection rate of wards. If the infection rate was above the 95% confidence interval, the ward-based unit should launch the act to improve the infection rate.
Hygiene programs: They were the combined facility improvement and continuing
education programs. The infection control team both introduced an alcohol-based
education programs. The infection control team both introduced an alcohol-based