Aim of this study - Literature review - 以主觀及客觀臨床評估偵測兒童阻塞性睡眠呼吸中止

2. Literature review

2.7 Aim of this study

The aim of this study is to elucidate diagnostic abilities of objective measures (i.e., adenoid size, tonsil size, and obesity), subjective measures (i.e., clinical symptoms), and a combination of both measures in detecting pediatric OSAS.

The study conducted in 2 steps: (1) By using objective measures as a basic model, to compare diagnostic performance when adding each subjective measure; and (2) By using objective measures as a basic model, to compare diagnostic performance when adding subjective measures (combined model). Finally, the applications for OSAS risk reclassification (basic model vs. combined model) provided valuable additional insights regarding the clinical usefulness.

3. Methods

3.1 Study population

The study protocol was approved by the Ethics Committee of the National Taiwan University Hospital. From June 2012 to January 2014, children aged 2 to 18 years were recruited. Children were included if they had signs or symptoms suggestive of sleep-disordered breathing including snoring, mouth breathing, and witnessed breath holding for at least 1 month duration.¹¹ The exclusion criteria were (1) prior tonsil, adenoid, or pharyngeal surgery, (2) cranio-facial anomalies, (3) genetic disorders, neuro-muscular diseases, cognitive deficits, or mental retardation, (4) suboptimal sleep studies (total sleep time <4 hours, or sleep efficiency <60%), (5) children younger 12 months of age. Basic data, including age, gender, and history of nasal allergy, otitis media with effusion, sinusitis or asthma were recorded.

3.2 Objective measures

Objective measures included measures of tonsil size, adenoid size, and obesity.

Tonsil grade

Tonsils were graded according to the scheme proposed by Brodsky:⁷⁷ : Grade I) small tonsils confined to the tonsillar pillars; grade II) tonsils that extend just outside the pillars; grade III) tonsils that extend outside the pillars, but do not meet in the midline;

grade IV) large tonsils that meet in the midline. Tonsil hypertrophy was defined as grade III or IV tonsils.⁷⁷

Adenoid size

Adenoid size was determined using lateral cephalometric radiographs to measure the adenoidal-nasopharyngeal (AN) ratio. The AN ratio was measured as the ratio of adenoidal depth to nasopharyngeal diameter according to the method of Fujioka et al.⁵³; an AN ratio ≥0.67 was considered adenoid hypertrophy.^5,6,29

Obesity

Obesity was determined by a measure of body mass index (BMI) percentile of each child. The weight and height of each child were measured at a sleep lab before PSG studies and BMI was calculated. The age and gender corrected BMI was applied for each children by using established guidelines to define the BMI percentile.⁷⁸ The guidelines for BMI in Taiwanese children and adolescents was established by Chen et al.⁷⁸ Obesity was defined as a BMI higher than the 95th percentile for a child’s age and gender.^5,78

3.3 Subjective measures

Detailed clinical symptoms were taken by using a standard clinical data sheet, which is adapted from that in Xu’s study.¹¹ A standardized clinical data sheet consisted of questions regarding the child’s snoring patterns, nighttime and daytime clinical symptoms, as well as other clinical symptoms associated with OSAS. Snoring patterns are snoring frequency and snoring duration of a child. Other nighttime and daytime include diaphoresis, bedwetting, awaken, nightmare, breathing pause, nasal speech, mouth breathing, weight gain, weight loss, daytime sleepiness, poor attention, depression, low self-esteem, shy, hyperactive ,and low academic performance. The questionnaires were administered by clinical physicians and caregivers of our children were asked to complete the standard questionnaire form. All clinical data were verified and recorded during the follow-up visit at a sleep clinic before patient receiving the PSG studies.

3.4 Polysomnography (PSG)

Full-night attained PSG (Embla, Medcare, Ice Land) was done at the sleep lab, with electro-encephalographic activity (C4-A1, C3-A2, O2-A1, and O1-A2);

electro-oculogram; chin and tibia electromyogram; oro-nasal airflow by thermocouples and nasal pressure; thoracic and abdominal excursions (respiratory inductive plethysmography); electrocardiogram; snoring sound; body position; and oxygen saturation, following a protocol described elsewhere.5,6,41,79-82 The sleep stage and respiratory event were scored based on the 2007 American Academy of Sleep Medicine standard.^21,22 All of the sleep studies were analyzed by the principal author to maximize inter- and intra-scorer reliability. Obstructive apnea was defined as the presence of continued inspiratory effort associated with a >90% decrease in airflow for duration of

≥2 breaths. Hypopnea was defined as a ≥50% decrease in airflow for duration of ≥2 breaths associated with arousal, awakening, or reduced arterial oxygen saturation of

≥3%. The disease severity were defined as primary snoring (AHI <1) or OSAS (AHI

≥1).2,4,6,7,14,15,18,44-46

3.5 Statistical methods

Data were analyzed using SAS software version 9.3 (SAS Institute, Cary, NC, USA).

A P value of less than 0.05 was considered statistically significant. Demographic data, including age, gender, adenoid size, and tonsil size in all subjects were analyzed. Also, parameters in sleep studies, including AHI, mean oxygen saturation (MeanSaO2), and minimum oxygen saturation (MinSaO2) in all subjects were analyzed. Categorical data were expressed as the number and percentage. Continuous data were expressed as mean, standard deviation, minimum, maximum, the first quartile, the second quartile, and the third quartile.

Objective measures (i.e., tonsil hypertrophy, adenoid hypertrophy, obesity) and subjective measures (e.g., snoring frequency, snoring duration, breathing pause) in all subjects were recorded and analyzed. Data were expressed as the number and percentage.

Children with AHI ≥1 were categorized into the OSAS group, while those with AHI

<1 into the non-OSAS group. Comparisons of demographics, sleep studies, objective measures, and subjective measures between the OSAS and the non-OSAS group were made. Categorical variables between the OSAS group and the non-OSAS group were compared using Chi-square test, while continuous variables between the OSAS group and the non-OSAS group were compared using independent sample t-test.

The OSAS risk for objective and subjective measures was analyzed. Objective measures or subjective measures that is significantly correlated with pediatric OSAS were put into the multiple logistic regression model. The B value, P value, adjusted odds ratios and 95% confidence interval of clinical measures in detecting risk of pediatric OSAS were all estimated by multiple logistic regression model.

Collinearity diagnostics of the objective measures and subjective measures in detecting pediatric OSAS were analyzed. Multicollinearity refers to a situation in which two or more explanatory variables in a multiple regression model are highly linearly related. In statistics, the variance inflation factor (VIF) and the tolerance quantifies the severity of multicollinearity. The VIF provides an index that measures how much the variance (the square of the estimate's standard deviation) of an estimated regression coefficient is increased because of collinearity. The tolerance is just the reciprocal of the VIF. Analyze the magnitude of multicollinearity by considering the size of the VIF. A common rule is that if VIF >10 then multicollinearity is high.⁸³

3.5.1 Objective model vs. subjective model vs. combined model

Objective measures (i.e., tonsil size, adenoid size, and obesity) which were significantly correlated with pediatric OSAS were put into the objective model.

Similarly, items of clinical symptoms which were significantly correlated with pediatric OSAS were selected to put into the subjective model. Combined model included both subjective measures and objective measures that were significantly correlated with pediatric OSAS.

3.5.2 Global measures of model fit

The global measure of model fit were assessed using the likelihood ratio Chi-square statistic and Nagelkerke R square statistic, with a higher value indicated a better model fit. Additionally, for models, the Akaike information criterion and Bayes information criterion were analyzed,⁸⁴ which were statistical estimates of the trade-off between the likelihood of a model against its complexity, with a lower value indicating a better model fit. Global measure of model fit was estimated in objective model, objective model adding one subjective measure, subjective model, and the combined model.

3.5.3 Discrimination

Discrimination is the ability of a model to separate those with OSAS from those without OSAS. The C index is an estimate for the area under a receiver operating characteristic (ROC) curve for logistic regression model,⁸⁵ which is an overall summary of diagnostic accuracy. Additionally, the difference between the two ROC curves

derived from two different models administered on the same set of patients was compared and the P-value for the difference was estimated. A P value <0.05 indicating

that the two compared areas are significantly different.⁸⁵

The IDI has been proposed recently, which take into account the difference in predicted risks. The discrimination slope measures the separation between subjects with OSAS and subjects without OSAS in terms of the average predicted risks for these 2 groups.

The IDI is a difference in discrimination slopes between the new and old models.

The IDI is estimated by the formula from Pencina et al.⁷⁶

= the mean of the new model-based predicted probabilities for OSAS group = the mean of the new model-based predicted probabilities for non-OSAS

group

= the mean of the old model-based predicted probabilities for OSAS group = the mean of the old model-based predicted probabilities for non-OSAS

group

The standard deviation of OSAS ( ) is calculated as the standard error of paired differences of new and old model-based predicted probabilities across the OSAS subjects ( ). The corresponding estimator was obtained for non-OSAS subjects. The null hypothesis (IDI=0) is tested using Z test.

3.5.4 Calibration

Calibration evaluates the degree of correspondence between the predicted probability of OSAS based on a model and the observed OSAS and is typical evaluated with the Hosmer-Lemeshow statistic.⁷⁴ The Hosmer-Lemeshow test statistic follows a Chi-square distribution and a small value indicates a good calibration. A P <0.05 indicates significant lack of calibration.

3.5.5 Reclassification

The reclassification of OSAS risk was evaluated by comparing predicted risk estimates based on objective model with and without adding subjective measure. Reclassification rates were evaluated separately in individuals who had OSAS and in those who do not.^75,76,86 The predicted OSAS probabilities were grouped into risk categories >50%

and <50% based on selected models. Upwards movement in categories in individuals who had OSAS indicates improved classification, whereas any downward movement in those with OSAS implies worse reclassification. Similarly, a downward movement in individuals who do not have OSAS indicates improved classification, whereas any upward movement in those without OSAS implies worse reclassification. The NRI is then calculated by summing the reclassification improvements for those with OSAS and those without OSAS. The statistical testing for significance of NRI is calculated

according to the formula by Pencina et al.⁷⁶

The formula from Pencina et al.⁷⁶ is listed below.

The NRI is estimated as

Assuming independence between event (OSAS) and non-event (non-OSAS) individuals and following McNemar’s logic for significance testing in correlated proportions (and using the properties of multinomial distribution), a simple asymptotic test for the null hypothesis of NRI=0 is tested (using Z test).

Two-sided P value of <0.05 were considered statistically significant.

3.5.6 Validation

Internal validation of the model was conducted by using the leave-one-out method, bootstrapping method, and k-fold cross-validation.

Leave-one-out cross-validation

Leave-one-out cross-validation involves using a single observation from the original sample as the validation data, and the remaining observations as the training data. This is repeated such that each observation in the sample is used once as the validation data.

The concordance C-index of the combined model was internal validated by the leave-one-out cross-validation method.⁸⁷

Bootstrap cross-validation

The C-index as a measure for the area under the receiver operating characteristic (ROC) curve represents the diagnostic accuracy of the model. Internal validation of the

concordance C-index of the combined model was performed by the bootstrapping method of 100, 200, and 500 iterations.⁸⁸

K-fold cross-validation

In k-fold cross-validation, the original sample is randomly partitioned into k equal size subsamples. Of the k subsamples, a single subsample is retained as the validation data for testing the model, and the remaining k-1 subsamples are used as training data.

The cross-validation process is then repeated k times (the folds), with each of the k subsamples used exactly once as the validation data.⁸⁹ K-fold cross-validation of the concordance C-index of the combined model was performed by partitioning the original sample into 3-fold, 5-fold, and 10 fold subsamples.

4. Results

4.1 Study population

Initially, 287 subjects were identified for possible inclusion. Fifty-three children were excluded due to incomplete records or PSG studies. Twelve children were excluded due to co-morbidities that met exclusion criteria, including 7 children with craniofacial anomaly and 5 children with neuromuscular disease. In total, 222 subjects were enrolled into the final analysis (Figure 1).

Table 1, 2, and 3 listed demographics in all subjects. Table 1 listed categorical variables in all subjects. In this study group, boys comprised 67.1 % (149/222).

Forty-eight children (21.6%) were obese, and 174 (78.4%) were non-obese; 126 (56.8%) subjects had tonsil hypertrophy, and 134 (60.4%) subjects had adenoid hypertrophy.

Fourteen (6.3%) subjects had grade 1 tonsil, 82 (36.9%) had grade 2 tonsil, 82 (36.9%) had grade 3 tonsil, and 44 (19.8%) had grade 4 tonsil. Among all subjects, 106 (47.7%) met the criteria for primary snoring, while 116 (52.3%) out of 222 met the criteria for pediatric OSAS.

Table 2 showed continuous variables in all subjects. Mean age of study participants was 7.3±3.7 years (median: 6.5 years; 25th to 75th percentile: 4.7 to 9.4 years). The youngest age was 1.4 years, and the oldest age is 17.8 years. The mean weight of all subjects was 29.2 ± 16.2 kg (median: 22.9 kg; 25th to 75th percentile: 18.0 to 34.7 kg).

The weight in all subjects ranged from 10 to 93 kg. The mean height of all subjects was 122.2 ± 21.7 cm (median: 119.5 cm; 25th to 75th percentile: 106.8 to 136.3 cm). The height in all subjects ranged from 79 to 185 cm. The mean BMI was 18.1 ± 3.9 kg/m² (median: 16.8 kg/m²; 25th to 75th percentile: 15.3 to 20.4 kg/m²). The BMI in all subjects ranged from 11.4 to 31.2 kg/m². The mean BMI percentile was 62.2 ± 30.7

centile (median: 64.7 centile; 25th to 75th percentile: 37.8 to 92.3 centile). The BMI percentile in all subjects ranged from 2 to 99 centile. The mean AN ratio was 0.69 ± 0.16 (median: 0.73; 25th to 75th percentile: 0.58 to 0.83). The AN ratio in all subjects ranged from 0.30 to 0.95. Sleep data from PSG studies showed the mean AHI was 5.4 ± 13.0 events/hour (median: 1.0 events/hour; 25th to 75th percentile: 0.3 to 3.4

events/hour). The AHI in all subjects ranged from 0 to 130.5 events/hour. The mean of mean oxygen saturation (MeanSaO2) was 97.2 ± 2.2 % (median: 97.7 %; 25th to 75th percentile: 97 to 98 %). The MeanSaO2 in all subjects ranged from 70.0 to 99.4 %. The mean of minimum oxygen saturation (MinSaO2) was 88.8 ± 6.1 % (median: 91 %; 25th to 75th percentile: 86 to 93 %). The MinSaO2 in all subjects ranged from 50 to 97 %.

Objective measures and subjective measures in all subjects were demonstrated in table 3. Of objective measures in all subjects, 56.8% had tonsil hypertrophy, 60.4% had adenoid hypertrophy, and 21.6% were obese. Of subjective measures in all subjects, the three main subjective symptoms were snoring (93.2%), mouth breathing (80.6%), and nasal speech (80.2%). For snoring duration and snoring frequency, 144 (64.9%)

children had snoring more than 5 nights per week and 166 (74.8%) children had snoring more than 3 months. Other nighttime and daytime symptoms in all subjects included poor attention problem (41.9%), nighttime awaken (32.0%), witness of breathing pause by caregivers (27.9%), nightmare (27.0%), diaphoresis (23.9%), hyperactive (22.5%), low academic performance (17.6%), weight gain (17.1%), daytime sleepiness (15.8%), shy (14.0%), bedwetting (12.2%), low self-esteem (8.1%), weight loss (6.3%), and depression (1.4%).

Table 4 listed correlation between age (in year) and adenotonsillar size. The tonsil size and age were not significantly correlated with each other in either the OSAS (P =

0.062) or non-OSAS group (P = 0.3). In contrast, the adenoid size inversely correlated with age in both the OSAS (P <0.001) and non-OSAS group (P <0.001).

Table 5 compared adenotonsillar size and age groups in the OSAS and non-OSAS group. The tonsil size did not significantly differed among different age groups in either the OSAS (P = 0.086) and non-OSAS group (P = 0.612). In contrast, the adenoid size inversely correlated with age groups in both the OSAS (P <0.001) and non-OSAS group (P <0.001).

Table 6 listed comparisons of the demographic data between the OSAS group and the non-OSAS group. Age (7.6 ± 4.0 vs. 7.1 ± 3.2 years, p = 0.355), gender (Boys 63.8 vs.

70.8 %, p = 0.270), height (123.0 ± 23.3 vs. 121.3 ± 19.9 cm, p = 0.543) and BMI percentile (64.9 ± 31.0 vs. 59.3 ± 30.3 centile, p = 0.175) did not significant differ between these two groups. The OSAS group had significant higher weight (31.5 ± 18.9 vs. 26.8 ± 12.4 kg, p = 0.03) and higher BMI (18.9 ± 4.6 vs. 17.3 ± 2.8 kg/m², p = 0.002) than the non-OSAS group. Among sleep parameters recorded by overnight PSG, the OSA group had higher AHI (9.9 ± 16.8 vs. 0.4 ± 0.3 events/hour, p <0.001) than the non-OSAS group. Children with OSAS also had lower MeanSaO2 (96.6 ± 2.8 vs. 97.7 ± 0.9 %, p <0.001) and MinSaO2 (86.1 ± 6.8 vs. 91.7 ± 3.4 %, p = 0.001) than children without OSAS.

4.2 Clinical measures in detecting pediatric obstructive sleep apnea syndrome

Table 7 compared objective and subjective measures between the OSAS and the non-OSAS group. Table 7 listed the sensitivity, specificity, positive predictive value, negative predictive value, and odds ratio of each objective and subjective clinical measures in detecting pediatric OSAS. For objective measures, tonsil hypertrophy (76.7% vs. 34.9%, p <0.001) and adenoid hypertrophy (75.0% vs. 44.3%, p <0.001) were more prevalent in children with OSAS than without OSAS. Obesity was also correlated with OSAS (28.4% vs. 14.2%, p = 0.011). Objective measures of tonsil hypertrophy and adenoid hypertrophy had a high sensitivity (76.7% and 75.0%), whereas obesity had a low sensitivity (28.4%) but high specificity (85.8%) in predicting pediatric OSAS. For subjective measures, the three leading clinical symptoms were snoring (93.2%), mouth breathing (80.6%), and nasal speech (80.2%). The clinical symptoms of snoring more than 5 nights per week, snoring more than 3 months, breathing pause, and awaken at night were significantly correlated with pediatric OSAS (76.7% vs. 51.9%, p <0.001; 83.6% vs. 65.1%, p = 0.002; 42.2% vs. 12.3%, p <0.001;

37.9% vs. 25.5%, p = 0.048, respectively). Snoring more than 5 nights per week and snoring more than 3 months had a high sensitivity in detecting OSAS (76.7% and 83.6%, respectively). Witness breathing pause and awaken at night had a high specificity (87.7% and 74.5%) but had a low sensitivity (42.2% and 37.9%) in detecting OSAS. Symptoms of diaphoresis, bedwetting, mouth breathing, sleepiness, shy, and low academic performance were also more frequent in the OSAS group than in the non-OSAS group but statistically insignificant.

A multiple logistic regression model was applied to analyze the associations between

clinical measures and OSAS risk (Table 8). In a multiple logistic regression model, tonsil hypertrophy (OR=7.2; 95% CI 3.5-14.8, p <0.001), adenoid hypertrophy (OR = 2.0; 95% CI 1.0-3.9, p = 0.047), awaken (OR = 2.1; 95% CI 1.0-4.4, p = 0.043), and breathing pause (OR = 5.7; 95% CI 2.4-13.5, p <0.001) significantly increased the risk of OSAS in children, whereas obesity (OR = 2.1; 95% CI 0.9-4.8, p = 0.068), snoring

> 5 nights/week (OR = 1.4; 95% CI 0.7-2.9, p = 0.382) and snoring > 3month (OR = 1.3; 95% CI 0.6-3.1, p = 0.475) was not significantly correlated with pediatric OSAS.

Multicolinearity of the model in detecting pediatric OSAS were examined using the tolerance and the VIF. Table 9 listed the collinearity diagnostics of the objective measures and subjective measures in detecting pediatric OSAS. If none of the VIFs are greater than 10, collinearity is not a problem. The VIF values were ranged from 1.02 to 1.29 (Table 9). Since the VIF was far below 10, collinearity in objective or subjective model in detecting pediatric OSAS was not thought likely to occur.

4.3 Effects of adding each subjective measure on objective measures

The objective model includes objective measures significantly correlated with pediatric OSAS (i.e., tonsil hypertrophy, adenoid hypertrophy, and obesity), and was used as the basic model. Of note, only subjective measures that significantly correlated with pediatric OSAS (i.e., snoring >5 nights/week, snoring >3month, breathing pause, and awaken) were used to add into the basic model. Table 10 showed the global model fit when adding one subjective measure on objective model in detecting pediatric OSAS.

The likelihood ratio chi-square test was highly statistically significant in the basic model as well as models containing basic model adding one subjective measure (P <

0.001). Comparing basic model adding one subjective measure, the R² was highest and the Bayes information criterion was lowest when adding “breathing pause” to the basic model. The C-index for the basic model was 0.775, and ranged from 0.788 to 0.822 when adding one subjective measure (Table 11 and Figure 2). As expected, the C-index was highest when adding“breathing pause” to the basic model. Comparing the basic model, the differences in the C-index were around 0.01 for adding “snoring >5 nights/week”, “snoring >3month”, “breathing pause”, and “awaken” to the basic model, but were 0.047 for adding“ breathing pause”. Additionally, the P value for the difference in the C-index was only significant when adding “breathing pause” to the basic model (P = 0.001). The Hosmer–Lemeshow test for basic model and basic model adding one subjective measure showed adequate fit for models in detecting pediatric OSAS (P >0.05) (Table 11).

Table 12 showed the IDI of each subjective measure adding on objective model to detect pediatric OSAS. By using the objective model, the mean predicted probability of OSAS was 0.636, and the mean predicted probability of OSAS was 0.398. The

discrimination slope of the objective model between the OSAS and the non-OSAS subjects was 0.238. The discrimination slope of the objective model adding “snoring frequency” was 0.259. The IDI for the objective model adding “snoring frequency” was 2.1% (P = 0.029). The discrimination slope of the objective model adding “snoring duration” was 0.255. The IDI for the objective model adding “snoring duration” was 1.7% (P = 0.062). The discrimination slope of the objective model adding “awaken”

was 0.255. The IDI for the objective model adding “awaken” was 1.7% (P = 0.048).

在文檔中以主觀及客觀臨床評估偵測兒童阻塞性睡眠呼吸中止 (頁 26-0)