Responder Definition of a Patient-Reported Outcome Instrument for Laryngopharyngeal Reflux Based on the US FDA guidance

(1)

TITLE: Responder Definition of a Patient-Reported Outcome Instrument for Laryngopharyngeal Reflux Based on the US FDA guidance

Authors: Han-Chung Lien,^1,2,3 Chi-Sen Chang,^1,4 Chen-Chi Wang,⁵ Jeng-Yuan Hsu,⁶Hong-Zen Yeh,^1,2Shou-Wu Lee,^1,4 and Wen-Miin Liang⁷

Affiliations:

1 Division of Gastroenterology, Taichung Veterans General Hospital, Taichung,Taiwan;

2 Department of Internal Medicine, National Yang-Ming University, Taipei, Taiwan;

3 Department of Public Health, China Medical University and Hospital, Taichung, Taiwan;

4 Department of Internal Medicine, Chung Shan Medical University, Taichung, Taiwan;

5 Department of Otolaryngology, Taichung Veterans General Hospital, Taichung, Taiwan;

6 Division of Chest Medicine, Taichung Veterans General Hospital, Taichung, Taiwan;

7 Biostatistics Center, China Medical University and Hospital, Taichung, Taiwan.

Short running title: responder definition in laryngopharyngeal treatment

Correspondence to:

Wen-Miin Liang, PhD Biostatistics Center,

China Medical University and Hospital, 91, Hsueh-Shih Road,

Taichung 40402, Taiwan.

E-mail: [email protected] Fax: +886-4-23741331

Tel: +886-4-23592525 ext. 3315 1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

(2)

Dear Professor El-Omar,

We would like to submit our recently completed manuscript entitled, “Responder Definition of a Patient-Reported Outcome Instrument in Laryngopharyngeal Reflux Based on the US FDA guidance,” including 3 tables, 2 figures, and 1 appendix to be considered for publication as an

original article in Gut.

We determined the responder definition of a patient-reported outcome instrument, the Reflux Symptom Index, in patients with laryngopharyngeal reflux based on the US FDA guidance. This paper extends two of our recent publications, “Composite pH predicts esomeprazole response in laryngopharyngeal reflux without typical reflux syndrome” (The Laryngoscope 2013) and

“Classical reflux symptoms, hiatus hernia and overweight independently predict pharyngeal acid exposure in patients with suspected reflux laryngitis” (Alimentary Pharmacology & Therapeutics 2011). Both are submitted as “Supplementary files not for review”. We also submit the previous Editor’s and reviewers’ comments as Supplementary material along with our responses to those comments. We hope these may facilitate the review process and contribute to future clinical trials and clinical practice in this common clinical disease, which remains a considerable challenge to

both gastroenterologists and otolaryngologists.

None of the material in this manuscript has been published previously in any form and none of this material is currently under consideration for publication elsewhere. The authors declare no

conflict of interest.

Thank you for your consideration.

26 27 28 29

30 31 32 33 34 35 36 37 38 39 40

41 42 43

44 45 46

(3)

Sincerely yours,

Wen-Miin Liang, PhD Biostatistics Center,

China Medical University and Hospital, 91, Hsueh-Shih Road,

Taichung 40402, Taiwan.

E-mail: [email protected] Fax: +886-4-23741331

Tel: +886-4-23592525 ext. 3315 47

48 49 50 51 52 53 54 55 56

(4)

ABSTRACT Objective

Different endpoint measures may contribute to inconsistent therapeutic responses among studies on therapeutic relief of laryngopharyngeal reflux (LPR) symptoms. The U.S. Food and Drug Administration (FDA) recommended an a priori responder definition for patient-reported outcome measures to interpret the treatment benefit for individuals. The aim of this study was to determine a responder definition of a disease-specific questionnaire, Reflux Symptom Index (RSI), based on the

U.S. FDA guidance.

Design

Patients with symptoms suggestive of LPR underwent twice daily 40 mg esomeprazole treatment for 12 weeks. We used a ≧ 50% reduction of the primary laryngeal symptoms at week 12 as the anchor to assess the effect of treatment response on the change in RSI score. The responder definition of the RSI score change was determined by an optimal cut-off point based on the

maximal Youden index.

Results

The mean reduction of the RSI score was significantly greater in subjects with a ≧ 50% reduction in the primary laryngeal symptoms than that in those without (-11 ± 7.8 vs. -3.1 ± 8.3, p < 0.0001). A

≧ 6-point reduction of the RSI score was considered to be the responder definition at week 12 with

a sensitivity of 0.79, a specificity of 0.70.

Conclusions

Using the empirical responder criterion as the anchor, as recommended by the U.S. FDA guidance, 57

58 59 60 61 62 63

64 65 66 67 68 69

70 71 72 73 74

75 76 77

(5)

the definition of responder was determined to be a reduction in the Reflux Symptom Index score of 6 points or more from baseline at week 12, which provides a clinically meaningful endpoint

measure that can be applied at both the individual and group level..

Significance of this study

What is already known about this subject?

► Although anti-secretory agents are recommended for relieving laryngopharyngeal reflux (LPR)

symptoms, different endpoint measures may contribute to inconsistenttherapeutic responses.

► The Reflux Symptom Index questionnaire is a disease-specific and psychometrically validated

instrument used to evaluate the symptom severity of patients with laryngopharyngeal reflux.

► The U.S. Food and Drug Administration has recommended developing an a priori responder

definition for a patient-reported outcome instrument as an endpoint measure based on an empirical responder criterion to interpret treatment benefit at both the individual and group

level.

What are the new findings?

► The Chinese version of the Reflux Symptom Index questionnaire is a reliable and responsive

patient-reported outcome instrument for Taiwanese patients with LPR.

► The mean reduction of Reflux Symptom Index score from baseline after the 12-week

esomeprazole treatment correlated with an empirical responder criterion, i.e., a 50% or more 78

79

80 81 82

83

84

85

86

87

88 89

90 91

92

93

94

95

(6)

reduction in primary laryngeal symptom.

► We propose a priori responder definition for interpretation of esomeprazole

treatment benefit according to the FDA guidance: a ≧ 6-point reduction on

the RSI instrument at week 12.

How might this impact on clinical practice in the foreseeable future?

► ^A≧ 6-point reduction in Reflux Symptom Index score from baseline after 12 weeks of PPI

treatment may be used as the endpoint measure in both clinical practice and clinical trials to interpret individual and group treatment benefit.

96

97

98

99 100

101

102 103 104

(7)

INTRODUCTION

Laryngopharyngeal reflux (LPR) or reflux laryngitis is an established extraoesophageal manifestation of gastro-oesophageal reflux disease (GERD).[1, 2] However, management of LPR is controversial because, while treatment with proton pump inhibitors (PPIs) is often recommended, [3, 4] the treatment efficacy remains inconsistent among controlled trials.[5-7] Inappropriate instruments and/or inconsistent endpoint measures of patient-reported outcomes (PRO) may

explain, at least in part, the failure to demonstrate any treatment benefit.[7]

The Reflux Symptom Index (RSI) is a disease-specific self-administered questionnaire for evaluation of LPR symptom severity. It was developed and evaluated in a sample of LPR patients, including reliability, validity, and responsiveness,[8] and has also been translated into several languages.[9-12] In addition, two recent randomized controlled trials used the RSI and found that

PPIs were more effective than the placebo.[6, 7]

Despite the use of a disease-specific instrument to measure PRO, a statistically significant score change may not be clinically relevant.[13] Subsequently, minimally important differences, defined as the smallest meaningful difference derived from point estimates of mean differences among groups, may mask important changes for individuals.[14] Recently, the U.S. Food and Drug Administration (FDA) released a PRO guidance,[15] which recommended determining an a priori responder definition for PRO instruments, i.e., the change in an individual patient’s PRO score over a predetermined time period which can be considered to constitute a treatment benefit. The responder definition is derived from a clinical and empirical responder criterion using the anchor- 105

106 107 108 109 110

111 112 113 114 115

116 117 118 119 120 121 122 123 124

(8)

based approach. Moreover, a proportion of treatment responders can be calculated to compare group differences in addition to assessing the conventional mean differences between groups.

Therefore, this approach is advantageous in both clinical practice and clinical trials for the

interpretation of individual and group treatment benefits, respectively.

In this study, we tested the reliability and responsiveness of the Chinese version RSI in Taiwanese patients with LPR. Secondly, we assessed whether the target concept of the RSI correlates with an empirical treatment endpoint, i.e., 50% or more reduction in the primary laryngeal symptoms.[5, 16] Finally, using a 50% or more reduction in the primary laryngeal symptoms as the anchor, we attempted to determine the responder definition using the change in RSI score during PPI treatment based on the FDA guidance.

METHODS

This study was a single-center multidisciplinary open label, therapeutic trial conducted at the Voice

& Laryngeal Pathology Laboratory and the Gastrointestinal Physiology & Motility Laboratory in Taichung Veterans General Hospital, Taiwan. All patients signed an informed consent form before the study.

Patient selection

Patients (aged > 20 years) with chronic laryngeal signs and symptoms suspected to be reflux-related and referred from the Department of Otolaryngology clinic between January 2007 and December 2011 were assessed for study eligibility.

125 126 127

128 129 130 131 132 133

134 135 136 137 138 139

140 141 142 143 144 145

(9)

The inclusion criteria and exclusion criteria are described in our recent publication.[17] In short, subjects with laryngeal symptoms as the major complaint for more than 3 months after laryngoscopic exam were included. Patients with common etiologies of chronic laryngitis other than reflux were excluded.

Screening period

Patients who met the eligibility criteria were enrolled in a 2- to 4-week run-in period to ensure compliance and to confirm that the severity of the primary laryngeal symptom was consistent. Each participant identified the most bothersome laryngeal symptom as the primary laryngeal symptom from the symptoms in the laryngeal symptom complex, which includes globus, sore throat, hoarseness, cough, and throat clearing. The severity was assessed on a four-point Likert scale (0 = none, 1 = mild, 2 = moderate, and 3 = severe), and was required to be ≧2 points in two assessments performed 7-14 days apart.

Study design

Patients with suspected LPR were instructed to take an oral esomeprazole tab 40 mg (Nexium;

AstraZeneca Pharmaceuticals, Sweden) 30 minutes before breakfast and 30 minutes before dinner.

During the treatment period, patients’ adherence to treatment, adverse events, and concomitant

medication were evaluated and documented at 4-, 8- and 12-week follow-up visits.

Laryngoscopy and upper gastrointestinal endoscopy

Laryngoscopy was performed using a flexible nasolaryngoscope (VNL-1171K; Pentax, Tokyo, 146

147 148

149 150 151 152 153 154 155 156 157

158 159 160 161 162 163

164 165 166

(10)

Japan) at enrollment by the same laryngologist (C.C. Wang). The laryngeal signs were documented at baseline based on the reflux finding score.[18] The presence of reflux oesophagitis and other mucosal lesions was evaluated by upper gastrointestinal endoscopy before treatment. Reflux oesophagitis was defined according to the Los Angeles classification.

The Reflux Symptom Index questionnaire

The Reflux Symptoms Index (RSI) is a self-administered 9-item symptom questionnaire for the assessment of symptoms in patients with LPR.[8] The items included hoarseness or voice problems, throat clearing, excess mucus or postnasal drip, difficulty swallowing, coughing after a meal or when lying down, breathing difficulties or choking episodes, troublesome cough, sensation of sticking or a lump in the throat, heartburn, chest pain, and regurgitation. The scale for each individual item ranges from 0 (no problem) to 5 (severe problem), with a maximum total score of 45. The RSI can be completed in less than 3 minutes.

Linguistic translation

Linguistic translation of the RSI from English to Chinese was performed using a forward-backward procedure involving two forward and one backward translators. Two bilingual translators independently translated the RSI into Chinese. One was a gastroenterologist (Lien HC), who was aware of the objectives of the questionnaire. The other translator (Lien SP) was unaware of the objectives and did not have specialized knowledge of medicine. The two versions were reconciled in a joint discussion by both translators to detect errors and divergent interpretations of items, and to 167

168 169

170 171 172 173 174 175 176 177 178

179 180 181 182 183 184 185 186 187

(11)

obtain a unified final version. The back-translation was performed by a native English speaker who has lived in Taiwan for 20 years and speaks fluent Chinese but does not have specialized knowledge of medicine.[19] The backward translated English version was compared with the original English version RSI by a committee and no semantic differences were detected. Subsequently, the patients who participated in the pretesting endorsed the comprehensibility of the Chinese version RSI.

(Supplementary Table)

Outcome measures

Two outcome measures were used to evaluate the esomeprazole treatment response. The first measure was the response to esomeprazole, defined as a 50% or more reduction in primary laryngeal symptoms, which empirically differentiated responders from non-responders.[20] It was measured using a 10-cm visual analogue scale by asking, “Compared to the baseline status (before treatment), what is the percentage of improvement in your primary laryngeal symptom?” (0 cm, no improvement or worse; 10 cm, 100% improvement) at week 12. The second measure was the change in total RSI score measured from baseline to week 12 during the treatment period. The first outcome measure, a clinical and empirical responder criterion, was used as the anchor to explore the association with the concept measured by the second outcome measure, and subsequently to determine the responder definition of the second measure.

Statistical analyses

Validation of the Chinese version RSI 188

189 190 191 192

193 194 195 196 197 198 199 200 201 202 203 204

205 206 207 208

(12)

Descriptive statistics were used to summarize demographic data. To examine overall score distributions, the proportions of respondents with the lowest and highest possible score of the total RSI scores were calculated for the presence of floor and ceiling effects. The Chinese version RSI was validated by evaluations of the reliability and the responsiveness. The reliability of the RSI was examined by internal consistency and test-retest reliability. The internal consistency was assessed by Cronbach’s α and the acceptable overall value of Cronbach’s α was between 0.7-0.9.[19] The test–retest reliability was calculated by comparing the results of the questionnaire between the two baseline visits 7-14 days apart before starting esomeprazole treatment in a randomly selected subgroup of 43 subjects, and was expressed by intra-class correlation coefficient (ICC). The ICC may range from 0 to 1, while values > 0.7 are considered acceptable.[21] The responsiveness to change during treatment was evaluated by effect size. The effect size is a standardization of the difference of the means used to measure the responsiveness, which represents the extent to which the scale is sensitive to change. It was calculated by dividing the mean difference between the baseline and week 12 by standard deviation at baseline, and was translated into benchmarks for assessing the relative size of change: an effect size of 0.2 was considered to be small, 0.5 to be medium and 0.8 or greater to be large.[22, 23]

Conceptual association between the anchor and the RSI

The relationship between the targeted concept of the RSI instrument and the concept measured by the anchor was evaluated by determining the association between the mean change of RSI scores 209

210 211 212 213 214 215 216 217 218 219 220 221 222 223

224 225 226 227 228

(13)

from baseline and a 50% or more reduction in primary laryngeal symptom at week 12 (a clinical and empirical responder criterion) using t-test and effect size.

Anchor-based method to determine responder definition for the RSI

To determine the responder definition of the PRO instruments, we used ≧ 50% reduction in the primary laryngeal symptoms at week 12, a clinical and empirical responder criterion which served as the anchor,[15] to dichotomize patients into persons with marked improvement and persons with no marked improvement or deterioration. The receiver operating characteristic (ROC) curve was plotted by the RSI score change against the anchor and the area under the ROC curve was calculated. We used the maximal Youden index to determine the cut-off point of RSI score change which served as the definition of responder. The responder definition of RSI score change was used to calculate the sensitivity and the specificity in all subjects. This method integrates both anchored- based and distribution-based approaches.[24] The sensitivity and the specificity were also calculated in subgroups with different baseline scores, i.e., < 12, 12-28, and > 28. A baseline score

< 12 was considered to be a low baseline because two previous studies showed 95% upper limits of the norm were 11.4[11] and 13.6,[8] respectively. A baseline score > 28 was arbitrarily chosen as the high baseline from the top 10% of subjects. We also plotted the cumulative percentage of patients at various RSI score reduction points in patients with and without a ≧ 50% reduction in the primary laryngeal symptoms to show the sensitivity and specificity of the various cut-off points of RSI score change.

229

230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248

(14)

RESULTS Flow of patients

A total of 228 subjects were assessed for eligibility. The intent-to-treat population consisted of 96 subjects. Twelve subjects dropped out, so 84 subjects were included in the per-protocol analysis.

There were no differences in the baseline characteristics between participants and non-participants (Table 1).

Validation of the Chinese version RSI

At baseline, there was neither floor effect nor ceiling effect for total RSI scores (Table 2). Total RSI scores changed from 17.5± 7.5 at baseline to 10.8± 7.2 at week 12 (p < 0.0001). The Chinese version RSI showed a good internal consistency with a Cronbach’s α of 0.74, a good test-retest reliability as demonstrated by ICC of 0.79, and a good responsiveness as documented by the effect size of 0.92 (Table 2).

The conceptual association between the anchor and the RSI

At week 12, the mean RSI score reduction in subjects with a ≧ 50% reduction in the primary laryngeal symptoms was significantly greater than that in those without. (-11 ± 7.8 vs. -3.1 ± 8.3, p

< 0.0001, effect size 0.99; Figure 1)

The anchor-based method to determine the responder definition for the RSI 249

250 251 252 253 254

255 256 257 258 259 260 261

262 263 264 265 266

267 268 269

(15)

Using a clinical and empirical responder criterion, i.e., a ≧ 50% reduction in the primary laryngeal symptom at week 12 as the anchor, the subjects were dichotomized into subjects with and without marked improvement. The ROC curve was plotted using the RSI score change against the anchor.

The area under the ROC curve was 0.771. A reduction in RSI score from the baseline of at least 6 points was determined to be the definition of responder, as shown by the maximal Youden index of 0.49. The sensitivity was 0.79 and the specificity was 0.7 for the prediction of a ≧ 50% reduction in the primary laryngeal symptom at week 12. The cumulative percentage of patients at various cut-off points of RSI score reduction showed distinct distributions between patients with vs. without ≧ 50%

reduction in the primary laryngeal symptom at week 12 (Figure 2). For the prediction of a ≧ 50%

reduction in the primary laryngeal symptom at week 12, the responder definition of a reduction in RSI score of at least 6 points had a low sensitivity in the subgroup with baseline RSI < 12, and a low specificity in the subgroups with baseline > 28, but there was a sensitivity of 0.86 and a specificity of 0.7 in the subgroup with a baseline in the range of 12-28 (Table 3).

DISCUSSION

In this study, we applied a disease-specific PRO instrument, the RSI, in the assessment of patients with LPR to determine an a priori responder definition for outcome measure using the anchor-based method, in accordance with the final version of the PRO guidance released by the U.S. FDA in

2009.

There are 3 major findings in this study. First, the Chinese version RSI in concert with its 270

271 272 273 274 275 276 277 278 279 280 281

282 283 284 285 286 287 288 289

(16)

original English version, was shown to be reliable and responsive to change during the treatment in Taiwanese patients with LPR. Second, the clinical and empirical responder criterion, a 50% or more reduction in primary laryngeal symptom, was highly correlated with the change of RSI scores in response to the PPI treatment, as demonstrated by the high effect size of 0.99. Third, the responder definition of a reduction in RSI score of at least 6 points, determined by the maximal Youden index using the anchor-based approach, was predictive of a 50% or more reduction in primary laryngeal symptoms at the end of treatment at week 12, reaching a sensitivity of 0.79, and a specificity of

0.70.

The FDA guidance emphasizes the importance of taking the patient’s perspective into account, which is in concert with the Montreal definition of GERD, relying on symptoms elicited directly from the patient. To capture optimal information from the patient’s experience, the PRO instruments used in clinical trials should be specific to the target population, i.e., all relevant symptoms and concepts must be included and measured accurately. Although various instruments have been used to measure PRO in patients with LPR, few controlled trials have used disease- specific validated questionnaires.[5-7] The RSI is a disease-specific instrument developed and validated among patients with LPR by Belafsky et al.[8] In this study, we found that the Chinese version RSI was reliable and was able to detect change over time during the treatment among Taiwanese patients with LPR, and thus is suitable for use as a PRO instrument for this patient

population.

The U.S. FDA also recommended that an a priori responder definition be identified for the 290

291 292 293 294 295 296

297 298 299 300 301 302 303 304 305 306 307 308 309

(17)

PRO instrument to support labeling claims in medical product development at an individual patient- level, such that the individual patient PRO score change over a predetermined time period can be interpreted as a treatment benefit.[15] In the FDA guidance, the anchor-based method using a clinical and empirical responder criterion is recommended to determine the responder definition for the PRO instrument. To be useful, the anchors chosen should have intuitive meaning, be easier to interpret than the PRO measure itself, and should correlate well with the target concept of the PRO instrument. In this study, we chose a 50% or more reduction in the primary laryngeal symptom as the anchor because it was the major complaint selected by the patient from multiple laryngeal symptoms and has been used as an endpoint measure in previous clinical trials.[25] The high correlation found between the mean RSI score reduction from baseline and a 50% or more reduction in primary laryngeal symptom at treatment end may justify the application of the anchor chosen in this study to determine the responder definition for the RSI instrument. Alternatively, a laryngoscopic signs scoring system, such as the Reflux Finding Score [26], or a 50% or more reduction in global laryngeal symptom may be considered as possible anchor candidates. However, the former may require a longer duration to observe any change,[18] and may not correlate well with symptoms;[27] while the latter averages a complex of changes in symptoms over a long period

of time and is subject to recall bias.

Recently, Vakil et al. reviewed current PRO instruments in GERD and found only 5 have been used as endpoints in clinical trials.[28] Among them, either no responder was defined or responder was defined as freedom from symptoms. However, it may be impractical to identify a definition of 310

311 312 313 314 315 316 317 318 319 320 321 322 323 324 325

326 327 328 329

(18)

complete symptom resolution in patients with LPR. Alternatively, the ReQuest-GI subscale, which uses a score below 1.73 derived from the 95^th percentile of healthy controls to define symptomatic response, could be applied.[29] The use of such distribution-based methods as the sole basis for determining a responder definition, however, is considered supportive but not appropriate in the FDA guidance.[15] Traditionally, the disadvantage of the distribution-based approach is that it does not provide information pertaining to the clinical importance of the observed change, while the disadvantage of anchor-based method is its inability to take into account the variability of the instrument and/or the sample. In this study, we combined the anchor-based and distribution-based methods to take advantage of both an external criterion and a measure of variability.[24] With the anchor-based method, the patients can be dichotomized into with and without marked improvement.

With these two sample distributions, the optimal cut-off value of the score change in the PRO instrument can be determined using a receiver operative characteristic curve with the maximal Youden index. Using these approaches, we found a cut-off value was a reduction in RSI score of 6 points or more from baseline, which had a sensitivity of 79% and specificity of 70% for prediction of a 50% or more reduction in the primary laryngeal symptom (Figure 2), and therefore it may be

suitable for use as the responder definition for RSI in future clinical trials.

The responder definition of the RSI instrument in our study may be advantageous in both clinical trials and in clinical practice. It appears to be capable of assessing a complex of LPR symptoms without averaging their condition, and has no or little recall bias over a long treatment period, in comparison to a patient’s change in global rating. Secondly, using the responder 330

331 332 333 334 335 336 337 338 339 340 341 342 343 344

345 346 347 348 349

(19)

definition of the PRO instrument as the endpoint measure enables the interpretation of the treatment benefit at the patient level, thereby facilitating the patient-doctor communication regarding the treatment efficacy. This method is likely to be superior to the use of the conventional minimally important difference,[14] which was derived from point estimates of intra-patient mean group change, possibly masking individual important change, and therefore was not included in the

revised 2009 FDA guidance.

However, there may be some practical limitations. Firstly, the application of the responder definition of RSI based on a score reduction of 6 or more may not be invariant across various baseline values and other characteristics. For example, a subset of patients with a low baseline score may have a low sensitivity for prediction of the anchor. Not surprisingly, such patients with a baseline RSI score of less than 12, which is within the range of normal controls in previous studies, may have either fewer or less severe symptoms, and thus are less likely to have a reduction of 6 points despite adequate or complete relief of the primary laryngeal symptom (Table 3). For those with a high baseline RSI score, the specificity in the prediction of the anchor is low, presumably due to the phenomenon of “regression to the mean”. Because such patients may have either more symptom items or more severe symptom, they are likely to have a reduction of 6 points or more despite inadequate relief of the primary laryngeal symptom. A baseline RSI score larger than 28 was arbitrarily chosen based on the top 10% of our study subjects to indicate an outlier value, and may vary from study to study. This may be an inherent limitation of using the PRO instrument for determination of a responder definition. Future large scale studies may refine the issue. Secondly, 350

351 352 353 354

355 356 357 358 359 360 361 362 363 364 365 366 367 368 369

(20)

the RSI instrument mainly consisted of 8 throat symptom items and one classical reflux symptom, and thus should only be used in patients with major complaint of reflux-related throat symptoms, but not in those with typical GERD symptoms as the major complaint if a responder definition is to be used as the endpoint of PRO measures. Finally, it is unclear if the responder definition that was

determined in a tertiary center in this study may be applied in the primary care setting.

In conclusion, we found that a reduction in RSI score from baseline of 6 or more was sufficiently sensitive and specific for use as the responder definition to interpret the treatment benefit among patients with laryngopharyngeal reflux in clinical practice and in clinical trials. This responder definition meets the U.S. FDA guidance for industry to evaluate PRO in medical product development to support labeling claims and may be used as the endpoint measure in both clinical

practice and clinical trials to interpret individual and group treatment benefit.

Contributors

Funding The work was supported by Taiwan’s National Science Council (Grant number: NSC102-

2314-B-075A-004-MY2).

Competing interest None.

Ethics approval The study was approved by Taichung Veterans General Hospital Institutional

Review Board (#C06254).

Provenance and peer review Not commissioned; externally peer reviewed.

370 371 372 373

374 375 376 377 378 379

380 381 382

383 384 385

386 387 388 389

(21)

REFERNCES

1. Koufman JA, Aviv JE, Casiano RR, et al. Laryngopharyngeal reflux: position statement of the Committee on Speech, Voice, and Swallowing Disorders of the American Academy of

Otolaryngology–Head and Neck Surgery. Otolaryngol Head Neck Surg. 2002;127:32-35.

2. Vakil N, van Zanten SV, Kahrilas P, et al. The Montreal definition and classification of gastroesophageal reflux disease: A global evidence-based consensus. Am J Gastroenterol

2006;101:1900-20.

3. Kahrilas PJ, Shaheen NJ, Vaezi MF, et al. American Gastroenterological Association Medical Position Statement on the Management of Gastroesophageal Reflux Disease. Gastroenterology

2008;135:1383-91.

4. Ford CN. Evaluation and management of laryngopharyngeal reflux. JAMA 2005;294:1534-40.

5. Qadeer MA, Phillips CO, Lopez AR, et al. Proton pump inhibitor therapy for suspected GERD-related chronic laryngitis: A meta-analysis of randomized controlled trials. Am J

Gastroenterol 2006;101:2646-54.

6. Reichel O, Dressel H, Weideranders K, et al. Double-blind, placebo-controlled trial with esomeprazole for symptoms and signs associated with laryngopharyngeal reflux. Otolaryngol

Head Neck Surg. 2008; 139:414-20.

7. Lam PK, Ng ML, Cheung TK, et al. Rabeprazole is effective in treating laryngopharyngeal

reflux in a randomized placebo-controlled trial. Clin Gastroenterol Hepatol 2010;8:770-6.

8. Belafsky PC, Postma GN, Koufman JA. Validity and reliability of reflux symptom index (RSI). J Voice 2002;16:274-7.

390 391 392

393 394 395

396 397 398

399 400 401 402

403 404 405

406 407

408 409 410

(22)

9. Schindler A, Mozzanica F, Ginocchio D, et al. Reliability and clinical validity of the Italian

reflux symptom index. J Voice 2010;24:354-8.

10. Printza A, Kyrgidis A, Oikonomidou E, et al. Assessing laryngopharyngeal reflux symptoms with the reflux symptoms index: Validation and prevalence in Greek population. Otolaryngol

Head Neck Surg. 2011;145:974-80.

11. Farahat M, Malki KH, Mesallam TA, et al. Development of the Arabic version of reflux

symptom index. J Voice 2012;26:814.e15-9.

12. Cohen JT, Gil Z, Fliss DM. The reflux symptom index—a clinical tool for the diagnosis of

laryngopharyngeal reflux. Harefuah. 2005;144:826-9.

13. Wright JG. The minimal important difference: Who’s to say what is important. J Clin

Epidemiol 1996; 49:1221-2.

14. Hays RD, Farivar SS, Liu H. Approaches and recommendations for estimating minimally important differences for health-related quality of life measures. Int J Chron Obstruct Pulmon

Dis 2005;2:63-7.

15. U.S. Department of Health and Human Services Food and Drug Administration. Guidance for industry 2009; Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. http://www.fda.gov/downloads/Drugs/Guidances/UCM193282.pdf

(2013/12/20 download)

16. Park W, Hicks DM, Khandwala F, et al. Laryngopharyngeal reflux: prospective cohort study evaluating optimal dose of proton-pump inhibitor therapy and pretherapy predictors of response. Laryngoscope 2005;115:1230-8.

411

412 413 414

415 416

417 418

419 420

421 422 423

424 425 426 427

428 429 430 431

(23)

17. Lien HC, Wang CC, Liang WM, et al. Composite pH predicts esomeprazole response in

laryngopharyngeal reflux without typical reflux syndrome. Laryngoscope. 2013;123:1483-9.

18. Belafsky PC, Postma GN, Koufman JA. Laryngopharyngeal reflux symptoms improve before

changes in physical findings. Laryngoscope 2001;111:979-81.

19. Guillemin F, Bombardier C, Beaton D. Cross-cultural adaptation of health-related quality of

life measures: literature review and proposed guidelines. J Clin Epidemiol 1993;46:1417–32.

20. Qadeer MA, Phillips CO, Lopez AR, et al. Proton pump inhibitor therapy for suspected GERD-related chronic laryngitis: A meta-analysis of randomized controlled trials. Am J

Gastroenterol 2006;101:2646-54.

21. Nunnally J.C. & Bernstein I.H. (1994) Psychometric Theory. McGraw-Hill, New York.

22. Cohen J. Statistical Power Analysis for the Behavioural Sciences. New York: Academic Press,

1977.

23. Kazis LE, Anderson JJ, Meenan RF. Effect sizes for interpreting changes in health status. Med

Care 1989;27(3 Suppl.):178-89.

24. deVet HC, Ostelo RW, Terwee CB, et al. Minimally important change determined by a visual method integrating an anchor-based and a distribution-based approach. Qual Life Res

2007;16:131-42.

25. Vaezi MF, Richter JE, Stasney CR, et al. Treatment of chronic posterior laryngitis with

esomeprazole. Laryngoscope 2006;116:254-60.

26. Belafsky PC, Postma GN, Koufman JA. The validity and reliability of the Reflux Finding Score (RFS). Laryngoscope 2001;111:1313-17.

432

433 434

435 436

437 438 439

440 441 442

443 444

445 446 447

448 449

450 451 452

(24)

27. Belafsky PC, Postma GN, Amin MR, et al. Symptoms and findings of laryngopharyngeal

reflux. Ear Nose Throat J. 2002 ;81(9 Suppl 2):10-3.

28. Vakil NB, Halling K, Becher A, et al. Systematic review of patient-reported outcome instruments for gastroesophageal reflux disease symptoms. Eur J Gastroenterol Hepatol.

2013;25:2-14.

29. Stanghellini V, Armstrong D, Monnikes H, et al. Determination of ReQuest-based symptom thresholds to define symptom relief in GERD clinical studies. Digestion 2005; 71:145–51.

453

454 455 456

457 458 459 460

(25)

The Reflux Symptom Index (RSI) 逆流症狀指數

Within the last month, how did the following problems affect you?

過去一個月內，以下的難題如何影響你？

Circle the appropriate response.

圈選適當的反應。

0= No Problem 0=沒問題

5= Severe Problem 5=很嚴重

1. Hoarseness or a problem with your voice 沙啞或你的聲音有問題

0 1 2 3 4 5

2. Clearing your throat 清你的喉嚨

0 1 2 3 4 5

3. Excess throat mucus or postnasal drip 過多喉嚨黏液或鼻倒流

0 1 2 3 4 5

4. Difficulty swallowing food, liquids, or pills 吞嚥食物，液體或藥丸困難

0 1 2 3 4 5

5. Coughing after you ate or after lying down 進食或躺下後咳嗽

0 1 2 3 4 5

6. Breathing difficulties or choking episodes 呼吸困難或嗆到事件

0 1 2 3 4 5

7. Troublesome or annoying cough 令人討厭或惱人的咳嗽

0 1 2 3 4 5

8. Sensation of something sticking in your throat or a lump in your throat

有東西黏在你喉嚨或有塊狀物在你喉嚨的感覺

0 1 2 3 4 5

9. Heartburn, chest pain, indigestion, or stomach acid coming up 心灼熱，胸痛，消化不良或胃酸跑上來

0 1 2 3 4 5

Total 總分

Supplementary Figure. The Chinese-English version of the Reflux Symptom Index (RSI) 461

462

(26)

Table 1. Demographics variables and clinical baseline characteristics of 96 subjects with suspected laryngopharyngeal reflux

Variable

Total (N = 96)

Completed^a at week 12

(N = 84)

Not completed at week 12

(N = 12)

Age (years) (mean ± SD) 49.5 ± 12.6 49.3 ± 12.6 50.9 ± 13.4

BMI (kg/m²) (mean ± SD) 23.7 ± 3.8 23.6 ± 3.8 23.7 ± 3.6

Gender (male) (n (%)) 54 (56.3) 50 (59.5) 4 (33.3)

Primary laryngeal symptom (n (%))

Globus sensation 28 (29.2) 25 (29.8) 3 (25.0)

Throat pain 15 (15.6) 14 (16.7) 1 (8.3)

Hoarseness 29 (30.2) 25 (29.8) 4 (33.3)

Cough 15 (15.6) 11 (13.1) 4 (33.3)

Throat clearing 9 (9.4) 9 (10.7) 0 (0.0)

Typical reflux syndrome (n (%)) 59 (61.5) 52 (61.9) 7 (58.3)

Erosive oesophagitis (n (%)) 21 (22.1) 20 (24.1) 1 (8.3)

Reflux finding scale (median (IQR)) 6 (4 - 7) 6 (4 - 7) 6 (4.5 – 7.5) RSI scale (median (IQR)) 17 (12 - 22.5) 17 (12 - 22) 17.5 (10 – 26) Typical reflux syndrome, heartburn or acid regurgitation; RSI, reflux symptom index; SD, standard

deviation; IQR, interquartile range.

a. Completed: patients with no missing data at week 12.

463 464

(27)

Table 2. Internal consistency, test-retest reliability, and responsiveness for the RSI scale

Measure Criterion Range Ideal Result

Internal consistency Cronbach's α 0 to 1 > 0.7 0.74

floor effect (%) 0 to 100 0.0

ceiling effect (%) 0 to 100 0.0

Test–retest reliability Intraclass correlation coefficient

0 to 1 > 0.7 0.79

Responsiveness Effect size^a > 0.8 0.92

Paired t-test (P value) 0 to 1 < 0.05 <0.0001 RSI, reflux symptom index.

a. Effect size, based on Cohen's definition d= ¯x₁−¯x₂

s ; s=√⁽ⁿ¹^{−1) s}ⁿ¹²¹⁺⁽ⁿ⁺ⁿ²²^{−1 )s}²² ^.

465 466

(28)

Table 3. Sensitivity and specificity of RSI score reduction of 6 points as responder definition for predicting an empirical responder criterion, i.e., ≥ 50% improvement in the primary laryngeal symptom, among all subjects and among subgroups with different baselines.

All subjects Subgroups with different baselines

(N = 84) ＜12

(n = 19)

12-28 (n = 56)

＞28 (n = 9)

Sensitivity 0.79 (30/38) 0.17 (1/6) 0.86 (25/29) 1.00 (3/3)

Specificity 0.70 (32/46) 1.00 (13/13) 0.70 (19/27) 0.17 (1/6)

RSI, reflux symptom index;

Sensitivity = the number of subjects with both RSI score reduction ≥ 6 points and ≥ 50% improvement in primary laryngeal symptom at week 12, divided by the number of subjects with ≥ 50% improvement

in primary laryngeal symptom at week 12.

Specificity = the number of subjects with both RSI score reduction < 6 points and < 50% improvement in primary laryngeal symptom at week 12, divided by the number of subjects with < 50% improvement in primary laryngeal symptom at week 12.

467

(29)

Figure 1. Mean changes in RSI scores from baseline between patients with ≧ 50% and ＜ 50%

reduction in the primary laryngeal symptom at week 12. (The “I” bars represent standard error.) 468

469 470 471

(30)

Figure 2. Illustrative cumulative distribution function shows a distinct difference in cumulative percentage at a RSI score reduction of 6 points from baseline between patients with ≧ 50% or ＜ 50% reduction in the primary laryngeal symptom at week 12. (X-axis: RSI score change from baseline; Y-axis: cumulative percentage of patients; Sensitivity= cumulative percentage for patients with ≧ 50% reduction; Specificity= 1 - cumulative percentage for patients with ＜ 50% reduction) (圖中 please use Sensitivity and Specificity)

472 473 474 475 476 477 478