• 沒有找到結果。

Statistical Methods for Clinical Evaluation of Biochip Products

N/A
N/A
Protected

Academic year: 2021

Share "Statistical Methods for Clinical Evaluation of Biochip Products"

Copied!
70
0
0

加載中.... (立即查看全文)

全文

(1)

Statistical Methods for Clinical

Evaluation of Biochip Products

Jen-pei Liu, PhD

Division of Biometry, Department of Agronomy National Taiwan University

and

Division of Biostatistics and Bioinformatics National Health Research Institutes

At

Workshop on Applications of Statistical Methods to Evaluation of Diagnostic Biochip Products

November 26, 2005 Taipei, Taiwan

(2)

Outline

„

Introduction

„

Methods when clinical truth (gold

standard) is present

„

Methods when clinical truth is

absence

„

Issues on experimental design

„

Discussion and summary

(3)

Introduction

„

Post HGP (Human Genome Project) Era

„

Pharmacogentics

„

Pharmacogenomics

„

Biochip Products

„

Target Clinical Trials

„

Personalized Medicine

(4)

Introduction

„

The US FDA guidance

„

Multiplex Tests for Heritable DNA Markers,

Mutations and Expression Pattern

„

Pharmacogenomic Data Submission

Collection of samples from clinical trials in Taiwan by

the global pharmaceutical company for microarray

and pharmacogenomic analysis

多標的陣列平台基因診斷試劑–查驗登記審查

指引(2005年3月)衛生署

(5)

Introduction

„

Article 8 of the Guidance (第八條用臨床檢體

進行比較研究)

„

Comparison to a Reference Method –

Sensitivity and Specificity for Clinical

Diagnosis

„

Comparison to Another Device – Percent

Agreement

„

Report of Discrepancy, False Positive and

False Negative Results

(6)

Introduction

„

Accuracy in diagnostic performance: a

measure of how faithfully the information

obtained using a diagnostic device reflects

truth as measured by a truth standard or gold

standard.

„

Comparator: An established test (device)

against which a proposed test is compared to

evaluate the effectiveness of the proposed

(7)

Methods When Presence of

Clinical Truth (Gold Standard)

(“Gold Standard”)

Clinical Truth of Diagnosis

Diagnosis Made from

New Marker Test

Present (+)

Absent (-)

Total

Positive (+)

Negative (-)

a

c

b

d

m

1

m

2

Total

n

1

n

2

N

(8)

Methods When Presence of

Clinical Truth (Gold Standard)

„

Example 1 (FDA, 2003)

New Maker

Test Result

True

Positive

Diagnosis

Negative

Total

Positive

Negative

44

7

1

168

45

175

Total

52

169

220

(9)

Indexes of Diagnostic Accuracy

„

Sensitivity

(True Positive rate): Capacity for

making a correct diagnosis in subjects with

the disease

„

Estimated Sensitivity:

100% x a/(a+c)

„

Specificity

(True Negative rate): Capacity

for making a correct diagnosis in subjects

without disease

„

Estimated Specificity:

(10)

Indexes of Diagnostic Accuracy

Data from Example 1

„

Estimated sensitivity

= 100% x 44/51 = 86.3%

Exact 95% confidence interval based on

binomial distribution: (73.7%, 94.3%)

„

Estimated specificity

= 100% x 168/169 = 99.4%

Exact 95% confidence interval based on

binomial distribution: (96.8%, 100%)

(11)

Indexes of Diagnostic Accuracy

„ Positive Predictive Value (Positive Predictive Accuracy): the

proportion of subjects with the disease given the positive results. = 100% x a/(a+b)

„ Negative Predictive Value (Negative Predictive Accuracy):

the proportion of subjects without the disease given the negative results.

= 100% x d/(c+d)

„ False positive rate: given the positive results ,the proportion

of subjects without the disease

=1 – positive predictive value = 100% x b/(a+b)

„ False negative rate: given the negative results, the proportion

of subjects with the disease

(12)

Methods When Presence of

Clinical Truth (Gold Standard)

Example 2 (Feinstein, 2002)

New Maker

Test Result

Diseased

Cases

Nondiseased

Control

Total

Positive

Negative

46

4

2

48

48

52

Total

50

50

100

(13)

Indexes of Diagnostic Accuracy

Data from Example 2 (Feinstein, 2002)

„

Sensitivity = 100% x 46/50 = 92.0%

„

Specificity = 100% x 48/50 = 96.0%

„

Prevalence = 100% x 50/100 = 50.0%

„

Positive Predictive Value

= 100% x 46/48 = 95.8%

„

Negative Predictive Value

= 100% x 48/52 = 92.3%

„

False Positive Rate = 100% x 2/48 = 4.2%

(14)

Example 3 (Feinstein, 2002)

New Maker

Test Result

Diseased

Cases

Nondiseased

Control

Total

Positive

Negative

46

4

38

912

84

916

Total

50

950

1000

(15)

Indexes of Diagnostic Accuracy

„

Example 3 (Feinstein, 2002)

„

Sensitivity = 100% x 46/50 = 92.0%

„

Specificity = 100% x 912/950 = 96.0%

„

Prevalence = 100% x 50/1000 = 5.0%

„

Positive Predictive Value = 100% x 46/84 =

54.8%

„

Negative Predictive Value = 100% x 912/916

= 99.6%

„

False Positive Rate = 100% x 38/84 = 45.2%

(16)

Error rates associated with screening test (Fleiss, 1981)

Prevalence

False Positive Rate

False Negative Rate

1/million

.9999

0

1/100,000

.9991

0

1/10,000

.9906

.00001

1/1000

.913

.00005

1/500

.840

.00010

1/200

.677

.00025

1/100

.510

.00051

(17)

Indexes of Diagnostic Accuracy

„

False positive rate is high if prevalence of the

disease is low and vice versa. False negative

rate is high if prevalence of the disease is

high and vice versa.

„

However, sensitivity and specificity are

independent of the size of the subjects used

to evaluate the tests. (i.e., independent of

the prevalence rate)

(18)

Indexes of Diagnostic Accuracy

Type of Diagnostic Tests (Feinstein, 1977)

„

Screening or discovery tests: mammogram,

fasting blood sugar-required high sensitivity

=> high false positive rate.

„

Exclusion tests: to rule out the presence of

the disease such as colonoscopic examination

=> require extremely high sensitivity

„

Confirmation test: to verify the suspicion of

the presence of the disease such as biopsy

for lung cancer => require extremely high

specificity with very few false positive.

(19)

Indexes of

Diagnostic Accuracy

Type of Diagnostic Markers

„

Binary Test Results (+,-)

„

Multiple Categorical Results

Abnormality Rating

Severity Rating

Urine test: None, trace, 1+, 2+

HER2 test: 0, 1+, 2+, 3+

„

Continuous Test Results

PSA

Intraocular Pressure

Glucose tolerance test

Gene expression level

(20)

Example 4

:

Results in Diagnostic Marker Study of Coronary Artery Disease and Level of S-T Depression in Exercise Stress Test

Definitive State of Disease

Patients with S-T Segment Cases of Controls Without Depression of Coronary Disease Coronary Disease

≥ 3.0mm. 31 0 A ≥ 2.5mm. but < 3.0mm. 15 0 B ≥ 2.0mm. but < 2.5mm. 27 7 C ≥ 1.5mm. but < 2.0mm. 30 8 D ≥ 1.0mm. but < 1.5mm. 32 39 E ≥ 0.5mm. but < 1.0mm. 12 43 F < 0.5mm. 3 53 TOTAL 150 150 Source: Feinstein (2002)

(21)

Indexes of

Diagnostic Accuracy

„

To convert a ranking scale or a continuous

measurement into a binary outcomes (+,–),

we need a cutoff point or threshold.

„

Example:

„

FBG >

126mg/dL

DM

(+)

„

≤ 126mg/dL

DM (–)

„

S-T Depression in Exercise Stress Test

„

Class D < 1.5 min

CAD (+)

(22)

Indexes of

Diagnostic Accuracy

At a specific threshold, relationship of

sensitivity, specificity, false positive and false

negative rates can be interpreted through

hypothesis testing:

H0:Absence of the disease

H1:Presence of the disease

α =Pr[Type I Error]

=Pr[test positive | no disease]

β=Pr[Type II Error]

(23)

From Affymetrix Technical Note “GeneChip® arrays provide optimal sensitivity

(24)

Indexes of

Diagnostic Accuracy

Variable, X μΝ μD Threshold β α Specificity=1-α Sensitivity=1-β

Normal

Diseased

(25)

Indexes of

Diagnostic Accuracy

Sensitivity = Pr[test positive | disease]

= 1 –

β

= power of the statistical procedure

Specificity = Pr[test negative | no disease]

= 1 –

α

„

α↑ ⇒ β↓ ⇒ (1-β)↑

„

A test with a high sensitivity also has a high incorrect

positive rate but a low incorrect negative rate. A test

with a high specificity also has a high incorrect

(26)

Indexes of

Diagnostic Accuracy

„

At each individual threshold (cut-off), sensitivity and

specificity can be computed.

„

A Receiving Operating Characteristic (ROC) curve is a

graphic presentation of sensitivity against

1-specificity.

„

It is a path in the unit square, from the lower left

corner to the upper right corner. In fact, it can be

viewed as a cumulative distribution function.

(27)

Summary of Nosologic Sensitivity and Specificity Calculated for Demarcations of Example 4 Demarcation Location of Boundary for Abnormal Number of Cases Included Sensitivity Number of Controls

Included Specificity 1 –Specificity A B C D E F ≥3.0mm. ≥2.5mm. ≥2.0mm. ≥1.5mm. ≥1.0mm. ≥0.5mm. TOTAL 31 46 73 103 135 147 150 0.21 0.31 0.49 0.69 0.90 0.98 –– 0 0 7 15 54 97 150 1 1 0.95 0.90 0.64 0.35 –– 0 0 0.05 0.10 0.36 0.65 ––

(28)

1-SPECIFICITY S E N S IT IV IT Y Source: Feinstein (2002) F E D C B A 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

(29)

Indexes of

Diagnostic Accuracy

„

In a useless marker test, the ROC curve will be a straight

line at a 45

o

angle.

„

The area under the ROC curve provides a summary index

for diagnostic accuracy across over all possible values of

thresholds.

„

The range of the area under the ROC curve is from 0.5

(50%) to 1.0(100%)

„

In a useless marker test, the area under the ROC curve is

50% which is the same as flopping a fair coin.

„

For non-inferiority or equivalence test based on the paired

ROC curve area, see Liu, et al. (2005, Statistics in

(30)

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1-SPECIFICITY S E N S IT IV IT Y Perfect test Ordinary test Useless test Source: Feinstein (2002)

(31)
(32)

Indexes of

Diagnostic Accuracy

„

Other indices

„

Likelihood ratios: independent of

prevalence

„

Positive test

„

Negative test

„

Odds ratio

„

Pretest

„

Posttest – positive

„

Posttest - negative

(33)

Methods When

Absence of Clinical Truth

Diagnosis Made from

Another Device

Diagnosis Made from

Marker Test

Positive (+) Negative (-)

Total

Positive (+)

Negative (-)

a

c

b

d

a+b

c+d

Total

a+c

b+d

N

(34)

Methods When

Absence of Clinical Truth

„

Overall percent agreement

= 100% x (a+d)/N

„

Agreement of new test with another

device – positive = 100% x a/(a+c)

„

Agreement of new test with another

device – negative = 100% x b/(b+d)

(35)

Methods When

Absence of Clinical Truth

Diagnosis Made from

Another Device

Diagnosis Made from

New Marker Test

Positive (+)

Negative (-)

Total

Positive (+)

Negative (-)

40

4

5

171

45

175

Total

44

176

220

(36)

Methods When

Absence of Clinical Truth

„

Overall percent agreement

= 100% x (40+171)/220 = 95.9%

95% CI = (92.4%, 98.1%)

„

Agreement of new test with another device –

positive = 100% x 40/44 = 90.9%

„

Agreement of new test with another device –

(37)

Methods When

Absence of Clinical Truth

Disadvantages of agreement measurements:

„

“Agreement” does not mean “correct”

„

Agreement changes depending on disease

prevalence

„

For evaluation of non-inferiority or

equivalence of two diagnostic tests based on

the proportions of the positive results, see Liu,

et al. (2002,

Statistics in Medicine

)

(38)

HercepTest

„

HER2 (the human epidermal growth factor receptor 2)

is a member of the HER

(erbB) family of

transmembrane tyrosine kinase

„

Enhanced level of HER2 is associated with mammary

epithelial cell transformation and shorter survival in

patients with breast cancer

„

≈ 25% of invasive breast cancers exhibit HER2

gene

amplification

„

The rate of HER2

gene amplification or protein in

ductal carcinoma in situ (DCIS) is higher than

invasive cancer ⇒ pathogenic role in the initiation of

mammary carcinoma

„

Treatment of Herceptin - requirement of screening

(39)

HercepTest

„

HercepTest

TM

is an immunohistochemical

(IHC) test intended to aid in the assessment

of patients being considered for Herceptin

treatment

„

A Class III device – require clinical studies

„

Interpret results

„

Negative for HER2 over-expression: 0 or 1+

„

Positive for HER2 over-expression: 2+ or 3+

(40)

PATHWAY

TM

Her 2 (Clone CB11)

„

A mouse monoclonal antibody

„

Semi-quantitative detection of c-erbB-2

antigen

„

Binding of an antibody to an antigen of

interest

„

Visualization of the bound primary antibody

by an indirect biotin-avidin system coupled to

an enzyme

„

Interpret results

„

Negative for HER2 over-expression: 0 or 1+

„

Positive for HER2 over-expression: 2+ or 3+

(41)

PATHWAY

TM

Her 2 (Clone CB11)

„

Potential Adverse Effects

„

False Positive

„

the benefit of Herceptin to patients with normal

or lower level of HER2 is unknown

„

The risks of Herceptin include infusion toxicity

(chills, fever, pain, asthenia, nausea, vomiting

and headache) and cardiotoxicity

(42)

Clinical Studies

„

Compared to DAKO HercepTest

TM

„

Goal: at least 75% agreement with 95%

confidence

Ho P ≥ 0.75 vs. Ha: P>0.75

„

One central laboratory

„

Multi-center: 3 sites

„

50+ and 100- specimens by HercepTest for

each site

(43)
(44)

Clinical Studies

„

Observed overall agreement = 92.4 (416/450) with

an exact 95% CI (89.6% to 94.7%)

„

P-value for testing Ho: P≤0.75 is < 0.0001

„

The observe kappa statistic = 0.83 with a p-value for

testing no agreement < 0.0001

„

P-value for McNemar test for equal proportion of

clinically + is 1.00

„

Assume that HercepTest is gold standard

„ Sensitivity: 88.7% (134/151); 95%CI: 82.6% - 93.3% „ Specificity: 94.3% (282/299); 95%CI: 91.1% - 96.7%

(45)
(46)

Clinical Studies

„

Observed overall agreement = 86.4 (389/450) with

an exact 95% CI (82.9% to 89.5%)

„

P-value for testing Ho: P≤0.75 is < 0.0001

„

The observe kappa statistic = 0.73 with a p-value for

testing no agreement < 0.0001

„

P-value for test for marginal homogeneity is 1.00

„

Assume that HercepTest is gold standard

„ Sensitivity for intermediate category: 46.2% (24/52); 95%CI:

32.3% - 60.5%

„ Sensitivity (+): 83.8% (83/99); 95%CI: 75.1% - 90.5% „ Specificity (-) : 94.3% (282/299); 95%CI: 91.1% - 96.7%

(47)

Kappa Statistic

„

κ = (p

0

– p

e

)/(1 – p

e

)

„

Where p

0

= Σp

ii

and p

e

= Σp

i.

p

.i

„

p

ij

is the proportion of (i,j) entry of a rxr

contingency table

„

P

i.

(p

.j

) is the sum of p

ij

over category i,

i=1,…,r

„

SE = {1/[1- p

e

]sqrt(n)}{sqrt[C]}

„

C = p

e

– p

2e

-

Σp

i.

p

.i

(p

i.

+p

.i

)

(48)

Kappa Statistic

Example

HercepTest

PATHWAY

+ +

Margin

+ 282 (0.6267) 17 (0.0378) 299 (0.6644)

-

17 (0.0378) 134 (0.2978) 151 (0.3356)

Margin 299 (0.6644) 151 (0.3356) 450 (1.0000)

p

0

= 0.6267 + 0.2978 = 0.9245

p

e

= 0.6644*0.6644 + 0.3356*0.3356 = 0.5541

κ = (0.9245 – 0.5541)/(1 – 0.5541) = 0.8307

(49)

Methods When

Absence of Clinical Truth

Diagnosis Made from

Another Device

Diagnosis Made from

New Marker Test

Positive (+)

Negative (-)

Total

Positive (+)

Negative (-)

40

4

5

171

45

175

Total

44

176

220

(50)

Methods When

Absence of Clinical Truth

Table A

New

Another

Total

True

Marker

Device

Patients

Diagnosis

+

--+

+

40

39

1

+

--

5

5

0

--

+

4

1

3

--

--

171

6

165

Total

220

51 169

New and another device agree for 211 patients

(51)

Methods When

Absence of Clinical Truth

Diagnosis Made from

Another Device

Diagnosis Made from

New Marker Test

Positive (+) Negative (-)

Positive (+)

Negative (-)

40

4

5

171

← retest

Retest

(52)

Methods When

Absence of Clinical Truth

Table C

New

Another

Total

True

Marker

Device

Patients

Diagnosis

+

--+

+

40

N/A

N/A

+

--

5

5

0

--

+

4

1

3

--

--

171

N/A N/A

(53)

Methods When

Absence of Clinical Truth

„

New marker agrees 8 specimens with the

resolver

„

Another device agrees 1 specimen with the

resolver

„

Impossible to estimate the relative magnitude

of this difference unless we know the true

state for all specimens (Table A) or the

(54)

Methods When

Absence of Clinical Truth

Common Mistakes:

„

When original results agree, assume

that they both correct and do not make

any change to the table

„

When original results disagree, and

another device disagrees with the

resolver, change the result of another

device to the resolve result.

(55)

Methods When

Absence of Clinical Truth

Table D

New

Another

Total

True

Revised

Marker Device

Patients

Diagnosis

Total

+

--+

+

40

40*

45

+

--

5

5

0 0

--

+

4

1

3

1

--

--

171

171* 174

Total

220

220

(56)

Methods When

Absence of Clinical Truth

Revised Results (Incorrect)

Diagnosis Made from

Another Device

Diagnosis Made from

New Marker Test

Positive (+) Negative (-)

Total

Positive (+)

Negative (-)

45

1

0

174

45

175

Total

Percent Agreement

= 99.5% (219/220)

46

174

220

(57)

Issues on Experimental Design

„

Objective: To achieve the maximal

accuracy with the best precision at the

minimal cost

„

Use “least burdensome approaches”

„

Collection of specimens (samples)

„

Assays of specimens (samples)

(58)

Issues on Experimental Design

Collection of specimens (samples)

Characteristics of diagnostic trials:

„

A number of diagnostic tests can

simultaneously can be applied to the

same person without need of washout

periods

„

Two diagnostic tests are usually

(59)

Issues on Experimental Design

Characteristics of diagnostic trials

:

„

Subjects serve their own control

„

Between-subject variation is greater than

within-subject variation

„

Paired design to increase the power and

efficiency

„

Sometimes order of diagnostic tests can be

randomly assigned

(60)

Issues on Experimental Design

Assays of Specimens:

„

Blindness is utmost important

„

Fully blinded evaluation - blinded to patient

status, clinical information, and results of

other tests or diagnosis by gold standard, etc.

„

Separate and unpaired evaluation: order of

assays should be randomly arranged

(61)

Discussion and Summary

„

Indexes of diagnostic accuracy

„

Presence vs. absence of clinical truth

„

Quantitative method comparison in

absence of a true standard

„

Systemic error (y=x): Deming

regression

„

Random error: Bland-Altman Difference

(62)
(63)
(64)

Discussion and Summary

„

Determination of cut-off (thresholds)

„

Evaluation of classification error rate:

leave-one-out method, bootstrap method

„

Feature selection of differentially expressed

genes for each leave-one-out training set:

correct error rate (Simon, et al., 2003)

„

Sensitivity, specificity, positive and negative

predictive value, percent agreement, and

ROC curve

(65)

Discussion and Summary

Four factors in QA/QC of diagnostic tests based on

microarray or other pharmacogenomic technology

„

Technical: manufacturing of microarrays, sample

collection, RNA extraction, cDNA/cRNA synthesis,

labeling with fluorescent dye and hybridization.

„

Instrumental: image acquisition, and quantification

„

Computational: data preprocessing, normalization

and analysis

„

Interpretative: biological reasoning

(66)

多標的陣列平台基因診斷試劑 – 查驗登記審查指引 (2005年3月)衛生署 Draft preliminary Concept paper – Drug –Diagnostic Co-Development Concept Paper (April, 2005) The US FDA

Draft Guidance on Multiplex tests for Heritable DNA Markers, Mutations, and Expression Pattern (Feb,, 2003) The US FDA

Statistical Guidance on Reporting Results from Studies Evaluating Diagnostic Tests (March, 2003) The US FDA

Campbell, G. (2004) Some statistical and regulatory issues in the evaluation of genetic and genomic tests, Journal of Biopharmaceutical Statistics 14:539-552. Shi, L., et al. (2004) QA/QC: challenges and pitfalls facing the microarray

community and regulatory agencies, Expert Rev. Mol. Diagn. 4(6):761-777.

(67)

Yerushalmy, J (1947) Statistical problems in assessing methods on medical diagnosis, with special reference to X-ray technique, Pub. Health Research 62:1432-1449.

Feinstein, AR(1977) Clinical Biostatistics, Mosby, St Louis.

Feinstein, AR (2002) Principles of Medical Statistics, Chapman and Hall/CPC, Boca Raton, FL.

Fleiss, JL (1981) Statistical Methods for Rates and Proportions, Wiley, New York.

Armitage, P and Berry, G (1987) Statistical Methods in Medical Research, Blackwell, Oxford.

Zhou, XH, Obuchowski, NA, McClish, DK (2002) Statistical Methods in Diagnostic Medicine, Wiley, New York.

Pepe, MS (2003) The Statistical Evaluation of Medical Tests for Classification and Prediction, Oxford University Press, New York.

(68)

Swets, JA(1979) ROC analysis applied to the evaluation of medical imaging techniques, Investigative Radiology, 14:109-121.

Hanley, JA and McNeil, BJ(1982) The meaning and use of the area under a receiver operating characteristic(ROC) curve, Diagnostic Radiology, 142:29-36. Hanley, JA and McNeil, BJ(1982) A method of comparing the area under a

receiver operating characteristic curves derived from the same cases, Radiology, 148:839-843.

Begg, C.B. (1987) Bias in the assessment of diagnostic test, Statistics in Medicine 6: 411-423

Hujoel, PP, Moulton, LH, and Loesche(1990) Estimation of sensitivity and specificity of site-specific diagnostic tests, Journal of Periodontal Research, 25:193-196.

Smith PJ, and Hadgu, A(1992) Sensitivity and specificity for correlated observations, Statistics in Medicine, 11:1503-1509.

(69)

Hui, SL, and Walter, SD(1980) Estimating the error rates of diagnostic tests, Biometrics, 36:167-171.

Thibodeau, LA(1981) Evaluating diagnostic tests, Biometrics, 37:801-804. Lachenbruch PA(1988) Multiple reading procedures: the performance of diagnostic tests, Statistics in medicine, 7:549-557.

Lachenbruch PA(1992) On the sample size for studies based upon McNemar’s Test, Statistics in medicine,11:1521-1523

Connor, RJ(1978) Sample size for testing differences in proportions for the paired-sample design, Biometrics, 43:629-638.

Feuer, EJ, and Kessler, LG(1989) Test statistics and sample size for a two-sample McNemar test, Biometrics, 45:629-638.

Metz CE(1979) Basic principles of ROC analysis, Seminar Nuclear medicine, 8:283-298.

(70)

Metz, CE, and Kronman, HB(1980) Statistical significance tests for binomial ROC curve, Journal of Mathematical psychology, 22:234-245.

Metz, CE(1989) Some practical issues of experimental design and data analysis in radiological ROC studies, Investigative Radiology; 24:234-245.

Begg, CB, and McNeil, BJ(1988) Assessment of radiological tests: control of bias and other design considerations; Radiology:167:565-569.

Hsueh, H.M., Liu, J.P., Chen, J.J. (2001) Unconditional exact tests for

equivalence or non-inferiority for paired binary data”, Biometrics:Vol.57(2), 478-483.

Liu, J.P., H.M. Hsueh, E. Hsieh, J.J. Chen(2002) “Tests for equivalence or non-inferiority for paired binary data”, Statistics in Medicine, 21:231-245.

Liu, J.P., Ma, M.C., Wu, C.Y., Tai, J.Y. (2005) “Tests of equivalence or non-inferiority for diagnostic accuracy based on the paired areas under ROC curves, Statistics in medicine, in press.

參考文獻

相關文件

the prediction of protein secondary structure, multi-class protein fold recognition, and the prediction of human signal peptide cleavage sites.. By using similar data, we

• Performance: vectorized code often runs much faster than the corresponding code containing loops.. Zheng-Liang

Table 7: Resident population born outside Macao by total years of residence in Macao c (2001 ). Total

6 《中論·觀因緣品》,《佛藏要籍選刊》第 9 冊,上海古籍出版社 1994 年版,第 1

Breu and Kirk- patrick [35] (see [4]) improved this by giving O(nm 2 )-time algorithms for the domination and the total domination problems and an O(n 2.376 )-time algorithm for

The main tool in our reconstruction method is the complex geometri- cal optics (CGO) solutions with polynomial-type phase functions for the Helmholtz equation.. This type of

b Starting from 2000, information relating to manufacturing, hotels and restaurants, financial services, electricity, gas and water supply will be released for the 1 st and 3 rd

Lemma 3 An American call or a European call on a non-dividend-paying stock is never worth less than its intrinsic value.. • An American call cannot be worth less than its