問卷中單複選題的選項排序方法的探討

(1)

國立交通大學

統計學研究所

碩士論文

問卷中單複選題的選項排序方法的探討

Ranking Responses of a Single Response Question or a

Multiple Response Question

研究生：林育駿

指導教授：王秀瑛教授

(2)

問卷中單複選題的選項排序方法的探討

Ranking Responses of a Single Response Question or a

Multiple Response Question

研究生：林育駿 Student: Yu-Chun Lin

指導教授：王秀瑛教授 Advisor: Dr. Hsiuying Wang

國立交通大學

統計學研究所

碩士論文

A Thesis

Submitted to Institute of Statistics College of Science National Chiao Tung University

in partial Fulfillment of the Requirements for the Degree of

Master in Statistics

June 2014

Hsinchu, Taiwan, Republic of China

(3)

問卷中單複選題中的選項排序方法的探討

學生: 林育駿指導教授: 王秀瑛教授

國立交通大學統計學研究所

摘

要

問卷調查在許多研究中是一個最常用的工具，而複選題是一個最常被設計在問卷中的題型。近年來，許多研究提出了一些方法對於複選題的資料作分析，其中，複選_{題的選項排序的問題是一個重要且我們感興趣的議題。在本篇文章中，我們應用} 了Wang (2008)，Wang和Huang (2014) 所提出的檢定方法，以及Hunter DR (2004) 中討論的Bradley-Terry模型並用Minorization–Maximization演算法來對單選題或複選題的選項做排序。我們將這些方法寫成一個R package，而這個package已經被收錄在R 軟體的程式套件，讓使用者們能更方便的用這些方法來對單選題或複選題的選項做排序。

(4)

Ranking Responses of a Single Response Question or a Multiple

Response Question

Student: Yu Chun Lin Advisor: Hsiuying Wang

Institute of Statistics

National Chiao Tung University

Abstract

In many studies, the questionnaire is a common tool for surveying. A multiple response question is a commonly-used question designed in a questionnaire. Recently, many methods had been proposed to analyze data of a multiple responses question in the literature. And ranking responses is one of important issues in analyzing data of a multiple responses question. In this thesis, we use the methods in Wang (2008), Wang and Huang (2014) and Bradley-Terry model with MM (Minorization–Maximization) algorithm in Hunter DR (2004) to develop a R package. It can provide a useful and convenient tool to rank responses in a signal response or a multiple response question.

Keywords : Questionnaire, Single response question, Multiple response question, Rank-ing

(5)

iii

誌謝

很高興當初考研究所的時候可以錄取交大統研所，是我理想的學校；在這邊兩年的時間學到很多我喜歡的事物，不僅僅是在課業上，生活上也跟同學們過得很快樂。首先，要先感謝我的指導教授─王秀瑛老師，老師在我作研究、學習的路上給予我很多的幫助，我不清楚、不知道的概念請教老師，老師都會很仔細的講解讓我真正的了解問題所在並找出解決方法，也很謝謝老師讓我早早就口試，可以為下一個人生階段做更多的準備；也要謝謝我的口試委員: 蔡明田教授、蕭金福教授以及陳鄰安教授，在口試的時候提多許多想法及建議，讓我的論文更加的完整。碩士兩年的生活過得很多采多姿，跟同學們常常出遊、吃大餐，在研究室也時常有歡笑聲。還記得碩一的時候為了數理統計的期中考，幾乎有一半的人在研究室讀通霄，當時很享受大家這樣的凝聚力，讓我覺得有這群同學真的不是一般的福氣。不只在課業上大家有這樣的向心力，在享樂方面大家也都很積極的參與，雖然說沒有人討厭玩樂，但是大家也都會把事情排開就為了要跟班上的人一起出遊，這部分就要很感謝姿琪，她根本就是我們班的玩樂王，常常揪大家一起出去玩，安排行程、規劃路線全都是靠她，我能在交大過的這麼開心有一部分是她的功勞。另外還要特別感謝驛為和民翰，分別是我們碩一碩二時的班代，為我們處理許多事，讓我們只要做簡單的回報就可以把事情完成。班上的其他同學明燕、昱均、兆慶、翔之、奇煒、青樺以及其他人，也都讓我的碩士生活有更多色彩。兩年的時間過得真的很快，也很捨不得跟這群同學分開，但我相信在以後我們依然能像現在一樣，友情永遠不滅。林育駿謹誌于國立交通學統計學研究所中華民國一零三年六月

(6)

List of Tables

Table 1 The consistent rates of the 5 methods when π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.33, k =5 and =0.05 for all methods and α=0.05 for the Wald test and the Generalized Score test: . . . 14

Table 2 The consistent rates of the 5 methods when π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, π6 = 0.43, k =6 and =0.05 for all methods and α=0.05 for the Wald test and the Generalized Score test: . . . 15

Table 3 The consistent rates of the 5 methods when π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, π6 = 0.43, π7 = 0.12, k =7 and =0.05 for all methods and α=0.05 for the Wald test and the Generalized Score test: . . . 15

Table 4 The consistent rates of the 5 methods when π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, π6 = 0.43, π7 = 0.12, π8 = 0.5, k =8 and =0.05 for all methods and α=0.05 for the Wald test and the Generalized Score test: 16

Table 5 The consistent rates of the 5 methods when π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, π6 = 0.43, π7 = 0.12, π8 = 0.5, π9 = 0.9, π10 = 0.62, k =10 and =0.05 for all methods and α=0.05 for the Wald test and the

Generalized Score test: . . . 16

Table 6 The consistent rates of the Wald test and the Generalized Score test when

π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, k =5 and =0.05: . . . 17 Table 7 The consistent rates of the Wald test and the Generalized Score test when

π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, π6 = 0.43, π7 =

(8)

Table 8 The consistent rates of the Wald test and the Generalized Score test when

π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, π6 = 0.43, π7 = 0.12, π8 = 0.5, π9 = 0.9, π10= 0.62, k =10 and =0.05: . . . 18

(9)

1 Introduction

Questionnaires are a commonly used tool for surveying in many fields. They are

es-pecially important in marketing or management studies. There are usually two kinds of

questions : response questions and multiple-response questions. The analysis of

single-response questions have been investigated in literature. Approaches of analyzing

multiple-response questions have been lacking until recently. Several researchers propose some

meth-ods about analyzing the dependence between a single-response question and a

multiple-response question (Umesh 1995,Loughin and Scherer 1998, Decady and Thomas 1999, Bilder,

Loughin and Nettleton 2000). However, most researchers are also interested in ranking

re-sponses in a question. Wang (2008) and Wang and Huang (2014) had proposed methods for

ranking responses in a multiple-response question under the frequentist or Bayesian setup.

Thus, in this thesis, we provide some methods to discuss the problem. They include Wald

test and Generalized score test in Wang (2008), Bayesian ranking response method in Wang

and Huang (2014) and Bradley-Terry model with MM algorithm in Hunter DR (2004).

For example, a company is designing a marketing survey to help develop a drink product.

The researchers will design a multiple response question and list several factors, including

price, packaging, capacity, taste that could attract consumers to buy this product. Suppose

that a group of individuals are surveyed on purchasing a drink product. They are asked to

write questionnaires which list all the questions. The following is a multiple-response

ques-tion in the quesques-tionnaire:

Question 1 : Which factor are important to you when considering the purchase of a

(10)

According to the number of each response which is chosen, many respondents more care

about price and taste than other factors. Then we can easily rank the response ”price”

first and ”taste” second according to the number of responses which are selected. But

it is based on the response selected numbers to rank, is not statistically significant and

we cannot clearly distinguish ”price” is more important than ”taste”. In this thesis, we

base on several methods, Wald test, Generalized Score test, Bayesian ranking response

method and Bradley-Terry model with MM algorithm in the literature to develop

rank-ing procedures. These rankrank-ing procedures has been written as a package RankResponse

for R. RankResponse is available from the Comprehensive R Archive Network at http:

//CRAN.R-project.org/package=RankResponse, which include code function rank.wald,

rank.gs, LR, LN, L2R, btmm, btqn and btnr. We review these methods in the

Prelimi-nary section. In Section 3, we propose rules to rank responses. In Section 4, we compare

these methods by a simulation study. In Section 5, we introduce details of these R codes.

2 Preliminary

In this section, we introduce some methods in the literature to rank responses.

2.1 The Wald Test and the Generalized Score Test

First, we consider ranking two specific responses. For the general case, assume that

a multiple-response question has k responses, v1, · · · , vk, and we interview n respondents. Each respondent is asked to choose at least one and at most s answers for this question,

(11)

where 0 < s 6 k. If s=1, it is a single-response question. There are total of c = C1k+ · · · + Csk possible kinds of answers that respondents will choose. Let ni1···ik denote the number of

respondents selecting the responses vh and not selecting vh0 if i_h=1 and i_h0=0, and p_i 1···ik

denotes the corresponding probability. For example, when k =5, n10100 denotes the number of respondents selecting the first and the third responses and not selecting the other responses.

Thus, the p.m.f function of ni1···ik is

fs(ni1···ik) = I(0 < k X j=1 ij ≤ s) n! Q ij=0or1 ni1···ik! Y ij=0or1 pn_i₁i1···ik_···i k , ni1···ik ≥ 0, 0 ≤ pi1···ik ≤ 1, (1)

where I(·) denotes the indicator function. Let mj denote the sum of the number ni1···ik

such that the j th response is selected, and πj denote the corresponding probability, that is mj = P

ij=1

ni1···ik and πj =

P ij=1

pi1···ik. Note πj is called a marginal probability of response j.

Also let mjldenote the sum of number ni1···ik such that the jth and lth responses are selected,

and πjldenote the corresponding probability. Then mjl= P ij=il=1 ni1···ik and πjl= P ij=il=1 pi1···ik.

For ranking the important of two specified responses, say response 1 and response i in

Ques-tion j from the survey data, we will consider the two sided test:

H0 : πi = πj vs H1 : πi 6= πj (2)

which is equivalent to

H₀∗ : πi− πij = πj − πij vs H1∗ : πi− πij 6= πj− πij (3)

If (2) is rejected, then we ca rank the response with larger mj first. The methods for testing (2) are given in Wang(2008).

(12)

2.1.1 Wald Test

A Wald test is a test based on a statistic of the form

Zn =

Wn− (πi− πj) Sn

where Wn is an estimator of πi − πj, and Sn is a standard error for Wn. An unbiased

estimator of pi1···ik is ni1···ik/n, which is also a maximum likelihood estimator(MLE). Let

ˆ

πi = mi/n,ˆπj = mj/n and ˆπij = mij/n. We can use ˆπi and ˆπj as estimators of πi and πj respectively, and we have

V ar(ˆπi− ˆπj) =                πi(1 − πi)/n + πj(1 − πj+ 2πi/n) if s = 1, (πi− πij)(1 − πi+ 2πj− πij)/n+ (πj − πij)(1 − πj+ πij)/n otherwise. (4)

Under the null hypothesis H0 in (2) and based on central limit theorem, the statistics

ˆ πi− ˆπj pV ar(ˆπi− ˆπj)

(5)

converges in distribution to standard normal random variable when n large. Since πi,πj and πij are unknown,we can use ˆπi, ˆπj and ˆπij to substitute πi, πj and πij in (4). Thus, for testing (2), H0 is rejected if absolute value of (5) is greater than zα/2, where zα/2 is upper α/2 cutoff

point of the standard normal distribution.

2.1.2 Generalized Score Test

In section 2.1.1, πi, πj and πij in V ar(ˆπi− ˆπj) are replaced by ˆπi, ˆπj and ˆπij in the test statistic. In this section, we consider the variance under the null hypothesis in (2), that is,

(13)

πi = πj. Thus we have V arπi=πj(ˆπi− ˆπj) =        2πi/n if s = 1 2(πi − πij)/n otherwise. (6)

By the central limit theorem, under H0, the statistic

ˆ πi− ˆπj

pV arπi=πj(ˆπi− ˆπj)

converges to a standard normal distribution when n is large. We can use (ˆπi + ˆπj)/2 and

ˆ

πij as substitutes for πi and πij in the variance. Hence for testing (2), the null hypothesis is rejected if        √ n|ˆπi−ˆπj| √ ˆ πi+ˆπj > zα/2 if s = 1, √ n|ˆπi−ˆπj| √ ˆ πi+ˆπj−2ˆπij > zα/2 otherwise

This approach is similar to the score test of testing a marginal probability equal to a specified

value. Hence we call this approach a generalized score test.

2.2 Bayesian Ranking Responses

In this section we review the Bayesian Ranking methods in Wang and Huang (2014). We

assume parameters pi1···ik have a prior distribution. Thus, we consider the conjugate prior

η(p) = I(0 < k X j=1 ij ≤ s) Γ( P ij=0 or 1 αi1···ik) Q ij=0 or 1 Γ(αi1···ik) Y ij=0 or 1 pα_i₁i1···ik_···i k , αi1···ik > 0, 0 ≤ pi1···ik ≤ 1, (7)

which is a Dirichlet distribution. The prior information is assumed to be known here.

Under this assumption, we have the posterior distribution

η(p|n) = fs(n|p)η(p) = I(0 < k P j=1 ij ≤ s) Γ    P ij =0 or 1 α_i1···ik+n_i1···ik    Q ij =0 or 1 Γ(α_i1···ik+n_i1···ik) Q ij=0 or 1 pα_ii1···ik+ni1···ik 1···ik (8)

(14)

Through the form of the posterior distribution, we can derive the Bayes estimator for

each pi1···ik under the squared error loss function. The Bayes estimator ˆπj of πj is equal to the

summation of the Bayes estimator of pi1···ik, where ij = 1. We can use the Bayes estimator

of πj to rank the significant of πj. Moreover, if we can associate a testing approach with the Bayes estimator to rank πj, this method can lead to a more accurate result. Therefore, we propose several multiple-testing methods for testing the relationship of πj.

Assume we have k responses and we are interested in testing

H01: π(2) ≤ π(1) versus H01: π(2) > π(1), H02: π(3) ≤ π(2) versus H01: π(3) > π(2), .. . H0k−1: π(k)≤ π(k−1) versus H0k−1: π(k)> π(k−1)                        (9)

For testing expressions (9), the decision rules that are considered here are to control

the posterior false discovery rate. The concept of the false discovery rate was proposed

by Benjamini and Hochberg (1995) to determine optimal thresholds for a multiple-testing

setting.

Then we define the false discovery rate, posterior false discovery rate, false negative

rate and posterior false negative rate for the frequentist and Bayesian setting based on the

literature as follows.

First, some notation and definitions are given. Let zi denote an indicator that the ith

hypothesis H0i is false and let ui = P (zi = 1|n) denote the marginal posterior probability of πi+1 > πi. The rejection of H0i is denoted by di=1; otherwise di=0. Let the false discovery

(15)

rate and false negative rate are denoted by F DR(d, z) and F N R(d, z) respectively,where F DR(d, z) = k−1 P i=1 di(1 − zi) D + ε and F N R(d, z) = k−1 P i=1 (1 − di)zi n − D + ε , D = k−1 P i=1

di and ε is a small constant to avoid a zero denominator, so we choose ε=0.00001. Let the posterior expected false discovery rate denoted by F DR(d, n) and the posterior

expected false negative rate F N R(d, n), where

F DR(d, n) = k−1 P i=1 di(1 − ui) D + ε and F N R(d, n) = k−1 P i=1 (1 − di)ui n − D + ε

Let posterior expected false discovery count F D(d, n) and the posterior expected false

negative count F N (d, n) are defined as

F D(d, n) = k−1 X i=1 di(1 − ui) and F N (d, n) = k−1 X i=1 (1 − di)ui

Then we propose three multiple-testing procedures from Berger(1985) and Muller et

(16)

2.2.1 Method 1

The decision to accept or reject the null hypothesis according to a loss function that was

proposed by Berger (1985), which is defined as

0 if the decision taken is right,

c if we reject H0i when it is true,

1 if we accept H0i when it is false,               

where c(≥0) and 1 represent the losses for making a wrong decision because of a false positive

and a false negative error respectively. In this criterion, the loss function can be written as

LN(d, n) = cF D + F N (10)

2.2.2 Method 2

The second method is consider the loss function

LR(d, n) = cF DR + F N R (11)

2.2.3 Method 3

We consider the third loss function is a bivariate function

L2R(d, n) = (F DR, F N R) (12)

We can define the optimal decisions under L2R as the minimization of F N R subject to F DR ≤ e2R.

From Muller et al.(2004), under the three loss functions, the optimal decision that

min-imizes the loss function takes the form

(17)

where t are tN = c/(c + 1), tR(n) = u(n−D∗₎ and t_2R(n)=min{s : F DR(s, n) ≤ e_2R} under the

loss function LN,LR and L2R respectively. To obtain the value of µi, the code is available in

http://www.stat.nctu.edu.tw/hwang/ranking.htm. In the expressions for tR and t2R, ui is the i th order statistic of {u1, . . . , un}, and D∗ is the optimal number of discoveries found by the function in Muller et al.(2004). Then we according to these di to decide expressions (9) whether reject or not.

2.3 Ranking Responses with Bradley-Terry Models

In this section, we introduce Bradley-Terry models with MM method (Hunter DR 2004).

Then according to result of the method, we can rank responses sequentially. We review this

method in Section 2.3.1 and 2.3.2.

2.3.1 Bradley-Terry Models

In a situation in which the individuals in a group are repeatedly compared with one

another in pairs, Bradley-Terry(1952) suggested the model

P (individual i beats individual j) = γi γi+ γj

, (13)

where γiis positive-values parameter associated with individual i, for each of the comparisons

pitting individual i against individual j. As a concrete example, consider the individuals to

be sports teams, where γi represent the overall skill of team i.

Suppose we observe a number of pairing s among m individuals or teams and we wish

to estimate the parameters γ1, . . . , γm using maximum likelihood estimation. If outcomes of different pairings are assumed to be independent, the log-likelihood based on the

(18)

Bradley-Terry model (13) is `(γ) = m X i=1 m X j=1 [wijln γi− wijln(γi+ γj)], (14)

where wij denotes the number of times individual i gas beaten individual j and we assume

wii=0 by convention. Since `(γ)=`(aγ) for a>0, the parameter space should be regarded as the set of equivalence classes Rm+, where two vectors are equivalent if one is a scalar multiple of the other. This is most easily accomplished by putting a constraint on the parameter

space; to this end, we assume that P iγi=1.

Now we describe an iterative algorithm to maximize `(γ). Start with an initial parameter

vector γ(1)_{. There are many ways to select starting points,in this paper, we assume that γ}(1)

is chosen arbitrarily. For k=1,2,. . .,let

γ_i(k+1) = Wi " X j6=i Nij γ_i(k)+ γ_j(k) #−1 , (15)

where Wi denotes the number od wins by individual i and Nij = wij + wji is the number of pairing between i and j. If the resulting γ(k+1) _{vector does not satisfy the constraint} P

iγ (k+1)

i = 1, it should simply be renormalized.

2.3.2 Minorizing functions and MM algorithm

The strict concavity of the logarithm function implies for positive x and y that

− ln x ≥ 1 − ln y − (x/y) with equality if and only if x = y (16)

Therefore, as shown in Lange, Hunter and Yang (2000), if we fix γ(k)and define the function

Qk(γ) = m X i=1 m X j=1 wij " ln γi− γi+ γj γ_i(k)+ γ_j(k) − ln(γ (k) i + γ (k) j ) + 1 # , (17)

(19)

we may conclude that

Qk(γ) ≤ `(γ) with equality if γ = γ(k) (18)

where `(γ) is the log-likehood of (14). A function Qk(γ) satisfying conditions (18) is said minorize `(γ) at the point γ(k)_. _{It is easy to verify that, for any Q}

k(γ) satisfying the minorizing conditions (18),

Qk(γ) ≥ Qk(γ(k)) implies `(γ) ≥ `(γ(k)) (19)

Property (19) suggests an iterative algorithm in which we let γ(k) _{denote the value of the} parameter vector before the k th iteration and define γ(k+1) _{to be the maximizer of Q}

k(γ); thus γ(k+1) of (15) maximizes Qk(γ). Since this algorithm consists of alternately creating a minorizing function Qk(γ) and then maximizing it, Hunter and Lange (2000) call it an MM algorithm.

3 Ranking Rule

In this section we introduce a criterion to rank the responses using the methods

intro-duced in Section 2. The ranking rule for the Wald test, the Generalized Score test and the

Bayesian ranking methods are similar, but they are different from the rank method with

Bradley-Terry model.

Now we first illustrate the ranking rule of the Wald test, the Generalized Score test

(20)

value, j = 1, . . . , k. Let m(j) be the order statistics, that is, m(1) ≤ m(2) ≤, . . . , m(k). Let v(j) be the response corresponding to m(j). It is natural to rank the importance of responses in order of m(j). That is, the most important response is v(k), and the second important

response is v(k−1). The proposed testing methods in Section 2.1 and Section 2.2 can be used to rank the responses. If the hypothesis π(k) = π(k−1) is rejected, we may claim that v(k) is the most important response. If it is not rejected, the two responses have same rank, that

is, the response v(k) is as important as the response v(k−1), and next to test the hypothesis π(k−1) = π(k−2). Similarly, if it is rejected, we rank response v(k−1) first and response v(k−2) second. If it is not rejected, response v(k−1) and v(k−2) have same rank.

We use Question 1 as an example to illustrate the rule. For example, let m1=54, m2=49, m3=28, m4=71, m5=23, and then we have m(1)=23, m(2)=28, m(3)=49, m(4)=54, m(5)=71. It is natural to rank the importance of responses in order of m(j). We may claim that price

is more important than taste for consumers to purchase. However, this rank method based

on the order of m(j) is not statistically significant. Hence, we follow the proposed rule and

use one of these methods to rank all response. First, we rank response v(5) and response v(4), i.e. rank the response ”taste” and response ”price”. If H01 : π(5) = π(4) is not reject, it means that the response ”taste” and the response ”price” are equally important. Then we

test H02: π(4) = π(3) versus H12: π(4) 6= π(3). If H02 is rejected ,we rank responses v(5)andv(4) first and response v(3) third. And when hypothesis H03 : π(3) = π(2) and H04 : π(2) = π(1), we reject H03 and do not reject H04. Then we denote the ranking notations for the above

result as ”taste” 1, ”capacity” 3, ”packaging” 4, ”price” 1, ”other” 4. Hence, we know that

the response ”taste” and the response ”price” are top priorities when consumers purchase a

(21)

the response ”other” are relatively unimportant for consumers.

Next, we illustrate the ranking rule of the method with Bradley-Terry model. It is easier

than above rule. In this method, we compute all γ values and according to the size of values,

we can obtain an order with descending. Hence, the order is the rank of these responses.

4 Simulation

In this section, a simulation study is conducted to evaluate the performance of these

methods in this section. Because the Bayesian ranking method has prior distribution

as-sumption, it is different from other methods. Hence, we do not discuss Bayesian

rank-ing method in the simulation study. The true rank of these responses is according to

the order of π(j). In this study, we regard two responses have the same rank if |π(i) − π(j)| ≤ where is a constant in a tolerance region. We compare the ranks of the 5

methods in terms of consistent rate, which is defined as the proportion that the rank

of these methods and the true rank are consistent for n respondents in 1000 replicates.

For example, let π10000 = 0.032, π01000 = 0.015, π00100 = 0.087, π00010 = 0.061, π00001 =

0.009, π11000 = 0.008, π10100= 0.0082, π10010= 0.068, π10001= 0.002, π01100= 0.002, π01010= 0.00005, π01001= 0.0005, π00110= 0.0005, π00101 = 0.00006, π00011= 0.00319 ,others equal to 0.044 and = 0.01, resulting π1 = 0.626, π2 = 0.501, π3 = 0.585, π4 = 0.617, π5 = 0.479.

Thus the true is 1 4 3 1 5. We obtain a sample, and use the Wald test to obtain the

(22)

than the true rank, we call this phenomenon is consistent. Here, the result of the

ex-ample is consistent. The codes to run γ in the Bradley-Terry models can be download

in http://sites.stat.psu.edu/~dhunter/code/btmatlab/. We apply these programs to

rank responses in a multiple response question. Since there are three codes for this method in

http://sites.stat.psu.edu/~dhunter/code/btmatlab/. The first code of using

Bradly-Terry with MM method is denoted as btmm in Tables 1-6 and in the R code section. The

second code of using Bradly-Terry with quasi-Newton accelerated MM method is denoted as

btqn in Tables 1-6 and in the R code section. The third code of using Bradly-Terry with a

Newton-Raphson method is denoted as btnr in Tables 1-6 and in the R code section.

Then the following table is the consistent rate for different k and n:

Table 1: The consistent rates of the 5 methods when π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.33, k =5 and =0.05 for all methods and α=0.05 for the Wald test and the Generalized Score test:

Sample size

Method

Wald test G.S. test btmm btqn btnr

n=100 0.997 0.996 0.682 0.682 0.682 n=200 0.999 0.999 0.814 0.814 0.814 n=300 0.999 0.999 0.879 0.879 0.879 n=500 1 1 0.951 0.951 0.951 n=800 1 1 0.968 0.968 0.968 n=1000 1 1 0.992 0.992 0.992

(23)

Table 2: The consistent rates of the 5 methods when π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, π6 = 0.43, k =6 and =0.05 for all methods and α=0.05 for the Wald test and the Generalized Score test:

Sample size

Method

n=100 0.998 0.998 0.578 0.578 0.578 n=200 0.999 0.999 0.796 0.796 0.796 n=300 1 1 0.869 0.869 0.869 n=500 1 1 0.95 0.95 0.95 n=800 1 1 0.99 0.99 0.99 n=1000 1 1 0.992 0.992 0.992

Table 3: The consistent rates of the 5 methods when π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, π6 = 0.43, π7 = 0.12, k =7 and =0.05 for all methods and α=0.05 for the Wald test and the Generalized Score test:

Sample size

Method

n=100 0.999 0.999 0.528 0.528 0.528 n=200 0.999 0.999 0.772 0.772 0.772 n=300 1 1 0.865 0.865 0.865 n=500 1 1 0.946 0.946 0.946 n=800 1 1 0.979 0.979 0.979 n=1000 1 1 0.99 0.99 0.99

(24)

Table 4: The consistent rates of the 5 methods when π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, π6 = 0.43, π7 = 0.12, π8 = 0.5, k =8 and =0.05 for all methods and α=0.05 for the Wald test and the Generalized Score test:

Sample size

Method

n=100 0.999 0.998 0.355 0.355 0.355 n=200 0.999 0.999 0.626 0.626 0.626 n=300 1 1 0.796 0.796 0.796 n=500 1 1 0.91 0.91 0.91 n=800 1 1 0.972 0.972 0.972 n=1000 1 1 0.985 0.985 0.985

Table 5: The consistent rates of the 5 methods when π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, π6 = 0.43, π7 = 0.12, π8 = 0.5, π9 = 0.9, π10 = 0.62, k =10 and =0.05 for all methods and α=0.05 for the Wald test and the Generalized Score test:

Sample size

Method

n=100 0.998 0.998 0.251 0.251 0.251 n=200 0.999 0.999 0.524 0.524 0.524 n=300 1 1 0.684 0.684 0.684 n=500 1 1 0.879 0.879 0.879 n=800 1 1 0.957 0.957 0.957 n=1000 1 1 0.985 0.985 0.985

Next, we compare the Wald test and the Generalized Score test for different α. Then the following tables are the consistent rates of the Wald test and the Generalized Score test for different α:

(25)

Table 6: The consistent rates of the Wald test and the Generalized Score test when π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, k =5 and =0.05: Sample size n=100 n=200 Significant level α Method

Wald test G.S. test Wald test G.S. test

α=0.15 0.984 0.984 0.996 0.996 α=0.1 0.99 0.991 0.997 0.997 α=0.05 0.997 0.996 0.999 0.999 α=0.01 0.999 0.999 1 1

Table 7: The consistent rates of the Wald test and the Generalized Score test when π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, π6 = 0.43, π7 = 0.12, k =7 and =0.05:

Sample size

n=100 n=200

Significant level α

Method

α=0.15 0.992 0.992 0.997 0.997 α=0.1 0.994 0.994 0.997 0.997 α=0.05 0.999 0.999 0.999 0.999 α=0.01 1 1 1 1

(26)

Table 8: The consistent rates of the Wald test and the Generalized Score test when π1 = 0.77, π2 = 0.28, π3 = 0.56, π4 = 0.21, π5 = 0.34, π6 = 0.43, π7 = 0.12, π8 = 0.5, π9 = 0.9, π10 = 0.62, k =10 and =0.05: Sample size n=100 n=200 Significant level α Method

α=0.15 0.992 0.992 0.998 0.998 α=0.1 0.997 0.998 0.999 0.999 α=0.05 0.998 0.998 0.999 0.999 α=0.01 1 1 1 1

According to above results, we find that consistent rate decreases as the number of

responses increase for the methods with Bradley-Terry model when n is not enough large.

When the sample size is large, the results of these methods are almost consistent. In

com-paring the Wald test and the Generalized Score test for different α, consistent rate increases

when α decreases. Although the consistent rate is not high for small sample size case, it

still has good result for large sample size case. It reveals that these methods are feasible in

ranking responses when the sample size is not small.

5 R code

These ranking procedures has been written as a package RankResponse for R.

(27)

org/package=RankResponse, which include code function rank.wald, rank.gs, rank.LR, rank.LN,

rank.L2R, rank.btmm, rank.btqn and rank.btnr.

rank.wald Rank responses based on the Wald test

Description

Rank responses of a single response question or a multiple response question by the Wald

test procedure.

Usage

rank.wald(data,alpha,type=2)

Argument

data A m × n matrix (dij), where dij = 0 or 1. If the i th respondent selects the j th response, then dij = 1, otherwise dij = 0.

alpha The significance level used in the Wald test.

type type=1 for a single response question;

type=2 for a multiple response question. Value

The rank.wald returns the estimated probabilities of the responses being selected and

the ranks of the responses by the Wald test procedure.

References

Wang, H. (2008). Ranking Responses in Multiple-Choice Questions. Journal of Applied

Statistics, 35, 465-474.

(28)

## This is an example to rank three responses in a multiple response

## question when the number of respondents is 1000 and the

signifi-## cance level is 0.05.In this example,we do not use a real data, but

## generate data in the first three lines.

A <-sample(c(0,1),1000,p=c(0.21,0.79),replace=T)

B <-sample(c(0,1),1000,p=c(0.86,0.14),replace=T)

C <-sample(c(0,1),1000,p=c(0.42,0.58),replace=T)

D <-cbind(A,B,C)

data <-matrix(D,nrow=1000,ncol=3)

# or upload the true data

alpha<-0.05

rank.wald(data,alpha,type=2)

rank.gs Rank responses based on the Generalized score test

Description

Rank responses of a single response question or a multiple response question by the

generalized score test procedure.

Usage

rank.gs(data,alpha,type=2)

(29)

data A m × n matrix (dij), where dij = 0 or 1. If the ith respondent selects the jth response, then dij = 1, otherwise dij = 0.

alpha The significance level used in the Generalized score test.

type type=1 for a single response question ;

type=2 for a multiple response question .

Value

The rank.gs returns the estimated probabilities of the responses being selected and the

ranks of the responses by the Generalized score procedure.

References

Wang, H. (2008). Ranking Responses in Multiple-Choice Questions. Journal of Applied

Statistics, 35, 465-474.

Examples

## question when the number of respondents is 1000 and the

signifi-## cance level is 0.05.In this example,we do not use a real data, but

## generate data in the first three lines.

D <-cbind(A,B,C)

alpha<-0.05

(30)

rank.btmm Rank responses based on the Bradley-Terry model with the

MM method

Description

Adopt the Bradley-Terry model to rank responses in a single response question or in

a multiple response question with the MM method. This method associates each response

with a value γ, and use the γ value to rank responses.

Usage

rank.btmm(data)

Argument

Value

The rank.btmm returns the associated γ values in the first line and the ranks ofthe

responses in the second line.

References

Hunter DR (2004). MM algorithms for generalized Bradley-Terry models. The Annals

of Statistics, 32, 384-406.

Examples

## question when the number of respondents is 1000. In this example,

## we do not use a real data, but generate data in the first three lines.

(31)

D <-cbind(A,B,C)

rank.btmm(data)

rank.btqn Rank responses based on the Bradley-Terry model with the

quasi-Newton accelerated MM method

Description

Adopt the Bradley-Terry model to rank responses in a single response question orin a

multiple response question with quasi-Newton and the MM method. This method associates

each response with a value γ, and use the γ value to rank responses.

Usage

rank.btqn(data)

Argument

Value

The rank.btqn returns the associated γ values in the first line and the ranks of the

re-sponses in the second line.

(32)

Examples

D <-cbind(A,B,C)

rank.btqn(data)

rank.btnr Rank responses based on the Bradley-Terry model with

New-ton Raphson method

Description

Adopt the Bradley-Terry model to rank responses in a single response question or in

a multiple response question with Newton-Raphson method. This method associates each

response with a value γ, and use the γ value to rank responses.

Usage

(33)

Argument

Value

The rank.btnr returns the associated γ values in the first line and the ranks of the

re-sponses in the second line.

References

Examples

D <-cbind(A,B,C)

rank.btnr(data)

rank.LN Rank responses under the Bayesian framework according to

(34)

Description

Rank responses of a single response question or a multiple response question under the

Bayesian framework according to the loss function LN(d, n) = cF D + F N . Usage

LN(data,response.number,prior.parameter,c)

Argument

data A m × n matrix (dij), where dij = 0 or 1. If the i th res-pondent selects the j th response, then dij = 1, otherwise

dij = 0.

response.number The number of the responses

prior.parameter The parameter vector of the Dirichlet prior distribution,

where the vector dimension is 2response.number.

c The value of c in the loss function Value

The rank.LN returns the estimated probabilities of the responses being selected in the

first line and the ranks of the responses in the second line.

References

Wang, H. and Huang, W. H. (2014). Bayesian Ranking Responses in Multiple Response

Questions. Journal of the Royal Statistical Society: Series A (Statistics in Society), 177,

191-208.

Examples

(35)

##question when the number of respondents is 1000 and the value c is

##1. In this example, we do not use a real data, but generate data in

##the first three lines.

D <-cbind(A,B,C)

response.number <-3

prior.parameter <- c(5,98,63,7,42,7,7,7)

c <-1

rank.LN(data,response.number,prior.parameter,c)

rank.LR Rank responses under the Bayesian framework according to

the loss LR(d, n) = cF DR + F N R

Description

Bayesian framework according to the loss function LR(d, n) = cF DR + F N R.

Usage

rank.LR(data,response.number,prior.parameter,c)

(36)

data A m × n matrix (dij), where dij = 0 or 1. If the i th res-pondent selects the j th response, then dij = 1, otherwise dij = 0.

response.number The number of the responses

where the vector dimension is 2response.number_.

c The value of c in the loss function Value

The rank.LR returns the estimated probabilities of the responses being selected in the

References

191-208.

Examples

##This is an example to rank three responses in a multiple response

##question when the number of respondents is 1000 and the value c is

##0.33. In this example, we do not use a real data, but generate data

##in the first three lines.

(37)

response.number <-3

c <-0.33

rank.LR(data,response.number,prior.parameter,c)

rank.L2R Rank responses under the Bayesian framework according to

the loss L2R(d, n) = (F DR, F N R)

Description

Bayesian framework according to the loss function L2R(d, n) = (F DR, F N R).

Usage

rank.L2R(data,response.number,prior.parameter,e)

Argument

data A m × n matrix (dij), where dij = 0 or 1. If the i th res-pondent selects the j th response, then dij = 1, otherwise dij = 0.

(38)

where the vector dimension is 2response.number.

e A cut point used in the loss function which depends on

the economic costs. Value

The rank.L2R returns the estimated probabilities of the responses being selected in the

References

191-208.

Examples

##This is an example to rank three responses in a multiple response

##question when the number of respondents is 1000 and the value e is

##0.15. In this example, we do not use a real data, but generate data

##in the first three lines.

D <-cbind(A,B,C)

(39)

e <-0.15

rank.L2R(data,response.number,prior.parameter,e)

6 Conclusion

In this thesis, our goal is ranking responses of a single response question or a multiple

response question. We proposed some methods to solve this problem. For ranking responses,

the simulation results in Section 4 show that the proposed methods have good performance.

Although these methods are not consistent for small sample size, their performances are very

good for large sample size. In real applications, the sample size of responses is usually not

very small, these methods are feasible in apply to real applications. According to the

simu-lation results, the consistent rates of the Wald test and the generalized score test are larger

than the method with Bradley-Terry model. Hence, we conclude that the Wald test and

the generalized score test are more powerful than the method with Bradley-Terry model on

ranking responses. We do not discuss Bayesian ranking responses method in the simulation

study, because the other methods are under frequentist setup, but the Bayesian method is

under the Bayesian framework. The codes of these methods has been written as an R package

(40)

7 References

[1] Agresti, A. and Liu, I.-M. (1999) Modeling a categorical variable allowing arbitrarily

many category choices. Biometrics 55, 936-943.

[2] Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rates: a practical

and powerful approach to multiple testing. J. R. Statist. Soc. B, 57, 289–300.

[3] Berger, J. O. (1985) Statistical Decision Theory and Bayesian Analysis. 2nd edn. New

York: Springer.

[4] Bilder, C. R., Loughin, T. M.and Nettleton, D. (2000) Multiple marginal independence

testing for pick any/c variables. Communications in Statistics-Simulation and

Compu-tation, 29(4), 1285-1316.

[5] Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs. I.

Themethod of paired comparisons. Biometrika 39, 324–345.

[6] Decady, Y. J. and Thomas, D. H. (2000). A simple test of association for contingency

tables with multiple column responses. Biometrics 56, 893-896.

[7] Hunter DR (2004), MM algorithms for generalized Bradley-Terry models, Annals of

Statistics , 32, 386-408

[8] J.-L. Ke (2011) Algorithms for Ranking Responses in Multiple Response Questions.

Na-tional Chiao Tung University

[9] Lange, K., Hunter, D.R., Yang I. (2000) Optimization transfer using surrogate objective

(41)

[10] Loughin, T. M. and Scherer, P. N. (1998). Testing for association in contingency tables

with multiple column responses. Biometrics 54, 630-637.

[11] Umesh, U. N. (1995). Predicting nominal variable relationships with multiple responses.

Journal of Forecasting 14, 585-596.

[12] Wang, H. (2008). Ranking responses in multiple-choice questions. Journal of Applied

Statistics, 35, 465-474

[13] Wang, H. and Huang, W. H. (2014). Bayesian Ranking Responses in Multiple Response

Questions. Journal of the Royal Statistical Society, Series A (Statistics in Society),

問卷中單複選題的選項排序方法的探討

國 立 交 通 大 學

統計學研究所

碩士論文

問卷中單複選題的選項排序方法的探討

Ranking Responses of a Single Response Question or a

Multiple Response Question

研 究 生：林育駿

指導教授：王秀瑛 教授

問卷中單複選題的選項排序方法的探討

Ranking Responses of a Single Response Question or a

Multiple Response Question

研 究 生：林育駿 Student: Yu-Chun Lin

指導教授：王秀瑛 教授 Advisor: Dr. Hsiuying Wang

國 立 交 通 大 學

統計學研究所

碩 士 論 文

問卷中單複選題中的選項排序方法的探討

國立交通大學統計學研究所

摘

要

Ranking Responses of a Single Response Question or a Multiple

Response Question

誌謝

Contents

List of Tables

1

Introduction

2

Preliminary

2.1

The Wald Test and the Generalized Score Test

2.2

Bayesian Ranking Responses

2.3

Ranking Responses with Bradley-Terry Models

3

Ranking Rule

4

Simulation

5

R code

6

Conclusion

7

References

國立交通大學

研究生：林育駿

指導教授：王秀瑛教授

研究生：林育駿 Student: Yu-Chun Lin

指導教授：王秀瑛教授 Advisor: Dr. Hsiuying Wang

國立交通大學

碩士論文