Quantile-DEA classifiers with interval data

(1)

Quantile–DEA classifiers with interval data

Quanling Wei•_{Tsung-Sheng Chang}•_{Song Han}

Published online: 2 March 2014

Springer Science+Business Media New York 2014

Abstract This research intends to develop the classifiers for dealing with binary clas-sification problems with interval data whose difficulty to be tackled has been well rec-ognized, regardless of the field. The proposed classifiers involve using the ideas and techniques of both quantiles and data envelopment analysis (DEA), and are thus referred to as quantile–DEA classifiers. That is, the classifiers first use the concept of quantiles to generate a desired number of exact-data sets from a training-data set comprising interval data. Then, the classifiers adopt the concept and technique of an intersection-form pro-duction possibility set in the DEA framework to construct acceptance domains with each corresponding to an exact-data set and thus a quantile. Here, an intersection-form accep-tance domain is actually represented by a linear inequality system, which enables the quantile–DEA classifiers to efficiently discover the groups to which large volumes of data belong. In addition, the quantile feature enables the proposed classifiers not only to help reveal patterns, but also to tell the user the value or significance of these patterns. Keywords Data envelopment analysis Classifier Quantile Production possibility set Interval data

1 Introduction

Lack of data used to be a big challenging problem faced by large numbers of researchers and practitioners in a great many domains. However, nowadays, they quite often face a

Q. Wei S. Han

Institute of Operations Research and Mathematical Economics, Renmin University of China, Beijing, China

T.-S. Chang (&)

Department of Transportation and Logistics Management, National Chiao Tung University, 1001 University Rd., Hsinchu 30010, Taiwan

(2)

new challenge of generating useful information from an explosive growth of data volume due to the advanced technologies for generating and collecting data. Therefore, it has become necessary to develop new technologies and tools to intelligently and rapidly process massive amounts of data into useful information and knowledge, which has resulted in data mining becoming increasingly important and receiving much attention in many fields. Data mining involves the use of sophisticated data analysis tools such as statistical models, mathematical algorithms, and machine learning methods to search for valuable information in large volumes of data (Seifert2004). To date, data mining has been used for a variety of purposes in both the private and public sectors such as association, sequence or path analysis, classification, clustering, and forecasting (see, e.g., Han and Kamber2007). In addition, as pointed out by Seifert (2004), data mining has not only been used by different industries, e.g., banking, insurance, medicine, and retailing, to reduce costs, enhance research, and increase sales, but has also been used in the public sector to detect fraud and waste, and measure and improve program performance. However, there are some limitations to the capability of data mining. For instance, data mining helps reveal patterns and relationships, but it does not tell the user the value or significance of these patterns; the user, therefore, still needs specialists to interpret the created output (Seifert

2004). That is, more user-friendly and efficient data mining tools are called for. It is noteworthy that operations research methods have been shown to be promising for improving data mining techniques (Corne et al.2012).

As indicated above, data mining includes several main functions. This research focuses on the function of classification that is to judge whether a piece of data belongs to a particular group by evaluating a set of characteristic values. Of particular interest in this study are binary classification problems, which are also referred to as two-group dis-criminant analysis problems. The problems, in which a piece of data belongs to one of two groups (more specifically, inside or outside an acceptance domain), has been applied to a wide variety of fields such as economics, finance, insurance and risk for credit scoring, bankruptcy prediction, insurance underwriting, management fraud detection and so on (Sinha and Zhao 2008). Up to now, there have been a few popular algorithms for data classification such as decision tree induction, Bayesian classification, rule-based classifi-cation and support vector machines (see, e.g., Han and Kamber 2007). In addition, data envelopment analysis-(DEA-)based approaches seem to have been receiving attention very recently in academia; the relevant works, which show that DEA-based methods are quite promising in practice, are reviewed as follows. It is well known that the function of conventional DEA theories, models and methods is to evaluate the relative efficiency among a given number of decision making units (DMUs) with multiple inputs and multiple outputs (Cooper et al.2006). Nonetheless, Troutt et al. (1996) have pioneered the use of DEA models in binary classification (more precisely, by developing an acceptance boundary); they propose a sample-based decision system to make a decision on whether or not to accept or reject a credit risk based on samples predetermined by experts. Seiford and Zhu (1998) extend the work to develop a DEA-type linear programming model to decide whether a new case is acceptable; the model also determines the location of the case corresponding to the previously classified samples. Pendharkar et al. (2000) apply the method of Troutt et al. to discover the breast cancer pattern; their empirical results show that the DEA-based approach outperforms statistically linear discriminant analysis. Instead of determining the acceptability of a new case such as in Troutt et al. (1996) and Seiford and Zhu (1998), Pendharkar (2002) deals with an inverse classification problem where the objective is to find out how to change a DMU’s inputs so that it can be classified into

(3)

network (RBFN) to develop a hybrid RBFN–DEA neural network for a binary classifi-cation problem with negative inputs and non-linearly separable classes. Pendharkar (2012) uses DEA for fuzzy classification where the classification output is a fuzzy membership function. He empirically shows that his DEA-based fuzzy classification system outper-forms the adaptive neuro fuzzy inference system, fuzzy rule-based classification system and logistic regression. However, it is noted that the research deals with exact data instead of fuzzy or interval data. Finally, Yan and Wei (2011) propose a DEA classification machine with exact data that includes an acceptance domain (i.e., a production possibility set under the DEA framework) and a classification function. The acceptance domain is constructed by an explicit system of linear inequalities, which makes the classification process very efficient.

To our knowledge, the DEA-based methods for data classification proposed in the literature assume that data are measured by exact values. However, as pointed out in Cooper et al. (1999), it is quite common in many applications for some data to be known only within specified intervals while other data may be known only in terms of ordinal relations; such a data type is commonly referred to as ‘‘imprecise data.’’ So far, there have been a few works that incorporate imprecise data into DEA models in the literature (see, e.g., Cooper et al.1999; Despotis and Smirlis2002; Zhu2003; Kao2006). It is important to note that the DEA models dealing with imprecise data in the literature are used to evaluate the relative efficiency among DMUs, but not to perform data classification. That is, there is no research, to our knowledge, working on developing DEA-based methods for classifying imprecise data so far in the literature. Therefore, this research intends to pro-pose DEA classifiers to deal with binary classification problems with classification data that are known to have either exact values or values only within bounded intervals; note that it is well recognized, regardless of the field, that it is difficult to tackle interval data. The proposed classifiers involve using the concepts and techniques of both quantiles and DEA, and are thus referred to as quantile–DEA classifiers. Here we briefly introduce the proposed classifiers, and elaborate on the classifiers in the succeeding sections. It is well known that the conventional way for dealing with a binary classification problem with exact data is to construct the corresponding acceptance domain such that classification data are either inside (i.e., accepted by) or outside (i.e., rejected by) the acceptance domain. A classifier embedded with a single acceptance domain while efficient is usually unable to provide the user with the degree of acceptance or rejection. In addition, due to the inherent complexity of interval data, it seems to be necessary to construct the classifiers that are embedded with multiple acceptance domains for handling binary classification problems with interval data. Therefore, we adopt the idea of quantiles in statistics to tackle the issue of multiple acceptance domains. That is, given n training interval data, we create n exact data by specifying a quantile for each of the original interval data; it follows that, by specifying t quantiles, we can obtain t exact-data sets with each containing n exact data. Then, we construct t acceptance domains with each corresponding to one of the t exact-data sets by applying the concept and technique of a production possibility set in the DEA framework. Clearly, whether or not a classifier with multiple acceptance domains is effi-cient and effective largely depends on the construction and presentation of the acceptance domains. In this research, we first use the idea and technique of a conventional sum-form production possibility set to construct the acceptance domains. Then, we adopt the tech-niques proposed in Wei and Yan (2001) and Yan and Wei (2000) to transform the sum-form acceptance domains into the ones with an intersection sum-form that is a linear inequality system. The intersection-form acceptance domain enables the quantile–DEA classifiers to

(4)

addition, the feature of quantiles makes the proposed classifiers not only able to reveal patterns, but also able to tell the user the value or significance of these patterns.

The remainder of this paper is organized as follows: Sect.2shows how the acceptance domains are constructed; Sect. 3introduces the quantile–DEA classifier for dealing with exact data; Sect.4elaborates the quantile–DEA classifier for tackling interval data; Sect.5

extends the classifiers proposed in Sects.3and4to deal with general cases; and, finally, Sect.6concludes the paper.

2 Construction of acceptance domains

The technique of acceptance domains lies at the heart of the proposed classifiers. Hence, this section introduces how we construct the acceptance domains by using n training interval data. Assume that each of the training interval data denoted as interval DMU-xj (j = 1,…, n) is associated with m inputs. We consider the setting where all training data are known only within specified bounds with values that are drawn from uniform distri-butions. Denote the interval-DMUs as

xj¼ x1j; x2j; . . .; xmj ; j¼ 1; . . .; n; where xij2 aij; bij ; i¼ 1; . . .; m; j¼ 1; . . .; n: That is, xj¼ a1j; b1j ; a2j; b2j ; . . .; amj; bmj ; j¼ 1; . . .; n: Define aj¼ a1j; a2j; . . .; amj T ; j¼ 1; . . .; n; bj¼ b1j; b2j; . . .; bmj T ; j¼ 1; . . .; n: Furthermore, denote the training data set as

T¼ xjjj ¼ 1; . . .; n : Moreover, define xbj ¼ x b 1j; x b 2j; . . .; x b mj ; j¼ 1; . . .; n; where xb_ij¼ aijþ b b ij aij[ 0; i¼ 1; . . .; m; j¼ 1; . . .; n; and b [ (L, ??) with L¼ max 1 i m;1 j n aij bij aij :

It is noted that, since the values of all training data are uniformly distributed within specified bounds, then, theoretically, b [ [0, 1]. However, it is quite common in practice that some of the classification data are positioned outside the range formed by the

(5)

n training interval data. It follows that the acceptance domains formed by the n training interval data based on the condition that b [ [0, 1] cannot classify those classification data. Hence, we need to extend the value of b to construct the acceptance domains that can handle the classification data with values spreading outside the range formed by the n training interval data. In actual fact, since xb_ij[ 08i; j; we can derive the lower bound of b, i.e., L, which is obviously less than 0; however, there is no systematic way to confine b from above, and thus the upper bound of b is defined as ??. The purpose and usefulness of extending the value of b will become clearer later.

In addition, we make the following assumptions for formally defining the considered binary classification problem. First, all training data xj; j = 1,… , n in training data set T are accepted to, however, different degrees. Second, the bigger the values of the data, the higher the probabilities that the data will be accepted, which is referred to as the condi-tional monotonicity/non-satiety assumption by Pendharkar and Troutt (2011). We will relax this assumption to deal with more general cases in Sect.5. Third, xbij[ 0; i = 1,… , m; j = 1,… , n for practical applications; it follows that b [ L. Fourth, if two data are accepted (rejected), then a data that can be represented by the convex combination of the two data is also accepted (rejected). As a result, let Tbrepresent the acceptance domain constructed by xbj ¼ ðx b ij; x b 2j; . . .; x b mjÞ T

; j¼ 1; . . .; n given a specified b [ (L, ??) that satisfies the above assumptions. It is easy to check that Tbsatisfies the following postulates (it is noted that an acceptance domain is uniquely determined by the system of postulates): Postulate 1 (Ordinary postulate) the observed xb_j [ Tbfor all j = 1,…, n.

Postulate 2 (Convexity postulate) If x [ Tb, and ^x2 Tb; then kxþ ð1 kÞ^x2 Tb; for k [ [0, 1].

Postulate 3 (Monotonicity postulate) if x [ Tb, and ^x x; then ^x2 Tb:

Postulate 4 (Minimum extrapolation postulate) Tbis the intersection set of all ~T satisfying Postulates 1–3.

In actual fact, the acceptance domain that satisfies Postulates 1–4 defined above can be represented as follows: Tb ¼ x Xn j¼1 xb_jkj x; Xn j¼1 kj 1; kj 0; j ¼ 1; . . .; n ( ) :

Note that acceptance domain Tbhas the same structure as the production possibility set corresponding to the classical CCR model (Charnes et al. 1978) with reference set {(xb_j; 1)|j = 1, … , n} in DEA research. Hence, in this study, the boundary of Tbis, for convenience, also referred to as the frontier of Tb.

Definition 1 Let b [ (L, ??) and Tb ¼ x Xn j¼1 xb_jkj x; Xn j¼1 kj 1; kj 0; j ¼ 1; . . .; n ( ) :

Tbis referred to as the acceptance domain with b-quantile.

Here, a piece of data on the frontier of Tbis regarded as acceptance if b C 0. The value of b is referred to as the acceptance degree; the larger the value of b, the higher the acceptance degree of the data. On the contrary, a piece of data on the frontier of T is

(6)

considered to be a rejection if L \ b \ 0. The rationale for using b to represent the degree of acceptance is as follows. Recall that xbij¼ aijþ bðbij aijÞ [ 0; i ¼ 1; . . .; m; j ¼ 1; . . .; n; and that the values within interval [aij, bij] are drawn from uniform distributions. Therefore, for any b2 ½0; 1; b ¼ Prfaij ^xij x

b ijg ¼

xb_ijaij

bijaij; which represents the

proba-bility that ^xijfalls into interval½aij; xbij: It follows that, here, b can be naturally used to represent the degree of acceptance. On the other hand, if b [ (L, 0)[ (1, ??), then b is not associated with the above probability property. However, based on the assumption that the bigger the values of the data, the higher the probability that the data are accepted (i.e., the monotonicity postulate), it is appropriate to extend the property of the b [ [0, 1] to the b [ (L, ??). That is, b [ (0, ??) and |b| such that b [ (L, 0) can be used to represent the degrees of acceptance and rejection, respectively.

Theorem 1 Let L\ b\ ^b; and Tb¼ x Xn j¼1 xb_jkj x; Xn j¼1 kj 1; kj 0; j ¼ 1; . . .; n ( ) ; and Tb^¼ x Xn j¼1 xb_j^kj x; Xn j¼1 kj 1; kj 0; j ¼ 1; . . .; n ( ) : Then, (i) Tb^ Tb;

(ii) There is no intersection between the frontiers of T_b^and T_b:

Proof See Appendices 1 and 4 for the proofs of (i) and (ii), respectively.

Example 1 Consider a sample training-data set of x1; x2; x3and x4in which m = 2. Their corresponding characteristic values are as follows:

x1¼ ð½1; 6; ½4; 7Þ; x2¼ ð½2; 6; ½2; 9Þ;

x3¼ ð½4; 10; ½1; 4Þ; x4¼ ð½4; 7; ½4; 8Þ:

It is easy to obtain that L = -0.2, and xb11¼ 1 þ 5b; x b 21¼ 4 þ 3b; xb₁₂¼ 2 þ 4b; xb₂₂¼ 2 þ 7b; xb₁₃¼ 4 þ 6b; xb₂₃¼ 1 þ 3b; xb₁₄¼ 4 þ 3b; xb₂₄¼ 4 þ 4b:

The acceptance domain with b-quantile, Tb, is given as follows: Tb¼ x¼ xð 1; x2Þ X4 j¼1 xb_1jkj; X4 j¼1 xb_2jkj ! xð 1; x2Þ; X4 j¼1 kj 1; kj 0; j ¼ 1; . . .; 4 ( ) :

(7)

Let the values of b be -0.1, 0, 0.5 and 1; the corresponding T-0.1, T0, T0.5and T1are depicted in Fig.1, which clearly shows that T-0.1. T0. T0.5. T1, and that there is no intersection between each pair of frontiers.

3 Quantile–DEA classifier with exact data

The focus of this study is on developing a quantile–DEA classifier for dealing with binary classification problems with interval data. Here, however, we first introduce the classifier for dealing with exact data for the following two reasons. First, we do not rule out the possibility that some classification data may have crisp values. Second, the methods for developing the classifier for handling exact data can be the building blocks for developing the one for tackling interval data. Nonetheless, it is important to note that the training data sets with respect to (wrt) both types of classifiers consist of only interval data.

Denote an exact data ^x as DMU-^x; and consider the following linear program wrt DMU-^

x (i.e., data ^x2 ^T; a set of classification data) with a specified b [ (L, ??), where ^ T <m þ: ^ hðbÞ ¼ min h; Pb s.t. P n j¼1 xb_jkj h^x; Pn j¼1 kj 1; kj 0; j¼ 1; . . .; n:

In actual fact, ^hðbÞ; the optimal objective function value of problem (Pb) given a specified value of b [ (L, ??), is a function of b. Here, the function is directly denoted

0 1 2

3 4 5 6 7 8 9 10 11 12

11

10

9

8

7

6

5

4

3

2

1

1 j

x

2 j

x

0 β

=

0.5 β

=

1 β

=

0.1 β

= −

Fig. 1 Tbwrt different values of

(8)

as ^hðbÞ and is referred to as the quantile function of DMU-^x: It is not difficult to show that ^

hðbÞ [ 0; and that it is possible that ^hðbÞ [ 1 (see Lemma 2 and its proof in Appendix 2). Figure2demonstrates an example in which 0\^hð0Þ\1 and ^hð1Þ [ 1: It is noted that, in Fig.2, the acceptance domain between T0and T1(including the boundaries corresponding to T0and T1) can be represented by the setfxjx 2 T0; x62 Int T1g: The following theorem defines the properties of ^hðbÞ:

Theorem 2 Let ^x2 ^T\ Int xjPn_j¼1xL

jkj x; Pnj¼1kj 1; kj 0; j ¼ 1; . . .; n

n o

; and ^

hðbÞ be the quantile function of DMU-^x: Then,

(i) hðbÞ is a continuous function defined over (L, ??).^

(ii) hðbÞ is a strictly monotonically decreasing function over (L, ??).^

Proof See Appendix 2. h

Definition 2 Let ^x2 ^T\ Int fxjPn_j¼1xL

jkj x;Pnj¼1kj 1; kj 0; j ¼ 1; . . .; ng; and ^

h bð Þ be the quantile function of DMU-^x: The b* [ (L, ??) that satisfies ^hðbÞ ¼ 1 is referred to as the quantile of DMU-^x (the existence and uniqueness of b* are shown in Appendices 3 and 4, respectively); to facilitate subsequent discussion, the quantile of DMU-^x is further represented as bð^xÞ:

The quantile here actually denotes the degree of acceptance. For instance, in Fig.2, the quantile of DMU-^x is 0.5, which also indicates that the degree of acceptance corresponding to DMU-^x is 0.5. Likewise, if the quantile of a DMU is on the boundary of the acceptance domain wrt T0(T1), then the degree of acceptance corresponding to the DMU is 0(1). It is clear that there is an infinite possible number of acceptance degrees that a DMU might take since b * [ (L, ??). It follows that, to determine the acceptance domain Tb corre-sponding to DMU- ^x; we may need to solve, according to Theorem 2, an infinite number of

0 x

1 j 2 j

x

0 β

=

β

=

0.5 β

=

1

11

a

₁₂ 21

a

22

a

11

b

₁₂ 21

b

22

b

1

ˆx

2

ˆx

ˆ

DMU

−

x

Fig. 2 Quantile function of DMU-^x

(9)

linear programs (Pb) with one corresponding to a b [ (L, ??). Fortunately, in practice, it may not be necessary to find the exact value of b*; that is, a close value to b* is usually sufficient. Therefore, we consider only t, instead of an infinite number of, different values of b, i.e., b¼ b1;b2; . . .;bt0₁;b_t0; . . .;b_t00₁;b_t00; . . .;b_t1;b_t; such that

L\b1\b2\ \bt0₁\b_t0\ \b_t00₁\b_t00\ \b_t1\b_t;b_t0 ¼ 0 and b_t00¼ 1:

However, it is noted that the larger the value of t, the stronger is the classification power of the classifiers and the more operation time that is needed.

Example 2 Consider an example in which there is only one training data with m = 2, i.e.,

x1¼ ð½3; 5; ½3; 5Þ; and thus x b

11¼ 3 þ 2b; x b

21¼ 3 þ 2b; and L ¼32 : In addition, let t = 5, t0= 2, and t00= 4, and assume b1¼12 ;b2= 0, b3¼12;b4= 1 and b5¼32: Thus,

xb1 1 ¼ x b1 11; x b1 21 ¼ ð2; 2Þ; xb2 1 ¼ x b2 11; x b2 21 ¼ ð3; 3Þ; xb3 1 ¼ x b3 11; x b3 21 ¼ ð4; 4Þ; xb4 1 ¼ x b4 11; x b4 21 ¼ ð5; 5Þ; xb5 1 ¼ x b5 11; x b5 21 ¼ ð6; 6Þ: In addition, Tb¼ x Xn j¼1 xbjkj x; Xn j¼1 kj 1; kj 0; j ¼ 1; . . .; n ( ) ¼ x 3þ 2b 3þ 2b k1 x; k1 1 :

Hence, if the value of b is taken as b1, b2, b3, b4, and b5, then Tb1¼ x1 x2 x1 2; x2 2 ; Tb2¼ x1 x2 x1 3; x2 3 ; Tb3¼ x1 x2 x1 4; x2 4 ; Tb4¼ x1 x2 x1 5; x2 5 ; Tb5¼ x1 x2 x1 6; x2 6 :

Figure3graphically demonstrates Tb1; Tb2; . . .; Tb5:

Now, consider a classification data set ^

T¼ ^xj^x2 <m þ

;

(10)

bð^xÞ ¼

b₁; if ^x62 Tb1;

b_i; if ^x is located on the frontier of Tbi; 1 i t;

biþbi1 2 ; if ^x2 Int Tbi1 nTbi; 2 i t; bt; if ^x2 Int Tbt: 8 > > < > > :

Here, if bð^xÞ\0; then bð^xÞ represents the rejection degree wrt DMU-^x; otherwise, bð^xÞ denotes the acceptance degree wrt DMU-^x:

It is clear that, to implement the above defined approximate quantile bð^xÞ; we need to repeatedly check the following four classification conditions, which could be quite time-consuming: (a) ^x62 Tb1; (b) x^ is located on the frontier of Tbi; 1 i t;

(c) ^x2 ðInt Tbi1ÞnTbi; 1 i t; and (d) ^x2 Int Tbt: Hence, to efficiently classify the data

included in the classification data set ^T by using conditions (a)–(d), we transform the sum-form acceptance domain into the intersection-sum-form one. The transsum-formation method is detailed in Wei and Yan (2001) and Yan and Wei (2000). Recall that sum-form acceptance domain Tbis as follows: Tb¼ x Xn j¼1 xbjkj x; Xn j¼1 kj 1; kj 0; j ¼ 1; . . .; n ( ) : On the other hand, intersection-form acceptance domain Tbis as follows:

Tb¼ x xkb T x lk b 0; k ¼ 1; . . .; lb ;

where xkb 0; xkb6¼ 0; lkb 0; k = 1, …, lb. Note that, in both types of acceptance domain, b = b1,…,bt-1, bt. It follows that Int Tb¼ fxjðxkbÞ

T x lk

b[ 0; k¼ 1; . . .; lbg: Furthermore, it is noticed that intersection-form acceptance domain Tbis actually a linear inequality system, and that the number of linear inequalities is less than or equal to m 9 n, which happens when all n DMUs are extreme points of Tbsuch that each extreme point is the single intersection point of m (the number of inputs associated with the DMUs) hyper-planes in<m

. It follows that in practical applications, it takes a reasonable amount time to perform the procedure of transforming the sum-form acceptance domain into the

0 1 2

3 4 5 6 7

7

6

5

4

3

2

1

1 j

x

(11)

intersection-form one because the main task of the transformation procedure is to search for the linear inequalities. Moreover, it is important to note that the classification power of the proposed classifiers that are described below and in the next section is not affected by the values of both m and n. That is, once the intersection-form acceptance domains Tbi;

i = 1, …, t are constructed, the classification data can be efficiently and effectively classified. In short, the values of both m and n affect the time needed to transform the sum-form acceptance domains into the intersection-sum-form ones, but do not affect the classifi-cation power of the quantile–DEA classifiers.

The quantile–DEA classifier with exact data can be formally described as follows: Step 1 Select training data set T¼ f½aj; bj; j ¼ 1; . . .; ng;where aj= (a1j, a2j,

…,amj)T, j = 1,…, n and bj= (b1j, b2j,…,bmj)T, j = 1,…, n.

Step 2 Set first the value of t (t C 1) and then the value of b such that b1\b2\ \bt0\ \b_t00\ \b_t1\b_t; t0¼ 0; and t00 = 1. Compute

xbi

j ¼ ajþ biðbj ajÞ; i ¼ 1; . . .; t; j ¼ 1; . . .; n:

Step 3 Construct intersection-form acceptance domains Tbi¼ fxjðx

k biÞ T x lk bi0 0; k ¼ 1; . . .; lbig; and Int Tbi ¼ fxjðx k biÞ T x lk bi0[ 0; k¼ 1; . . .; lbig; where i = 1, …, t.

Step 4 Implement the approximate quantile bð^xÞ defined above to classify (accept or reject) and at the same time give the corresponding degree of every piece of data ^

x2 ^T:

Example 3 Consider again the training data set given in Example 1; that is, the training data set T¼ fx1; x2; x3; x4g: Set t = 4, and let b1= -0.1, b2 = 0, b3= 0.5 and b4= 1. Based on the values of xb_ij; i = 1, 2; j = 1, 2, 3, 4 (see Example 1), we can construct Tb1; Tb2; Tb3 and Tb4 in both sum and intersection forms.

(i) b1= -0.1: Tb1¼ x1 x2 0:5 3:7 k1þ 1:6 1:3 k2þ 3:4 0:7 k3þ 3:7 3:6 k4 x1 x2 ; P4 j¼1 kj 1; kj 0; j ¼ 1; . . .; 4 8 > > > < > > > : 9 > > > = > > > ; ðsum-formÞ ¼ x1 x2 240x1þ 110x2 527 0; x1 0:5; 2x1þ 6x2 11 0; x2 0:7 ðintersection-formÞ: (ii) b2= 0: Tb2¼ x1 x2 1 2 k1þ 2 2 k2þ 4 1 k3þ 4 4 k4 x1 x2 ; P4 j¼1 kj 1; kj 0; j ¼ 1; . . .; 4 8 > > > < > > > : 9 > > > = > > > ; ðsum-formÞ ¼ x1 x2 2x1þ x2 6 0; x1 1; x1þ 2x2 6 0; x2 1 ðintersection-formÞ: (iii) b3= 0.5:

(12)

Tb3¼ x1 x2 3:5 5:5 k1þ 4 5:5 k2þ 7 2:5 k3þ 5:5 6 k4 x1 x2 ; P4 j¼1 kj 1; kj 0; j ¼ 1; . . .; 4 8 > > > < > > > : 9 > > > = > > > ; ðsum-formÞ ¼ x1 x2 12x1þ 14x2 119 0; x1 3:5 0; x2 2:5 0 ðintersection-formÞ: (iv) b4= 1: Tb4 ¼ x1 x2 6 7 k1þ 6 9 k2þ 10 4 k3þ 7 8 k4 x1 x2 ; P4 j¼1 kj 1; kj 0; j ¼ 1; . . .; 4 8 > > > < > > > : 9 > > > = > > > ; ðsum-formÞ ¼ x1 x2 3x1þ 4x2 46 0; x1 6 0; x2 7 0 ðintersection-formÞ:

Let classification data set ^T¼ f^x1; ^x2; ^x3; ^x4; ^x5; ^x6; ^x7g; where

^ x1¼ ^ x11 ^ x21 ¼ 1 2 ; x^2¼ ^ x12 ^ x22 ¼ 1 6 ; x^3¼ ^ x13 ^ x23 ¼ 2 10 ; ^ x4¼ ^ x14 ^ x24 ¼ 3:5 8 ; x^5¼ ^ x15 ^ x25 ¼ 6 6 ; x^6¼ ^ x16 ^ x26 ¼ 12 4 ; ^ x7¼ ^ x17 ^ x27 ¼ 10 10 :

The resulting classification wrt each element in ^T is as follows: (1) x^1: since 240^x11þ 110^x21 527\0; ^x162 Tb1 and thus b

_ð^_x

1Þ ¼ 0:1:

(2) x^2: since 2^x12þ ^x22 6 [ 0; ^x12þ 2^x22 6 [ 0; ^x12¼ 1 and ^x22[ 1; ^x2is located on the frontier of Tb2 and thus b

_ð^_x 2Þ ¼ 0:

(3) x^3: since 2^x13þ ^x23 6 [ 0; ^x13þ 2^x23 6 [ 0; ^x13[ 1 and ^x23[ 1; ^x32 Int Tb2:

In addition, since ^x13 3:5\0; ^x362 Tb3: Hence, b

_ð^_x 3Þ ¼

b2þb3

2 ¼ 0:25: (4) x^4; ^x5; ^x6: similar to the analysis in (1)–(3), we can obtain that

^

x4is located on the frontier of Tb1and thus b

_^_x 4 ð Þ ¼ 0:5; ^ x52 Int Tb1; b _x_^ 5 ð Þ ¼b1þ b2 2 ¼ 0:75; ^

x6is located on the frontier of Tb2 and thus b

_x_^ 6 ð Þ ¼ 1:

(5) x^7: since 2^x17þ ^x27 6 [ 0; ^x17þ 2^x27 6 [ 0; ^x17[ 1 and ^x27[ 1; ^x72 Int Tb2:

Furthermore, since 12^x17þ 14^x27 119 [ 0; ^x17 3:5 [ 0 and ^x27 2:5 [ 0; ^x72 Int Tb1: Moreover, since 3^x17þ 4^x27 46 [ 0; ^x17 6 [ 0 and ^x17 7 [ 0; ^x72

Int Tb3: As a consequence, b

_ð^_x 7Þ ¼ 1:

(13)

4 Quantile–DEA classifier with interval data

This section introduces the quantile–DEA classifier with interval data (interval DMUs) that is mainly built on the methods proposed in the preceding section for developing the classifier for handling exact data. Let interval DMU-^x2 <m

þ; where ^ x¼ a^1; ^b1 ; â2; ^b2 ; . . .; âm; ^bm ; ^ a¼ ^ða1; â2; . . .; âmÞT; ^ b¼ ^b1; ^b2; . . .; ^bm T :

Recall that the training data set, T; that is used to construct acceptance domains is defined as T¼ f½aj; bj; j ¼ 1; . . .; ng; where aj= (a1j, a2j,…,amj)T, j = 1, …, n and bj= (b1j, b2j,…,bmj)T, j = 1,…, n. Here, we consider t different values of b [ (L, ??) such that L\b₁\b₂\ \b_t0₁\b_t0\ \b_t00₁\b_t00\ \b_t1\b_t;b_t0 ¼ 0 and

bt00¼ 1: The corresponding approximate quantiles of ^a and ^b; i.e., bð^aÞ and bð^bÞ;

respectively, can be calculated by applying the following formula, which is defined in the preceding section.

bð^aðor ^bÞÞ ¼

b₁; if ^aðor ^bÞ 62 Tb1;

b_i; if ^aðor ^bÞ is located on the frontier of Tbi; 1 i t;

biþbi1

2 ; if ^aðor ^bÞ 2 Int Tbi1

nTbi; 2 i t; b_t; if ^aðor ^bÞ 2 Int Tbt: 8 > > > < > > > :

It is noted that the frontier of T0, i.e., b = 0, is constructed by the minimums of [a1j, a2j,…,amj], j = 1,…, n in T; therefore, if ^x is located outside T0, then we can confidently consider ^x as being ‘‘rejected’’. By contrast, the frontier of T1, i.e., b = 1, is constructed by the minimums of [b1j, b2j,…,bmj], j = 1,…, n in T; hence, if ^x is located inside T1, then we can be confident of considering ^x as being ‘‘accepted’’. However, if ^x is located inside T0 and outside T1, then we can only consider ^x as being ‘‘accepted with risk’’. It follows that we can divide<m _{based on T} _{and T} _{into three regions, i.e., ‘‘rejection region’’ I, ‘‘risky}

0 1 2 3 4 5 6 7 8 9 10 11 12 11 10 9 8 7 6 5 4 3 2 1 1 j x 2 j x 2 0

β

= 3 0.5

β

= 4 1

β

= 1 ˆx 2 ˆx 4 ˆx 5 ˆx 6 ˆx 7 ˆx 3 ˆx 1 0.1

β

= −

Fig. 4 Resulting classification with respect to each piece of data

(14)

acceptance region’’ II and ‘‘acceptance region’’ III, which are graphically demonstrated in Fig.5. As a result, we can classify each interval DMU-^x by the three degrees of ‘‘rejec-tion’’, ‘‘risky acceptance’’ and ‘‘acceptance’’ that are represented by ^z1ð^xÞ; ^z2ð^xÞ and ^z3ð^xÞ; respectively. In what follows, we consider six possible scenarios corresponding to interval DMU-^x; and show the formulas that are used to calculate its ^z1ð^xÞ; ^z2ð^xÞ and ^z3ð^xÞ wrt each scenario.

Scenario 1. Let ^b62 IntT0(see Fig.6), and thus bð^aÞ\0 and bð^bÞ 0: Then, define ^ z1ð^xÞ ¼ jbð^aÞj þ jbð^bÞj; ^ z2ð^xÞ ¼ 0; ^ z3ð^xÞ ¼ 0:

Scenario 2. Let â2 T1(see Fig.7), and thus bðâÞ 1 and bð^bÞ [ 1: Then, define ^ z1ð^xÞ ¼ 0; ^ z2ð^xÞ ¼ 0; ^ z3ð^xÞ ¼ bðâÞ þ bð^bÞ:

Scenario 3. Let â2 T0 and ^b62 Int T1 (see Fig.8), and thus bðâÞ 0 and bðâÞ\bð^bÞ 1: Then, define

^ z1ð^xÞ ¼ 0; ^ z2ð^xÞ ¼ bð^aÞ þ bð^bÞ; ^ z3ð^xÞ ¼ 0:

Scenario 4. Let â62 T0 and ^b2 ðInt T0\T1 (see Fig.9), and thus bðâÞ\0 and 0\bð^bÞ\1: It follows that the line segment connecting points â and ^b intersects the efficient frontier of T0(see Fig.9); note that the point on the above defined line segment can be denoted as ^ xa¼ ð1 aÞâþ a^b; a2 ½0; 1: 0 x1 2 x 0

β

=

β

=1 I II III

Fig. 5 Three classification regions separated by T0and T1

(15)

Let ^xa represent the intersection point; thus, bð^x_aÞ ¼ 0: In addition, define ^ z1ð^xÞ ¼ jbðâÞj þ bð^xaÞ ¼ jbðâÞj; ^ z2ð^xÞ ¼ bðxâÞ þ bð^bÞ ¼ bð^bÞ; ^ z3ð^xÞ ¼ 0:

Scenario 5. Let â2 ðInt T0Þ\T1and ^b2 Int T1(see Fig.10), and thus 0\bðâÞ\1 and bð^bÞ [ 1: It follows that the line segment connecting points â and ^b intersects the efficient frontier of T1(see Fig.10).

Let ^xa represent the intersection point; thus, bð^x_aÞ ¼ 1: Define

0 x1 2 x 0

β

Note that it is easy to check, according to the formulas of ^z1; ^z2and ^z3in scenarios 1–6, that ^z1ð^xÞ 2 ð0; 2jLjÞ; ^z2ð^xÞ 2 ð0; 2Þ and ^z3ð^xÞ 2 ð2; 1Þ; it is defined that the larger the value of ^z1ð^xÞ; ^z2ð^xÞ and ^z3ð^xÞ; the higher the degree of rejection, risky acceptance, and acceptance, respectively. However, due to that, as indicated above, only the b that belongs to [0, 1] is associated with a well-defined probability property, ^z1ð^xÞ; ^z2ð^xÞ and ^z3ð^xÞ; which are respectively derived from b2 ðL; 0Þ; b 2 ½0; 1 and b 2 ð1; þ1Þ; can be compared only to themselves, but not to each other.

Based on the six scenarios and their corresponding formulas for calculating the quantile (degree) of classification data ^x that are described above, we can formally define the quantile–DEA classifier with interval data as follows:

Step 1 Select training data set T¼ fxjjj ¼ 1; . . .; ng; where xj¼ ð½a1j; b1j; ½a2j; b2j; . . .;½amj; bmjÞ; j ¼ 1; . . .; n; aj¼ ða1j; a2j; . . .; amjÞT; j¼ 1; . . .; n; and bj= (b1j, b2j,…,bmj)

T

, j = 1,…, n.

Step 2 Set first the value of t (t C 1), and then the value of b such that L\b₁\b₂\ \bt0\ \b_t00\ \b_t1\b_t; t0¼ 0; and t00 = 1. Compute

xbi

j ¼ ajþ biðbj ajÞ; i ¼ 1; . . .; t; j ¼ 1; . . .; n:

Step 3 Construct intersection-form acceptance domains Tbi¼ fxjðx

k biÞ T x lk bi0 0; k ¼ 1; . . .; lbig; and Int Tbi¼ fxjðx k biÞ T x lk bi0[ 0; k¼ 1; . . .; lbig; where i = 1, …, t.

Step 4 Check the corresponding scenario wrt classification data ^x2 ^T:

Step 5 Implement the approximate quantile bð^xÞ to calculate ^z1; the ‘‘rejection degree’’, ^

z2; the ‘‘risky acceptance degree’’, and ^z3; the ‘‘acceptance degree’’ by using the formulas corresponding to the scenario that ^x is involved.

0 x1 2 x 0

β

=

β

=1 1 ˆ a 2 ˆ a 1 ˆ b 2 ˆ b ˆ b ˆ a I II III ** ˆx_α

(18)

Example 4 Consider the intersection-form acceptance domains Tb1; . . .; Tb5 that are

constructed in Example 2, where b1¼12 ;b2¼ 0; b3¼12;b4¼ 1; and b5¼32: Let classification data set

^

T¼ ^fx1; ^x2; ^x3; ^x4; ^x5; ^x6g; where interval DMU-^xi; i = 1,…, 6 are as follows:

^ x1¼ a^11; ^b11 ; â21; ^b21 ¼ ð½1; 2; ½1; 2Þ; a^1¼ ð1; 1ÞT; b^1¼ ð2; 2ÞT; ^ x2¼ a^12; ^b12 ; â22; ^b22 ¼ ð½6; 7; ½6; 7Þ; a^2¼ ð6; 6ÞT; b^2¼ ð7; 7ÞT; ^ x3¼ a^13; ^b13 ; â23; ^b23 ¼ ð½4; 4:5; ½4; 4:5Þ; a^3¼ ð4; 4ÞT; b^3¼ ð4:5; 4:5ÞT; ^ x4¼ a^14; ^b14 ; â24; ^b24 ¼ ð½2; 4; ½2; 4Þ; a^4¼ ð2; 2ÞT; b^4¼ ð4; 4ÞT; ^ x5¼ a^15; ^b15 ; â25; ^b25 ¼ ð½4; 6; ½4; 6Þ; a^5¼ ð4; 4ÞT; b^5¼ ð6; 6ÞT; ^ x6¼ a^16; ^b16 ; â26; ^b26 ¼ ð½2; 7; ½2; 7Þ; a^6¼ ð2; 2ÞT; b^6¼ ð7; 7ÞT:

The positions of ^ai; ^bi; i¼ 1; . . .; 7 corresponding to bi; i¼ 1; . . .; 5 are graphically shown in Fig.12.

In what follows, we calculate the values of ^z1ð^xÞ; ^z2ð^xÞ and ^z3ð^xÞ wrt each classification data ^x2 ^T in sequence.

(i) The condition of ^x1satisfies scenario 1. Its corresponding ^z1ð^x1Þ; ^z2ð^x1Þ and ^z3ð^x1Þ are as follows: ^ z1ð Þ ¼ bx^1 j ð Þa^1 j þ b b^1 ¼ bj j þ b1 j j ¼ 1;1 ^ z2ð Þ ¼ 0;x^1 ^ z3ð Þ ¼ 0:x^1 0 x1 2 x 0

β

=

β

=1 1 ˆ a 2 ˆ a 1 ˆ b 2 ˆ b bˆ ˆ a I II III * ˆx_α ** ˆx_α

(19)

Hence, interval DMU-^x1 can obviously be classified as ‘‘rejection’’ with degree 1. (ii) The condition of ^x2 satisfies scenario 2. Its corresponding ^z1ð^x2Þ; ^z2ð^x2Þ and ^z3ð^x2Þ

are as follows: ^ z1ð Þ ¼ 0;x^2 ^ z2ð Þ ¼ 0;x^2 ^ z3ð Þ ¼ bx^2 ð Þ þ ba^2 b^2 ¼ b5þ b5¼ 3:

Hence, interval DMU-^x2 can evidently be classified as ‘‘acceptance’’ with degree 3. (iii) The condition of ^x3 satisfies scenario 3. Its corresponding ^z1ð^x3Þ; ^z2ð^x3Þ and ^z3ð^x3Þ

are as follows: ^ z1ð Þ ¼ 0;x^3 ^ z2ð Þ ¼ bx^3 ð Þ þ ba^3 b^3 ¼ b3þ 1 2ðb4þ b3Þ ¼ 5 4; ^ z3ð Þ ¼ 0:x^3

Hence, interval DMU-^x3 can be classified as ‘‘risky acceptance’’ with degree5₄:

(iv) The condition of ^x4satisfies scenario 4. Its corresponding ^z1ð^x4Þ; ^z2ð^x4Þ and ^z3ð^x4Þ are as follows: ^ z1ð Þ ¼ b^x4 j ð Þa^4 j ¼ bj j ¼1 1 2; ^ z2ð Þ ¼ b^x4 b^4 ¼ b3¼ 1 2; ^ z3ð Þ ¼ 0:^x4

0 1 2

3 4 5 6 7 8

x

1 j 2 j

x

3

1 / 2

β =

2

0 β =

1

1 / 2

β = −

4

1 β =

5

3 / 2

β =

1 ˆ (1, 1)T a = 1 4 6 ˆ _ˆ _ˆ _{(2, 2)}T b =a =a = 3 (4.5, 4.5) ˆ T b = 3 4 5 (4, 4) ˆ ˆ ˆ T a =b =a = 2 5 (6, 6) ˆ ˆ T a =b = 2 6 (7, 7) ˆ ˆ T b =b =

1

2

3

4

5

6

7

Fig. 12 Positions of ^ ai; ^bi; i¼ 1; . . .; 7

(20)

Since interval DMU-^x4 crosses over conflicting regions of ‘‘rejection’’ and ‘‘risky acceptance’’, a classification on ^x4may inevitably result in a ‘‘Type I’’ or ‘‘Type II’’ error. Hence, we do not provide the user with a definite classification, but rather provide the user with both the ‘‘rejection’’ and ‘‘risky acceptance’’ degrees, i.e., ^z1ð^x4Þ and ^z2ð^x4Þ; for his/ her reference.

(v) The condition of ^x5satisfies scenario 5. Its corresponding ^z1ð^x5Þ; ^z2ð^x5Þ and ^z3ð^x5Þ are as follows: ^ z1ð Þ ¼ 0;x^5 ^ z2ð Þ ¼ bx^5 ð Þ þ 1 ¼ ba^5 3þ 1 ¼ 3 2; ^ z3ð Þ ¼ 1 þ bx^5 b^5 ¼ 1 þ b5¼ 5 2:

The range associated with interval DMU-^x5 crosses over two regions, which may cause difficulty in classifying it. However, since the two of them are ‘‘risky acceptance’’ and ‘‘acceptance’’ regions, we should have confidence in classifying interval DMU-^x5 as ‘‘acceptance with low risk’’.

(vi) The condition of ^x6satisfies scenario 6. Its corresponding ^z1ð^x6Þ; ^z2ð^x6Þ and ^z3ð^x6Þ are as follows: ^ z1ð Þ ¼ bx^6 ð Þ ¼ ba^6 j j ¼1 1 2; ^ z2ð Þ ¼ 1;x^6 ^ z3ð Þ ¼ 1 þ bx^6 b^6 ¼ 1 þ b5¼ 5 2:

In actual fact, an interval data with a range crossing over three regions as shown in scenario 6 is unusual if not impossible. Here we consider such an interval data, e.g., interval DMU-^x6; to be un-classifiable due to the inherently uninformative values of ^

z1ð^x6Þ; ^z2ð^x6Þ and ^z3ð^x6Þ; and suggest that the user should re-check the data.

5 Extensions

Recall that in Sect. 2, we assumed that each training interval data denoted as interval DMU-xj(j = 1,…, n) is solely associated with m inputs, and that the bigger the values of the data, the higher the probability that the data are accepted. Based on the above setting, we construct acceptance domain Tbin both sum-form and intersection-form. In this sec-tion, we consider two new settings: (1) each interval DMU-yj (j = 1, …, n) is solely associated with s outputs, and (2) each interval DMU-ðxj; yjÞ (j = 1, …, n) is associated with both m inputs and s outputs. Accordingly, we construct acceptance domain Tbin both sum-form and intersection-form wrt each new setting.

First, consider the setting in which each interval DMU-yj (j = 1,…, n) is associated with s outputs, and the smaller the values of the data, the higher the probability that the data will be accepted. Denote the interval-DMUs as

(21)

yj¼ y1j; y2j; . . .; ysj ; j¼ 1; . . .; n; where yrj2 crj; drj ; r¼ 1; . . .; s; j¼ 1; . . .; n; and drj[ crj[ 0; r¼ 1; . . .; s; j¼ 1; . . .; n: In addition, denote the training data set as

T¼ yjjj ¼ 1; . . .; n ; and define yb_j ¼ yb_1j; y_2jb; . . .; yb_sj ; j¼ 1; . . .; n; where ybrj¼ drj b drj crj [ 0; r¼ 1; . . .; s; j¼ 1; . . .; n; and b2 ð1; RÞ with R¼ min 1 r s;1 j n drj drj crj [ 1: Let Tb represent the acceptance domain constructed by ybj ¼ ðy

b 1j; y b 2j; . . .; y b sjÞ T ; j¼ 1; . . .; n given a specified b2 ð1; RÞ: It follows that that Tb satisfies the following postulates:

Postulate 1 (Ordinary postulate) the observed ybj 2 Tb for all j = 1,…, n.

Postulate 2 (Convexity postulate) if y [ Tband ^y2 Tb; then kyþ ð1 kÞ^y2 Tb; for k [ [0, 1].

Postulate 3 (Monotonicity postulate) if y [ Tb, and ^y y; then ^y[ Tb.

Postulate 4 (Minimum extrapolation postulate) Tbis the intersection set of all ~T satisfying Postulates 1–3.

The sum-form acceptance domain that satisfies Postulates 1–4 defined above can be represented as follows: Tb¼ y Xn j¼1 yb_jkj y; Xn j¼1 kj 1; kj 0; j ¼ 1; . . .; n ( ) :

Note that sum-form acceptance domain Tbhas the same structure as the production pos-sibility set corresponding to the classical CCR model with reference set fð1; yb_jÞjj ¼ 1; . . .; ng in DEA research. In addition, intersection-form acceptance domain Tb is as follows: Tb¼ yxkb l k b T y 0; k ¼ 1; . . .; lb ;

(22)

where xk

b 0; lkb 0; lkb6¼ 0; k = 1, …, lb. Denote an exact classification data ^y as DMU-^y; and consider the following linear program wrt DMU-^y with a specified b [ (-?, R). ^ uðbÞ ¼ max u; Pb s.t. P n j¼1 yb_jkj u^y; Pn j¼1 kj 1; kj 0; j¼ 1; . . .; n:

Second, consider the setting in which each interval DMU-ðxj; yjÞ (j = 1, …, n) is associated with both m inputs and s outputs. Denote the interval-DMUs as

xj¼ x1j; x2j; . . .; xmj ; j¼ 1; . . .; n; yj¼ y1j; y2j; . . .; ysj ; j¼ 1; . . .; n; where xij2 aij; bij ; i¼ 1; . . .; m; j¼ 1; . . .; n; yrj2 crj; drj ; r¼ 1; . . .; s; j¼ 1; . . .; n:

Note that the smaller the values of the outputs and the larger the values of the inputs, the higher the probability that the data will be accepted. In addition, define

xb_j ¼ xb_1j; x_2jb; . . .; xb_mj ; j¼ 1; . . .; n; yb_j ¼ yb_1j; y_2jb; . . .; yb_sj ; j¼ 1; . . .; n; where xbij¼ aijþ b bij aij [ 0; i¼ 1; . . .; m; j¼ 1; . . .; n; yb_rj¼ drj b drj crj [ 0; r¼ 1; . . .; s; j¼ 1; . . .; n: and b [ (L, R) with L¼ max 1 i m;1 j n aij bij aij \0; R¼ min 1 r s;1 j n drj drj crj [ 1: Let Tbrepresent the acceptance domain constructed byðx b j; y

b

jÞ; j = 1, …, n given a specified b [ (L, R). It follows that that Tbsatisfies the following postulates:

Postulate 1 (Ordinary postulate) the observedðxb_j; yb_jÞ 2 Tb for all j = 1,…, n. Postulate 2 (Convexity postulate) if (x, y) [ Tb, and ð^x; ^yÞ 2 Tb; then kðx; yÞ þ ð1

kÞð^x; ^yÞ 2 Tb; for k [ [0, 1].

Postulate 3 (Monotonicity postulate) if (x, y) [ Tb, ^x x and ^y y; then ð^x; ^yÞ 2 Tb: Postulate 4 (Ray unbounded postulate) if (x, y) [ Tb, then að^x; ^yÞ 2 Tb for all a C 0.

(23)

Postulate 5 (Minimum extrapolation postulate) Tb is the intersection set of all ~T satisfying Postulates 1–4.

The sum-form acceptance domain that satisfies Postulates 1–5 defined above can be represented as follows: Tb¼ ðx; yÞ Xn j¼1 xbjkj x; Xn j¼1 ybjkj y; kj 0; j ¼ 1; . . .; n ( ) :

Note that sum-form acceptance domain Tbhas the same structure as the production pos-sibility set corresponding to the classical CCR model with reference set fðxb_j; yb_jÞjj ¼ 1; . . .; ng in DEA research. In addition, intersection-form acceptance domain Tb is as follows: Tb¼ ðx; yÞ xkb T x lk b T y 0; k ¼ 1; . . .; lb ; where x k lk 0; xk lk

6¼ 0: Denote an exact classification data ð^x; ^yÞ as DMU-ð^x; ^yÞ; and consider the following linear program wrt DMU-ð^x; ^yÞ with a specified b [ (L, R).

^ hðbÞ ¼ min h; Pb s.t. P n j¼1 xb_jkj h^x; Pn j¼1 ybjkj ^y; kj 0; j¼ 1; . . .; n:

It is easy to check that once the intersection-form acceptance domains Tbcorresponding to the two new settings are constructed, the quantile–DEA classifiers introduced in Sects.3

and4can be directly used to discover the groups to which a huge amount of classification data belong. That is, the proposed quantile–DEA classifiers can deal with the settings in which each interval DMU-xj; (j = 1,…, n) is associated with solely inputs, solely outputs, and both inputs and outputs.

6 Conclusions

This research proposes, to our knowledge, the first DEA-based classifiers, quantile–DEA classifiers, for dealing with binary classification problems with classification data that are known to have either exact values or values only within bounded intervals. The technique of multiple acceptance domains that is derived from the ideas and methods of both quantiles in statistics and the intersection-form production possibility set in the DEA framework enables the quantile–DEA classifiers not only to promptly classify a large volume of data, but also to provide the degrees associated with the patterns. It is note-worthy that the proposed classifier simply classifies an exact piece of data into ‘‘accep-tance’’ or ‘‘rejection’’ with a corresponding degree. However, due to the inherent complexity of interval data, the proposed classifier outputs three types of degrees asso-ciated with classification data: the ‘‘rejection’’, ‘‘risky acceptance’’, and ‘‘acceptance’’ degrees.

(24)

In addition, it is worth mentioning that, in the proposed quantile–DEA classifiers, it is assumed that the m characteristic values are equally important. However, in practice, the decision makers may value them differently. Hence, there are proposed DEA models in the literature that apply the preference cone to reflect the different importance of the char-acteristic values (see, e.g., Charnes et al. 1989; Yu et al. 1996; Wei and Yu1997). In actual fact, by applying the ideas in the articles, it is not difficult to construct the pref-erence-cone restricted quantile–DEA classifiers that can deal with the classification problems with different weighted characteristic values.

In short, this study sheds some light in extending the function of traditional DEA models from evaluating to classifying. The proposed quantile–DEA classifiers are both efficient and quite user-friendly in terms of detailed output information. Therefore, they have great potential in practical applications and can thus be effective complementary approaches for data mining.

Acknowledgments This paper has benefited from the suggestions offered by the reviewers, and this assistance is gratefully acknowledged. In addition, the first and the third authors are partially supported by the National Natural Science Foundation of China, NNSF 71271208.

Appendix 1: Proof of Theorem 1(i) Theorem 1 Let L\ b\ ^b; and

T_b¼ x Xn j¼1 xb_jkj x; Xn j¼1 kj 1; kj 0; j ¼ 1; . . .; n ( ) ; and T_b^¼ x Xn j¼1 xb_j^kj x; Xn j¼1 kj 1; kj 0; j ¼ 1; . . .; n ( ) : Then, T_b^ T_b:

Proof Since bj[ aj, j = 1,…, n, if L\ b\ ^b; then xb_j¼ ajþ b bj aj

\ajþ ^b bj aj

¼ xb_j^; j¼ 1; . . .; n:

It follows that ifPn_j¼1kj 1; kj 0; j ¼ 1; . . .; n; then Xn j¼1 xbjkj\ Xn j¼1 xbj^kj:

Thus, if x2 Tb^; then x2 Tb; that is, Tb^ Tb: h Appendix 2: Proof of Theorem 2

(25)

Lemma 1 If L\ b\ ^b; and ~x2 T_b^¼ x Pn_j¼1x ^ b jkj x;Pnj¼1kj 1; kj 0; j ¼ 1; n

. . .; n:g; then the optimal objective function value of the following linear program is less than one; i.e., ^hð bÞ\1: _^

hð bÞ ¼ min h; P_b s.t. P n j¼1 xbjkj h~x; Pn j¼1 kj 1; kj 0; j¼ 1; . . .; n: Proof Let T_b¼ xPn_j¼1x b jkj x; Pn_j¼1kj 1; kj 0; j ¼ 1; . . .; n n o : Since ~x2 T_b^; there exist ~k1; ~k2; . . .; ~knthat satisfy

Xn j¼1 xb_j^k~j ~x; Xn j¼1 ~ kj 1; ~ kj 0; j¼ 1; . . .; n:

Furthermore, since Pn_j¼1k~j 1; ð~k1; ~k2; . . .; ~knÞ 6¼ 0: Moreover, since aj\bj; 0 \x b j \xb_j^; j = 1,_{…, n. In summary, X}_n j¼1 xb_jk~j\ Xn j¼1 xb_j^k~j ~x:

It follows that there exist solutions to the following system of inequalities: Xn j¼1 xb_jkj\~x; Xn j¼1 kj 1; kj 0; j¼ 1; . . .; n:

As a result, ^hð bÞ\1 (i.e., the optimal objective function value of ðPbÞ is less than

one). h

Lemma 2 If L\ b; ^x[ 0 and ^x62 Tb; then the optimal objective function value of the following linear program is greater than one; i.e., ^hðbÞ [ 1:

^ hðbÞ ¼ min h; Pb s.t. P n j¼1 xb_jkj h^x; Pn j¼1 kj 1; kj 0; j¼ 1; . . .; n:

(26)

Proof Let ^k1; ^k2; . . .; ^kn denote the optimal solution to ðPbÞ and ^hðbÞ ¼ ^h: If ^hðbÞ ¼ ^ h 1; then Xn j¼1 xbjk^j ^h^x ^x; Xn j¼1 ^ kj 1; ^ kj 0; j¼ 1; . . .; n:

That is, ^x2 Tb; which is a contradiction. h

In what follows, we give the proof to Theorem 2, first to (i) and then to (ii). Theorem 2 Let ^x2 ^T\ Int xjPn_j¼1xL

jkj x;Pn_j¼1kj 1; kj 0; j ¼ 1; . . .; n

n o

; and ^

hðbÞ be the quantile function of DMU-^x: Then,

(i) hðbÞ is a continuous function defined over (L, ??).^

(ii) hðbÞ is a strictly monotonically decreasing function over (L, ??).^ Proof

(i) Consider the following linear programðPbÞ : ^ hðbÞ ¼ min h; Pb s.t. P n j¼1 xb_jkj h^x; Pn j¼1 kj 1; kj 0; j¼ 1; . . .; n: Equivalently, ^ hðbÞ ¼ min h; Pb s.t. P n j¼1 ajþ b b j aj kj h^x; Pn j¼1 kj 1; kj 0; j¼ 1; . . .; n:

According to the stability of linear programming (Ying et al.1975), the optimal objective function value of (Pb), ^hðbÞ; is a continuous function defined over (L, ??).

(ii) Let L\ b\ ^b; and consider the following problemðP_b^Þ : ^ hð ^bÞ ¼ min h; Pb^ s.t. P n j¼1 xb_j^kj h^x; Pn j¼1 kj 1; kj 0; j¼ 1; . . .; n:

(27)

It is clear that ^hð ^bÞ^x2 T_b^: Consider also the following problemð ~P_bÞ : ^ hð bÞ ¼ min h; ~ Pb s.t. P n j¼1 xb_jkj hð^hð ^bÞ^xÞ; Pn j¼1 kj 1; kj 0; j¼ 1; . . .; n:

Let ~h; ~k1; ~k2; . . .; ~kndenote the optimal solution toð ~P_bÞ: It is easy to check that ~h [ 0: Furthermore, since ^hð ^bÞ^x2 Tb^and L\ b\ ^b; from Lemma 1, ~h\1: Moreover, since ^hð bÞ is the optimal objective function value ofð ~PbÞ; ^hð bÞ ~h^hð ^bÞ\^hð ^bÞ: h

Appendix 3: Existence ofb*

The following Theorem 3 shows the existence of b*. Theorem 3 Let ^x2 ^T\ Int xjPn_j¼1xL

jkj x; Pnj¼1kj 1; kj 0; j ¼ 1; . . .; n

n o

; and ^

hðbÞ be the quantile function of DMU-^x: Then, there exists b* [ (L, ??) such that the optimal objective function value of the following problem (Pb) is equal to one; i.e., ^ hðbÞ ¼ 1: ^ hðbÞ ¼ min h; Pb s.t. P n j¼1 xbjkj h^x; Pn j¼1 kj 1; kj 0; j¼ 1; . . .; n: Proof

(i) If ^x is located on the frontier of T1, then ^hð1Þ ¼ 1; i.e., b* = 1.

(ii) If ^x is not located on the frontier of T1, and ^x2 Int T1; then there exist k0j 0; j ¼ 1; . . .; n;Pn_j¼1k0 j 1 such that Xn j¼1 ajþ 1 bj aj k0_j ¼X n j¼1 bjk0j\^x; ð1Þ and ^hð1Þ\1: Let ^ b [ max max 1 i m;1 j n x^ij aij = bij aij ; L : Then, ajþ ^b bj aj [ ^x; j¼ 1; . . .; n:

(28)

Therefore, for any kj 0; j ¼ 1; . . .; n;Pnj¼1kj 1; we have Xn

j¼1

xb_j^kj[ ^x: ð2Þ

From (2), ^x62 T_b^; and from Lemma 2, ^hð ^bÞ [ 1: As a result, since ^hð1Þ\1; ^hð ^bÞ [ 1; ^b2 ðL; þ1Þ; from Theorem 2(i), ^hðbÞ is a continuous function defined over (L, ??). It follows that there exists b2 ðL; þ1Þ such that ^hðbÞ ¼ 1:

(iii) If ^x62 T1; from Lemma 2, ^hð1Þ [ 1: In addition, since ^ x2 Int xjX n j¼1 xLjkj x; Xn j¼1 kj 1; kj 0; j ¼ 1; . . .; n ( ) ;

there exist k0_j 0; j ¼ 1; . . .; n;Pn_j¼1k0_j 1 such that Xn j¼1 ajþ L bj aj k0_j ¼X n j¼1 xL_jk0_j\^x:

Therefore, there exists ^b that satisfies ^b [ L such that Xn

j¼1

xb_j^k0_j\^x:

That is, ^x2 Int T_b^; and thus ^hð ^bÞ\1: Consequently, since ^hð1Þ [ 1; ^hð ^bÞ\1; ^b2 ðL; þ1Þ; from Theorem 2(i), ^hðbÞ is a continuous function defined over (L, ??). It

follows that there exists b* [ (L, ??) such that ^hðbÞ ¼ 1: h

Appendix 4: Uniqueness ofb*

The following Theorem 4 shows the uniqueness of b*.

Theorem 4 Let bj[ aj; j¼ 1; . . .; n; L\ b\ ^b; and ^x2 ^T: Then (i) There is no intersection between the frontiers of Tb^and Tb: (ii) The quantile of DMU-^x; i.e., b*, is uniquely determined.

Proof The proof to (i) is achieved by contradiction. If there exists x0_{2 <}m þ; and x

0_is located on the frontiers of both T_b^and T_b; then, from Theorem 2, 1¼ ^hð bÞ\^hð ^bÞ ¼ 1; which is a contradiction. That is, there is no intersection between the frontiers of Tb^and Tb: The proof to (ii) is also achieved by contradiction. Assume that there exist two quantiles of DMU-^x; i.e., b₁and b₂: Without loss of generality, assume that L\b₁\b₂: Since both

(29)

b₁ and b₂ are the quantiles of DMU-^x; ^hðb₁Þ ¼ ^hðb₂Þ ¼ 1: However, from Theorem 2, ^

hðb₁Þ\^hðb₂Þ: That is, there is a contradiction. It follows that b* is uniquely determined. h

References

Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European Journal of Operational Research, 2, 429–444.

Charnes, A., Cooper, W. W., Wei, Q. L., & Huang, Z. M. (1989). Cone ratio data envelopment analysis and multiobjective programming. International Journal of Systems Science, 20(7), 1099–1118.

Cooper, W. W., Park, K. S., & Yu, G. (1999). IDEA and AR-REA: Models for dealing with imprecise data in DEA. Management Science, 45, 597–607.

Cooper, W. W., Seiford, L. M., & Tone, K. (2006). Introduction to data envelopment analysis and its uses: With DEA-solver software and references. New York: Springer.

Corne, D., Dhaenens, C., & Jourdan, L. (2012). Synergies between operations research and data mining: The emerging use of multi-objective approaches. European Journal of Operational Research, 221, 469–479.

Despotis, D. K., & Smirlis, Y. G. (2002). Data envelopment analysis with imprecise data. European Journal of Operational Research, 140, 24–36.

Han, J., & Kamber, M. (2007). Data mining: Concepts and techniques. San Francisco: Morgan Kaufman Publishers.

Kao, C. (2006). Interval efficiency measures in data envelopment analysis with imprecise data. European Journal of Operational Research, 174, 1087–1099.

Pendharkar, P. C. (2002). A potential use of DEA for inverse classification problem. Omega: An Interna-tional Journal of Management Science, 30, 243–248.

Pendharkar, P. C. (2011). A hybrid radial basis function and data envelopment analysis neural network for classification. Computers and Operations Research, 38, 256–266.

Pendharkar, P. C. (2012). Fuzzy classification using the data envelopment analysis. Knowledge Based Systems, 31, 183–192.

Pendharkar, P. C., Khosrowpour, M., & Rodger, J. A. (2000). Application of Bayesian network classifiers and data envelopment analysis for mining breast cancer patterns. The Journal of Computer Information Systems, 40(4), 127–132.

Pendharkar, P. C., & Troutt, M. D. (2011). DEA based dimensionality reduction for classification problems satisfying strict non-satiety assumption. European Journal of Operational Research, 212, 155–163. Seifert, J. W. (2004). Data mining: An overview. CRS Report for Congress, The Library of Congress, Order

Code RL31798.http://www.fas.org/irp/crs/RL31798.pdf.

Seiford, L. M., & Zhu, J. (1998). An acceptance system decision rule with data envelopment analysis. Computers and Operations Research, 25(4), 329–332.

Sinha, A. P., & Zhao, H. (2008). Incorporating domain knowledge into data mining classifiers: An appli-cation in indirect lending. Decision Support Systems, 46, 287–299.

Troutt, M. D., Rai, A., & Zhang, A. (1996). The potential use of DEA for credit applicant acceptance systems. Computers and Operations Research, 23(4), 405–408.

Wei, Q. L., & Yan, H. (2001). A method of transferring polyhedron between the intersection-form and the sum-form. Computers and Mathematics with Application, 41, 1327–1342.

Wei, Q. L., & Yu, G. (1997). Analyzing the properties of K-cone in generalized data envelopment analysis model. Journal of Econometrics, 80, 63–84.

Yan, H., & Wei, Q. L. (2000). A method of transferring cones of intersection-form to cones of sum-form and its applications in DEA models. International Journal of Systems Science, 31(5), 629–638. Yan, H., & Wei, Q. L. (2011). Data envelopment analysis classification machine. Information Science, 181,

5029–5041.

Ying, M. Q., Xu, R. E., & Wei, Q. L. (1975). Stability of mathematical programming. Acta Mathematical Sinica, 18(2), 123–175.

Yu, G., Wei, Q. L., & Brockett, P. (1996). A generalized data envelopment analysis model: A unification and extension of existing methods for efficiency analysis of decision making units. Annals of Oper-ations Research, 66, 47–89.

Zhu, J. (2003). Imprecise data envelopment analysis: A review and improvement with an application. European Journal of Operational Research, 144, 513–529.