廣義線性模式下處理比較之最適設計 - 政大學術集成

全文

(1)國立政治大學統計學系博士學位論文. 指導教授: 丁兆平博士. 立. 政治大. ‧ 國. 學. 廣義線性模式下處理比較之最適設計. ‧. Optimal Designs for Treatment Comparisons. sit. y. Nat. io. n. al. er. under Generalized Linear Models. Ch. engchi. i n U. v. 研究生: 何漢葳撰. 中華民國一○二年六月.

(2) Abstract The problem of finding D- and A-optimal designs for the zero- and one-way elimination of heterogeneity under generalized linear models is considered. Since GLM designs rely on the values of parameters to be estimated, our strategy is to employ the locally optimal designs. For the zero-way elimination model, a theorem-based algorithm is proposed to search for the D-optimal exact designs. A formula for the construction of D-optimal approximate design when values of unknown parameters are split into two, with respective sizes m and v − m, are derived. Analytic solutions provided to the exact counterpart, however, are restricted to the cases when m = 1 and m = v − 1. An example is given to explain the problem involved.. 政治大 replicates per treatment are proved dependent on m, rather than the unknown pa立 rameters. These bounds imply that designs having as equal number of replications On the other hand, the upper bound and lower bound of the optimal number of. ‧ 國. 學. for each treatment as possible are efficient in D-optimality.. In addition, a D-optimal approximate design when values of unknown param-. ‧. eters are divided into three groups is also obtained. A closed-form expression for. y. Nat. an A-optimal approximate design for comparing arbitrary v treatments is given.. sit. For the one-way elimination model, our focus is on studying the D-optimal. er. io. designs for v = 2 and v = 3 with each block size given. The D- and A-optimality. al. n. v i n C U of two variances, which is largerhthan with smaller variance e n1,gtocthe h i treatment. for v = 2 can be achieved by assigning units proportional to square root of the ratio. in each block separately. For v = 3, the structure of determinant is much more. complicated even for two blocks, and we can only show that, when treatment variances are the same within a block, design having equal number of replicates as possible in each block is a D-optimal block design. Some numerical evidences conjecture that a design satisfying the condition that the number of replicates are inversely proportional to the treatment variances per block is better in terms of D-optimality, as long as the ordering of treatment variances are the same across blocks, which is reasonable for an additive model as we assume. Keywords: Generalized linear models (GLMs); block designs; D-optimality; Aoptimality; approximate and exact designs; robustness. iii.

(3) 摘要本研究旨在建立廣義線性模式下之D-與A-最適設計(optimal designs)，並依不同處理結構(treatment structure)分成完全隨機設計(completely randomized design, CRD)與隨機集區設計(randomized block design, RBD)兩部分探討。根據完全隨機設計所推導出之行列式的性質與理論結果，我們首先提出一個能快速大幅限縮尋找D-最適正合(exact)設計範圍的演算法。解析解的部分，則從將v個處理的變異數分為兩類出發，建立其D-最適近似(approximate)設計，並由此發現 (1) 各水準對應之樣本最適配置的上下界並非與水準間不同變異有關，而是與有多少處理之變異相同有關；(2) 即使是變異很大的處理，也必須分配觀察值，始能極大化行列式值。此意味著當v較大時，均分應不失為一有效. 政治大的D-最適設計，並舉例說明求不出一般解的原因。立除此之外，我們亦求出當三個處理的變異數皆不同時之D-最適近似設計，以及v個處理皆不同時之A-最適近似設計。. 學. ‧ 國. 率(efficient)的設計。至於正合設計，我們僅能得出某一處理特別大或特別小時. 至於最適隨機集區設計的建立，我們的重點放在v = 2及v = 3的情形，並假. ‧. 設集區樣本數(block size)為給定。當v = 2時，各集區對應之行列式值不受其他. y. Nat. 集區的影響，故僅需依照完全隨機設計之所得，將各集區之行列式值分別最. sit. 佳化，即可得出D-與A-最適設計。值得一提的是，若進一步假設各集區中兩. er. io. 處理變異的比例(> 1)皆相同，且集區大小皆相同，則將各處理的「近似設計. al. n. v i n Ch 當v = 3時，即使只有2個集區，行列式也十分複雜，我們目前僅能證明當集區 engchi U. 下最適總和」取最接近的整數，再均分給各集區，其結果未必為最適設計。. 內各處理的變異相同時(不同集區之處理變異可不同)，均分給定之集區樣本數為D-最適設計。當集區內各處理的變異不全相同時，我們僅能先以2個集區為例，類比完全隨機設計的性質，舉例猜想當兩集區中處理之變異大小順序相同時，各處理最適樣本配置的多寡亦與變異大小呈反比。由於本研究對處理與集區兩者之效應假設為可加，因此可合理假設集區中處理之變異大小順序相同。. 關鍵詞：廣義線性模式；集區設計；D-最適；A-最適；近似與正合設計；穩健性. iv.

(4) Contents 1 Introduction. 1. 2 Information Matrices of GLMs. 10. 3 Completely Randomized Designs. 15. 3.1. D-optimal Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . 15. 3.2. D-optimal Designs for c1 = · · · = cm and cm+1 = · · · = cv , v ≥ 3 . . 22. 3.3. D-optimal Designs for v = 2 . . . . . . . . . . . . . . . . . . . . . . 40. 3.4. D-optimal Designs for c1 = · · · = cm , cm+1 = · · · = cm+k , cm+k+1 =. 立. 政治大. · · · = cv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41. ‧ 國. 學. Robustness of D-optimal Designs . . . . . . . . . . . . . . . . . . . 43. 3.6. A-optimal Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . 45. ‧. 3.5. 47. sit. y. Nat. 4 Randomized Block Designs. 4.2. D-optimal Designs for v = 3 . . . . . . . . . . . . . . . . . . . . . . 51. 4.3. A-optimal Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . 53. n. al. er. D-optimal Designs for v = 2 . . . . . . . . . . . . . . . . . . . . . . 49. io. 4.1. Ch. engchi. 5 Conclusion and Future Research. v. i n U. v. 54.

(5) List of Tables Optimal nij for k2 = 13 and k2 = 31 . . . . . . . . . . . . . . . . . . 53. 立. 政治大. 學 ‧. ‧ 國 io. sit. y. Nat. n. al. er. 1. Ch. engchi. vi. i n U. v.

(6) 1. Introduction. Experiments are commonly conducted to compare results from different settings of treatment factors, the major sources of variation that are of particular interest, on a response variable in control of unwanted variation. The design, by which the experiment is carried out, is a rule determining which treatments are assigned to which experimental units. Due to unavoidable random errors in experiments, a statistical model describing the relationship between the outcome and the treatment and block structures. 政治大. is essential and has to be specified in advance for design purposes. This model in. 立. turn determines the techniques used in subsequent analysis of experiments. The. ‧ 國. 學. higher the degree of parameter precision attained utilizing data produced by such. ‧. model-oriented designs, the better an experiment is. The quality of a design is thus. io. er. effects in terms of variances of the estimators.. sit. y. Nat. measured by its efficiency, indicating how accurate are the estimators of treatment. al. v i n C h some statistical design by choosing designs minimizing e n g c h i U criteria-based functionals n. The theory of optimal design provides a systematic way for finding an optimal. of the variance-covariance matrix of the estimators. These real-valued functions are summaries capturing different features of simultaneous inferences on correlated estimators, and have statistical interpretations in terms of the information matrix, which is inversely proportional to the covariance matrix. The optimality of a design can therefore be viewed as the maximizing of the information to be carried about unknown parameters from another angle. Three most familiar design criteria include the D-, A-, and the E-optimality. A. 1.

(7) design giving the minimum determinant of the covariance matrix, or equivalently, the maximum determinant of the information matrix is called a D-optimal design. This criterion serves to minimize the generalized variance of parameter estimates over all possible designs. A design is said to be A-optimal if it minimizes the trace of the covariance matrix of estimators among all competing designs. A-optimality criterion minimizes the total, or the average, variance of the parameter estimates. In E-optimality, the largest eigenvalue of the covariance matrix is minimized, and this is equivalent to minimizing the maximal possible variance from any normalized. 政治大. linear combination of the two estimated parameters.. 立. The information about the uncertainty of a set of parameters evaluated by the. ‧ 國. 學. above criteria can also be expressed by characteristics of a confidence ellipsoid in linear models. In D- and A-optimality, the volume and the size of the ellipsoid are. ‧. minimized respectively, whereas the E-optimal design minimizes the major axis of. y. Nat. io. sit. the ellipsoid. In light of the D- and A-optimality being the most two statistically. n. al. er. attractive criteria for estimation of parameters, we shall consider these two criteria. Ch. i n U. on the determination of optimal designs in this paper.. engchi. v. For a given model, an optimal design is fully measured by a pair of arguments, including a finite number of support points to best estimate the parameters, and the associated number of observations to be taken which sum up to a fixed sample size. Conventionally, designs are constructed under the structure of linear models in which errors are assumed to be uncorrelated random variables with mean 0 and common variance σ 2 . In this setting, the information matrix is proportional to the identity matrix, and a design with such an information matrix is D-, A-, and E-optimal.. 2.

(8) Over the past few decades, however, along with data analysis problems faced in practice involving qualitative and quantitative discrete data, it has been learned that there is more to the world than just standard homoscedastic linear regression. This has encouraged not only the development of models specifically designed for analyzing qualitative and quantitative discrete data, but also the study of how to produce such data through experimental designs. In the context of optimal designs for linear models, the addition of design variables does not significantly change the complexity of the design problem or the nature of its solution. However, pertaining. 政治大. to the optimal designs for categorical or count data, the problem is fundamentally. 立. different and much more involved.. ‧ 國. 學. There is a crucial distinction between optimal designs for classical linear models and for linear predictors with categorical or count outcomes, which are special cases. ‧. of the famous generalized linear models (GLMs). GLMs provide flexible extensions. y. Nat. io. sit. of linear models by relating the design matrix to the mean of the response variable. n. al. er. which independently follows the exponential-family distribution via a so-called link function.. Ch. engchi. i n U. v. Different from the least squares estimator that minimizes the variance of meanunbiased estimators under the conditions of the Gauss-Markov theorem in linear models, maximum-likelihood estimation (MLE) remains popular for GLMs. There is generally no analytical solution to these MLE, but their asymptotic inference for unknown parameters follows from the standard weighted least squares formulation. Designs derived for GLMs are thus based on the large-sample covariance matrix. However, this covariance matrix contains unknown parameters in weights that we are trying to estimate later on, and so does the resulting design. This phenomenon. 3.

(9) is referred to as the parameter dependence problem in optimal designs for GLMs. A review of design issues and existing literature on GLMs delivered by Khuri, Mukherjee, Sinha and Ghosh (2006) summarized common approaches that can be taken in dealing with the problem of dependence on unknown parameters in finding optimal designs. These include locally optimal designs which involve specifying the initial values or best guesses of the values of unknown parameters, and the values of unknown parameters of the fitted GLM can be updated through a sequence of locally-optimal experiments. The Bayesian approach where the prior distribution. 政治大. of unknown parameters is assumed is also discussed.. 立. In contrast to the very general consequence for linear models, designs for GLMs. ‧ 國. 學. are often distribution-specific and different criteria may not always yield the same optimal design. Besides, the growing interest in finding optimal designs for GLMs. ‧. has been a strong concentration on multiple linear regression models as compared. y. Nat. io. sit. with the popularity of designs for analysis of variance (ANOVA) in linear models.. n. al. er. The works on optimal designs for GLMs evolves around logistic and probit models. Ch. i n U. for binary data, and Poisson regressions for count data.. engchi. v. Recent research has identified the structure of GLM optimal designs for some specific criteria with single or multiple independent variables included as first-order terms in the predictor. For a typical binary response model with a single design variable, D-optimal designs do exist for unbounded design space. Regarding two design variables and bounded design space, a D-optimal design with at most four or six symmetrically arranged support points, from which a further reduction in the number of support points may sometimes be accomplished by using asymmetric weighting. It is also. 4.

(10) well established that a design in which the number of support points equals the number of parameters with equal weights is D-optimal. Looking for D-optimality, which is invariant under a nonsingular transformation, Ford, Torsney and Wu (1992) derived the canonical form and the resulting geometry of the induced design space to identify the support points using Carath´eodory’s theorem as well as giving the explicit formula for the determination of optimal weights. Sitter and Wu (1993) showed that for symmetric distributions and mild conditions, D- and A-optimal designs have two or three supports, de-. 政治大. pending on the location of the upper boundary point of the 2-dimensional convex. 立. hull generated from the diagonal partitions of the information matrices for binary. ‧ 國. 學. response experiments. Rodr´ıguez-Torreblanca and Rodr´ıguez-Daz (2007) investigated a D-optimal design under the heteroscedastic Poisson regression model and. ‧. the negative binomial model. The two equally weighted support points are derived. y. Nat. io. sit. by the general equivalence theorem for both distributions under D-optimality.. n. al. er. Yang (2008) proved the conjecture of Mathew and Sinha (2001) that the A-. Ch. i n U. v. optimal design has exactly two symmetric support points with asymmetric weights. engchi. for binary data under logistic regression, while also extending this result to probit, double exponential, and double reciprocal models by means of algebra and numerical computations. Yang and Stufken (2009) proposed a unified algebraic method to identify the support points for binary and count data that can be applied to common models and optimality criteria based on information matrices with constrained and unconstrained design regions. Russell, Woods, Lewis and Eccleston (2009) found the D-optimal design for Poisson regression models. By linearly transforming the covariates, the information matrix turn into independent. 5.

(11) of unknown parameters given the upper and lower bounds of the transformed variables. Engaging the equally-weighted canonical form of design together with the general equivalence theorem proves the D-optimality. Algebraically, Yang, Zhang and Huang (2011) found explicit formulas for the D-, A- and E- optimal designs with multiple design variables on full or subset parameters for binary responses. The linear transformation together with bounded covariates makes the dominating class of optimal designs a two-point structure. Following Russell, Woods, and Eccleston (2009), McGree and Eccleston (2012). 政治大. defined an average model based on the compromise design proposed by Woods,. 立. Lewis, Eccleston and Russell (2006) to analytically optimize across the unknown. ‧ 國. 學. parameters over the compromise design. They showed that the locally D-optimal saturated designs are D-optimal across all Poisson main-effect models and sug-. ‧. gested replicating the saturated design instead of running all the design points. y. Nat. io. sit. found by Russell et al. (2009). The specific levels of each covariate for a locally. n. al. er. D-optimal design are independent of the inclusion and/or exclusion of other co-. Ch. i n U. v. variates in a main-effect Poisson regression model. The average model approach. engchi. makes the previously onerous optimization problem much more easily managed. The aforementioned literature review implies that Fisher information matrices for regression models contain unknown support points to be determined. Besides, designs are mostly described in terms of coded variables in regression models in order to apply general principles of a design and also to aid the interpretability of experimental results. It is reasonable in most situations to scale the numerical variables because typically a bounded design region for such variables is considered. In addition to continuous variables, the outcome of an experiment may also be. 6.

(12) affected by different groupings, or qualitative factors where the support points are predetermined and fixed. Comparing means between groups in regression models can be made by means of dummy coding scheme that formulates dummy variables to sort the factor levels into mutually exclusive categories which indicate absence or presence of a certain treatment effect. However, as the number of levels grows, the number of dummy variables increases, considerably especially when there are factor interactions. This makes the ANOVA model more appropriate for describing the relationship in such settings, and optimal designs for this model consider only. 政治大. the allocation of treatments into experimental units.. 立. Yang, Mandal and Majumdar (2012) investigated the problem of obtaining the. ‧ 國. 學. locally D-optimal designs for 22 factorial experiments with main-effects model for binary responses. Their work provides a design strategy for experimenters with an. ‧. approximate idea of the variances, in which the use of assumed values will lead to. y. Nat. io. sit. a highly efficient design, whereas for an experimenter absent of prior information. n. al. er. of variances, performing uniform designs remain robust in terms of efficiency loss.. Ch. i n U. v. The goal of this paper is twofold. First we aim to propose a rigorous treatment. engchi. to determine locally-optimal designs for treatment comparisons under generalized linear models. This brings about our second goal regarding the issue of robustness against parametric uncertainty in the sense of finding a design whose efficiency is bounded by an acceptance level regardless of the values of unknown parameters, or from another perspective, of providing a design whose performance is robust to a range of values for each unknown parameter. The problem of deriving a design reduces to either a continuous or a discrete optimization problem once the criteria and support points have been determined.. 7.

(13) The main difference between the two resulting designs is that exact designs require the number of observations assigned to each design point to be specified as integers, while approximate designs allow real numbers instead. The construction of optimal approximate designs thus make use of tools from convex analysis, which makes the optimization steps much easier to understand and more mathematically tractable. In particular, it is equivalent to transform the real numbers into proportions, the so-called design weights, so the optimal approximate design becomes applicable for arbitrary sample sizes.. 政治大. There is no general methodology for solving exact design problems in which. 立. number-theoretic questions are involved and when standard calculus techniques. ‧ 國. 學. for optimization do not apply. The structure of optimal exact designs may even be subject to change with different sample sizes. In practice, all designs must be. ‧. exact to be carried out. Therefore, it follows that approximate designs have to be. y. Nat. io. sit. realized as exact designs by means of rounding to integers. Whether or in what. n. al. er. condition an optimal exact design and an exact design based on the approximate. Ch. i n U. v. counterpart coincide remains a question. Given an optimal approximate design, we. engchi. claim its neighboring integers are not necessarily an optimal or near-optimal exact design, because it is the value of the design criterion that is of importance, rather than the closeness between an exact design and its approximate correspondence. In this paper, we shall concentrate on finding the optimal exact designs, and it takes advantage of the optimal approximate designs sometimes. The idea of uniform designs emerges when encountering exact designs without any prior information about the unknown parameters. In this paper, a uniform design is defined to allocate the observations as equal as possible to each treatment.. 8.

(14) Besides serving as benchmarks, uniform designs resemble optimal designs more when the number of treatments increases, and at least one sample is required to be assigned to each treatment. If the variances are more or less equal across all the treatments, a uniform design is expected to be optimal as a result of regarding the treatment variances to be identical. Moreover, solving for local optimality based on unknown parameters adds complexity to the optimization, which may not be worth a small increase in efficiency. It suffices to implement an uniform design if desired efficiency is attained. For all the reasons above, it pays to figure out to. 政治大. what extent of variation a uniform design is still optimal or near optimal.. 立. The remaining of this paper is organized as follows. In Section 2, we introduce. ‧ 國. 學. our settings of a completely randomized design for generalized linear models, elaborate on the derivation of the information matrix, and formulate the design prob-. ‧. lems. In Section 3, we propose an algorithm for finding D-optimal exact designs. y. Nat. io. sit. based on some theoretical results about the general relations between unknown. n. al. er. parameters ci ’s and the optimal ni ’s. The efficiency of using this algorithm are. Ch. i n U. v. demonstrated in two examples. The main theoretical results about the D-optimal. engchi. approximate and exact designs are given, with a consideration on separating the unknown values of ci ’s into two. The A-optimality of approximate designs with arbitrary ci ’s is also established. The constructions of optimal block designs are investigated in Section 4, with a brief discussion on the difficulties of finding such a design. Section 5 gives some concluding remarks and directions of future research, including an interesting comparison between a block design with 2 treatments and 2 blocks and a 22 factorial design with main-effects.. 9.

(15) 2. Information Matrices of GLMs. Two design models are considered, a completely randomized design with v treatments and a randomized block design with v treatments and b blocks. Throughout this paper, the Y ’s denote responses obtained after applying treatment i to unit h. Y ’s are independent and assumed to follow a member of an exponential family, not necessarily normally distributed. In a completely randomized design set-up, we let ni be the number of replicates of treatment i, i = 1, · · · , v, with a total number of observations n = n1 + · · · + nv .. 政治大. The zero-way elimination of heterogeneity model then postulates the transformed. 立. expectation of observation Yih to be µ+τi , i = 1, · · · , v, with g(E(Yi )) = µ+τi = ηi ,. ‧ 國. 學. where g, the so-called link function, is monotonic and differentiable. That is, the. ‧. GLM does not assume a direct linear relationship between treatments and the. sit. y. Nat. expectation of corresponding observations, but it does assume linear relationship. io. er. between the transformed expectation in terms of link function and the treatments.. al. Let transposed vector xTi represent the ith row of design matrix X defined later,. n. v i n C h layout, η can beUexpressed as sum of products and β = (µ, τ , · · · , τ ) for one-way engchi 1. v. i. of two vectors, which is ηi = xTi β, where here and throughout the sequel, µ is the overall mean and τ1 , · · · , τv are fixed effects of treatments,. Pv. i=1 τi. = 0.. The distributions of Yi , as we follow the notation and terminology of Agresti (2012), belongs to the exponential dispersion family, taking the form of f (yi ; θi , φ) = exp. yi θi − b(θi ) + c(yi , φ) , a(φ). (1). where θi and φ are the natural parameters and the dispersion parameter, respectively, for some known functions a(·), b(·) and c(·). Functions a(·), b(·), and c(·). 10.

(16) are specified to correspond to a particular distribution. Although the parameters θi are typically not of direct interest, E(Yi ) is some function of θi , and thus is connected with the interested parameters β, which are estimated using the maximum likelihood (ML) customarily. When the link function makes the linear predictor ηi identical with the natural parameter θi , namely, ηi = g(E(Yi )) = θi , we say that the link is canonical. If φ is known, then (1) agrees with the usual definition of the one-dimensional exponential family in canonical form, which is f (yi ; θi ) = s(θi )t(yi ) exp(yi q(θi )),. 政治大. upon writing q(θi ), s(θi ) and t(yi ) by θi /a(φ), exp(−b(θi )/a(φ)) and exp(c(yi , φ)),. 立. respectively.. ‧ 國. 學. The structure of a completely randomized design with v treatments in a GLM is thus established. Designs with various other settings of treatments and blocks. ‧. are also candidates of the linear predictor of a GLM, and the relationship between. y. Nat. io. sit. the mean of the response and the design matrix is extended to more than simply. n. al. er. the identity link. This generalization, however, comes at the expense of no analytic. Ch. i n U. v. solutions available for the estimators, except that in rare cases explicit expressions. engchi. can be found. Numerical methods such as Newton-Raphson are usually required in practice. The usage of expected Hessian matrix is assumed here when applying the Newton-Raphson algorithm, which is so-called iteratively reweighted least squares (IRLS) algorithm. The development of optimal designs relies on the asymptotic covariance matrix ˆ instead. This matrix is the inverse of the information matrix of the ML estimator β and is determined by the likelihood function for the GLM. Before deriving the likelihood equations of β, we state without proof for later. 11.

(17) use that E(Yi ) = b0 (θi ) and Var(Yi ) = a(φ)b00 (θi ), respectively, where primes denote differentiation with respect to θi . These identities are established by applying the well-known relations, E. ∂L ∂θi. . = 0 and E. ∂ 2L ∂θi2. . +E. ∂L ∂θi. 2 = 0,. which hold under regularity conditions satisfied by (1) with L the log-likelihood function of n observations. The reader is referred to Section 4.4.1 of Agresti (2012), or any other books on GLMs for complete and detailed proofs about these facts.. 政治大 of θ ’s on the model parameters β (Agresti (2012)), is given by 立. Now, the log-likelihood function of n observations, reflecting the dependence. L(β) =. ni v X X yih θi − b(θi ) i=1 h=1. a(φ). + c(yih , φ) .. 學. ‧ 國. i. ‧. Using the chain rule, it is straightforward to show that the score function is. er. io. sit. y. Nat. X ni v v ∂L(β) ∂ XX yih θi − b(θi ) ∂ yi. θi − ni b(θi ) = + c(yih , φ) = ∂βk ∂βk i=1 h=1 a(φ) ∂βk a(φ) i=1 v X ∂ yi. θi − ni b(θi ) ∂θi ∂b0 (θi ) ∂ηi = · 0 · · ∂θ a(φ) ∂b (θ ) ∂η ∂βk i i i i=1. n. al. =. v X yi. − ni b0 (θi ). a(φ). i=1. where yi· =. Pni. h=1. Ch. engchi. i n U. v. v. X yi. − ni b0 (θi ) ∂b0 (θi ) a(φ) ∂b0 (θi ) · · · xik = · · xik , Var(Yi ) ∂ηi Var(Y ) ∂η i i i=1. yih and 1/b00 (θi ) = a(φ)/Var(Yi ) for the second term, and the. third term ∂b0 (θi )/∂ηi depends on the link function used. Finally, ∂ηi /∂βk = xik , where xik is the kth element of the covariate vector for the ith treatment. It can be seen that the score function depends on β implicitly through E(Yi ) and Var(Yi ), and so does I(β), the information matrix of β, given by the variancecovariance of the score function.. 12.

(18) −E. ∂ 2 L(β) ∂βk ∂β`. . =E. ∂L(β) ∂L(β) ∂βk ∂β`. . v v X (yi. − ni E(Yi ))xik ∂b0 (θi ) X (yi. − ni E(Yi ))xil ∂b0 (θi ) =E · · Var(Yi ) ∂ηi i=1 Var(Yi ) ∂ηi i=1. !. 2 2 v v X X E(yi. − ni E(Yi ))2 ∂b0 (θi ) ni Var(Yi ) ∂b0 (θi ) = xik xi` = xik xi` . (Var(Yi ))2 ∂ηi (Var(Yi ))2 ∂ηi i=1 i=1 If the canonical link ηi = θi is used, then ∂b0 (θi )/∂ηi = b00 (θ) so that v. ∂L(β) X yi. − ni E(Yi ) = · xik , ∂βk a(φ) i=1. 治政大 X ∂ L(β) n Var(Y )x x −E立 = ∂β ∂β (a(φ)). which leads to the Fisher information. v. 2. i. k. `. i. ik i` . 2. i=1. ‧ 國. 學. In this paper, we focus on the distributions when a(φ) = 1 and with the canonical Pv. ‧. link function, and therefore the Fisher information matrix is. i=1. ni Var(Yi )xik xi` .. al. n. I = XT W X = .       , where X =      . Ch. T I11 I21. I21 I22. engchi. sit. . 1n1 1n1. er. io . y. Nat. Generalizing from the typical elements of I to the entire matrix, we have. i 0v n 1U 0 1n2. 0. 0. 1n2 · · ·. 0. ···. 0 .. .. · · · 1nv. .. .. .. .. 0 .. .. 1nv. 0. 0. n3. ···.       ,    . where W = diag(w1 I n1 , · · · , wv I nv ) is a block diagonal matrix with wi = Var(Yi ), I11 =. Pv. i=1. ni wi , I21 = (n1 w1 , · · · , nv wv )T is a v×1 vector, I22 = diag(n1 w1 , · · · , nv wv ). is a v ×v diagonal matrix, 1m is a m×1 vector of 1’s, and I m is the m×m identity matrix. Since our focus is on estimating treatment contrasts, the information matrix M for the estimation of (τ1 , · · · , τv )T can be derived,. 13.

(19) −1 T M = I22 − I11 I21 I21 = ((mij )),. P P where mii = ni wi − ( vi=1 ni wi )−1 (ni wi )2 , and mij = −( vi=1 ni wi )−1 ni nj wi wj , i, j = 1, · · · , v, i 6= j by considering  . 1 − −I21 I11. 01×v Iv.  . T I21. I11. I21 I22.  . 1. − T −I11 I21. 0v×1. Iv. . . I11. =. 01×v. − T 0v×1 I22 − I21 I11 I21.  .. Note that the row sum and column sum of M are both zeroes. A design satisfying this condition is said to be connected, and we shall restrict our attention to such. 治政大 of M, and minimizing reduced to maximizing the product of non-zero eigenvalues 立 designs. The D- and A-optimality criteria of connected designs then respectively. ‧ 國. 學. the sum of the inverse of the non-zero eigenvalues of M. Let λ1 , · · · , λv−1 be the. non-zero eigenvalues of M, in other words, a design d is said to be D-optimal if i=1. λi is the maximum among all designs, and a design d is said to Pv−1 i=1. λ−1 is the minimum among all designs. i. sit. Nat. be A-optimal if its value of. y. Qv−1. ‧. its value of. n. al. er. io. Our theorems are applicable to any exponential family with the canonical link. i n U. v. and the dispersion parameter given, such as Poisson distribution with log link, both. Ch. engchi. binomial and multinomial distributions with logit link and exponential response with inverse link.. 14.

(20) 3. Completely Randomized Designs. Without loss of generality, we define w1 ≤ · · · ≤ wh ≤ · · · ≤ wk ≤ · · · ≤ wv and wi = ci w1 , 1 ≤ i ≤ v, so that 1 = c1 ≤ · · · ≤ ch ≤ · · · ≤ ck ≤ · · · ≤ cv .. 3.1. D-optimal Designs. By summing over all. v 2. −1 T second order principal minors of M = I22 − I11 I21 I21. . yields the product of the v − 1 non-zero eigenvalues of M which leads to Qv. i=1. ni wi /. = (w1v−1. Pv. i=1. ni wi , or equivalently, (w1v−1. Qv. i=1 ci )D(n1 , · · ·. Qv. i=1. Qv. i=1 ci )(. 政治大. i=1. ni /. Pv. i=1 ci ni ). , nv ), say. Our goal is to find values of ni , 1 ≤ i ≤ v,. P 立. v i=1. ni wi /. ni wi , subject to the constraints that. Pv. i=1. ni = n. 學. ‧ 國. that maximize. Qv. is fixed and the wi ’s are given, or equivalently, to find values of the ni ’s that Pv. i=1. ni = n fixed and the ci ’s given.. ‧. maximize D(n1 , · · · , nv ) with. Although D(n1 , · · · , nv ) is not a complicated function, the D-optimal approx-. sit. y. Nat. io. by studying the case when some of the ci ’s are equal.. n. al. Ch. engchi. Theorem 1. Suppose ch = · · · = ck = c, and. er. imate and exact designs for arbitrary ci ’s are difficult to acquire. We thus begin. i n P U k i=h. v. ni = r is fixed. Design. maximizing D(n1 , · · · , nv ) has ni = [r/(k − h + 1)] or [r/(k − h + 1)] + 1, where [·] is the largest integer function. Proof. By substituting of. Pk. i=h. ni = r into D(n1 , · · · , nv ), Qk. Q Qv ni h−1 i=1 ni i=k+1 ni D(n1 , · · · , nv ) = . Ph−1 Pv cr + i=1 ci ni + i=k+1 ci ni i=h. It is easily seen that for fixed value of r, D(n1 , · · · , nv ) is maximized when ni , h ≤ i ≤ k, are as equal as possible.. 15.

(21) A direct consequence of Theorem 1 is the D-optimality of uniform designs when all the ci ’s are identical. For ci ’s are not all equal, suppose nh ≤ nk , by interchanging nh and nk , one can show that D(n1 , · · · , nh , · · · , nk , · · · , nv ) − D(n1 , · · · , nk , · · · , nh , · · · , nv ) =. v Y. ! ni. i=1. 1 1 P −P i6=h,k ci ni + ch nh + ck nk i6=h,k ci ni + ch nk + ck nh. ! ≤0. since ch nh + ck nk ≥ ch nk + ck nh . The result is given in the following Theorem 2. Design maximizing D(n1 , · · · , nv ) satisfies n1 ≥ · · · ≥ nv−1 ≥ nv .. 政治大 Theorem 2 established the general relation between c ’s and the corresponding 立 i. ‧ 國. 學. “optimal” ni ’s, so that the search region for the optimal (n1 , · · · , nv ) reduces from a complete enumeration to a subregion satisfying n1 ≥ · · · ≥ nv . Take v = 5 and. ni = 100 with ni ≥ 1, namely, at least one unit shall be assigned. y. i=1. = 3, 764, 376 ni combinations. sit. Pv. . Nat. to be. 5+99−1 5−1. ‧. n = 100 as an example, there are in total. n. al. er. io. to each treatment. The number of combinations reduces to 38, 225 when having. i n U. v. ni ’s in non-increasing order. This subregion can further be narrowed down to a. Ch. engchi. smaller subregion by specifying upper bounds of the ratio of nh to nk . Now, let ck /ch = ρ > 1. Theorem 3. Suppose nh /nk > ρ and nh −nk ≥ 2. Let n0h = nh −t and n0k = nk +t. where t is the smallest integer such that 1 ≤ n0h /n0k ≤ ρ. Then D(n1 , · · · , nh , · · · , nk , · · · , nv ) ≤ D(n1 , · · · , n0h , · · · , n0k , · · · , nv ). Proof. By Theorem 2, t ≤ (nh − nk )/2. An easy computation yields that. 16.

(22) D(n1 , · · · , n0h , · · · , n0k , · · · , nv ) − D(n1 , · · · , nh , · · · , nk , · · · , nv ) = D(n1 , · · · , nh − t, · · · , nk + t, · · · , nv ) − D(n1 , · · · , nh , · · · , nk , · · · , nv ) Q P t( vi6=h,k ni )(( vi=1 ci ni )(nh − nk − t) − (ck − ch )nh nk ) P P . = ( vi=1 ci ni + t(ck − ch ))( vi=1 ci ni ) P It suffices to show that ( vi=1 ci ni )(nh − nk − t) − (ck − ch )nh nk ≥ 0, v X. ! (nh − nk − t) − (ck − ch )nh nk. ci ni. i=1. =. v X. ! ci ni + ch nh + ρch nk. (nh − nk − t) − (ρch − ch )nh nk. i6=h,k. =. v X. 政治大. !. (nh − nk − t) + ch (n2h − ρn2k − t(nh + ρnk )). ci ni. 立. i6=h,k. ci ni ≥. v X. ci ≥. i6=h,k. i6=h,k. v X. ci ≥ ck (v − k),. i=k+1. ‧. ‧ 國. since. v X. 學. ≥ ch ρ(v − k)(nh − nk − t) + ch (n2h − ρn2k − t(nh + ρnk )),. n. al. er. io. sit. y. Nat. ρ ≥ ch ( (nh − nk )(v − k) + (nh + nk )(nh − ρnk ) + (ρ − 1)nh nk 2 1 − (nh − nk )(nh + ρnk )) 2 ch (ρ(nh − nk )(v − k) + (nh + nk )(nh − ρnk )) ≥ 0, = 2. Ch. which proves the desired results.. engchi. i n U. v. Theorem 3 shows if ck is ρ times larger than ch , the design is better, in terms of D-optimality, to have the corresponding nh being at most ρ times larger than nk . For an application of Theorems 2 and 3 consider the following Example 1. Example 1. For v = 5 and n = 100, suppose ci = i, i = 1, · · · , 5. Other than ni ≥ 1 and n1 ≥ · · · ≥ n5 , the reduced subregion also satisfies ni /n5 ≤ 5/ci , 1 ≤ i ≤ 4; ni /n4 ≤ 4/ci , 1 ≤ i ≤ 3; ni /n3 ≤ 3/ci , 1 ≤ i ≤ 2, and n1 /n2 ≤ 2. It follows that we may initiate our search by choosing n5 = [100(1 + 5/4 + 5/3 + 5/2 + 5)−1 ] = 8,. 17.

(23) and then for a fixed the value of n5 , we can set up the upper limits of the remaining ni ’s, in terms of n5 . Repeat this procedure for n5 = 9, · · · , 12, so that the reduced search region is configured as “strata” of the ni ’s. Specifically, let n0 = 8 denote the value of n5 in the initial stratum, Stratum 0. By Theorem 3, n5 = 8, n4 ≤ [(5/4)n5 ] = 10, n3 ≤ min([(5/3)n5 ], [(4/3)10]) = 13, n2 ≤ min([(5/2)n5 ], [(4/2)10], [(3/2)13]) = 19, and n1 ≤ min(5n5 , 40, 39, 38) = 38 in Stratum 0. That is, when n5 = 8, the search ranges for n4 to n1 are (8, 10), (n4 , 13), (n3 , 19), and (n2 , 38), respectively. This process can be continued to. 政治大. obtain search ranges for n0 = 9, · · · , 12, as we list in the following table, where Sj. n1. 10 11 12 13 15 16 17. 13 14 16 17 20 21 22. 19 21 24 25 30 31 33. 38 42 48 50 60 62 66. io. n. al. Ch. Stratum n5 S7 S8 S9 S10 S11 S12. engchi. 15 16 17 18 19 20. n4. n3. n2. n1. 18 20 21 22 23 25. 24 26 28 29 30 33. 36 39 42 43 45 49. 72 78 84 86 90 98. y. n2. sit. n3. upper limits. ‧. 8 9 10 11 12 13 14. n4. Nat. S0 S1 S2 S3 S4 S5 S6. upper limits. er. Stratum n5. 立. 學. ‧ 國. denotes the jth stratum.. i n U. v. In the table above, we observe it is possible to identify strata with infeasible upper limits of ni in some cases. For example, in Stratum 6, the theoretical upper limit for n2 is n2 ≤ 33. However, for n5 = 14, the minimum values for both n4 and n3 are 14, leading to the maximum possible value for n2 + n1 is 58 = 100 − 3(14). Meanwhile, we learn that n2 ≤ n1 by Theorem 2. Hence, the maximum feasible value for n2 is 29. Note that when n5 = n4 = n3 = 14, and n2 = 33, the value of n1 is then 25 which does not satisfy Theorem 2, and thus should not be included. Formally, a restriction ni ≤ [(n − (v − i)nv )/i], i = 1, · · · , v − 1, on the scope of. 18.

(24) Theorem 3 describing the maximum possible value for ni must be imposed if a feasible search is to be achieved. The modified search region with replaced upper limits in bold is shown below. upper limits Stratum n5 S0 S1 S2 S3 S4 S5 S6. 8 9 10 11 12 13 14. n4. n3. n2. n1. 10 11 12 13 15 16 17. 13 14 16 17 20 21 22. 19 38 21 42 24 48 25 50 30 52 30 48 29 44. upper limits Stratum n5 S7 S8 S9 S10 S11 S12. 15 16 17 18 19 20. n4. n3. n2. n1. 18 20 20 20 20 20. 23 22 22 21 20 20. 27 26 24 23 21 20. 40 36 32 28 24 20. 政治大. 立. Strata 0 and 1 are not qualified candidate search regions because the sum of n5. ‧ 國. 學. and upper limits for n4 , · · · , n1 in both strata are less than 100. Thus, the remaining strata constitute the reduced search region containing 1, 627 combinations.. ‧ sit. y. Nat. The upper bound obtained in this fashion may be sharpened when the condition. io. n. al. er. in the following lemma is met.. i n U. v. Lemma 4. Suppose ck /ch = ρ > 1, nh /nk ≤ ρ and nh − nk ≥ 2. Then. Ch. engchi. D(n1 , · · · , nh , · · · , nk , · · · , nv ) ≤ D(n1 , · · · , nh − 1, · · · , nk + 1, · · · , nv ), if the relation nh (nh − 1) ≥ ρnk (nk + 1) holds.. 19.

(25) Proof. D(n1 , · · · , nh − 1, · · · , nk + 1, · · · , nv ) − D(n1 , · · · , nh , · · · , nk , · · · , nv ) Q P ( vi6=h,k ni )(( vi6=h,k ci ni + ch nh + ck nk )(nh − nk − 1) − (ck − ch )nh nk ) P P = ( vi=1 ci ni + ck − ch )( vi=1 ci ni ) Q P P ( vi6=h,k ni )((nh − nk )( vi6=h,k ci ni ) + ch n2h − ck n2k − vi=1 ci ni ) P P = ( vi=1 ci ni + ck − ch )( vi=1 ci ni ) Q P ( vi6=h,k ni )( vi6=h,k ci ni + ch (nh (nh − 1) − ρnk (nk + 1))) P P , since nh − nk ≥ 2, ≥ ( vi=1 ci ni + ck − ch )( vi=1 ci ni ) ≥ 0, if nh (nh − 1) ≥ ρnk (nk + 1).. 政治大. As an illustration of the use of Lemma 4 we note that in Example 1 some of the. 立. upper limits in Strata 2 to 8 may further be sharpened. For example, in Stratum. ‧ 國. 學. 2, n5 = 10, the largest n3 such that n3 (n3 − 1) < (5/3)(10)(11) is 14. The refined. 12 14 13 15 14 16 15 17 16 19 17 20. 17 18 20 21 23 24. 23 26 28 30 32 35. al. Ch. Stratum n5 S8 S9 S10 S11 S12. engchi. 16 17 18 19 20. n4. n3. n2. n1. 21 22 21 20 20. 26 24 23 21 20. 36 32 28 24 20. sit. n1. 18 20 20 20 20. er. n2. y. upper limits. n3. n. 10 11 12 13 14 15. n4. io. S2 S3 S4 S5 S6 S7. upper limits. Nat. Stratum n5. ‧. search region with replaced upper limits in bold is given in the next table.. i n U. v. The D-optimal ni ’s searching through a thorough 3, 764, 376 enumeration are n∗1 = 23, n∗2 = 21, n∗3 = 20, n∗4 = 19, and n∗5 = 17. They lie in the above subregion containing 298 combinations only, and the search time would be much reduced. Theorems 1 to 3 and Lemma 4 are great tools to search for the optimal ni ’s through computer software since they reduce search region substantially, especially when n and/or v are large, and/or the ci ’s are close. In search of an D-optimal. 20.

(26) exact design with v treatments, the following algorithm is proposed by combining the preceding results. Define n0 =. n Pv −1 −1 ( i=1 ci ) . cv. (i) Determine nv(j) = n0 + j, 1 ≤ j ≤ [n/v] − n0 , and nv(j) is the value of nv in the jth stratum. 0 (ii) Let li(j) = [cv nv(j) /ci ], and li(j) = nint((cv nv(j) (nv(j) +1)/ci )1/2 ), where nint(·). is the nearest integer function, 1 ≤ i ≤ v − 1, 1 ≤ j ≤ [n/v] − n0 . 0 ), 1 ≤ i ≤ v − 1, 1 ≤ j ≤ [n/v] − n0 , and ni(j) is the (iii) Let ni(j) = min(li(j) , li(j). 政治大. upper bound, satisfying Theorem 3 and Lemma 4, for ni in the jth stratum.. 立. (iv) Determine n ˜ i(j) = min([(n − (v − i)nv(j) )/i], ni(j) ), 1 ≤ i ≤ v − 1, j ∗ ≤ j ≤. ‧ 國. 學. [n/v] − n0 , and n ˜ i(j) us the feasible upper bound for ni in the jth stratum. Pv. i=1. n ˜ i(j) ≥ n.. ‧. (v) Let j ∗ be the smallest j such that. sit. y. Nat. (vi) The subregion searching for the optimal ni ’s is then the union of strata. io. al. n. [n/v] − n0 .. er. Sj = {(n1 , · · · , nv )|nv = nv(j) , ni+1 ≤ ni ≤ n ˜ i(j) , 1 ≤ i ≤ v − 1}, j ∗ ≤ j ≤. Ch. engchi. i n U. v. An illustrative examples obtained by the algorithm is presented below. Example 2. For v = 5, n = 100, c1 = 1, c2 = 2, c3 = 4, c4 = 8, c5 = 16. Now 0 n0 = 3, n5(j) = 3 + j, li(j) = [16(3 + j)/ci ], and li(j) = nint((16(3 + j)(4 + j)/ci )1/2 ),. 1 ≤ i ≤ 4, 1 ≤ i ≤ 17. For j = 1, we have n5(1) = 4, l4(1) = 8, l3(1) = 16, l2(1) = 32, 0 0 0 0 and l1(1) = 64; l4(1) = 6, l3(1) = 9, l2(1) = 13, and l1(1) = 18. As a result, we. have n4(1) = 6, n3(1) = 9, n2(1) = 13, and n1(1) = 18. Through straightforward computation, we can obtain that n ˜ 4(1) = min(24, 6) = 6, n ˜ 3(1) = min(30, 9) = 9, n ˜ 2(1) = min(44, 13) = 13, n ˜ 1(1) = min(84, 18) = 18, and. 21. P5. i=1. n ˜ i(1) = 50 < 100..

(27) Proceed with j = 2, · · · , 17, and we have j ∗ = 6. The finalized search region for the optimal ni ’s are identified and are listed below. upper limits Stratum n5 S6 S7 S8 S9 S10 S11. 9 10 11 12 13 14. n4. n3. n2. n1. 13 15 16 18 19 20. 19 21 23 25 24 24. 27 30 32 32 30 29. 38 42 46 50 48 44. upper limits Stratum n5 S12 S13 S14 S15 S16 S17. 15 16 17 18 19 20. n4. n3. n2. n1. 21 21 20 20 20 20. 23 22 22 21 20 20. 27 26 24 23 21 20. 40 36 32 28 24 20. The D-optimal ni ’s searching through all combinations are n∗1 = 24, n∗2 = 23,. 學. combinations.. ∗ 5. ‧ 國. n∗3 = 21, n∗4. 政治大 = 18, and n = 14. They lie in the above subregion containing 2, 995 立. In what follows, the consideration of finding the D-optimal designs focuses on. ‧. splitting the values of ci ’s into two, 1 ≤ i ≤ v, with sizes m and v −m, respectively.. y. Nat. io. sit. That is, c1 = · · · = cm = 1 and cm+1 = · · · = cv = c > 1, 1 ≤ m ≤ v − 1. We. n. al. er. begin by determining the D-optimal approximate designs.. 3.2. Ch. engchi. i n U. v. D-optimal Designs for c1 = · · · = cm and cm+1 = · · · = cv , v ≥ 3. Let R denotes the set of all real numbers and r = n1 + · · · + nm . In the following Theorem 5 and its proof we have r ∈ R and ni ∈ R, 1 ≤ i ≤ v. From the proof of Theorem 1, for a fixed r, D(n1 , · · · , nv ) is maximized by choosing n1 = · · · = nm = r/m and nm+1 = · · · = nv = (n − r)/(v − m). As a result, finding the D-optimal design reduces to finding the values of r that maximizes D(n1 , · · · , nv ). Theorem 5. Design having n1 = · · · = nm = ropt,m /m, and nm+1 = · · · = nv =. 22.

(28) (n − ropt,m )/(v − m), where ropt,m. n(c(v + m − 1) − m + 1 − = 2(c − 1)(v − 1). √ δ). ,. with δ = c2 (v − m − 1)2 + (m − 1)2 + 2c(m(v − m) + v − 1), is a D-optimal approximate design. Proof. D(n1 , · · · , nv ) = max. ni ,1≤i≤v P v i=1 ni =n. =. Qv. ni /(r + c(n − r)), and Qv i=1 ni D(n1 , · · · , nv ) = max n ,1≤i≤v r + c(n − r) Piv i=1. i=1. (v − m)m−v mm. max. ni =n. rm (n − r)v−m (v − m)m−v = cn − (c − 1)r mm. 政治大 The solution to d ln f (r)/dr 立= 0 is r . r∈R m≤r≤m−v. max. r∈R m≤r≤m−v. f (r), say.. opt,m. ‧ 國. 學. If ropt,m is an integer, m | ropt,m , and (v −m) | (n−ropt,m ), where m | n means n is a multiple of m by convention, the optimal approximate design and optimal exact. ‧. design coincide. If not, the optimal approximate design is carried out by assigning. y. Nat. er. io. sit. [ropt,m /m] or [ropt,m /m] + 1 to the first m treatments, and [(n − ropt,m )/(v − m)] or [(n − ropt,m )/(v − m)] + 1 to the rest of v − m treatments. Rounding ropt,m to the. al. n. v i n nearest integer, however, mayC not lead to an optimal exact design. For example, hengchi U. consider the case when n = 56, v = 10, m = 4 and c = 38, the ropt,4 = 24.7618, but the optimal exact design is to have r∗ = 26. Hence, there is a need to develop theoretical methods for optimal exact designs. Finding the optimal approximate design is straightforward; the problem of finding the optimal exact design is, on the other hand, usually intractable and computationally intensive. In the remaining part of this section, we shall focus on constructing the optimal exact designs for m = v − 1 and for m = 1, respectively, and defer the reasons later in Example 7.. 23.

(29) Since v − 1 ni ’s should be allotted as equally as possible to attain D-optimality for both cases when m = v − 1 and m = 1, we let r = let r =. Pv. i=2. Pv−1 i=1. ni for m = v − 1, and. ni for m = 1, v − 1 ≤ r ≤ n − 1.. For a fixed value of r, it follows from Theorem 1 that the largest possible value of D(n1 , · · · , nv ) occurs when (v − 1)(1 + [r/(v − 1)]) − r of the ni are [r/(v − 1)], and r − (v − 1)[r/(v − 1)] of the ni are [r/(v − 1)] + 1. More explicitly, we have, for m = v − 1, max. ni ,1≤i≤v P v i=1 ni =n. max. max. H(r), say,. r∈N v−1≤r≤n−1. r∈N v−1≤r≤n−1. 立. 學. =. 政治大. r r (v−1)([ v−1 r−(v−1)([ v−1 ]+1)−r r ]) r n−r +1 cn − (c − 1)r v − 1 v−1. ‧ 國. =. D(n1 , · · · , nv ). ‧. where N denotes the set of positive integers, and r∗ obtained by maximizing H,. io. al. er. in deriving r∗ due to the discrete nature of the problem.. sit. y. Nat. v − 1 ≤ r ≤ n − 1, is a global maximum. However, there is an inherent difficulty. v. n. Now, replace r by a(v−1)+b, where a = [r/(v−1)] and b = r−(v−1)[r/(v−1)].. Ch. engchi. i n U. The function H(r) then can be expressed as a function of h(a, b), (n − a(v − 1) − b)av−b−1 (a + 1)b H(r) = = h(a, b), say. cn − (c − 1)(a(v − 1) + b) Let h(a∗ , b∗ ) = max h(a, b), where Θ = {(a, b) : 0 ≤ a ≤ [(n − 1)/(v − 1)], 0 ≤ (a,b)∈Θ. b ≤ v − 1, with v − 1 ≤ a(v − 1) + b ≤ n − 1}. To ensure a∗ (v − 1) + b∗ = r∗ , it is necessary to prove that H(r) is increasing in r for r ≤ r∗ , and is decreasing for r ≥ r∗ . To this end, we study some theoretical properties of h(a, b) in the following five lemmas. As r varies from v − 1 to n − 1, for fixed b, h(a, b) varies as a step function,. 24.

(30) with jumps at the points when r is a multiple of v − 1. In Lemma 6, we examine the function of h(a, b) as a remains fixed, showing that h(a, b) is either increasing, or decreasing, or increasing then decreasing in an integer interval [0, v − 1]. Lemma 6. For fixed a and n ≥ (c − 1)(v − 3), h(a, b) is either increasing in b, or decreasing in b, or increasing in b then decreasing in b, 0 ≤ b ≤ v − 1. Proof. For fixed a ∈ Θ,. h(a, b+1)−h(a, b) =. av−b−2 (a + 1)b g(b) , (cn − (c − 1)(a(v − 1) + b))(cn − (c − 1)(a(v − 1) + b + 1)). where g(b) = (c − 1)b2. 政治大 − (cn + (c − 1)(n − 1) − 2a(c − 1)(v − 1))b + γ(a), with 立. ‧ 國. 學. γ(a) = (c − 1)(v − 1)2 a2 − ((c − 1)(n − 1)(v − 1) + (cv − c + 1)n)a + cn(n − 1). The sign of h(a, b + 1) − h(a, b) is the same as the sign of g(b).. ‧. Let β1 and β2 be the two roots of g(b) with β2 ≥ β1 . Also, β1 + β2 > 0 as we. Nat. sit. y. observe cn + (n − 1)(c − 1) − 2a(c − 1)(v − 1) ≥ n + c − 1. The evaluation of β1 β2 ,. n. al. er. io. or equivalently, γ(a), at a = 0 and a < [(n − 1)/(v − 1)] gives β1 β2 |a=0 = cn(n −. i n U. v. 1)/(c−1) > 0, and β1 β2 |a=(n−1)/(v−1) = −n(n−1)/(c−1)(v−1) < 0. Consequently,. Ch. engchi. β1 and β2 are either 0 ≤ β1 ≤ β2 or β1 ≤ 0 ≤ β2 . It can also be verified that β2 is decreasing in a. Now, we proceed to show that if n ≥ (c − 1)(v − 3), β2 − (v − 2) (2c − 1)n − 2a(c − 1)(v − 1) − (c − 1) + = 2(c − 1). p (n + c − 1)2 + 4an(c − 1). − (v − 2). >. 2(c − 1)n − 2(c − 1)(n − 1) − 2(c − 1)(v − 2) n−1 , evaluated at a = , 2(c − 1) v−1. =. n − (c − 1)(v − 3) cn − (c − 1)(n − v − 3) = ≥ 0. 2(c − 1) 2(c − 1). 25.

(31) Combining the above results, we have, for 0 ≤ v − 2 ≤ β1 ≤ β2 , g(b) ≥ 0 for b ∈ Θ; for 0 ≤ β1 ≤ v − 2 ≤ β2 , it can be seen that g(b) ≥ 0 for 0 ≤ b ≤ β1 , and g(b) ≤ 0 for β1 ≤ b ≤ v − 2; for β1 ≤ 0 ≤ β2 , g(b) ≤ 0 for b ∈ Θ. In summary, for 0 ≤ v − 2 ≤ β1 ≤ β2 , h(a, b) is increasing in b; for 0 ≤ β1 ≤ v − 2 ≤ β2 , h(a, b) is increasing in b for 0 ≤ b ≤ β1 , and then decreasing in b for β1 ≤ b ≤ v − 2; for β1 ≤ 0 ≤ β2 , h(a, b) is decreasing in b. Lemma 6 is proved. We next investigate h(a, b) in the neighborhood of (a, 0) in Θ. For two consecutive a’s, h(a, b) is shown to be monotonically increasing for v − 1 ≤ r ≤ r∗ , and. 政治大. monotonically decreasing for r∗ ≤ r ≤ n − 1. The proofs are classified into two. 立. types depending on whether b∗ = 0 or not. Lemma 9 and 10 deal with the cases. ‧ 國. 學. when b∗ = 0, whereas Lemma 7 and 8 cover the remaining cases when b∗ 6= 0. For a somewhat neater deviation, we proceed by letting x1 = cn−a(c−1)(v−1),. ‧. x2 = cn − (a + 1)(c − 1)(v − 1), y1 = n − a(v − 1), and y2 = n − (a + 1)(v − 1).. y. Nat. er. io. al. sit. Note that x1 , x2 , y1 , and y2 > 0.. n. Lemma 7. For fixed a, h(a, 0) ≤ h(a, 1) implies h(a − 1, v − 2) ≤ h(a − 1, v − 1).. Ch. engchi. i n U. v. Proof. Observe that h(a, 0) = h(a − 1, v − 1) = av−1 y1 /x1 , h(a, 1) = (a + 1)av−2 (y1 − 1)/(x1 − c + 1), and h(a − 1, v − 2) = (a − 1)av−2 (y1 + 1)/(x1 + c − 1), from which it follows that h(a − 1, v − 1) − h(a − 1, v − 2) = h(a, 1) − h(a, 0) =. av−1 (x1 + c − 1)y1 − (a − 1)av−2 x1 (y1 + 1) and x1 (x1 + c − 1). (a + 1)av−2 x1 (y1 − 1) − av−1 (x1 − c + 1)y1 . x1 (x1 − c + 1). 26.

(32) Now since x1 (x1 + c − 1)(h(a − 1, v − 1) − h(a − 1, v − 2)) = av−1 (x1 + c − 1)y1 − (a − 1)av−2 x1 (y1 + 1) = (a + 1)av−2 x1 (y1 − 1) − av−2 ((n − va − 1)x1 − a(c − 1)y1 ) − av−1 (x1 − c + 1)y1 + av−2 ((n − va + 1)x1 − a(c − 1)y1 ) = x1 (x1 − c + 1)(h(a, 1) − h(a, 0)) + 2av−2 x1 , the following equality can be derived, (h(a, 1) − h(a, 0))(x1 − c + 1) + 2av−2 , h(a − 1, v − 1) − h(a − 1, v − 2) = x1 + c − 1 and the lemma follows.. 政治大 Lemma 8. For fixed a, h(a, v − 2) ≥ h(a, v − 1) implies h(a + 1, 0) ≥ h(a + 1, 1). 立. ‧ 國. 學. Proof. Since h(a, v − 1) = h(a + 1, 0) = (a + 1)v−1 y2 /x2 , h(a, v − 2) = a(a + 1)v−2 (y2 + 1)/(x2 + c − 1), and. ‧. h(a + 1, 1) = (a + 2)(a + 1)v−2 (y2 − 1)/(x2 − c + 1), we clearly have. Nat. al. n. and h(a, v − 2) − h(a, v − 1) = Now since. er. io. sit. y. (a + 1)v−1 (x2 − c + 1)y2 − (a + 2)(a + 1)v−2 x2 (y2 − 1) , h(a + 1, 0) − h(a + 1, 1) = x2 (x2 − c + 1) a(a + 1)v−2 x2 (y2 + 1) − (a + 1)v−1 (x2 + c − 1)y2 . x2 (x2 + c − 1). Ch. engchi. i n U. v. x2 (x2 − c + 1)(h(a + 1, 0) − h(a + 1, 1)) = (a + 1)v−1 (x2 − c + 1)y2 − (a + 2)(a + 1)v−2 x2 (y2 − 1) = a(a + 1)v−2 x2 (y2 + 1) − (a + 1)v−2 ((va + v − n − 1)x2 + (a + 1)(c − 1)y2 ) − (a + 1)v−1 (x2 + c − 1)y2 + (a + 1)v−2 ((va + v − n + 1)x2 + (a + 1)(c − 1)y2 ) = x2 (x2 + c − 1)(h(a, v − 2) − h(a, v − 1)) + 2(a + 1)v−1 x2 , hence, h(a + 1, 0) − h(a + 1, 1) =. (h(a, v − 2) − h(a, v − 1))(x2 + c − 1) + 2(a + 1)v−1 , x2 − c + 1. and the result follows.. 27.

(33) Lemma 9. Suppose aL , 0 < aL ≤. n−1 , satisfies that h(aL , v − 2) ≤ h(aL , v − 1), v−1. then h(a, v − 2) ≤ h(a, v − 1), 0 ≤ a ≤ aL . Proof. Let η1 (a) = (a + 1)(c − 1)y2 + x2 (y2 − a), then through some algebraic computations we can show that h(a, v − 1) ≥ h(a, v − 2) if η1 (a) ≥ 0. η1 (a) is a quadratic function with a positive coefficient of a2 , with η1 (0) = (n − v + 1)(cn − (c − 1)(v − 2)) > 0 and η1 (n/(v − 1) − 1) = −n(n − v + 1)/(v − 1) < 0. Hence, there is a root of η1 (a) between 0 and n/(v − 1) − 1 so that there exists an integer aL , 0 < aL < n/(v − 1) − 1 such that η1 (aL ) ≥ 0, which thus proves that η1 (a) ≥ 0 for 0 ≤ a ≤ aL .. 政治大. 立. ‧ 國. h(a, 0) ≥ h(a, 1), aU ≤ a ≤. n−1 v−1. n−1 v−1. , satisfies that h(aU , 0) ≥ h(aU , 1), then. 學. Lemma 10. Suppose aU , 0 < aU ≤ .. ‧. Proof. Lemma 10 is similarly proved. It is easy to show that h(a, 0) − h(a, 1) is. y. Nat. io. sit. equal to −av−2 (x1 (x1 − c + 1))−1 η2 (a), where η2 (a) = a(c − 1)y1 + x1 (y1 − a − 1).. n. al. er. Observing that η2 (a) is a convex function of a, with η2 (0) = cn(n − 1) > 0, and. Ch. i n U. v. η2 (n/(v −1)) = −n(n−1)/(v −1) < 0, there is an integer aU ≤ (n−1)/(v −1) such. engchi. that η2 (aU ) ≤ 0, and η2 (a) has one root in (0, aU ), one root in ((n − 1)/(v − 1), ∞). Hence, η2 (a) ≤ 0 for aU ≤ a ≤ [(n − 1)/(v − 1)]. Now we are ready to prove Theorem 11. Theorem 11. Assume n ≥ (c−1)(v−3). If maxr∈N,v−1≤r≤n−1 H(r) = H(r∗ ), then H(r) is increasing in r for v−1 ≤ r ≤ r∗ , and is decreasing in r for r∗ ≤ r ≤ n−1. Proof. Let r∗ = a∗ (v − 1) + b∗ . For cases when r∗ is divisible by v − 1, i.e., b∗ = 0, applying Lemmas 6 and 9, the following chain of inequalities can be derived. 28.

(34) h(a∗ , 0) = h(a∗ − 1, v − 1) ≥ h(a∗ − 1, v − 2) ≥ · · · ≥ h(a∗ − 1, 0) = h(a∗ − 2, v − 1) ≥ h(a∗ − 2, v − 2) ≥ · · · ≥ h(a∗ − 2, 0). Proceeding in this manner, it can be shown that H(r) is increasing in r for v − 1 ≤ r ≤ r∗ . Applying Lemmas 6 and 10 and following a similar procedure, the following chain of inequalities can be derived. h(a∗ , 0) ≥ h(a∗ , 1) ≥ · · · ≥ h(a∗ , v − 1) = h(a∗ + 1, 0) ≥ h(a∗ + 1, 1) ≥ · · · ≥ h(a∗ + 1, v − 1) = h(a∗ + 2, 0). Continuing this chain and that H(r) is decreasing in r for r∗ ≤ r ≤ n − 1 is shown.. 立. ∗. 政治大. For b 6= 0, by Lemmas 6 and 7, we can derive the following chain of inequalities,. ‧ 國. 學. h(a∗ , b∗ ) ≥ h(a∗ , b∗ − 1) ≥ · · · ≥ h(a∗ , 1) ≥ h(a∗ , 0) = h(a∗ − 1, v − 1). ‧. ≥ h(a∗ − 1, v − 2) ≥ h(a∗ − 1, 1) ≥ h(a∗ − 1, 0).. sit. y. Nat. Continuing this chain, it can be shown that H(r) is increasing in r for v − 1 ≤ r ≤. n. al. er. io. r∗ . By Lemmas 6 and 8 and following a similar procedure, the following chain of inequalities can be derived.. Ch. engchi. i n U. v. h(a∗ , b∗ ) ≥ h(a∗ , b∗ + 1) ≥ · · · ≥ h(a∗ , v − 2) ≥ h(a∗ , v − 1) = h(a∗ + 1, 0) ≥ h(a∗ + 1, 1) ≥ · · · ≥ h(a∗ + 1, v − 2) ≥ h(a∗ + 1, v − 1). Continuing this chain and that H(r) is decreasing in r for r∗ ≤ r ≤ n − 1 is shown. The proof of Theorem 11 is now complete. Theorem 11 proves H(r) is unimodal, that is, r1 ≤ r2 < r∗ implies H(r2 ) ≥ H(r1 ), and r∗ < r1 ≤ r2 implies H(r2 ) ≤ H(r1 ), so that the difference between r∗ and ropt,v−1 is less than 1. The optimal value r∗ can thus be found by choosing between [ropt,v−1 ] and [ropt,v−1 ] + 1 depending on which of the two gives a larger value. 29.

(35) for H(r). More explicitly, H(r∗ ) = maxr∈N {H([ropt,v−1 ]), H([ropt,v−1 ] + 1)}. The subsequent theorem further enables us to pinpoint r∗ directly without comparing H([ropt,v−1 ]) with H([ropt,v−1 ] + 1). Theorem 12.. (i) If γ. ropt v−1. ≤ 0, design having r∗ = a∗ (v − 1), where a∗ =. ropt , is a D-optimal exact design. v−1 (ii) If γ. ropt v−1. > 0, design having r∗ = a∗ (v − 1) + [λ∗ ] + 1, where. cn + (c − 1)(n − 1) − 2a∗ (c − 1)(v − 1)) − λ = 2(c − 1) ∗. p (n + c − 1)2 + 4n(c − 1)a∗. 治政大 is a D-optimal exact design. 立 (i) h(a, 0) ≥ h(a, 1) if and only if γ(a) ≤ 0. Hence, if γ. ‧ 國. 學. Proof.. ropt v−1. ≤ 0,. then the maximum occurs at b∗ = 0 and hence r∗ = a∗ (v − 1).. ‧. (ii) If γ(a) > 0, r∗ = a∗ (v − 1) + b∗ due to h(a, 0) < h(a, 1), in which b∗ can be. y. Nat. smaller root of g(b)|a=a∗ , then b∗ is [λ∗ ] + 1.. n. al. Ch. engchi. er. io. sit. found by examining h(a∗ , b + 1) − h(a∗ , b), i.e., g(b)|a=a∗ . Now let λ∗ be the. i n U. v. Let us illustrate Theorem 12 by the following two examples. One is with γ. ropt v−1. < 0 and the other is with γ. ropt,v−1. ropt v−1. > 0. For m = v − 1, we obtain. p n(2c(v − 1) − v + 2 − (v − 2)2 + 4c(v − 1)) = . 2(c − 1)(v − 1). Example 3. Let v = 4, n = 20, and c = 38.5. By straightforward computation, ropt,3 = 18.4367, a∗ = 6, and γ. ropt v−1. r∗ = 18 is a D-optimal exact design.. 30. = γ(6) = −25 < 0. A design having. ,.

(36) Example 4. For v = 4, n = 30, and c = 9, we easily calculate ropt,3 = 25.8856, a∗ = 8, and γ. ropt v−1. = γ(8) = 150 > 0. Now λ∗ = 1.405, hence r∗ = 8(3)+1+1 =. 26 is a D-optimal exact design. A development analogous to that of Lemma 6 to Theorem 12 can be carried out for the case m = 1, that is, c1 = 1 and c2 = · · · = cv = c. Now let r =. Pv. i=2. ni ,. and write x3 = n + a(c − 1)(v − 1) and x4 = n + (a + 1)(c − 1)(v − 1). As mentioned earlier in the case m = v − 1, it follows from Theorem 1 that the maximum value of D(n1 , · · · , nv ) occurs when (v − 1)(1 + [r/(v − 1)]) − r of. 政治大. the ni are [r/(v − 1)], and r − (v − 1)[r/(v − 1)] of the ni are [r/(v − 1)] + 1 for a. 立. max. H1 (r), say.. y. n. al. sit. io. r∈N v−1≤r≤n−1. r r (v−1)([ v−1 r−(v−1)([ v−1 ]+1)−r r ]) r n−r +1 n + (c − 1)r v − 1 v−1. Nat. =. r∈N v−1≤r≤n−1. ‧. =. D(n1 , · · · , nv ). er. max. ni ,1≤i≤v P v i=1 ni =n. 學. max. ‧ 國. fixed value of r. Thus for m = 1,. Writing r = a(v − 1) + b, we have max. r∈N v−1≤r≤n−1. Ch. engchi. i n U. v. (n − a(v − 1) − b)av−b−1 (a + 1)b = max h1 (a, b), say. (a,b)∈Θ (a,b)∈Θ n + (c − 1)(a(v − 1) + b). H1 (r) = max. To prove the unimodularity of H1 (r), we begin by examining the function h1 (a, b) as a remains fixed but b varies over all integers in the interval [0, v − 1] in the next lemma. Lemma 13. For fixed a and n ≥ c − 1, h1 (a, b) is either increasing in b, or decreasing in b, or increasing in b then decreasing in b, 0 ≤ b ≤ v − 1. Proof. Fix a, a ∈ Θ,. 31.

(37) h1 (a, b + 1) − h1 (a, b) =. av−b−2 (a + 1)b g1 (a, b) , (n + (c − 1)(a(v − 1) + b))(n + (c − 1)(a(v − 1) + b + 1)). where g1 (a, b) = −(c − 1)b2 + ((c − 1)(n − 1) − n − 2a(c − 1)(v − 1))b + γ1 (a) with γ1 (a) = −(c − 1)(v − 1)2 a2 + ((c − 1)(v − 1)(n − 1) − (v + c − 1)n)a + n(n − 1). Accordingly, g1 (a, b) is concave in b, and the sign of g1 (a, b) determines where h1 (a, b) is increasing and decreasing. Let α1 and α2 be the two roots of g1 (a, b), α2 ≥ α1 . It is not hard to show that α1 + α2 < 0 for 1 < c ≤ 2. For c > 2, α1 + α2 ≤ 0 if a ≥ a0 , and α1 + α2 ≥ 0 if. 政治大 function of a, and the evaluation of α α at the points a = 0, a = (n − 1)/(v − 1) 立. a ≤ a0 , where a0 = ((c − 1)(n − 1) − n)/2(c − 1)(v − 1). Moreover, α1 α2 is a convex 1 2. ‧ 國. (n − 1)(cn(v − 3) + (v − 1)(n − c + 1)) < 0. 4(c − 1)(v − 1). y. =−. ‧. n−1 2(v−1). Nat. α1 α2 |a=. cn(n − 1) n(n − 1) < 0, α1 α2 |a= n−1 = > 0, and v−1 c−1 (c − 1)(v − 1). sit. α1 α2 |a=0 = −. 學. and a = (n − 1)/2(v − 1) yields. n. al. er. io. Let λ1 be the larger root of γ1 (a), and through direct computation we have. i n U. v. a0 ≤ (n−1)/2(v−1) ≤ λ1 ≤ (n−1)/(v−1). Therefore, the sign of α1 α2 determines. Ch. engchi. whether λ1 ≤ 0 ≤ λ2 or λ1 ≤ λ2 ≤ 0. It follows that if 0 ≤ a ≤ λ1 and α2 ≥ v − 2, g1 (a, b) > 0 for b ∈ Θ. If λ1 ≤ a ≤. n−1 , v−1. g1 (a, b) < 0 for b ∈ Θ. If 0 ≤ a ≤ λ1 and. α2 ≤ v − 2, g1 (a, b) ≥ 0 for 0 ≤ b ≤ α2 , and g1 (a, b) ≤ 0 for α2 ≤ b ≤ v − 2. Summarizing our results, we have h1 (a, b) is increasing in b if 0 ≤ a ≤ λ1 and λ2 ≥ v − 2; if 0 ≤ a ≤ λ1 and λ2 ≤ v − 2, then h1 (a, b) is increasing in b for 0 ≤ b ≤ α2 , and is decreasing in b for α2 ≤ b ≤ v − 2; if λ1 ≤ a ≤ (n − 1)/(v − 1), h1 (a, b) is decreasing in b. This completes the proof of Lemma 13. Lemma 14. h1 (a, 0) ≤ h1 (a, 1) implies h1 (a − 1, v − 2) ≤ h1 (a − 1, v − 1), for. 32.

(38) 1 ≤ a ≤ [0, n−1 ]. v−1 Proof. It is not hard to show that h1 (a − 1, v − 1) − h1 (a − 1, v − 2) = h1 (a, 1) − h1 (a, 0) =. av−1 (x3 − c + 1)y1 − (a − 1)av−2 x3 (y1 + 1) x3 (x3 − c + 1) (a + 1)av−2 x3 (y1 − 1) − av−1 (x3 + c − 1)y1 , x3 (x3 + c − 1). in which h1 (a, 0) = h1 (a − 1, v − 1) = av−1 y1 /x3 , h1 (a, 1) = (a + 1)av−2 (y1 − 1)/(x3 + c − 1), and h1 (a − 1, v − 2) = (a − 1)av−2 (y1 + 1)/(x3 + c − 1). Now since. 治政 x (x − c + 1)(h (a − 1, v − 2) − h (a − 大 1, v − 1)) 立 = a (x − c + 1)y − (a − 1)a x (y + 1) 3. 3. 1. 3. v−2. 1. 3. 1. 學. ‧ 國. v−1. 1. = (a + 1)av−2 x3 (y1 − 1) + av−2 ((va − n + 1)x3 − a(c − 1)y1 ) − av−1 (x3 + c − 1)y1 − av−2 ((n − va − 1)x3 − a(c − 1)y1 ). n. al. (h1 (a, 1) − h1 (a, 0))(x3 + c − 1) + 2av−2 , x3 − c + 1. er. io. h1 (a − 1, v − 1) − h1 (a − 1, v − 2) =. sit. y. Nat. Hence,. ‧. = x3 (x3 + c − 1)(h1 (a, 1) − h1 (a, 0)) + 2av−2 x3 .. from which the lemma follows.. Ch. engchi. i n U. v. Lemma 15. h1 (a, v − 2) ≥ h1 (a, v − 1) implies h1 (a + 1, 0) ≥ h1 (a + 1, 1), for 0 ≤ a ≤ [ n−1 ] − 1. v−1 Proof. A simple evaluation gives us h1 (a + 1, 0) = h1 (a, v − 1) = (a + 1)v−1 y2 /x4 , h1 (a, v − 2) = a(a + 1)v−2 (y2 + 1)/(x4 − c + 1), and h1 (a + 1, 1) = (a + 2)(a + 1)v−2 (y2 − 1)/(x4 + c − 1). A direct computation yields h1 (a + 1, 0) − h1 (a + 1, 1) =. (a + 1)v−1 y2 (x4 + c − 1) − (a + 2)(a + 1)v−2 x4 (y2 − 1) x4 (x4 + c − 1). h1 (a, v − 2) − h1 (a, v − 1) =. a(a + 1)v−2 x4 (y2 + 1) − (a + 1)v−1 (x4 − c + 1)y2 . x4 (x4 − c + 1). 33.

(39) Now since x4 (x4 + c − 1)(h1 (a + 1, 0) − h1 (a + 1, 1)) = (a + 1)v−1 y2 (x4 + c − 1) − (a + 2)(a + 1)v−2 x4 (y2 − 1) = a(a + 1)v−2 x4 (y2 + 1) + (a + 1)v−2 ((a + 1)(c − 1)y2 + (n − va − v + 1)x4 ) − (a + 1)v−1 (x4 − c + 1)y2 − (a + 1)v−1 ((a + 1)(c − 1)y2 + (n − va − v − 1)x4 ) = x4 (x4 − c + 1)(h1 (a, v − 2) − h1 (a, v − 1)) + 2(a + 1)v−1 x4 , we have h1 (a+1, 0)−h1 (a+1, 1) =. (x4 − c + 1)(h1 (a, v − 2) − h1 (a, v − 1)) + 2(a + 1)v−1 , x4 + c − 1. which gives Lemma 15.. 立. 政治大 n−1 v−1. , satisfies that h1 (aL , v − 2) ≤ h1 (aL , v −. 學. ‧ 國. Lemma 16. Suppose aL , 0 < aL ≤. 1), then h1 (a, v − 2) ≤ h1 (a, v − 1), 0 ≤ a ≤ aL .. ‧. Proof. Let η3 (a) = (y2 −a)x4 −(a+1)(c−1)y2 = y2 (x4 −(a+1)(c−1))−ax4 , and it. sit. y. Nat. is easy to show h1 (a, v −1)−h1 (a, v −2) = (a+1)v−2 (x4 (x4 −c+1))−1 η3 (a), so that. n. al. er. io. h1 (a, v−1) ≥ h1 (a, v−2) if η3 (a) ≥ 0. Since η3 (a) is concave in a, and evaluations of. v. η3 (a) at a = 0 and a = (n/(v−1)−1) yield η3 (0) = (n−v+1))(n+(c−1)(v−2)) > 0,. Ch. engchi. i n U. and η3 (n/(v − 1) − 1) = −cn(n − v + 1)/(v − 1) < 0, respectively, η3 (a) has a root between 0 and n/(v−1)−1 so that there exists an integer aL , 0 ≤ aL ≤ n/(v−1)−1 such that η3 (aL ) ≥ 0, and the lemma follows. Lemma 17. Suppose aU , 0 < aU ≤ then h1 (a, 0) ≥ h1 (a, 1), aU ≤ a ≤. n−1 , satisfies that h1 (aU , 0) ≥ h1 (aU , 1), v−1. n−1 . v−1. Proof. From a similar argument used in the proof of Lemma 10, we have h1 (a, 0)− h1 (a, 1) = −av−2 (x3 (x3 +c−1))−1 η4 (a) , in which η4 (a) = a(c−1)y1 −(y1 −a−1)x3 , a convex function of a. By direct computation, we see that η4 (0) = n(n − 1) > 0. 34.