### 統計 Statistics 地 點 ： M 2 1 1 數 學 館

TMS Annual Meeting

### 數 學 年 會

## 2018 ^{數 學 年 會}

### D e c . 8 / 0 9 : 3 0 - 2 1 : 0 0

### D e c . 9 / 0 9 : 3 0 - 1 5 : 5 0

### 演講摘要

Speech Abstracts

ѱ䗜փ ঊ䗜փ

### D e c . 8 / 0 9 : 3 0 - 2 1 : 0 0

1 1 : 2 0 - 1 2 : 0 5

### တᙗឬ

<L&KLQJ<DR

$PRGHOELDVSUREOHPDULVLQJIURPLPDJHDQDO\VLVLQFU\RJHQLFHOHFWURQ

PLFURVFRS\

1 3 : 3 0 - 1 4 : 1 5

### ⧁㏣㧷

:HL&KLQJ:DQJ 䂋զཝᆮуੂޛᆮ㇗䚉ҁ㎧䀾⁗ශ

1 4 : 2 0 - 1 4 : 4 5

### 䲩ᱛ

&KXQ6KX&KHQ 0HDVXULQJVWDELOL]DWLRQLQPRGHOVHOHFWLRQ

### 哹ь䊠

6KLK+DR+XDQJ

2SWLPDOGHVLJQVIRUELQDU\UHVSRQVHPRGHOVZLWKPXOWLSOHQRQQHJDWLYH

YDULDEOHV

### A model bias problem arising from image analysis in cryogenic electron microscopy

### Yi-Ching Yao

### Institute of Statistical Science Academia Sinica

### E-mail: yao@stat.sinica.edu

Cryogenic electron microscopy (cryo-EM) is an imaging technique to
con-struct the 3D con-structures of biological samples such as membrane proteins. In
some cases, the extremely low signal-to-noise ratio of cryo-EM images results in
the processing dictated by the reference of a model, which is known as model
bias. A well-known example showed that a blurred Einstein face emerged from
1000 aligned images of pure noise (often referred as “Einstein from noise”). To
investigate this model bias phenomenon quantitatively, we consider a simplified
*model consisting of n iid p-dimensional images of pure Gaussian noise and a*
*specified reference image (of Einstein). The n images of pure noise are sorted in*
*terms of their cross correlation values with the reference image, and the top m*
images (of pure noise) are selected and averaged. We derive asymptotic
distri-butions for the cross correlation between the averaged image and the reference
*image as n, p, m→ ∞ at suitable rates. (This is joint work with Shao-Hsuan*
Wang, Wei-Hau Chang and I-Ping Tu.)

**Keywords: Digital image, noise, correlation, asymptotic distribution**

### 96

### 評估大學不同入學管道之統計模型

### 陳佩珊 王維菁 洪慧念 統計學研究所 國立交通大學

### 電子信箱: wjwang@stat.nctu.edu.tw

對於評估大學多元入學管道之探討，多數文獻皆著重於針對特定校系之資料 分析。本研究之重點在於建立機率統計模型以描述大學入學申請與指考管道的 篩選過程，可以調控多種因子，包含大環境的考生數與招生比例、考試的鑑別 度、考生選填志願與是否接受分發結果的行為等等。我們透過模擬分析調整模 型參數值以設計各種情境，比較經由申請入學與考試分發入學錄取的學生的差 異。教育當局可利用此統計模型 (或修正版本) 預測不同政策可能導致的結果。

關鍵詞: 入學管道、申請入學、指考分發、統計模型。

### 97

### Measuring stabilization in model selection

### Chun-Shu Chen

### Institute of Statistics and Information Science National Changhua University of Education

### E-mail: cschen@cc.ncue.edu.tw

Model selection and model averaging are essential to regression analysis, but determining which of the two approaches is the more appropriate and un-der what circumstances remains an active research topic. In this paper, we focus on geostatistical regression models for spatially referenced environmental data. For a general information criterion, we develop a new perturbation-based criterion that measures the uncertainty of spatial model selection, as well as an empirical rule for choosing between model selection and model averaging.

Statistical inference based on the proposed model selection instability measure is justified b oth i n t heory a nd v ia a s imulation s tudy. T he p redictive perfor-mance of model selection and model averaging can be quite different when the uncertainty in model selection is relatively large, but the performance becomes more comparable as this uncertainty decreases. For illustration, a precipitation data set in the state of Colorado is analysed. This is a joint work with Jun Zhu and Tingjin Chu.

**Keywords:** Information criterion, model complexity, spatial prediction

### 98

### Optimal designs for binary response models with multiple nonnegative variables

### Shih-Hao Huang, Mong-Na Lo Huang, and Cheng-Wei Lin Department of Mathematics and Department of Applied Mathematics

### National Central University and National Sun Yat-sen University E-mail: shhuang@math.ncu.edu.tw

In this work, we consider optimal approximate designs for binary response
models with nonnegative explanatory variables. With respect to the Schur
or-dering, we construct an essentially complete class consisting of designs with a
*simple structure. In particular, we explicitly identify locally D-optimal designs*
within the class for logit and probit models. When the nonnegative explanatory
variables have more restrictions, such as factorial and mixture experiments, we
also provide an informative iteration algorithm to search an optimal design.

**Keywords: ϕ***p**-optimality, D-optimality, essentially complete class, logit*
model, probit model, schur ordering

### 99

### Functional data classification using covariate-adjusted subspace projection

### Pai-Ling Li Department of Statistics

### Tamkang University E-mail: plli@gms.tku.edu.tw

We propose a covariate-adjusted subspace projection method for classifying functional data, where the covariate effects on the response functions influence the classification o utcome. The proposed method is a subspace classifier based on functional projection, and the covariates affect the response function through the mean of a functional regression model. We assume that the response func-tions in each class are embedded in a class specific s ubspace s panned b y a covariate-adjusted mean function and a set of eigenfunctions of the covariance kernel through the covariate-adjusted Karhunen-Loève expansion. A newly ob-served response function is classified i nto t he o ptimally p redicted c lass that has the minimal distance between the observation and its projection onto the subspaces among all classes. The covariate adjustment is useful for functional classification, e specially w hen t he c ovariate e ffects on th e me an fu nctions are significantly d ifferent am ong th e cl asses. Nu merical pe rformance of th e pro-posed method is demonstrated by simulation studies, with an application to a data example. This is a joint work with Jeng-Min Chiou and Yu Shyr.

**Keywords:** classification, discriminant analysis, functional data analysis,
functional principal component analysis

### 100

### Statistical inference for the accelerated failure time model under multivariate outcome

### dependent sampling design

### Tsui-Shan Lu

### Department of Mathematics National Taiwan Normal University

### E-mail: tslu@ntnu.edu.tw

Researchers are always seeking for cost-effective d esigns d ue t o a limited budget, especially for large biomedical or epidemiological studies. An outcome-dependent sampling (ODS) design, a retrospective sampling scheme, has been shown to improve the study efficiency wh ile eff ectively red ucing the monetary burden. Under the ODS design, one observes the covariates with a probability depending on the outcome and selects several supplemental samples from the most informative and appealing segments. Lu, Longnecker, and Zhou (2017) extended the ODS design to incorporate multivariate data often appeared in the recent studies and proposed a further generalization of the biased sampling.

In this talk, we consider a multivariate ODS (MODS) design for time-to-different-events data under the framework of a semiparametric accelerated fail-ure time (AFT) model, allowing multiple disease outcomes with clustered failure times. We establish an estimating equation approach to estimate parameters based on induced smoothing. The asymptotic properties of the proposed esti-mators are developed. Simulation results show that the proposed design is more efficient and powerful than other existing ap proaches. The proposed method is illustrated with a real data set.

**Keywords:** outcome-dependent sampling, multivariate, AFT model,
semi-parametric

### 101

### Test Statistics of Pearson-Fisher’s Type with Some Remarks on the Degrees of Freedom

### Wei-Hsiung Chao

### Department of Applied Mathematics National Dong Hwa University E-mail: whchao@gms.ndhu.edu.tw

Pearson-Fisher’s tests have been widely used for assessing the

t of a model for the categorical response in settings of a single multinomial or product multinomials. The statistic used in these tests can be viewed as a quadratic form in the differences between the observed totals and fitted totals which uses as a weighting matrix a particular nonsingular generalized inverse for the singular variance-covariance matrix of the differences. Using properties of inner product spaces and the rank condition, we demonstrate an alternative way to determine the degrees of freedom of the asymptotic null distribution of these Pearson-Fisher statistics.

To assess the fit of polytomous regression models with only categorical co-variates, it is also appropriate to use the Pearson-Fisher’s test for product multi-nomials since the response observations within each covariate pattern are homo-geneous so that their total can be viewed as a single non-sparse multinomial. In the presence of continuous covariates, direct use of this method is not appropri-ate since the response observations within each cappropri-ategorical covariappropri-ate pattern can be quite heterogeneous. To overcome this limitation, many ad-hoc extensions of Pearson-Fishers chi-squared statistics have been proposed for binary and ordinal logistic regression models using some sorts of grouping strategies. For example, Hosmer and Lemeshow (1980) suggested partitioning the observations into g groups with equal size based on the fitted probabilities. Their statistic is then formed as a sum of Pearson’s statistics over all groups. With a small number of groups, these statistics are not close to a chi-square distribution since the within-group observations are more heterogeneous so that the observed totals within a group are actually underdispersed relative to multinomial distribution.

Through extensive simulations, these authors showed that their statistic has a chi-square null distribution for a certain range of numbers of groups and certain covariate distributions being considered. We will also discuss the degrees of freedom of their statistic from the view point of the rank condition.

**Keywords: goodness of fit, Pearson’s chi-square test, rank condition.**

### 102

### On Fixed Effects Estimation for Spatial Regression Under the Presence of Spatial

### Confounding

### Yung-Huei Chiou Department of Mathematics

### National Changhua University of Education E-mail: S0322011@gm.ncue.edu.tw

Spatial regression models are often used to analyze the ecological and en-vironmental data sets over a continuous spatial support. Issues of collinearity among covariates are often considered in modeling, but only rarely in discussing the relationship between covariates and unobserved spatial random processes.

Past researches have shown that ignoring this relationship (or, spatial confound-ing) would have significant influences on the estimation of regression parameters.

To improve this problem, an idea of restricted spatial regression is used to ensure that the unobserved spatial random process is orthogonal to covariates, but the related inferences are mainly based on Bayesian frameworks. In this thesis, an adjusted generalized least squares estimation method is proposed to estimate regression coefficients, resulting in the estimators that perform better than the conventional methods. Under the frequentist framework, statistical inferences of the proposed methodology are justified b oth i n t heories a nd v ia simulation studies. Finally, an application of a water acidity data set in the Blue Ridge region of the eastern U.S. is analyzed for illustration. This is a joint work with Hong-Ding Yang and Chun-Shu Chen.

**Keywords:** Bias, Generalized least squares, Maximum likelihood estimate,
Random effects, Restricted spatial regression

### 103

### Estimation and selection for spatial regression under the presence of spatial confounding

### Hong-Ding Yang

### Institute of Statistics and Information Science National Changhua University of Education

### E-mail: hdyang@cc.ncue.edu.tw

The spatial random effects m odel i s p opular i n a nalyzing s patially refer-enced data. The model includes spatially observed covariates and unobserved spatial random effects, w hich i f n ot d eal p roperly w ith t he c onfounding be-tween the two components, parameter estimation and spatial prediction had been demonstrated to be unreliable. In this research, we focus on discussing the estimation of regression coefficients an d th e se lection of co variates fo r spatial regression under the presence of spatial confounding. We first introduce an ad-justed estimation method of regression coefficients an d th e co nsequent spatial predictor when spatial confounding exists. From a prediction point of view, we then propose a generalized conditional Akaike information criterion to select a subset of covariates, resulting in variable selection and spatial prediction that are satisfactory. Statistical inferences of the proposed methodology are justified theoretically and numerically. This is a joint work with Yung-Huei Chiou and Chun-Shu Chen.

**Keywords:** conditional information criterion, mean squared prediction
er-ror, restricted spatial regression, spatial prediction, variable selection.

### 104

ѱ䗜փ ঊ䗜փ

### 數學科普 Popular Mathematics 地 點 ： 綜 合 館 H 3 0 1

TMS Annual Meeting

### 數 學 年 會 2018 ^{數 學 年 會}

### D e c . 8 / 0 9 : 3 0 - 2 1 : 0 0

### D e c . 9 / 0 9 : 3 0 - 1 5 : 5 0

### 演講摘要

Speech Abstracts