國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Section 3 the Meaning of Zero Skewness and Zero Kurtosis of the Discrete Probability Distributions
Some may think the normal approximated discrete probability distribution can be
characterized as having both skewness and kurtosis values of 0. However, the above
characterization may not be reasonable. It can be show as following example. If we
symmetrically divided a normal distribution between -3 and 3 into equal-interval
categories probability distributions with number of categories from 3 to 20, the
skewness are all 0 but none of the kurtosis is 0. The values of kurtosis are presented in
Table 6. It can be seen that all the values of kurtosis are smaller than 0 except in the
case that the number of categories is 3. The more categories of the discrete probability
distribution, the closer the value of kurtosis gets to 0. The range of kurtosis is -.26 in
the 4 categories probability distribution and -.17 in the 20 categories probability
distribution. Therefore, the discrete probability distribution may not be normal
approximated when the discrete probability distribution with both the values of
skewness and kurtosis are 0.
In addition, the probability distribution might not be symmetric even with zero
skewness. While relatively large value of kurtosis is specified for a discrete
probability distribution, in order to satisfying the constraint, the probability might be
highly concentrated in 1 category. This makes the even categories probability
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
distributions might be asymmetric when skewness is 0. When the value of skewness is
set at 0 and the value of kurtosis is set at a relatively high value (such as 0 in the 4
categories probability distributions), large probability could appear in only 1 category
near center, and thus the discrete probability distribution would be asymmetric.
Although the MEP-2 and MEP-4 don’t tend to generate the probability distributions
with peaks since the maximum entropy distributions are the smoothest one, they still
could not guarantee the symmetric with unsupported information.
Section 4 the Uniqueness of the Solutions of the MEP-4 and MEP-2
The uniqueness of the solutions with the maximum entropy with the first 4
moments has been discussed and proved (Kapur & Kesavan, 1992; Kesavan & Kapur,
1989; Zellner & Highfield, 1988). However, the uniqueness of the solutions of MEP-2
is still in question. In this research, it could be seen through empirically that in some
conditions that the solutions of MEP-2 are not unique.
First, it has been proven that the solutions on the boundary of skewness and
kurtosis would be solved if and only if the number of nonempty categories are 2
(Wilkins, 1944). Therefore, in a k categories probability distribution, there are
2 k
solutions with the same value of information entropy would be generated when the
values of skewness and kurtosis are on the boundary. It’s obvious there’s no unique
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
solution in this situation. A more efficient R package for this situation is proposed in
Appendix C. Since this package is used for the boundary, the only information needed
is the value of skewness. The solutions obtained through this package are quite
accurate. It’s suggested to use the package when the set value of skewness and
kurtosis are on the boundary.
Second, there are some discrete maximum entropy distributions with zero
skewness are not symmetric as mentioned. These probability distributions could be
flipped over and the skewness, kurtosis and entropy would be still the same. It shows
that if the estimated discrete maximum entropy distributions are not symmetric with 0
skewness, there are at least 2 different maximum entropy distributions exist. Through
above discussion, it can be seen that the uniqueness of the solutions in this situation
does not hold.
Through discussion above, it can be seen that although the uniqueness of the
MEP-4 is proved, the uniqueness of the MEP-2 is still in question. The uniqueness of
solutions of MEP-2 would not be held in at least 2 situations. The situations are 1. the
values of skewness and kurtosis are on the boundary. 2. The maximum entropy
distributions are not symmetric when skewness is 0.
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Section 5 the Maximum Entropy Distributions with Prior Probability Distributions
The MEP presented in this research is restricted the known information as the
parameters of the target probability distribution. With more information of the shape
other than the parameters of the target probability distribution, another procedure of
maximum entropy family might be more useful. The probability of each category of
the probability distribution need to be set and the probability distribution is defined as
prior probability distribution. With probability of each category of the prior discrete
probability distribution are set, the measure Kullback–Leibler divergence introduced
by Kullback and Leiber (1951) could be used to choosing the maximum entropy
distribution satisfying the parameters and with the smallest divergence to the prior
probability distribution. The measure Kullback–Leibler divergence for the discrete
probability distributions is defined as follows,
k
i i
i
i q
p p
1
ln (24)
where pi and qi are the probability of ith category of the estimated probability
distribution A and the prior probability distribution B respectively. The minimization
of formula (24) entails the maximization of the entropy of distribution A. The
Kullback–Leibler divergence is also defined as cross entropy. It seeks to determine the
probability distribution A that satisfies all the constraints and with the closest
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
probability values to the given probability distribution B. It means that the
probabilities of A would be modified to the closest values to the corresponding
probabilities of B and fulfilled the constraints. Therefore, with a clear picture of the
shape of probability distribution in mind, the cross entropy could be used and a proper
probability distribution could be suggested.
Section 6 Conclusion
The discrete variables are quite common in psychological researches, but the
robustness researches on this topic are relatively scant. When conducting the Monte
Carlo researches for understanding the effect of the parameters, there are 2 difficulties
would be encountered. The first difficulty is to estimate the discrete probability
distribution with the specified parameters precisely and the second difficulty is to
choose the discrete probability distribution from infinite discrete probability
distribution with the same parameters when the number of constraints is smaller than
the number of categories. Therefore a procedure to estimate the discrete probability
distribution is necessary and practical. The Maximum Entropy Procedure (MEP)
suggests to estimating the discrete maximum entropy distribution with specified
parameters as constraints. The maximum entropy distribution is the greatest number
distribution from all discrete probability distributions satisfying the same constraints.
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Therefore, it’s a rationale choice if only the values of the parameters are known
information. In addition, the discrete maximum entropy distribution chosen through
MEP are smooth and have no empty categories unless necessary. These properties of
maximum entropy distributions make MEP a suitable procedure to follow in
simulating the discrete probability distribution of psychological research with the
specified parameters as the only information. The Maximum Entropy Procedure with
4 Parameters (MEP-4) and Maximum Entropy Procedure with 2 Parameters (MEP-2)
are presented, evaluated and discussed. Moreover, the R packages of these procedures
are also presented in this research for the implementation of these procedures.
Furthermore, the accuracy of these procedures also the R packages is confirmed by
three simulation studies in this research and the possible generated parameter spaces
of these procedures are consistent with our knowledge. The R packages could
estimate the discrete probability distribution with specified information and also
sample specified number of samples from this estimated distribution with specified
sample size. Therefore, theses programs are convenient and practical for the Monte
Carlo researches of the robustness of the statistics with the univarate non-normal
approximated discrete data.
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Reference
Allen, R. C., Bottcher, C., Bording, P., Burns, P., Conery, J., Davies, T. R., et al.
(1996). Computational Science Education Project. In M. Baranger, J. Dongarra,
G. Fox & D. Schneider (Eds.) Available from http://www.phy.ornl.gov/csep
Anderson, M. J. (2001). A new method for non-parametric multivariate analysis of
variance. Austral Ecology, 26, 32-46.
Andrews, D. F., Bickel, P. J., Hampel, F. R., Huber, P. J., Rogers, W. H., & Tukey, J.
W. (1972). Robust estimates of location: Survey and Advances. NJ: Princeton
University Press.
Ansell, M. J. G. (1973). Robustness of location estimators to asymmetry. Applied
Statistics, 22, 249-254.
Bollen, K. A. (1989). Structural Equations with Latent Variables. New York: John
Wiley & Sons, Inc.
Bollen, K. A., & Barb, K. H. (1981). Pearson's r and coarsely categorized measures.
American Sociological Review, 46, 232-239.
Curran, P. J., West, S. G., & Finch, J. F. (1996). The robustness of test statistics to
nonnormality and specification error in confirmatory factor analysis.
Psychological Methods, 1, 16-29.
David, H. A., & Shu, V. S. (1978). Robustness of location estimators in the presence
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
of an outlier. In H. A. David (Ed.), Contributions to survey sampling and applied
statistics. New York: Academic Press.
Erceg-Hurn, D. M., & Mirosevich, V. M. (2008). Modern robust statistical methods:
An easy way to maximize the accuracy and power of your research. American
Psychologist, 63, 591-601.
Fleishman, A. I. (1978). A method for simulating non-normal distributions.
Psychometrika, 43, 521-532.
Golan, A., Judge, G., & Miller, D. (1996). Maximum Entropy Econometrics: Robust
Estimation with Limited Data. New York: John Wiley & Sons, Inc.
Gorsuch, R., L. (1983). Factor analysis. NJ: Lawrence Erlbaum Associates, Inc.
Guttman, L. (1948). An inequality for kurtosis. The Annals of Mathematical Statistics,
2, 277-278.
Hipp, J. R., & Bollen, K. A. (2003). Model fit in structural equation models with
censored, ordinal, and dichotomous variables: Testing vanishing tetrads.
Sociological Methodology, 33, 267-305.
Jaynes, E. T. (1957a). Information theory and statistical mechanics. Physics Review,
106, 620-630.
Jaynes, E. T. (1957b). Information theory and statistical mechanics II. Physics Review,
108, 171-190.
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Jaynes, E. T. (1982). On the rationale of maximum-entropy methods Proceedings of
the IEEE, 70, 939-952.
Kapur, J. N., & Kesavan, H. K. (1992). Entropy Optimization Principles with
Applications. Boston: Academic Press.
Kesavan, H. K., & Kapur, J. N. (1989). The generalized maximum entropy principle.
IEEE Transactions on Systems, Man, and Cybernetics, 19(5), 1042-1052.
Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. The Annuals of
Mathematical Statistics, 22, 79-86.
Lei, M., & Lomax, R. G. (2005). The effect of varying degrees of nonnormality in
structural equation modeling. Structural Equation Modeling: A Multidisciplinary
Journal, 12, 1-27.
Martin, W. S. (1973). The effects of scaling on the correlation coefficient: A test of
validity. Journal of Marketing Research, 10, 316-318.
Martin, W. S. (1978). Effects of scaling on the correlation coefficient: additional
considerations. Journal of Marketing Research, 15, 304-308.
Mattson, S. (1997). How to generate non-normal data for simulation of structural
equation models. Multivariate Behavioral Research, 32, 355-373.
Micceri, T. (1989). The Unicorn, The normal curve, and other improbable creatures.
Psychological Bulletin, 105, 156-166.
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Mohammad-Djafari, A., & d'Electricite, E. S. (1992). A MATLAB program to
calculate the maximum entropy distributions. In C. R. Smith, G. J. Erickson & P.
O. Neudofer (Eds.), Maximum Entropy and Bayesian Methods Seattle, 1991
(Vol. 50, pp. 221-233). Dordrecht, Netherland: Kluwer Academic Publisher.
Muthén, B., & Kaplan, D. (1985). A comparison of some methodologies for the factor
analysis of non-normal Likert variables. British Journal of Mathematical and
Statistical Psychology, 38, 171-189.
Muthén, B., & Kaplan, D. (1992). A comparison of some methodologies for the factor
analysis of non-normal Likert variables: A note on the size of the model. British
Journal of Mathematical and Statistical Psychology, 45, 19-30.
Nelder, J. A., & Mead, R. (1965). A simplex algorithm for function minimization.
Computer Journal, 7, 308-313.
Olsson, U. (1979). On the robustness of factor analysis against crude classification of
the observations. Multivariate Behavioral Research, 14, 485-500.
Ory, D. T., & Mokhtarian, P. L. (2010). The impact of non-normality, sample size and
estimation technique on goodness-of-fit measures in structural equation
modeling: Evidence from the empirical models of travel behavior. Quality and
Quantity, 44, 427-445.
Pearson, E. S., & Please, N. W. (1975). Relation between the shape of population
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
distribution and the robustness of four simple test statistics Biometrika, 62,
223-241.
Pearson, K. (1916). Mathematical contributions to the theory of evolution. XIX.
Second supplement to a memoir on skew variation. Philosophical Transactions
of the Royal Society of London. A, 216, 429-457.
Reinartz, W. J., Echambadi, R., & Chin, W. W. (2002). Generating non-normal data
for simulation of structural equation models using Mattson's method.
Multivariate Behavioral Research, 37, 227-244.
Tadikamalla, P. R. (1980). On simulating non-normal distributions. Psychometrika, 45,
273-279.
Theil, H., & Fiebig, D. G. (1984). Exploiting Continuity: Maximum Entropy
Distribution of Continuous Distribution. Cambridge, MA: Ballinger Publishing
Company.
Wilcox, R. R. (1998). How many discoveries have been lost by ignoring modern
statistical methods. American Psychologist, 53, 300-314.
Wilkins, J. E. (1944). A note on skewness and kurtosis. The Annals of Mathematical
Statistics, 15, 333-335.
Wu, N. (1997). Rationale of the maximum entropy method In T. S. Huang, T.
Kohonen & M. Schroeder (Eds.), The Maximum Entropy Method. New York:
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Springer.
Zellner, A., & Highfield, R. A. (1988). Calculation of maximum entropy distributions
and approximation of marginal posterior distributions. Journal of Ecnometrics,
37, 195-209.
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Table 1 the Descriptive Statistics and Root Mean Square Errors of Difference between the Specified and Obtained 4 Parameters of 6 and 7 (in brackets) Categories Discrete Measures of the Study 1
N=11590
(N=30292) Descriptive Statistics Root Mean Squared Difference
Difference Min Max Mean Variance Mean Variance
mean -.000(-.000) .000(.000) -.000(-.000) .000(.000) .000(.000) .000(.000) variance -.000(-.000) .000(.000) -.000(-.000) .000(.000) .000(.000) .000(.000) skewness -.001(-.001) .001(.001) -.000( .000) .000(.000) .000(.000) .000(.000) kurtosis -.001(-.001) .000(.001) -.000(-.000) .000(.000) .000(.000) .000(.000)
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Table 2 the Pearson Correlation Coefficients of the 4 Parameters of 6 (lower triangle) and 7 (upper triangle) Categories Discrete Measures of the Study 1
Mean Variance Skewness Kurtosis
Mean 1 0 -.75 0
Variance 0 1 0 -.42
Skewness -.77 0 1 0
Kurtosis 0 -.54 0 1
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Table 3 the Descriptive Statistics and Root Mean Square Errors of Difference between the Specified and Obtained 4 Parameters of 6 and 7 (in brackets) Categories Discrete Measures of the Study 2
N=53,124
(N=230,223) Descriptive Statistics Root Mean Squared Difference
Difference Min Max Mean Variance Mean Variance
mean -.000(-.000) .000(.000) -.000(-.000) .000(.000) .000(.000) .000(.000) variance -.000(-.000) .000(.000) -.000(-.000) .000(.000) .000(.000) .000(.000) skewness -.000(-.000) .000(.000) -.000( .000) .000(.000) .000(.000) .000(.000) kurtosis -.000(-.000) .000(.000) -.000(-.000) .000(.000) .000(.000) .000(.000)
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Table 4 the Pearson Correlation Coefficients of the 4 Parameters of 6 (lower triangle) and 7 (upper triangle) Categories Discrete Measures of the Study 2
Mean Variance Skewness Kurtosis
Mean 1 0 -.77 0
Variance 0 1 0 -.52
Skewness -.77 0 1 0
Kurtosis 0 -.52 0 1
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Table 5 the Descriptive Statistics and Root Mean Square Errors of Difference between the Specified and Obtained 2 Parameters of 4 and 5 (in brackets) Categories Discrete Measures of the Study 3
N=925
(N=925) Descriptive Statistics Root Mean Squared Difference
Difference Min Max Mean Variance Mean Variance
skewness -.001(-.001) .001(.001) -.000(-.000) .000(.000) .000(.000) .000(.000) kurtosis -.001(-.001) .001(.001) -.000(-.000) .000(.000) .000(.000) .000(.000)
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Table 6 the Values of Kurtosis of 3 to 20 Equal-Interval Categories Discrete Measures from the Normal Distribution
Number of Categories Values of Kurtosis
3 .170
4 -.264
5 -.217
6 -.203
7 -.194
8 -.189
9 -.185
10 -.182
11 -.180
12 -.179
13 -.178
14 -.177
15 -.176
16 -.175
17 -.175
18 -.174
19 -.174
20 -.174
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 1 The Maximum Entropy Distribution and the First Probability Distribution of
Muthén’s Research
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 2 The Maximum Entropy Distribution and the Second Probability Distribution
of Muthén’s Research
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 3 The Maximum Entropy Distribution and the Third Probability Distribution of
Muthén’s Research
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 4 The Maximum Entropy Distribution and the Forth Probability Distribution of
Muthén’s Research
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 5 The Maximum Entropy Distribution and the Fifth Probability Distribution of
Muthén’s Research
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 6 The Modified Maximum Entropy Distribution and the Fifth Probability
Distribution of Muthén’s Research
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 7 Generated Parameter Space of Mean and Variance in Study 1
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 8 Generated Parameter Space of Mean and Skewness in Study 1
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 9 Generated Parameter Space of Mean and Kurtosis in Study 1
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 10 Generated Parameter Space of Variance and Skewness in Study 1
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 11 Generated Parameter Space of Variance and Kurtosis in Study 1
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 12 Generated Parameter Space of Skewness and Kurtosis in Study 1
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 13 Generated Parameter Space of Mean and Variance in Study 2
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 14 Generated Parameter Space of Mean and Skewness in Study 2
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 15 Generated Parameter Space of Mean and Kurtosis in Study 2
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 16 Generated Parameter Space of Variance and Skewness in Study 2
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 17 Generated Parameter Space of Variance and Kurtosis in Study 2
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Figure 18 Generated Parameter Space of Skewness and Kurtosis in Study 2
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Appendix the Code of the R Package for the MEP-4
MEP4=function(points,mean,variance,skewness,kurtosis,size=0,number=0){
center=points/2+0.5 scale=-(1-center)
X=matrix(0,5,points) for (i in 1:nrow(X)){
X[i,]=rep(1,points)*((seq(1:points)-(points/2+0.5))/scale)^(i-1) }
mea=(mean-center)/scale var=variance/scale^2 skew=skewness kurt=kurtosis+3
x1=mea
x2=(var)+(mea)^2
x3=((var)^(3/2))*(skew)+(3*x2*x1-2*(x1^3))
x4=((var)^2)*(kurt)+((4*x3*x1)-(6*x2*(x1^2))+3*(x1^4)) moment=rbind(x1,x2,x3,x4)
M=c(1,moment)
lambda=matrix(c(-log(1/points),0,0,0,0),1,5)
for (t in 1:1000){
P=exp(-1*lambda%*%X)
gnk=matrix(0,5,5) for (i in 1:5){
for (j in 1:5){
gnk[i,j]=P%*%as.matrix(X[2,]^(i+j-2)) }}
v=X%*%t(P)-M
z=try(solve(gnk),silent=T)
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
if(is(z,"try-error")) break else { delta=solve(gnk)%*%v
if (sum(is.na(abs(delta)))>0) break else { if (sum(abs(delta))<1e-20) break else { lambda=lambda+t(delta)}
}}
}
mean.est=(X[2,]%*%t(P))*scale+center
var.est=(X[3,]%*%t(P)-(X[2,]%*%t(P))^2)*scale^2
skew.est=(X[4,]%*%t(P)-3*X[3,]%*%t(P)*X[2,]%*%t(P)+2*(X[2,]%*%t(P))^3)/(X[
3,]%*%t(P)-(X[2,]%*%t(P))^2)^(3/2)
kurt.est=(X[5,]%*%t(P)-4*X[4,]%*%t(P)*X[2,]%*%t(P)+6*(X[3,]%*%t(P))*(X[2,]
%*%t(P))^2-3*(X[2,]%*%t(P))^4)/((X[3,]%*%t(P)-(X[2,]%*%t(P))^2)^2)-3
P1=P
for (c in 1:ncol(P1)){
if (P1[1,c]==0) P1[1,c]=1 }
entropy=sum(log(P1)*P1)*-1
diff=c(mean-mean.est,variance-var.est,skewness-skew.est,kurtosis-kurt.est) abs=abs(diff)
output=c(mean,variance,skewness,kurtosis,P,mean.est,var.est,skew.est,kurt.est,entropy ,diff,abs)
names(output)=c("set.mean","set.var","set.skew","set.kurt",paste("p",1:points,sep="") ,"est.mean","est.var","est.skew","est.kurt","entropy","diff.mean","diff.var","diff.skew
","diff.kurt","abs.mean","abs.var","abs.skew","abs.kurt")
write.table(t(as.matrix(output)),file=paste("c:/output",paste(mean,",",variance,",",ske wness,",",kurtosis,sep=""),".txt",sep="",collapse=""),row.names=F)
error=0
if (is.na(mean.est)==T|is.na(var.est)==T|is.na(skew.est)==T|is.na(kurt.est)==T){
error=T} else {
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
if (abs(mean.est-mean)>0.001 & abs(var.est-variance)>0.001 &
abs(skew.est-skewness)>0.001 & abs(kurt.est-3-kurtosis)>0.001 & is.na(entropy)==F) { error=T }}
n=size m=number
if (n==0){
m=0}
if (n>0){
if (m==0){
m=1 }}
if (error==T) list(error="improper solution") else {
if (n>0){
if (m>0){
data=matrix(0,n,m) for (i in 1:m){
pro=runif(n,min=0,max=sum(output[5:(points+4)])) for (j in 1:n){
for (k in 1:points){
if (k==1){
if (pro[j]<=output[5]) data[j,i]=1}
else
if (pro[j]>sum(output[5:(k+3)]) & pro[j]<=sum(output[5:(k+4)])) data[j,i]=k }}}
colnames(data)=paste("sample",1:m,sep="") write.table(data,file="C:/data.dat")
# save the data generated
list(output=output,data=data) }} else list(output=output) }}
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Appendix B the Code of the R Package for the MEP-2
MEP2=function(points,skewness,kurtosis,maxit=500,st=10, size=0,number=0){
par=rep(1/points,points-1) center=points/2+0.5 scale=-(1-center)
X=matrix(0,5,points) for (i in 1:nrow(X)){
X[i,]=rep(1,points)*((seq(1:points)-(points/2+0.5))/scale)^(i-1) }
MEP.P=function(par,points,skew,kurt){
p=1-sum(par)
if (all.equal(0,p)==T) p=0
P=c(as.matrix(par),p)
skew.est=(X[4,]%*%P-3*X[3,]%*%P*X[2,]%*%P+2*(X[2,]%*%P)^3)/(X[3,]%*%P -(X[2,]%*%P)^2)^(3/2)
kurt=kurt+3
kurt.est=(X[5,]%*%P-4*(X[4,]%*%P)*(X[2,]%*%P)+6*(X[3,]%*%P)*((X[2,]%*%
P)^2)-3*(X[2,]%*%P)^4)/((X[3,]%*%P-(X[2,]%*%P)^2)^2)
for (i in 1:points){
if (P[i]==0){P[i]=1}}
if (is.na(skew.est)==T) skew.est=100 if (is.na(kurt.est)==T) kurt.est=100
return(sum(log(P)*P)+abs(skew.est-skew)+abs(kurt.est-kurt)) }
for (j in 1:maxit){
p.est=optim(par,MEP.P,points=points,skew=skewness,kurt=kurtosis,control=list(ndep s=1e-4,maxit=10000,reltol=1e-20))$par
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
P=c(p.est,1-sum(p.est))
skew.est=(X[4,]%*%P-3*X[3,]%*%P*X[2,]%*%P+2*(X[2,]%*%P)^3)/(X[3,]%*%P -(X[2,]%*%P)^2)^(3/2)
kurt.est=(X[5,]%*%P-4*X[4,]%*%P*X[2,]%*%P+6*(X[3,]%*%P)*(X[2,]%*%P)^2 -3*(X[2,]%*%P)^4)/((X[3,]%*%P-(X[2,]%*%P)^2)^2)
for (i in 1:points){
if (P[i]==0){P[i]=1}}
entropy=sum(log(P)*P)
p=1-sum(p.est) if (p<0) p=0
if (abs(skew.est-skewness)<0.001 & abs(kurt.est-3-kurtosis)<0.001 &
is.na(entropy)==F) break else { if (j<=11){
par=p.est[1:points-1]+rnorm((points-1))/1000}
if (j>11 & j<=20){
par=p.est[1:points-1]+rnorm((points-1))/100}
if (j>20){
par=abs(rnorm(points-1)/3)}
for (b in 1:(points-1)){
if(par[b]<0) par[b]=0 if(par[b]>1) par[b]=1 }
par=par/sum(c(par,p)) }}
st_en=matrix(0,(st+1),(points+5)) st_en[1,1]=skewness
st_en[1,2]=kurtosis st_en[1,3:(points+2)]=P st_en[1,(points+3)]=skew.est st_en[1,(points+4)]=kurt.est-3
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
st_en[1,(points+5)]=entropy
for (k in 2:(st+1)){
for (l in 1:500){
par=abs(rnorm(points-1)/3)
for (b in 1:(points-1)){
if(par[b]>1) par[b]=1 }
par=par/sum(c(par,p))
p.est=optim(par,MEP.P,points=points,skew=skewness,kurt=kurtosis,control=list(ndep s=1e-4,maxit=10000,reltol=1e-20))$par
P=c(p.est,1-sum(p.est))
skew.est=(X[4,]%*%P-3*X[3,]%*%P*X[2,]%*%P+2*(X[2,]%*%P)^3)/(X[3,]%*%P -(X[2,]%*%P)^2)^(3/2)
kurt.est=(X[5,]%*%P-4*X[4,]%*%P*X[2,]%*%P+6*(X[3,]%*%P)*(X[2,]%*%P)^2 -3*(X[2,]%*%P)^4)/((X[3,]%*%P-(X[2,]%*%P)^2)^2)
for (i in 1:points){
if (P[i]==0){P[i]=1}}
entropy=sum(log(P)*P)
if (abs(skew.est-skewness)<0.001 & abs(kurt.est-3-kurtosis)<0.001 &
is.na(entropy)==F) break p=1-sum(p.est)
if (p<0) p=0 }
st_en[k,1]=skewness st_en[k,2]=kurtosis st_en[k,3:(points+2)]=P st_en[k,(points+3)]=skew.est
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
st_en[k,(points+4)]=kurt.est-3 st_en[k,(points+5)]=entropy }
colnames(st_en)=c("set.skew","set.kurt",paste("p",1:points,sep=""),"est.skew","est.ku rt","entropy")
diff=cbind((skewness-st_en[,(points+3)]),(kurtosis-st_en[,(points+4)])) group=abs(diff[,1])<0.001&abs(diff[,2])<0.001
st_en[,(points+5)]=st_en[,(points+5)]*-1
output=st_en[(order(group,st_en[,points+5],decreasing=T)[1]),]
names(output)=c("set.skew","set.kurt",paste("p",1:points,sep=""),"est.skew","est.kurt"
,"entropy")
write.table(t(as.matrix(output)),file=paste("c:/mep2/validation/output4/MEP2_v4/outp ut",paste(skewness,",",kurtosis,sep=""),".txt",sep="",collapse=""),row.names=F) write.table(st_en,file=paste("c:/mep2/validation/output4/MEP2_v4/diss_st",paste(ske wness,",",kurtosis,sep=""),".txt",sep="",collapse=""),row.names=F)
n=size m=number
if (n==0){
m=0 }
if (n>0){
if (m==0){
m=1 }}
if (n>0){
if (m>0){
data=matrix(0,n,m) for (i in 1:m){
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
pro=runif(n,min=0,max=sum(output[3:(points+4)])) for (j in 1:n){
for (k in 1:points){
if (k==1){
if (pro[j]<=output[3]) data[j,i]=1}
else
if (pro[j]>sum(output[3:(k+1)]) & pro[j]<=sum(output[3:(k+2)])) data[j,i]=k }}}
# generate variables of different sample size and number of samples with specified parameters
list(output=output,data=data) }} else list(output=output) }
‧ 國
立 政 治 大 學
‧
Na tiona
l Ch engchi University
Appendix C the Code of the R Package for the Boundary of Skewness and Kurtosis MEP.bon=function(skew){
MEP.P=function(par,skew){
P=c(par,1-par)
X=matrix(0,5,2) for (i in 1:nrow(X)){
X[i,]=rep(1,2)*(seq(1:2)^(i-1)) }
skew.est=(X[4,]%*%P-3*X[3,]%*%P*X[2,]%*%P+2*(X[2,]%*%P)^3)/(X[3,]%*%P -(X[2,]%*%P)^2)^(3/2)
for (i in 1:points){
if (P[i]==0){P[i]=1}}
return(sum(log(P)*P)+100*abs(skew.est-skew)) }
p1=optimize(MEP.P,interval=c(0,1),skew=skew)$minimum probability=c(p1,1-p1)
list(probability=probability) }