• 沒有找到結果。

The data we used in present study contains 65 SNP markers and 890 individuals (514 cases and 376 controls). Table 1 shows that the information of each markers. It contains:

# is the marker number.

Name is the marker ID specified.

Position is the marker position specified (in base pair).

ObsHET is the marker’s observed heterozygosity.

PredHET is the marker’s predicted heterozygosity (i.e. 2*MAF*(1-MAF)).

HWpval is the Hardy-Weinberg equilibrium p value.

%Geno is the percentage of non-missing genotypes for this marker.

MAF is the minor allele frequency for this marker.

Alleles are the major and minor alleles for this marker.

Using the criteria we described in the section of data quality control, we will exclude 10 SNPs: rsNRG1_E_1, rsG72_8, rsG72_E_4, rsG72_E_3, rsDAO_3, rsDAO_E_1, rsDAO_E_2, rsDISC1_E_3, rsDISC1_34, and rsDISC1_5. All because of the MAF is less than 0.01.

As we described in the section of study design, we used the Haploview software to define haplotype block. Figure 3 shows that the pair-wise LD plot and defined block in five genes. The deeper color means the stronger LD. There are five blocks (each block contains 2 SNPs) in DISC1, no block in NRG1, one block (contains 7 SNPs) in DAO, two blocks (one contains 3 SNPs and one contains 2 SNPs) in G72, and two blocks (each block contains 2 SNPs) in CACNG2. One block can be treated as one variable. Therefore, the haplotype-based data will have 39 variables (10 blocks and 29 SNPs).

Our goal is to detect single marker effect, two-way and three-way interaction. We use the five methods and rank the association in our genotype-based and haplotype-based data. We showed the top five best models of single marker effects, two-way, and three-way interactions in table 2 to 4.

Single marker effects. In our genotype-based single marker effects study, chi-square test, LRM, and BEAM identified that the SNP rsDAO_13 as the most significant marker. CART and MDR identified that the SNP rsDAO_7 as the most significant marker, which as the second most significant marker by chi-square test, LRM, and BEAM. And in the haplotype-based data, all methods shows that DAO_block1, which contains SNP rsDAO_13 and rsDAO_7, as the best model. It shows that DAO might be a significant gene with associated with schizophrenia.

Two-way interaction. In genotype-based two-way interaction study, Chi-square, LRM, and CART still shows that SNPs in DAO gene (rsDAO_6, rsDAO_7, and rsDAO_8) have two-way interaction, whereas BEAM and MDR did not detected.

BEAM identified rsDISC1_E_7*rsDISC1_E_4 as two-way best model, and MDR identified rsNRG1_14*rsG72_16. It might because that Chi-square test, LRM, and CART require significant main effect to be detected before including interaction effects between factors. This is a major methodological limitation for situations where each marker has relatively small main effects but more substantial interactive effects.

In these situations, using haplotype-base study might give more information. In haplotype-base study, Chi-square test and LRM detected that G72_block2 (which contains rsG72_16) has interaction effects with other SNP.

Three-way interaction. The markers detected in three-way interaction study were

showed in table 4. Most of them were also detected by two-way interaction study. For example, rsDAO_6, rsDAO_7, rsG72_16, etc. In haplotype-based three-way interaction study, LRM faced numerical difficulties for estimating the model parameters since there are too many categories in block variables. Therefore, we did not propose the LRM three-way interaction in haplotype-based study.

Odds ratio. In order to realize the relationship between genotype (haplotype) and disease, we further calculate the odds ratio and its confidence interval for some candidate model (rsDAO_13, rsDAO_7, DAO_block1, and rsDAO_6*rsDAO_7). The results are showed in table 5 to 8. The genotype (haplotype) with minimum odds is considered as reference group. If the genotype with zero case or control, we didn’t calculate the odds ratio. We can see that the genotype CC of rsDAO_13 has a significant result, that is, the confidence interval of the odds ratio did not cover 1.

Also, the genotype GA of rsDAO_7 has a significant result. In the model DAO_block1, there are also some haplotypes with significant odds ratio. Besides, there are many haplotypes with only affected individuals. Similar result also appeared in rsDAO_6*rsDAO_7.

Cross validation. By using the cross-validation procedure, we can get 100 best models in each one-way, two-way, and three-way interaction along with each method.

Using the prediction rule we described before, we can calculate prediction errors with each best model in each CV. We averaged the prediction errors across 100 CVs, which be showed in table 9. The box-plots of prediction error were also displayed in figure 4 to 6. In one-way interaction, BEAM shows best ability of prediction. However, the differences between each method are not too significant in box-plot. In two-way interaction, the traditional approach LRM shows that minimum prediction error, and

is much smaller than the others. BEAM seems to have worst prediction and has biggest variation. In three-way interaction, CART has the smaller prediction error but all three methods do not have good performance. Their prediction errors are too close to 0.5. A prediction error of 0.5 is what you expect if you were to predict case-control status by flipping a coin. It might because that our data did not contain a three-way interaction. We can see that the prediction error go up at two-way interaction and go down at three-way interaction in CART and MDR. Over-fitting might be the reason, that is, we add the false positives thus decreasing its predictive ability. It might be worth to note that the MDR has the smaller variation. This means that MDR is much stable than others.

相關文件