• 沒有找到結果。

While confronted with high-throughput data, researchers often turn to dimension reduction methods of sorts to ease the severe penalty associated with testing myriads of variables.17-21 For our p-based method, dimensionality is not a curse but in fact is a blessing.

In this paper, we see that the power of the multiple perturbation test actually increases as the number of auxiliary variables increases. Such ‘the-more-the-better’ principle also applies, when one is knowledgeable about which variables may be perturbative. In Figure 4, the initial segment of the power curve (solid line) emulates a situation when a researcher incorporates into the multiple perturbation test the total 100 informative variables (I =0.02) that are known to him/her. Since the power is only 0.59, should the researcher add more variables into the test? We see as expected that adding more variables unselectively (dotted lines from left to right, for I =0.001,0.00025 and 0.0001, respectively) into the test will only dilute the power. However, upon more and more of these low-informativity variables being added, the power then rises up again and surpasses the original power.

However, it should be emphasized that the above p-based approach only goes so far as when the auxiliary variables have a non-zero informativeness (I >0, irrespectively of how small it may be). A computer can easily generate millions and billions of random variables for us, but all these artificial data amount to nothing (I =0, exactly). The more such variables being added, the more the power will be curtailed. Another caveat is that there is no use replicate the data at hand just to make the total number of auxiliary variables appear larger;

the power simply won’t bulge with this maneuver.

Age-related macular degeneration is a progressive disease in macula of the retina in which the pigment epithelium cells and the photoreceptor cells degenerate, causing gradual loss of central vision.22,23 With FDR controlled at 0.05, in this study we are able to identify

20

two novel SNPs that are significantly associated with age-related macular degeneration. The first SNP (rs2618034) is located in the intron region of KCND3 gene (potassium voltage-gated channel, Shal-related subfamily, member 3) on chromosome 1p13.2, and the second (rs2014029), the intron region of DTL gene (denticleless E3 ubiquitin protein ligase homolog (Drosophila)) on 1q32.3. KCND3 gene encods Kv4.3 regulating neuronal excitability.24 Mutations in KCND3 gene have been identified as a cause for cerebellar neurodegeneration.25,26 In this regard, it is worthy to note that the retina photoreceptor cells are a specialized type of neurons which may also degenerate with aging. Meanwhile, DTL gene regulates p53 polyubiquitination and protein stability27 and the evidence to date suggests that p53 is a key regulator involved in the apotosis of retinal pigment epithelium cells.28 All these findings further support that KCND3 and DTL genes may be causally related to the development of age-related macular degeneration.

It is worthy to note that the proposed p-based multiple perturbation test indeed is a very powerful test. The two significant SNPs (rs2618034 and rs2014029) that we identified in this study are only very weakly associated with age-related macular degeneration (odds ratios= 0.53 and 2.10, respectively), and the traditional n-based method (Pearson chi-square test) comes nowhere near detecting them (P-values= 0.201 and 0.166, respectively) (Table 3).

Even if we increase the total number of subjects from the present n=146(Klein et al’s data6) to n≈25000 and n≈77000 (Holliday et al’s7 and Fritsche et al’s8 meta-analyses data), the n-based method still cannot detect them. But this is not to say that the n-based method is useless. In fact, Klein et al6 themselves presented one SNP (rs380390) with an n-based P-value of 4.1 10× -8 (significance after Bonferroni correction), but it is undetectable with our method. It is important to note that the p-based test proposed in this paper is not meant to take the place of the traditional n-based test. It is better that they can work side by side,

21

complementing each other. Finally, we wish to point out that by incorporating a priori knowledge into analyzing Klein et al’s data,6 Lin and Lee29 previously were able to identify four more significant SNPs in chromosome 1 (rs800292, rs2019727, rs1329428 and rs1853882) using the traditional n-based test. The same principle can also be applied to the p-based multiple perturbation test in this paper to facilitate the detection of even more genes.

In this paper, we have successfully applied the multiple perturbation test to a genome-wide association study where thousands of genomic markers serve the roles of the auxiliary/perturbation variables. The method should have broad applications to other high-dimension (large p ) -omics studies, such as epigenomic, transcriptomic, proteomic, metabolomic, and exposomic studies, etc. It would be even better to have a cross-omics study, and/or with all its study subjects further linked to existing government or private-sector databases, such as, data of health insurances, traffic violations, internet usages, etc. A researcher conducting such a data-mining study has the potentials to push the p (the number of auxiliary/perturbation variables) to the millions, billions or even trillions, and be rewarded with a very high power for detecting a weak association. Such a p-based method may set a stage for a new paradigm of statistical hypothesis tests.

22

Figure 4. Power curve when a researcher includes the 100 informative variables (I =0.02) known to him/her and then other low-informativity variables (dotted lines from left to right, for I =0.001,0.00025 and 0.0001, respectively) unselectively into the multiple perturbation test.

23

REFERENCE

1. Siontis, G.C., and Ioannidis, J.P. (2011). Risk factors and interventions with statistically significant tiny effects. Int J Epidemiol 40, 1292-1307.

2. Grontved, A., and Hu, F.B. (2011). Television viewing and risk of type 2 diabetes, cardiovascular disease, and all-cause mortality: a meta-analysis. JAMA 305, 2448-2455.

3. Hemila, H., and Chalker, E. (2013). Vitamin C for preventing and treating the common cold.

Cochrane Database Syst Rev 1, CD000980.

4. Ioannidis, J.P., Trikalinos, T.A., and Khoury, M.J. (2006). Implications of small effect sizes of individual genetic variants on the design and interpretation of genetic association studies of complex diseases. Am J Epidemiol 164, 609-614.

5. Hindorff, L.A., Sethupathy, P., Junkins, H.A., Ramos, E.M., Mehta, J.P., Collins, F.S., and Manolio, T.A. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 106, 9362-9367.

6. Klein, R.J., Zeiss, C., Chew, E.Y., Tsai, J.Y., Sackler, R.S., Haynes, C., Henning, A.K., SanGiovanni, J.P., Mane, S.M., Mayne, S.T., et al. (2005). Complement factor H polymorphism in age-related macular degeneration. Science 308, 385-389.

7. Holliday, E.G., Smith, A.V., Cornes, B.K., Buitendijk, G.H., Jensen, R.A., Sim, X., Aspelund, T., Aung, T., Baird, P.N., Boerwinkle, E., et al. (2013). Insights into the genetic architecture of early stage age-related macular degeneration: a genome-wide association study meta-analysis. PLoS One 8, e53830.

8. Fritsche, L.G., Chen, W., Schu, M., Yaspan, B.L., Yu, Y., Thorleifsson, G., Zack, D.J., Arakawa, S., Cipriani, V., Ripke, S., et al. (2013). Seven new loci associated with age-related macular degeneration. Nat Genet 45, 433-439, 439e431-432.

24

9. Wellcome Trust Case Control, C. (2007). Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661-678.

10. Ollier, W., Sprosen, T., and Peakman, T. (2005). UK Biobank: from concept to reality.

Pharmacogenomics 6, 639-646.

11. Chen, Z., Chen, J., Collins, R., Guo, Y., Peto, R., Wu, F., Li, L., and China Kadoorie Biobank collaborative, g. (2011). China Kadoorie Biobank of 0.5 million people: survey methods, baseline characteristics and long-term follow-up. Int J Epidemiol 40, 1652-1666.

12. Chapman, K., Ferreira, T., Morris, A., Asimit, J., and Zeggini, E. (2011). Defining the power limits of genome-wide association scan meta-analyses. Genet Epidemiol 35, 781-789.

13. Buzkova, P., Lumley, T., and Rice, K. (2011). Permutation and parametric bootstrap tests for gene-gene and gene-environment interactions. Ann Hum Genet 75, 36-45.

14. Lim, L.S., Mitchell, P., Seddon, J.M., Holz, F.G., and Wong, T.Y. (2012). Age-related macular degeneration. Lancet 379, 1728-1738.

15. Gorin, M.B. (2012). Genetic insights into age-related macular degeneration: controversies addressing risk, causality, and therapeutics. Mol Aspects Med 33, 467-486.

16. Storey, J.D., and Tibshirani, R. (2003). Statistical significance for genomewide studies. Proc Natl Acad Sci U S A 100, 9440-9445.

17. Chatterjee, N., Kalaylioglu, Z., Moslehi, R., Peters, U., and Wacholder, S. (2006). Powerful multilocus tests of genetic association in the presence of gene-gene and gene-environment interactions. Am J Hum Genet 79, 1002-1016.

18. Gauderman, W.J., Murcray, C., Gilliland, F., and Conti, D.V. (2007). Testing association between disease and multiple SNPs in a candidate gene. Genet Epidemiol 31, 383-395.

19. Wang, T., Ho, G., Ye, K., Strickler, H., and Elston, R.C. (2009). A partial least-square

25

approach for modeling gene-gene and gene-environment interactions when multiple markers are genotyped. Genet Epidemiol 33, 6-15.

20. Pan, W. (2009). Asymptotic tests of association with multiple SNPs in linkage disequilibrium. Genet Epidemiol 33, 497-507.

21. Pan, W. (2010). Statistical tests of genetic association in the presence of gene-gene and gene-environment interactions. Hum Hered 69, 131-142.

22. Bhutto, I., and Lutty, G. (2012). Understanding age-related macular degeneration (AMD):

relationships between the photoreceptor/retinal pigment epithelium/Bruch's membrane/choriocapillaris complex. Mol Aspects Med 33, 295-317.

23. Ambati, J., and Fowler, B.J. (2012). Mechanisms of age-related macular degeneration.

Neuron 75, 26-39.

24. Tsaur, M.L., Chou, C.C., Shih, Y.H., and Wang, H.L. (1997). Cloning, expression and CNS distribution of Kv4.3, an A-type K+ channel alpha subunit. FEBS Lett 400, 215-220.

25. Lee, Y.C., Durr, A., Majczenko, K., Huang, Y.H., Liu, Y.C., Lien, C.C., Tsai, P.C., Ichikawa, Y., Goto, J., Monin, M.L., et al. (2012). Mutations in KCND3 cause spinocerebellar ataxia type 22. Ann Neurol 72, 859-869.

26. Duarri, A., Jezierska, J., Fokkens, M., Meijer, M., Schelhaas, H.J., den Dunnen, W.F., van Dijk, F., Verschuuren-Bemelmans, C., Hageman, G., van de Vlies, P., et al. (2012).

Mutations in potassium channel kcnd3 cause spinocerebellar ataxia type 19. Ann Neurol 72, 870-880.

27. Banks, D., Wu, M., Higa, L.A., Gavrilova, N., Quan, J., Ye, T., Kobayashi, R., Sun, H., and Zhang, H. (2006). L2DTL/CDT2 and PCNA interact with p53 and regulate p53 polyubiquitination and protein stability through MDM2 and CUL4A/DDB1 complexes.

Cell Cycle 5, 1719-1729.

26

28. Bhattacharya, S., Chaum, E., Johnson, D.A., and Johnson, L.R. (2012). Age-related susceptibility to apoptosis in human retinal pigment epithelial cells is triggered by disruption of p53-Mdm2 association. Invest Ophthalmol Vis Sci 53, 8350-8366.

29. Lin, W.Y., and Lee, W.C. (2010). Incorporating prior knowledge to facilitate discoveries in a genome-wide association study on age-related macular degeneration. BMC Res Notes 3, 26.

27

相關文件