Rice microarray data analysis - 淨本質相關係數在基因選擇與基因調控網路建構之應用

The second dataset was to study the bHLH (basic helix–loop–helix) Pathway in rice (Oryza Sativa). The expressions data were downloaded from the NCBI-GEO database [ http://www.ncbi.nlm.nih.gov/gds] (accession numbers GSE6901 and GSE 14275). The GSE6901 dataset includes gene expression of the 7-day-old light-grown rice seedlings under drought, salt and cold stresses from 9 samples (three biological replicates of each stress) as well as the gene expression from the adjacent controlled conditions of 3 samples. The GSE14275 dataset includes gene expression of the 14-day-old light-grown rice seedlings under heat shock stress from 3 samples and the gene expression from the adjacent controlled conditions of 3 samples. Both datasets hybridized the RNA samples on Affymetrix microarrays (NCBI-GEO accession num-ber GPL2025). The raw expression data of 51,279 probes from 18 samples also went through pre-processing using the RMA method and log2 transformed. In this study, we were interested in the 167 genes that were previously reported as related genes involving in bHLH Pathway (Li et al., 2006). Through matching the annotations of the affymetrix probe ID, we identified 128 bHLH-related probes in the microarray (Table B.1). Among them, 72 probes (61 genes) were called the G-box binders, which meant recognizing and binding to the G-box sequence (5’-CACGTG-3’), according to Li et al. (2006). We also downloaded the gene sequences of the bHLH-related genes in the microarray from RAP-DB (version 7.0) and found 104 probes (80 genes) containing G-box sequences in their promoter regions. The 72 probes recognize the G-box sequence and the 104 probes contain G-box sequences were designated as source and the candidate target genes, respectively, to construct the bHLH gene network. Besides, we match the 72 probes ID with 104 probes ID. There were 54 probes (45 genes) among these chosen probes to be appointed as source and the candidate target genes.

A family of transcription factors bHLH in plant plays principal role in develop-mental processes (Buck et al., , 2003). The abiotic stresses affect the growth of crops.

Up to the present, the functions of OsbHLH (Oryza sativa bHLH) transcription fac-tors have not been studied completely. In this study, we explored the relationship of the OsbHLH gene expressions under the abiotic stresses by CID/pCID and the result of bHLH gene network was shown in Figure 4.10. The arrows indicate the association between two OsbHLH probes by CID/pCID. Rectangle nodes indicate the OsbHLH probes are the G-box binders and exclude G-box sequences. Ellipse

nodes indicate the OsbHLH probes include box sequences and are not the G-box binders. Octagon nodes are the G-G-box binders and include G-G-box sequences at the same time. The gray nodes represent that could respond in different stress have been verified in rice studies. OsbHLH001 (OsICE2 ) and OsbHLH002 (OsICE1 ) are induced at the protein level in response to cold and salt stresses, but not effected by cold stress on mRNA level (Nakamura et al., 2011). OsbHLH006 (RERJ1 ) was shown to be up-regulated on drought stress (Kiribuchi et al., 2005, Miyamoto et al., 2013); OsbHLH009 (OsMYC ) corresponded to Arabidopsis AtMYC2 (Zhu et al., 2005) and AtMYC2 could induce the expression under drought stress (Abe et al., 1997); OsbHLH062 (OsbHLH1 ) could be able to enhance the cold tolerance (Wang et al., 2003); OsbHLH148 was induced by salt stress and resulted in activation under cold stress (Seo et al., 2011); OsbHLH152 (OsPILI1 ) could reduce internode elonga-tion under drought stress (Todaka et al., 2012). Besides, OsbHLH001, OsbHLH002 and OsbHLH003 are related to the GO term, response to stress (GO: 0006950), from agriGO (GO Analysis Toolkit and Database for Agricultural Community). In Figure 4.10, we could observe that OsbHLH009 and OsbHLH148 connected with the downstream gene , OsbHLH006, respectively. Furthermore, OsbHLH006, Os-bHLH009 and OsbHLH148 are important in drought stress.

In addition, OsbHLH010, OsbHLH024-1 (Os.10316.1.S1 at ), OsbHLH024-2 (Os.

26054.1.S1 s at ), OsbHLH025-1 (Os.32770.1.S1 x at ), OsbHLH031, OsbHLH032, OsbHLH033-2 (Os.8796.2.S1 a at ), OsbHLH044, OsbHLH058, OsbHLH060, bHLH061, OsbHLH088, OsbHLH093, OsbHLH104-1 (Os.15089.1.S1 at ) and Os-bHLH 104-2 (Os.44516.1.S1 x at ) might be the key roles in abiotic stresses because they had a lot of connections within these genes and with the other OsbHLH probes.

Figure 4.10: The gene regulatory network for OsbHLH rice seedlings contained the G-box binders and sequences under abiotic stresses is constructed by CID/pCID method from the NCBI-GEO database. Each node is the code of the OsbHLH number, for example 152 means the OsbHLH152. An arrow between nodes indicates a connection is determined by CID/pCID. Gray nodes show the genes are related to abiotic stresses have been confirmed from paper or GO term. Rectangle nodes indicate the OsbHLH probes are the G-box binders and exclude G-box sequences.

Ellipse nodes indicate the OsbHLH probes include G-box sequences and are not the G-box binders. Octagon nodes are the G-box binders and include G-box sequences at the same time.

4.5 Discussion

For diminishing the computation of the programming, some irrelevant candidate target genes were eliminated in the first step of our proposed heuristic approach and were not proceed in the next steps. However, we use the same approach without eliminating the irrelevant genes to select the next genes for constructing the network.

In order to compare the results with these two programmings, we use the same 100 simulations of pseudo network for sample size N =25, 50 and 100. Consider a particular simulation with N =50, which is the same as that is used in Table 4.1, the CID and pCID values as well as their p-values are shown in Table 4.3. Starting from the source node, A11, the first selected node is A22 and the direction is set from A11 to A22. For proceeding the steps, the results are A11 → A21, A21 → A31, A21 → A32 and A21 → B. Next starting from the other source node, B, there are all insignificant values of CID at the first step of GRN inference and was isolated from the other nodes. Hence, the resulting network is distinct from the pseudo network in Figure 4.4. We obtain another connection, A21 → B, which is unsuitable for our expectations.

We also collect all networks reconstructed under the source node is A11 in the simulations for N = 25, 50 and 100; networks consisting of the same set of nodes are grouped together and the groups occurr at least 5 times are shown in Figure 4.11. Fifteen resulting networks match the correct network structure among these one hundred simulations for N = 25, thirty-eight correct networks are restructured for N = 50 and forty-seven correct networks are for N = 100. However, these proportions of correct networks with different sample sizes are almost less than the results of our proposed heuristic approach in Figure 4.5. Because of using the new approach may increase additional connections besides the complete network. There are 23% and 39% of the simulations have additional connections with the negative-control node B for N = 50 and 100, respectively. In addition, there also have the partial networks. For N = 25, 47% of the simulations only reveal the partial network;

when using a larger sample (N = 50), as few as 8 simulations obtain partial network;

moreover, there were not any partial network under the sample of size N = 100.

In Figure 4.12, we combine all the correct connections between two nodes from 100 simulations for N = 25, 50 and 100. When the sample of size N = 25 and the source node is A11, there are 88% of networks to connect (A11, A21) together, 92%

for (A21, A31), 57% for (A11, A22), and 44% for (A21, A32); 14% of the networks

Table 4.3: The estimated CID and pCID values in one of the 100 simulations with sample size N = 50.

CID/pCID Estimate (p-value) CID/pCID Estimate (p-value)

CID(A11|A21) 0.1936 (0.0010)

CID(A11|A22) 0.2028 (0.0010) CID(A22|A11) 0.1791 (0.0010)

CID(A11|A31) 0.1612 (0.0010)

CID(A11|A32) 0.1281 (0.0010)

CID(A11|B) 0.0129 (0.4136)

pCID(A11|A21;A22) 0.1013 (0.0010) PCID(A21|A11;A22) 0.0934 (0.0010) pCID(A11|A31;A22) 0.0639 (0.0020)

pCID(A21|A31;A11,A22) 0.1131 (0.0010) pCID(A31|A21;A11,A22) 0.1123 (0.0010) pCID(A21|A32;A11,A22) 0.0929 (0.0010)

pCID(A21|A32;A11,A22,A31) 0.0553 (0.0020) pCID(A32|A21;A11,A22,A31) 0.0576 (0.0350) pCID(A21|B;A11,A22,A31) 0.0073 (0.6424)

Figure 4.11: The results of the network reconstructed from 100 simulations of pseudo network for N = 25, 50 and 100, respectively. The numbers next to the arrows illustrate the number of connection from the source node to the target node; besides, the number of connection in the brackets illustrated the inverse direction.

Figure 4.12: Pseudo network for the simulation study based on the procedure (Pick up the connected node which has the minimum significant CID/pCID p-value, if there existed at least two nodes which fitted the requests, we chose the node that had the maximum CID/pCID value). (A) The numbers next to the arrows illustrate the proportions of the objects in the sample that the expressions of the target node actually determined by the expressions of the source node. (B), (C) and (D) were the results which were combined with all connection from 100 simulations when the source node T₀ was A11 for N = 25, 50 and 100, respectively.

include the negative control node B (Figure 4.12 B). When N = 50, 97%, 98%, 82%, and 85% of the networks contain the edges between (A11, A21), (A21, A31), (A11, A22) and (A21, A32), respectively, while 46% of them had node B (Figure 4.12 C). When N = 100, 99%, 100%, 97%, and 94% of the networks contain the edges between (A11, A21), (A21, A31), (A11, A22) and (A21, A32), respectively, while 48% of them had node B (Figure 4.12 D). We can observe that the proportions of networks which are combined all correct edges are similar to the outcomes in Figure 4.6. However, the proportions of networks include node B are larger than the results of our proposed approach and go up as the sample size increases. On the other source node B, 16% (Figure 4.12 B), 21% (Figure 4.12 C) and 26% (Figure 4.12 D) of the networks are significant build at α = 0.05. All false networks start from B of the same combination of nodes only appear less than or equal to five times in 100 simulations for N = 25, 50 and 100. Therefore, our proposed heuristic approach which was eliminated some irrelevant nodes in the first step based on CID has more accuracy.

Chapter 5 Conclusions

We have proposed a strategy to select explanatory variables that are relevant to the target variable using the CID along with the pCID without interference from other essential variables. The proposed method is more sensitive to curvilinearity and more specific to linearity than the PCC/pPCC method. It is also demonstrated in the simulations that the proposed procedure is able to quantify various types of associations in a stepwise manner. It also had the potential to index different levels of curvilinearity. While practicing on real microarray data, we have noticed that the CID/pCID procedure can not only identify cold-responsive genes but can also capture sample-specific gene-gene interactions. Biologists may find the proposed strategy useful in their efforts to extract meaningful relationships among genes out of the noise when meta analysis is of large interest in the post-genomic era.

In addition, we have extended the CID/pCID method to construct the gene regulatory network. The proposed heuristic approach can obtain more accurate re-constructed network when the sample size increase in the simulation study. While exercising a known gene regulatory network inference on gene expression data, we have observed that the CID/pCID programming can acquire more consistent path-way if the source gene is an upstream gene which has evidenced in biology. On the other hand, we practice an unknown gene regulatory network inference to sup-ply not only some notable genes but also the new network. Biologists can verify the gene-gene interactions according to the experiments and explore the biological properties.

References

Abe, H., Yamaguchi-Shinozaki, K., Urao, T., lwasaki, T., Hosokawa, D., and Shi-nozaki, K. (1997), ”Role of Arabidopsis MYC and MYB homologs in drought- and abscisic acid-regulated gene expression.” The Plant Cell, 9: 1859-1868.

Akhtar, M., Jaiswal, A., Taj, G., Jaiswal, J. P., Qureshi, M. I., and Singh, N. K.

(2012), ”DREB1/CBF transcription factors: their structure, function and role in abiotic stress tolerance in plants.” Journal of Genetics, 91: 385-395.

Baba, K., Shibata, R., and Sibuya, M. (2004), ”Partial correlation and conditional correlation as measures of conditional independence.” Australian and New Zealand Journal of Statistics, 46(4): 657-664.

Benjamini, Y., and Hochberg, Y. (1995), ”Controlling the false discovery rate: a practical and powerful approach to multiple testing.” Journal of the Royal Statistical Society. Series B. Methodological, 57: 289-300.

Buck, M.J., and Atchley, W.R. (2003), ”Phylogenetic analysis of plant basic helix-loop-helix proteins.” Journal of Molecular Evolution, 56: 742-750.

Du, Z., Zhou, X., Ling, Y., Zhang, Z., and Su, Z. (2010), ”agriGO: a GO analysis toolkit for the agricultural community.” Nucleic Acids Research, 38: W64-W70.

Faith, J.J., Hayete, B., Thaden, J.T., Mogno, I.,Wierzbowski, J., Cottarel, G., Kasif, S., Collins, J.J. and Gardner, and T.S. (2007), ”Large-scale mapping and validation of escherichia coli transcriptional regulation from a compendium of expression pro-files.” PLOS Biology, 5(1): 54-66.

Fowler, S., and Thomashow, M. F. (2002), ”Arabidopsis transcriptome profiling indicates that multiple regulatory pathways are activated during cold acclimation in addition to the CBF cold response pathway.” The Plant Cell, 14: 1675-1690.

Friedman, J.H. (1991), ”Multivariate adaptive regression splines.” The Annals of

Statistics, 19: 1-67.

Fuente, A. de la, Bing, N., Hoeschele, I., and Mendes, P. (2004), ”Discovery of meaningful associations in genomic data using partial correlation coefficients.” BMC Bioinformatics, 20(18): 3565-3574.

Gentleman, R. C., Carey, V. J., Bates, D. M., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., Hornik, K., Hothorn, T., Huber, W., Iacus, S., Irizarry, R., Leisch, F., Li, C., Maechler, M., Rossini, A. J., Sawitzki, G., Smith, C., Smyth, G., Tierney, L., Yang, J. Y., and Zhang, J. (2004), ”Bioconductor:

open software development for computational biology and bioinformatics.” Genome Biology, 5: R80.

Gilmour, S.J., Fowler, S.G., and Thomashow, M.F. (2004), ”Arabidopsis transcrip-tional activators CBF1, CBF2, and CBF3 have matching functranscrip-tional activities.”

Plant Molecular Biology, 54(5): 767-781.

Hayfield, T., and Racine, J. S. (2008), ”Nonparametric econometrics: the np pack-age.” Journal of Statistical Software, 27(5).

Hsing, T., Liu, L.Y., Brun, M., and Dougherty, E.R. (2005), ”The coefficient of intrinsic dependence (feature selection using el CID).” Pattern Recognition, 38(5):

623-636.

Huala, E., Dickerman, A. W., Garcia-Hernandez, M., Weems, D., Reiser, L., La-Fond, F., Hanley, D., Kiphart, D., Zhuang, M., Huang, W., Mueller, L. A., Bhat-tacharyya, D., Bhaya, D., Sobral, B.W., Beavis, W., Meinke, D.W., Town, C. D., Somerville, C., Rhee, and S. Y. (2001), ”The Arabidopsis information resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant.” Nucleic Acids Research, 29: 102-105.

Irizarry, R. A., Hobbs, B., Collin, F., BeazerBarclay, Y. D., Antonellis, K. J., Scherf, U., Speed, and T. P. (2003), ”Exploration, normalization, and summaries of high density oligonucleotide array probe level data.” Biostatistics, 4: 249-264.

Jain, M. (2012), ”Next-generation sequencing technologies for gene expression pro-filing in plants.” Briefings in Functional Genomics, 11(1): 63-70.

Kim, S. (2012), ”ppcor: partial and semi-partial (part) correlation.” Http://CRAN.R-project.org/package=ppcor.

Kiribuchi, K., Jikumaru, Y., Kaku, H., Minami, E., Hasegawa, M., Kodama, O., Seto, H., Okada, K., Nojiri, H., and Yamane, H. (2005), ”Involvement of the ba-sic helix-loop-helix transcription factor RERJ1 in wounding and drought stress re-sponses in rice plants.” Biosci. Biotechnol. Biochem., 69(5): 1042-1044.

Krouk, G., Lingeman, J., Colon, A.M., Coruzzi, G., and Shasha, D. (2013), ”Gene regulatory networks in plants: learning causality from time and perturbation.”

Genome Biology, 14: 123.

Lee, B.-h., Henderson, D. A., and Zhu, J.-K. (2005), ”The Arabidopsis cold-responsive transcriptome and its regulation by ICE1.” The Plant Cell, 17: 3155-3175.

Li, X., Duan, X., Jiang, H., Sun, Y., Tang, Y., Yuan, Z., Guo, J., Liang, W., Chen, L., Yin, J., Ma, H., Wang, J., and Zhang, D. (2006), ”Genome-wide analysis of basic/helix-loop-helix transcription factor family in rice and Arabidopsis.” Plant Physiology, 141(4): 1167-1184.

Liu, L.Y.D. (2005), Coefficient of intrinsic dependence: a new Measure of associa-tion, Ph.D. Dissertaassocia-tion, Texas A& M University: College Staassocia-tion, Texas, USA.

Liu, L.Y.D., Chang, L.Y., Kuo, W.H., Hwa, H.L., Shyu, M.K., Chang, K.J., and Hsieh, F.J. (2012), ”In silico prediction for regulation of transcription factors on their shared target genes indicates relevant clinical implications in a breast cancer population.” Cancer Informatics, 11: 113-137.

Liu, L.Y.D., Chen, C.Y., Chen, M.J.M., Tsai, M.S., Lee, C.H.S., Phang, T.L., Chang, L.Y., Kuo, W.H., Hwa, H.L., Lien, H.C., Jung, S.M., Lin, Y.S., Chang, J.K., and Hsieh, F.J. (2009), ”Statistical identification of gene association by CID in application of constructing ER regulatory network.” BMC Bioinformatics, 10:

85.

Liu, Q., Kasuga, M., Sakuma, Y., Abe, H., Miura, S., Yamaguchi-Shinozaki, K., and Shinozaki, K. (1998), ”Two transcription factors, DREB1 and DREB2, with an EREBP/AP2 DNA binding domain separate two cellular signal transduction pathways in drought- and low-temperature-responsive gene expression, respectively, in Arabidopsis.” The Plant Cell, 10(8): 1391-1406.

Mardis, E.R. (2008), ”Next-generation DNA sequencing methods.” Genomics and Human Genetics, 9: 387-402.

McKhann, H.I., Gery, C., B´erard, A., L´evˆeque, S., Zuther, E., Hincha, D.K., Mita, S.D., Brunel, D., and T´eoul´e, E. (2008), ”Natural variation in CBF gene sequence, gene expression and freezing tolerance in the Versailles core collection of Arabidopsis thaliana.” BMC Plant Biology, 8(1): 105.

Miyamoto, K., Shimizu, T., Mochizuki, S., Nishizawa, Y., Minami, E., Nojiri, H., Yamane, H., and Okada, K. (2013), ”Stress-induced expression of the transcription factor RERJ1 is tightly regulated in response to jasmonic acid accumulation in rice.”

Protoplasma, 250(1): 241-249.

Nakamura, J., Yuasa, T., Huong, T.T., Harano, K., Tanaka, S., Iwata, T., Phan, T., Iwaya-Inoue, M. (2011) ”Rice homologs of inducer of CBF expression (OsICE) are involved in cold acclimation.” Plant Biotechnology, 28(3): 303-309.

Priness, I., Maimon, O., and Ben-Gal, I. (2007), ”Evaluation of gene-expression clustering via mutual information distance measure.” BMC Bioinformatics, 8:111.

Sakuma, Y., Maruyama, K., Osakabe, Y., Qin, F., Seki, M., Shinozaki, K., and Yamaguchi-Shinozaki, K. (2006), ”Functional analysis of an Arabidopsis transcrip-tion factor, DREB2A, involved in drought-responsive gene expression.” Plant Cell, 18: 1292-1309.

Schadt, E.E., Lamb, J., Yang, X., Zhu, J., Edwards, S., Guhathakurta, D., Sieberts, S.K., Monks, S., Reitman, M., Zhang, C., Lum, P.Y., Leonardson, A., Thieringer, R., Metzger, J.M., Yang, L., Castle, J., Zhu, H., Kash, S.F., Drake, T.A., Sachs, A., and Lusis, A.J. (2005), ”An integrative genomics approach to infer causal associa-tions between gene expression and disease.” Nature Genetics, 37: 710-717.

Seo, J.S., Joo, J., Kim, M.J., Kim, Y.K., Nahm, B.H., Song, S.I., Cheong, J.J., Lee, J.S., Kim, J.K., and Choi, Y.D. (2011), ”OsbHLH148, a basic helix-loop-helix protein, interacts with OsJAZ proteins in a jasmonate signaling pathway leading to drought tolerance in rice.” Plant J., 65(6): 907-921.

Shrinet, J., Jain, S., Jain, J., Bhatnagar, R.K., and Sunil, S. (2014), ”Next genera-tion sequencing reveals regulagenera-tion of distinct aedes microRNAs during chikungunya virus development.” PLoS Neglected Tropical Diseases, 8(1):e2616.

Suh, E.B., Dougherty, E.R., Kim, S., Bittner, M.L., Chen, Y., Russ, D.E., Martino, and R.L. (2003), ”Parallel computation and visualization tools for codetermination

analysis of multivariate gene expression relations.” Computational and Statistical Approaches to Genomics, 227-240.

Thomashow, M.F., Gilmour, S.J., Stockinger, E.J., Jaglo-Ottosen, K.R., and Zarka, D.G. (2001), ”Role of the Arabidopsis CBF transcriptional activators in cold accli-mation.” Physiologia Plantarum, 112(2): 171-175.

Tittarelli, A., Santiago, M., Morales, A., Meisel, L. A., and Silva, H. (2009), ”Isola-tion and func”Isola-tional characteriza”Isola-tion of cold-regulated promoters, by digitally identi-fying peach fruit cold-induced genes from a large EST dataset.” BMC Plant Biology, 9: 121.

Todaka, D., Nakashima, K., Maruyama, K., Kidokoro, S., Osakabe, Y., Ito, Y., Mat-sukura, S., Fujita, Y., Yoshiwara, K., Ohme-Takagi, M., Kojima, M., Sakakibara, H., Shinozaki, K., and Yamaguchi-Shinozaki, K. (2012), ”Rice phytochrome-interacting factor-like protein OsPIL1 functions as a key regulator of internode elongation and induces a morphological response to drought stress.” Proceedings of the National Academy of Sciences, 109(39): 15947-15952.

Tsai, C.A., and Liu, L.Y.D. (2013), ”Identifying gene set association enrichment using the coefficient of intrinsic dependence.” PLoS One, 8(3): e58851.

Wang, Y., Zhang, Z., He, X., Zhou, H., Wen, Y., Dai, J., Zhang, J. and Chen, S.

(2003), ”A rice transcription factor OsbHLH1 is involved in cold stress response.”

Theoretical and Applied Genetics, 107: 1402-1409.

Wettenhall, J. M., Simpson, K. M., Satterley, K., and Smyth, G. K. (2006), ”affylmGUI:

a graphical user interface for linear modeling of single channel microarray data.”

Bioinformatics, 22: 897-899.

Zhang, Q., Jiang, N., Wang, G-L., Hong, Y., and Wang, Z. (2013), ”Advances in understanding cold sensing and the cold-responsive network in rice.” Advances in Crop Science and Technology, 1: 104.

Zhu, Z.F., Sun, C.Q., Fu, Y.C., Qian, X.Y., Yang, J.S., Wang, X.K. (2005), ”Isola-tion and analysis of a novel MYC gene from rice.” Journal of Genetics and Genomics, 32(4): 393-398.

Appendix A

The inference of pseudo network

Suppose A11 and B were randomly generated from N (1, 1). In the pair genes (S, T ), if S was expressed, the expression level of T was distributed as N (1, 0.25); otherwise, the expression level of T was distributed as N (−1, 0.25). The critical value of these two distribution was setted at the mean value minus two standard deviations and which value was calculated to be zero. The binding efficiency (b) for {A11, A21}, {A11, A22}, {A21, A31}, and {A21, A32} were 0.9, 0.7, 0.9, and 0.8, respectively.

The approximate proportions of gene expressions of the target gene actually deter-mined by the expression levels of the source gene were expressed as P (S → T ) and the inferences were shown as follows.

• P (A11 > 0) ' 0.84.

The binding efficiency b_{A11, A21} was 0.9.

Therefore P (A11 → A21) ' 0.84 × 0.9 ' 0.76.

• P (A11 > 0) ' 0.84 and b_{A11, A22} = 0.7.

Then P (A11 → A22) ' 0.84 × 0.7 ' 0.59.

• P (A11 > 0) ' 0.84 and b_{A21, A31^} = 0.9.

P (A21 > 0) = P [I₍A11^→A21)N (A11, 0.25) > 0] + P [I₍A119A21)N (−1, 0.25) > 0]

= b_{A11, A21^}[P (0 < A11 < 1)P (N (0, 0.25) > −0.5) + P (A11 > 1)]

+ (1 − b_{A11, A21})P (N (−1, 0.25) > 0) ' 0.9 × (0.34 × 0.84 + 0.5) + 0.24 × 0.025 ' 0.713

P (A11 → A31) ' 0.713 × 0.9 ' 0.64.

Thus P (A21 → A31) ' ^0.64_0.76 ' 0.84.

• P (A21 > 0) ' 0.713 and b_{A21, A32} = 0.8.

P (A11 → A32) ' 0.713 × 0.8 ' 0.57.

Thus P (A21 → A32) ' ^0.57_0.76 ' 0.75.

Appendix B

Supplement table

Table B.1: GenBank accession number of OsbHLH members is in this study.

OsbHLH number GenBank Affymetrix

MSU ID RAP ID

accession number probe ID

OsbHLH001-1 (OsICE2) AK102594.1 Os.13595.1.S1 at LOC Os01g70310 Os01g0928000 OsbHLH001-2 (OsICE2) BI796438 Os.13595.2.S1 x at LOC Os01g70310 Os01g0928000 OsbHLH002 (OsICE1) AK109915.1 Os.56356.1.S1 at LOC Os11g32100 Os11g0523700 OsbHLH003 (RAI1) AK103779.1 Os.5860.1.S1 at LOC Os03g04310 Os03g0135700 OsbHLH004-1 AK063669.1 Os.46563.1.S1 at LOC Os10g39750 Os10g0544200 OsbHLH004-2 AK063669.1 Os.46563.1.S1 a at LOC Os10g39750 Os10g0544200 OsbHLH005 (TDR) AK106761.1 Os.50000.1.S1 at LOC Os02g02820 Os02g0120500 OsbHLH006 (RERJ1) AB040744.1 Os.6043.1.S1 at LOC Os04g23550 Os04g0301500 OsbHLH008 AK064943.1 Os.3825.1.S1 at LOC Os01g13460 Os01g0235700 OsbHLH009 (OsMYC) AY536428.1 Os.46443.1.S1 at LOC Os10g42430 Os10g0575000 OsbHLH010 AK064946.1 Os.46956.1.S1 at LOC Os01g50940 Os01g0705700 OsbHLH013 (OSB1/Ra) AB021079.1 Os.2233.1.S1 at LOC Os04g47080 Os04g0557800 OsbHLH015 AK111704.1 Os.49810.1.S1 at LOC Os04g47040 Os04g0557200 OsbHLH016 (OSB2) AB021080.1 Os.57542.1.S1 at LOC Os04g47059 Os04g0557500 OsbHLH018 AK120539.1 Os.7441.1.S1 at LOC Os03g51580 Os03g0725800 OsbHLH020 AK107190.1 Os.54959.1.S1 at LOC Os03g46860 Os03g0671800 OsbHLH024-1 AK106333.1 Os.10316.1.S1 at LOC Os01g39330 Os01g0575200 OsbHLH024-2 BM038927 Os.26054.1.S1 s at LOC Os01g39330 Os01g0575200 OsbHLH024-3 BM038927 Os.26054.1.S1 at LOC Os01g39330 Os01g0575200 OsbHLH025-1 AK102964.1 Os.32770.1.S1 x at LOC Os01g09990 Os01g0196300 OsbHLH025-2 AK102964.1 Os.32770.1.S1 at LOC Os01g09990 Os01g0196300 OsbHLH028 AK107675.1 Os.55212.1.S1 at LOC Os05g11070 Os05g0199800 OsbHLH031 AK100183.1 Os.5093.1.S1 at LOC Os08g38210 Os08g0490000 OsbHLH032 AK071315.1 Os.16741.1.S1 a at LOC Os09g29930 Os09g0475400 OsbHLH033-1 AK072417.1 Os.8796.1.S2 s at LOC Os01g65080 Os01g0871200

在文檔中淨本質相關係數在基因選擇與基因調控網路建構之應用 (頁 68-86)