• 沒有找到結果。

Combination of sequence-based and structure-based interolog mapping

Chapter 5. Conclusion

5.2 Future works

5.2.2 Combination of sequence-based and structure-based interolog mapping

Our laboratory has developed a concept “3D-domain interologs”4, 54. We will combine the sequence-based homologous PPIs with the structure-based method of interolog mapping. The detail of the concept of 3D-domain interologs is described as follows.

For studying the mechanisms of PPIs in multiple species, domain-domain interactions, which are regarded as key of PPIs, should be identified. As the rapid increasing of protein structures, to identify interacting domains from three-dimensional (3D) structural complexes is able to study domain-domain interactions. A known 3D structure of interacting proteins provides interacting domains and atomic details for thousands of direct physical interactions.

Based on considering interacting domain of interacting protein pair, we have proposed a concept “3D-domain interolog mapping” to improve the generalized interologs mapping. Two physical interacting-domain sequences of a 3D-dimer protein structure are used as the queries to identify its 3D-domain interolog candidates by searching on protein sequences of genomes by utilizing PSI-BLAST. The proteins with both significant sequence similarity and the same interacting domains are considered as 3D-domain homologs forming a homolog family. Here, we define as 3D-domain homologs as: (1) candidates of both alignments with a significant PSI-BLAST E-value (< 10-8); (2) candidates have 25% domain sequence identity in both sequences in the PSI-BLAST alignment; (3) candidates have 25% sequence identity in both sequences on contacted residues in the PSI-BLAST alignment. The 3D-domain interolog candidates are defined as the all protein pairs between two homolog families derived from two sequences of a structure 3D-dimer (Figure 20). We believe that 3D-domain interolog mapping is able to study

the evolution of the interacting domain through 3D-domain homologous family from multiple 3d-domain interolog mapping &

Generalized interolog mapping

ecTbetaR2 (interacting domain)

TGF beta (interacting domain) Pfam-B 73018

PKinase Pfam-B 211911

Activin_recp TGF_beta_GS

Figure 20. Architecture of 3D-domain interolog mapping. Human TGF-B3 and TGFBR-2 co-crystallize in PDB55. Four Zebrafish homologous proteins of Human TGF-B3 found are by PSIBLAST. Likewise, five Zebrafish proteins are homologous to Human TGFBR-2. Through generalized interologs mapping, all possible pairs between the two families are considered as the generalized interologs (show as black and green line with arrows). Moreover, we could find the interacting domains of TGF-B3-TGFBR-2 complex (TGF beta domain is showed as gray and ecTbetaR2 domain is shown as light green) by exploring the co-crystal structure. The pairs of proteins which contain these interacting domains are considered as 3D-domain interologs (show as green line with arrows).

Based on combination of homologous PPIs and 3D-domain interologs, we will develop a new scoring function to model protein interface. The scoring function is

E = Einteracting + Econsensus + Esimilarity

The scoring function is composed of interacting force (Einteracting), consensus of residues (Econsensus) and template similarity (Esimilarity). We have applied this function and 3D-domain interologs to measure the interaction changes during evolution and the effect of residue substitution on the binding interface.

Human Mouse Cow

Type 1 Type 2

Human Mouse

Cow

Interacting score >= threshold

Type 1 Type 2

Homologs Interacting protein pair

Homologs

Figure 21. Overview of mutation analysis in protein-protein interactions.

In comparison of biochemical networks across species, protein-protein interactions may be conserved or non-conserved. Figure 21 shows an overview of how we analyze the causes of a protein-protein interaction would keep or lose in different organisms. In our study, we have acquired a reliable threshold of E (= Einteracting+Econsensus+Esimilarity) to estimate that two proteins will interact with each other or not. The contacting residues of homologous proteins among organisms are colored by yellow and green.

In Figure 21, the protein-protein interaction exists in human and mouse (≥ threshold) but

not in cow (< threshold). This observation suggests that the mutation in human (colored red) may not disrupt the interaction (called Type 1 mutation), but the mutation in cow (colored blue) may cause the loss of this interaction (Type 2 mutation). This model could help us to perform large-scale analyses of changes in interacting modes and residues among multiple organisms.

These analyses will support us to understand the causes of conservation and diversity in protein-protein interaction networks.

References

1. Watson, J. D., Laskowski, R. A. & Thornton, J. M. Predicting protein function from sequence and structural data. Current Opinion in Structural Biology 15, 275-284 (2005).

2. Yang, J.-M. & Tung, C.-H. Protein structure database search and evolutionary classification. Nucleic Acids Research 34, 3646-3659 (2006).

3. Pellegrini, M., Marcotte, E. M., Thompson, M. J., Eisenberg, D. & Yeates, T. O.

Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. U. S. A. 96, 4285-4288 (1999).

4. Chen, Y.-C., Lo, Y.-S., Hsu, W.-C. & Yang, J.-M. 3D-partner: a web server to infer interacting partners and binding models. Nucleic Acids Research 35, W561-567 (2007).

5. Yu, H. Y. et al. Annotation transfer between genomes: Protein-protein interologs and protein-DNA regulogs. Genome Research 14, 1107-1118 (2004).

6. Shoemaker, B. A. & Panchenko, A. R. Deciphering protein-protein interactions. Part I.

Experimental techniques and databases. PLoS Computational Biology 3, 337-344 (2007).

7. Kerrien, S. et al. IntAct - open source resource for molecular interaction data. Nucleic Acids Research 35, D561-D565 (2007).

8. Mewes, H. W. et al. MIPS: analysis and annotation of genome information in 2007.

Nucleic Acids Research 36, D196-D201 (2008).

9. Salwinski, L. et al. The Database of Interacting Proteins: 2004 update. Nucleic Acids Research 32, D449-D451 (2004).

10. Chatr-Aryamontri, A. et al. MINT: the molecular INTeraction database. Nucleic Acids Research 35, D572-D574 (2007).

11. Stark, C. et al. BioGRID: a general repository for interaction datasets. Nucleic Acids Research 34, D535-D539 (2006).

12. Patil, A. & Nakamura, H. Filtering high-throughput protein-protein interaction data using a combination of genomic features. BMC Bioinformatics 6, 100-112 (2005).

13. Saeed, R. & Deane, C. An assessment of the uses of homologous interactions.

Bioinformatics 24, 689-695 (2008).

14. Scott, M. S. & Barton, G. J. Probabilistic prediction and ranking of human protein-protein interactions. BMC Bioinformatics 8, 239-259 (2007).

15. Michaut, M. et al. InteroPORC: automated inference of highly conserved protein interaction networks. Bioinformatics 24, 1625-1631 (2008).

16. Kelley, B. et al. Conserved pathways within bacteria and yeast as revealed by global

17. Sharan, R. et al. Conserved patterns of protein interaction in multiple species. Proc.

Natl. Acad. Sci. U. S. A. 102, 1974-1979 (2005).

18. Matthews, L. R. et al. Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or "interologs". Genome Research 11, 2120-2126 (2001).

19. Shoemaker, B. A. & Panchenko, A. R. Deciphering protein-protein interactions. Part II.

Computational methods to predict protein and domain interaction partners. PLoS Computational Biology 3, e43 (2007).

20. Finn, R. D. et al. The Pfam protein families database. Nucleic Acids Research 36, D281-D288 (2008).

21. Hunter, S. et al. InterPro: the integrative protein signature database. Nucleic Acids Research 37, D211-D215 (2009).

22. Ashburner, M. et al. Gene Ontology: tool for the unification of biology. Nature Genetics 25, 25-29 (2000).

23. Kersey, P. et al. Integr8 and Genome Reviews: integrated views of complete genomes and proteomes. Nucleic Acids Research 33, D297-D302 (2005).

24. Andreeva, A. et al. SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Research 32, D226-D229 (2004).

25. Kriventseva, E. V., Fleischmann, W., Zdobnov, E. M. & Apweiler, R. CluSTr: a database of clusters of SWISS-PROT+TrEMBL proteins. Nucleic Acids Research 29, 33-36 (2001).

26. Bonifacino, J. S. & Traub, L. M. Signals for sorting of transmembrane proteins to endosomes and lysosomes. Annual Review of Biochemistry 72, 395-447 (2003).

27. Heldwein, E. E. et al. Crystal structure of the clathrin adaptor protein 1 core. Proc. Natl.

Acad. Sci. U. S. A. 101, 14108-14113 (2004).

28. Lieb, J. D., Albrecht, M. R., Chuang, P. T. & Meyer, B. J. MIX-1: An essential component of the C-elegans mitotic machinery executes x chromosome dosage compensation. Cell 92, 265-277 (1998).

29. Hagstrom, K. A., Holmes, V. F., Cozzarelli, N. R. & Meyer, B. J. C. elegans condensin promotes mitotic chromosome architecture, centromere organization, and sister

chromatid segregation during mitosis and meiosis. Genes & Development 16, 729-742 (2002).

30. Hirano, M. & Hirano, T. Hinge-mediated dimerization of SMC protein is essential for its dynamic interaction with DNA. EMBO Journal 21, 5733-5744 (2002).

31. Massague, J., Blain, S. W. & Lo, R. S. TGF-beta signaling in growth control, cancer, and heritable disorders. Cell 103, 295-309 (2000).

32. Laping, N. J. et al. Inhibition of transforming growth factor (TGF)-beta1-induced extracellular matrix with a novel inhibitor of the TGF-beta type I receptor kinase

33. Groppe, J. et al. Cooperative assembly of TGF-beta superfamily signaling complexes is mediated by two disparate mechanisms and distinct modes of receptor binding.

Molecular Cell 29, 1-13 (2008).

34. Kanehisa, M. et al. KEGG for linking genomes to life and the environment. Nucleic Acids Research 36, D480-D484 (2008).

35. Keseler, I. M. et al. EcoCyc: a comprehensive database resource for Escherichia coli.

Nucleic Acids Research 33, D334-337 (2005).

36. Juo, Z. S., Kassavetis, G. A., Wang, J., Geiduschek, E. P. & Sigler, P. B. Crystal

structure of a transcription factor IIIB core interface ternary complex. Nature 422, 534-539 (2003).

37. Walhout, A. et al. Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 287, 116-122 (2000).

38. Gadal, O. et al. A nuclear AAA-type ATPase (Rix7p) is required for biogenesis and nuclear export of 60S ribosomal subunits. EMBO Journal 20, 3695-3704 (2001).

39. Tirosh, I. & Barkai, N. Computational verification of protein-protein interactions by orthologous co-expression. BMC Bioinformatics 6, 40 (2005).

40. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25-29 (2000).

41. Wu, X., Zhu, L., Guo, J., Zhang, D. & Lin, K. Prediction of yeast protein-protein interaction network: insights from the Gene Ontology and annotations. Nucleic Acids Research 34, 2137-2150 (2006).

42. Davy, A. et al. A protein-protein interaction map of the Caenorhabditis elegans 26S proteasome. EMBO Reports 2, 821-828 (2001).

43. Consortium, C. e. S. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282, 2012-2018 (1998).

44. Altschul, S., Gish, W., Miller, W., Myers, E. & Lipman, D. Basic local alignment search tool. Journal of Molecular Biology 215, 403-410 (1990).

45. Hermjakob, H. et al. IntAct: an open source molecular interaction database. Nucleic Acids Research 32, D452-D455 (2004).

46. Weng, S. et al. Saccharomyces Genome Database (SGD) provides biochemical and structural information for budding yeast proteins. Nucleic Acids Research 31, 216-218 (2003).

47. Edwards, A. et al. Bridging structural biology and genomics: assessing protein interaction data with known complexes. Trends Genetics 18, 529-536 (2002).

48. von Mering, C. et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399-403 (2002).

49. Jansen, R. et al. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302, 449-453 (2003).

cellular networks of protein interactions. Nucleic Acids Research 30, 303-305 (2002).

51. Birney, E. et al. An Overview of Ensembl. Genome Research 14, 925-928 (2004).

52. Kolch, W. Meaningful relationships: the regulation of the Ras/Raf/MEK/ERK pathway by protein interactions. Biochemical Journal 351, 289-305 (2000).

53. Wang, C. C., Cirit, M. & Haugh, J. M. PI3K-dependent cross-talk interactions converge with Ras as quantifiable inputs integrated by Erk. Molecular Systems Biology 5, 246 (2009).

54. Chen, Y.-C., Chen, H.-C. & Yang, J.-M. DAPID: a 3D-domain annotated protein-protein interaction database. Genome Informatics 17, 206-215 (2006).

55. Hart, P. J. et al. Crystal structure of the human TbetaR2 ectodomain -- TGF-beta3 complex. Nature Structural Biology 9, 203-208 (2002).

Appendix A

List of publications

Journal papers

1. Yang, J.-M. and Chen, C.-C. GEMDOCK: a generic evolutionary method for molecular docking. Proteins 55, 288-304 (2004). (Impact factor: 4.429)

2. Yao, Y.-Y., Shrestha K.L., Wu, Y.-J., Tasi H.-J., Chen, C.-C., Yang, J.-M., Ando, A., Cheng C.-Y. and Li, Y.-K. Structural simulation and protein engineering to convert an endo-chitosanase to an exo-chitosanase. Protein Engineering Design & Selection 21, 561-566 (2008). (Impact factor: 2.662)

3. Chen, C.-C., Lin, C.-Y., Lo, Y.-S. and Yang, J.-M. PPISearch: a web server for searching homologous protein-protein interactions across multiple species. Nucleic Acids Research 37:W376-W383 (2009). (Impact factor: 6.954)

Conference Papers

1. Huang J.-W., Chen, C.-C., Yang, J.-M. (2008) Identifying critical positions and rules of antigenic drift for influenza A/H3N2 viruses. The 2nd International Conference on Bioinformatics and Biomedical Engineering, pp. 249-252

Appendix B

Journal papers

Yang, J.-M. and Chen, C.-C. GEMDOCK: a generic evolutionary method for molecular docking. Proteins 55, 288-304 (2004).

Yao, Y.-Y., Shrestha K.L., Wu, Y.-J., Tasi H.-J., Chen, C., Yang, J.-M., Ando, A., Cheng C.-Y. and Li, C.-Y.-K. Structural simulation and protein engineering to convert an endo-chitosanase to an exo-chitosanase. Protein Engineering Design & Selection 21, 561-566 (2008).

相關文件