• 沒有找到結果。

Signal Peptide Cleavage Sites Prediction

VII. Conclusions and Discussions

7.3 Signal Peptide Cleavage Sites Prediction

Using SVM for the recognition of human signal peptide cleavage sites, the overall performance is 96.8%, which improves the results in SignalP and ACN. Accuracy for cleavage site location is 84.1%, which also improves the results by SignalP and ACN.

The correlation coefficient is 0.81, which mean that overprediction is lower for the SVM than for SignalP and ACN.

The experimental results reveal that SVM can easily achieve comparable accuracy as other predictors. Thus SVM can be a powerful computational tool for predicting the signal peptide cleavage sites. Therefore, it is a promising direction to study other applications by using SVM.

APPENDICES

APPENDIX A

Optimal Hyperparameters for Protein Fold

Recognition

Table A.1: Optimal hyperparameters for the training set by 10-fold cross validation.

fold Comp. S H V P Z CS

index C γ C γ C γ C γ C γ C γ C γ

1 50 50 50 50 10 1000 10 50 10 0.01 10 100 10 10

3 1 1000 1 10 10 50 10 10 10 100 10 10 50 1

4 10 10 10 10 10 50 10 10 10 50 1 100 1000 0.1

7 10 100 10 50 10 10 10 50 10 10 10 100 10 10

9 10 50 10 50 10 100 10 100 10 50 10 50 500 0.1

11 10 50 1 1 1 1 1 1 10 50 1 1 1 1

20 10 10 1 10 10 100 1 100 10 1000 10 100 100 0.1

23 10 100 1 50 10 10 10 10 1 1 10 10 1 10

26 1 100 100 50 10 100 10 100 1 1 1 1 10 50

30 10 100 10 100 10 50 10 1000 10 100 10 10 1 50

31 10 10 1 1 100 50 1 50 10 50 1 100 1000 0.01

32 10 1000 10 50 10 100 10 50 10 50 10 50 10 50

33 10 10 10 50 10 10 1 100 1 100 10 50 1 50

35 10 100 1 50 1 50 1 50 1 50 10 50 1 10

39 10 50 10 10 10 100 10 100 1 1 10 100 10 10

46 10 1000 10 50 1 1000 10 1000 1 1000 50 1000 50 10

47 10 1000 10 10 10 1000 10 1000 10 1000 10 1000 10 10

48 10 1000 10 100 10 1000 10 50 10 1000 1 100 1 50

51 10 1000 10 1 1 1000 10 100 1 1000 10 100 50 10

54 10 100 10 50 1000 0.1 10 1000 10 10 10 100 10 50

57 10 100 10 10 10 10 1 1000 10 50 10 50 10 1

59 10 1000 10 10 10 100 1 1000 10 100 10 100 10 10

62 1 1000 1 100 1 100 10 10 1 1000 10 10 1 50

69 10 50 100 50 1 1 1 1 1 1 1 1 10 10

72 10 100 10 50 10 50 10 50 10 10 10 50 10 50

87 500 1 1 50 10 50 10 50 10 100 10 50 10 10

110 10 10 500 0.1 500 0.1 10 1 1 50 1000 1 500 0.1

Table A.2: (Cont’d) Optimal hyperparameters for the training set by 10-fold cross validation.

fold HZ SV CSH VPZ HVP CSHV CSHVPZ

index C γ C γ C γ C γ C γ C γ C γ

1 10 10 10 10 10 10 10 10 10 10 10 1 10 50

3 50 0.1 10 1 1 50 100 0.1 10 1 10 1 50 10

4 50 0.1 10 1 50 0.1 50 0.1 50 0.1 10 1 1000 10

7 1 50 10 1 50 10 1 50 10 1 10 1 10 50

9 10 10 500 0.1 10 50 100 0.1 500 0.01 500 0.01 500 50

11 1 1 1 1 1 1 1 1 1 1 1 1 1 1

20 50 10 100 0.1 10 50 10 10 10 0.1 10 1 100 50

23 1 10 1 10 10 10 10 10 1 10 10 10 1 10

26 1 1 1 1 10 10 10 50 10 10 10 10 10 10

30 10 10 10 10 10 50 10 50 10 10 1 10 1 10

31 1 10 1 50 1 50 10 10 1 10 1 10 1000 10

32 10 10 10 10 10 10 10 10 10 10 10 10 10 10

33 1 50 1 50 1 50 1 50 1000 0.01 1 10 1 10

35 1 10 1 10 10 10 1 10 10 1 10 1 1 10

39 10 1 10 1 10 50 10 10 10 1 10 1 10 50

46 10 10 10 10 1 50 1 50 10 10 10 10 50 50

47 50 1 10 1 1 100 1 100 50 1 50 1 10 50

48 10 10 10 10 1 50 1 50 10 10 1 10 1 50

51 10 10 10 10 10 50 10 10 10 1 10 10 50 50

54 10 50 10 10 10 50 10 50 10 10 10 10 10 50

57 50 1 10 1 1 50 10 10 50 0.1 50 0.1 10 10

59 10 10 10 10 10 50 10 50 10 10 10 10 10 50

62 1 50 1 50 1 50 1 50 1 10 10 10 1 50

69 10 10 10 10 10 50 1 50 10 10 10 10 10 50

72 10 10 10 10 10 1 10 10 10 10 10 1 10 10

87 10 10 10 10 10 10 10 10 10 10 10 10 10 10

110 50 0.1 500 0.01 100 0.1 500 0.01 50 0.1 50 0.1 500 1

APPENDIX B

Data Set for Human Signal Peptide Cleavage Sites Predictions

Table B.1: Data set for human signal peptide cleavage sites prediction

10KS-HUMAN 1B05-HUMAN 5NTD-HUMAN 7B2-HUMAN A1AH-HUMAN A1AT-HUMAN A2AP-HUMAN A2HS-HUMAN A4-HUMAN AACT-HUMAN ABP-HUMAN ACET-HUMAN ACE-HUMAN ACHA-HUMAN ACHB-HUMAN ACHE-HUMAN ACHG-HUMAN ACHN-HUMAN ACRO-HUMAN ALBU-HUMAN ALK1-HUMAN ALS-HUMAN AMYP-HUMAN ANF-HUMAN ANGI-HUMAN ANGT-HUMAN ANPA-HUMAN ANPC-HUMAN ANT3-HUMAN APA1-HUMAN APA2-HUMAN APA4-HUMAN APC1-HUMAN APC2-HUMAN APC3-HUMAN APD-HUMAN APE-HUMAN APOA-HUMAN APOH-HUMAN ARSA-HUMAN ASM-HUMAN ASPG-HUMAN AXO1-HUMAN B2MG-HUMAN B61-HUMAN B71-HUMAN BAL-HUMAN BFR2-HUMAN BGAM-HUMAN BGLR-HUMAN BLSA-HUMAN C1QA-HUMAN C1QC-HUMAN C1R-HUMAN C1S-HUMAN C4BB-HUMAN C4BP-HUMAN CA11-HUMAN CA13-HUMAN CA14-HUMAN CA18-HUMAN CA19-HUMAN CA21-HUMAN CA24-HUMAN CA25-HUMAN CAH4-HUMAN CAH6-HUMAN CAMA-HUMAN CAML-HUMAN CAP7-HUMAN CASB-HUMAN CASK-HUMAN CATD-HUMAN CATE-HUMAN CATH-HUMAN CATL-HUMAN CBG-HUMAN CBP1-HUMAN CBPB-HUMAN CBPC-HUMAN CBPN-HUMAN CCKN-HUMAN CD14-HUMAN CD1A-HUMAN CD1D-HUMAN CD1E-HUMAN CD28-HUMAN CD2-HUMAN CD30-HUMAN CD3D-HUMAN CD3E-HUMAN CD3G-HUMAN CD3Z-HUMAN CD45-HUMAN CD4X-HUMAN CD4-HUMAN CD52-HUMAN CD59-HUMAN CD5-HUMAN CD7-HUMAN CD82-HUMAN CD8A-HUMAN CERU-HUMAN CETP-HUMAN CFAI-HUMAN CHLE-HUMAN CLUS-HUMAN CMGA-HUMAN CO2-HUMAN CO3-HUMAN CO4-HUMAN CO6-HUMAN CO7-HUMAN CO8G-HUMAN COG1-HUMAN COG7-HUMAN COG8-HUMAN COG9-HUMAN COL-HUMAN CR1-HUMAN CR2-HUMAN CRFB-HUMAN CRTC-HUMAN CSF1-HUMAN CSF2-HUMAN CSF3-HUMAN CTRB-HUMAN CYPB-HUMAN CYRG-HUMAN CYTC-HUMAN CYTS-HUMAN DAF2-HUMAN DEFN-HUMAN DOPO-HUMAN DRN1-HUMAN E2-HUMAN EGFR-HUMAN EL2B-HUMAN ELS-HUMAN EMBP-HUMAN ENPL-HUMAN EPOR-HUMAN EPO-HUMAN F13B-HUMAN FA12-HUMAN FA5-HUMAN FA8-HUMAN FBLB-HUMAN FCEA-HUMAN FCG1-HUMAN FETA-HUMAN FGF7-HUMAN FGR3-HUMAN FIBA-HUMAN FIBB-HUMAN FIBH-HUMAN FINC-HUMAN FKB3-HUMAN FOL2-HUMAN FSA-HUMAN FSHB-HUMAN GA6S-HUMAN GELS-HUMAN GL6S-HUMAN GLCM-HUMAN GLHA-HUMAN GLPE-HUMAN GLUC-HUMAN GLYP-HUMAN GMCR-HUMAN GONL-HUMAN GP1A-HUMAN GP1B-HUMAN GP39-HUMAN GPIX-HUMAN GR78-HUMAN GRA1-HUMAN GRAA-HUMAN GRP2-HUMAN GUAN-HUMAN HA25-HUMAN HA2R-HUMAN HA2Z-HUMAN HB23-HUMAN HB2A-HUMAN HB2Q-HUMAN HC-HUMAN HEP2-HUMAN HEXA-HUMAN HGFA-HUMAN HGF-HUMAN HIS3-HUMAN HPT2-HUMAN HRG-HUMAN HV1B-HUMAN HV2H-HUMAN HV2I-HUMAN HV3C-HUMAN I12A-HUMAN I12B-HUMAN I309-HUMAN IAC2-HUMAN IBP1-HUMAN IBP2-HUMAN IBP3-HUMAN IBP4-HUMAN IC1-HUMAN ICA1-HUMAN ICA2-HUMAN IGF2-HUMAN

Table B.2: (Cont’d) Data set for human signal peptide cleavage sites prediction

IHA-HUMAN IHBA-HUMAN IL11-HUMAN IL1R-HUMAN IL1X-HUMAN IL2A-HUMAN IL2B-HUMAN IL2-HUMAN IL3-HUMAN IL4-HUMAN IL5R-HUMAN IL5-HUMAN IL6R-HUMAN IL6-HUMAN IL7R-HUMAN IL7-HUMAN IL8-HUMAN IL9-HUMAN INA7-HUMAN INB-HUMAN INGR-HUMAN ING-HUMAN INIG-HUMAN INIP-HUMAN INSR-HUMAN INS-HUMAN IPSP-HUMAN IPST-HUMAN IRBP-HUMAN ITA2-HUMAN ITA4-HUMAN ITA6-HUMAN ITAB-HUMAN ITAL-HUMAN ITAV-HUMAN ITAX-HUMAN ITB1-HUMAN ITB2-HUMAN ITB4-HUMAN ITB7-HUMAN KAL-HUMAN KFMS-HUMAN KHEK-HUMAN KKIT-HUMAN KMET-HUMAN KNL-HUMAN KV4B-HUMAN KV5A-HUMAN LAG3-HUMAN LBP-HUMAN LCAT-HUMAN LCA-HUMAN LDLR-HUMAN LEM1-HUMAN LEM3-HUMAN LEUK-HUMAN LFA3-HUMAN LIF-HUMAN LIPG-HUMAN LIPH-HUMAN LIPL-HUMAN LIPP-HUMAN LITH-HUMAN LMB1-HUMAN LMB2-HUMAN LMP1-HUMAN LMP2-HUMAN LPH-HUMAN LSHB-HUMAN LV0A-HUMAN LV6E-HUMAN LYC-HUMAN LYSH-HUMAN MABC-HUMAN MAG-HUMAN MCPI-HUMAN MCP-HUMAN MDP1-HUMAN MG24-HUMAN MGP-HUMAN MI1B-HUMAN MI2B-HUMAN MK-HUMAN MLCH-HUMAN MOTI-HUMAN MPRD-HUMAN MPRI-HUMAN MYP0-HUMAN NAGA-HUMAN NCA2-HUMAN NDDB-HUMAN NEC2-HUMAN NEU2-HUMAN NEUB-HUMAN NGFR-HUMAN NIDO-HUMAN NMZ2-HUMAN OMGP-HUMAN ONCM-HUMAN P4HA-HUMAN PA21-HUMAN PA2M-HUMAN PAHO-HUMAN PAI1-HUMAN PBGD-HUMAN PDGB-HUMAN PEC1-HUMAN PENK-HUMAN PEPA-HUMAN PEPC-HUMAN PERF-HUMAN PF4L-HUMAN PGDR-HUMAN PGDS-HUMAN PGH1-HUMAN PGSG-HUMAN PLFV-HUMAN PLMN-HUMAN PLR2-HUMAN PP11-HUMAN PP14-HUMAN PPA5-HUMAN PPAL-HUMAN PPAP-HUMAN PPB3-HUMAN PPBT-HUMAN PRIO-HUMAN PRL-HUMAN PRN3-HUMAN PROP-HUMAN PRPC-HUMAN PRTC-HUMAN PRTP-HUMAN PRTS-HUMAN PRTZ-HUMAN PS2-HUMAN PSPA-HUMAN PSSP-HUMAN PTHY-HUMAN PTN-HUMAN PTPG-HUMAN PZP-HUMAN REL2-HUMAN RENI-HUMAN RETB-HUMAN RIB1-HUMAN RIB2-HUMAN RNKD-HUMAN SAA-HUMAN SABP-HUMAN SAMP-HUMAN SAP3-HUMAN SAP-HUMAN SCF-HUMAN SEM2-HUMAN SG1-HUMAN SIAL-HUMAN SLIB-HUMAN SMS1-HUMAN SODE-HUMAN SOMW-HUMAN SPRC-HUMAN SRCH-HUMAN SSBP-HUMAN STAT-HUMAN STS-HUMAN TCO1-HUMAN TCO2-HUMAN TENA-HUMAN TETN-HUMAN TFPI-HUMAN TF-HUMAN TGR3-HUMAN THBG-HUMAN THY1-HUMAN THYG-HUMAN TIM2-HUMAN TNFB-HUMAN TNR1-HUMAN TNR2-HUMAN TRFE-HUMAN TRFL-HUMAN TRFM-HUMAN TRKA-HUMAN TRY2-HUMAN TRYA-HUMAN TSHB-HUMAN TSHR-HUMAN TSP1-HUMAN TTHY-HUMAN TVA2-HUMAN TVA3-HUMAN TVB2-HUMAN TVC-HUMAN TYRR-HUMAN UPAR-HUMAN UROK-HUMAN UROM-HUMAN VEGF-HUMAN VIP-HUMAN VTDB-HUMAN VTNC-HUMAN VWF-HUMAN WNT1-HUMAN ZA2G-HUMAN ZP2-HUMAN

BIBLIOGRAPHY

[1] P. Baldi, S. Brunak, P. Frasconi, G. Soda, and G. Pollastri. Exploiting the past and the future in protein secondary structure prediction. Bioinformatics, 15:937–946, 1999.

[2] K. P. Bennett, D. Hui, and L. Auslender. On support vector decision trees for database marketing. Department of Mathematical Sciences Math Report No.

98-100, Rensselaer Polytechnic Institute, Troy, NY 12180, Mar. 1998.

[3] V. Biou, J.-F. Gibrat, J. Levin, B. Robson, and J. Garnier. Secondary structure prediction: combination of three different methods. Protein Engineering, 2:185–

191, 1989.

[4] H. Bohr, J. Bohr, S. Brunak, R. Cotterill, B. Lautrup, L. Nskov, O. Olsen, and S. Petersen. Protein secondary structures and homology by neural networks:

The helices in rhodopsin. FEBS Letters, 241:223–228, 1988.

[5] B. Boser, I. Guyon, and V. Vapnik. A training algorithm for optimal mar-gin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, 1992.

[6] M. Brown, W. Grundy, D. Lin, N. Cristianini, C. Sugnet, M. Ares, and D. Haus-sler. Support vector machine classification of microarray gene expression data.

Technical Report UCSC-CRL-99-09, University of California, Santa Cruz, 1999.

[7] M. P. S. Brown, W. N. Grundy, D. Lin, N. Cristianini, C. Sugnet, T. S. Furey, J. M.Ares, and D. Haussler. Knowledge-based analysis of microarray gene ex-pression data using support vector machines. PNAS, 97(1):262–267, 2000.

[8] C. J. C. Burges. A tutorial on support vector machines for pattern recognition.

Data Mining and Knowledge Discovery, 2(2):121–167, 1998.

[9] J.-M. Chandonia and M. Karplus. Neural networks for secondary structure and structural class predictions. Prot. Sci., 4:275–285, 1995.

[10] C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.

[11] C. Chothia and A. V. Finkelstein. The classification and origin of protein folding patterns. Annu. Rev. Biochem., 59:1007–1039, 1990.

[12] F. E. Cohen and I. D. Kuntz. Tertiary Structure Prediction. In Prediction of protein structure and the principles of protein conformation. Plenum Press, New York, London., 1989.

[13] L. L. Conte, B. Ailey, T. J. P. Hubbard, S. E. Brenner, A. G. Murzin, and C. Chothia. SCOP: a structure classification of proteins database. Nucleic Acids Res., 28:257–259, 2000.

[14] C. Cortes and V. Vapnik. Support-vector network. Machine Learning, 20:273–

297, 1995.

[15] N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Ma-chines. Cambridge University Press, Cambridge, UK, 2000.

[16] J. A. Cuff and G. J. Barton. Evalution and improvement of multiple sequence methods for protein secondary structure prediction. Proteins: Struct. Funct.

Genet., 34:508–519, 1999.

[17] D. DeCoste and B. Sch¨olkopf. Training invariant support vector machines. Ma-chine Learning, 2001. To appear.

[18] C. H. Q. Ding and I. Dubchak. Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics, 17(4):349–358, 2001.

[19] D. Donnelly, J. P. Overington, and T. L. Blundell. The prediction and orienta-tion of a-helices from sequence alignments: the combined use of environment-dependent substitution tables, fourier transform methods and helix capping rules. Prot. Engin., 7:645–653, 1994.

[20] K. Duan, S. S. Keerthi., and A. N. Poo. Evaluation of simple performance mea-sures for tuning svm hyperparameters. Technical Report CD-01-11, Department of Mechanical Engineering National University of Singapore, 2001.

[21] I. Dubchak, I. Muchnik, S. R. Holbrook, and S. H. Kim. Prediction of protein folding class using global description of amino acid sequence. Proc. Natl Acad.

Sci. USA, 92:8700–8704, 1995.

[22] I. Dubchak, I. Muchnik, C. Mayor, I. Dralyuk, and S. H. Kim. Recognition of a protein fold in the context of the structural classification of protein (SCOP) classification. Proteins, 35:401–407, 1999.

[23] J. Friedman. Another approach to polychotomous classification. Techni-cal report, Department of Statistics, Stanford University, 1996. Available at http://www-stat.stanford.edu/reports/friedman/poly.ps.Z.

[24] D. Frishman and P. Argos. Knowledge-based protein secondary structure as-signment. Proteins, 23:566–579, 1995.

[25] J. Garnier, D. Osguthorpe, and B. Robson. Analysis of the accuracy and im-plications of simple methods for predicting the secondary structure of globular proteins. Journal of Molecular Biology, 120:97–120, 1978.

[26] O. Gascuel and J. L. Golmard. A simple method for predicting the secondary structure of globular proteins: implications and accuracy. CABIOS, 4:357–365, 1988.

[27] C. Geourjon and G. Deleage. Sopma: significant improvements in protein sec-ondary structure prediction by consensus prediction from multiple alignments.

CABIOS, 11:681–684, 1995.

[28] J.-F. Gibrat, J. Garnier, and B. Robson. Further developments of protein sec-ondary structure prediction using information theory. Journal of Molecular Biology, 198:425–443, 1987.

[29] U. Hobohm and C. Sander. Enlarged representative set of proteins. Protein Sci., 3:522–524, 1994.

[30] H. Holley and M. Karplus. Protein secondary structure prediction with a neural network. In Proceedings of the National Academy of Sciences of the United States of America, volume 86, pages 152–156, 1989.

[31] C.-W. Hsu and C.-J. Lin. A comparison of methods for multi-class support vector machines. IEEE Trans. Neural Networks, 2002. To appear.

[32] T. J. P. Hubbard, A. G. Murzin, S. E. Brenner, and C. Chothia. Scop: a structure classification on proteins database. Nucleic Acids Research, 25:236–

239, 1997.

[33] B. Jagla and J. Schuchhardt. Adaptive encoding neural networks for the recog-nition of human signal peptide cleavage sites. Bioinformatics, 16(3):245–250, 2000.

[34] T. Joachims. Transductive inference for text classification using support vector machines. In Proceedings of International Conference on Machine Learning, 1999.

[35] T. Joachims. The Maximum-Margin Approach to Learning Text Classifiers:

Methods, Theory, and Algorithms. PhD thesis, Universitaet Dortmund, 2000.

[36] D. Juretic, B. Lee, N. Trinajstic, and R. W. Williams. Conformational prefer-ence functions for predicting helices in membrane proteins. Biopolymers, 33:255–

273, 1993.

[37] W. Kabsch and C. Sander. Dictionary of protein secondary structure: Pat-tern recognition of hydrogen-bonded and geometrical features. Biopolymers, 22:2577–2637, 1983.

[38] M. Kanehisa. A multivariate analysis method for discriminating protein sec-ondary structural segments. Prot. Engin., 2:87–92, 1988.

[39] S. S. Keerthi, C. B. S. K. Shevade, and K. R. K. Murthy. A fast iterative near-est point algorithm for support vector machine classifier design. IEEE Trans.

Neural Networks, 11(1):124–136, 2000.

[40] D. G. Kneller, F. E. Cohen, and R. Langridge. Improvements in protein sec-ondary structure prediction by an enhanced neural network. Journal of Molec-ular Biology, 214:171–182, 1990.

[41] S. Knerr, L. Personnaz, and G. Dreyfus. Single-layer learning revisited: a step-wise procedure for building and training a neural network. In J. Fogelman, editor, Neurocomputing: Algorithms, Architectures and Applications. Springer-Verlag, 1990.

[42] U. Kreßel. Pairwise classification and support vector machines. In B. Sch¨olkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods — Support Vector Learning, pages 255–268, Cambridge, MA, 1999. MIT Press.

[43] Y. LeCun, L. Jackel, L. Bottou, A. Brunot, C. Cortes, J. Denker, H. Drucker, I. Guyon, U. Muller, E. Sackinger, P. Simard, and V. Vapnik. Comparison of learning algorithms for handwritten digit recognition. In F.Fogelman and P.Gallinari, editors, International Conference on Artificial Neural Networks, pages 53–60, Paris, 1995. EC2 & Cie., 1995.

[44] C.-J. Lin. Formulations of support vector machines: a note from an optimization point of view. Neural Computation, 13(2):307–317, 2001.

[45] A. Lupas, M. V. Dyke, and J. Stock. Predicting coiled coils from protein se-quences. Science, 252:1162–1164, 1991.

[46] R. Maclin and J. Shavlik. Using knowledge-based neural networks to improve algorithms: Refining the chou-fasman algorithm for protein folding. Machine Learning, 11:195–215, 11.

[47] N. Matic, I. Guyon, J. Denker, and V. Vapnik. Writer adaptation for on-line handwritten character recognition. In I. C. S. Press, editor, In Second Interna-tional Conference on Pattern Recognition and Document Analysis, pages 187–

191, Tsukuba, Japan, 1993.

[48] B. Matthews. Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochim. Biophys. Acta, 405:442–451, 1975.

[49] F. R. Maxfield and H. A. Scheraga. Status of empirical methods for the predic-tion of protein backbone topography. Biochem., 15:5138–5153, 1976.

[50] E. M. Mitchell, P. J. Artymiuk, D. W. Rice, and P. Willett. Use of techniques derived from graph theory to compare secondary structure motifs in proteins.

Journal of Molecular Biology, 212:151–166, 1992.

[51] S. Muggleton, R. D. King, and M. J. E. Sternberg. Protein secondary structure prediction using logic-based machine learning. Prot. Engin., 5:647–657, 1992.

[52] K.-R. M¨uller, A. Smola, G. R¨atsch, B. Sch¨olkopf, J. Kohlmorgen, and V. Vapnik.

Predicting time series with support vector machines. In B. Sch¨olkopf, C. J. C.

Burges, and A. J. Smola, editors, Advances in Kernel Methods - Support Vector Learning, pages 243–254, Cambridge, MA, 1999. MIT Press.

[53] K. Nagano. Triplet information in helix prediction applied to the analysis of super-secondary structures. Journal of Molecular Biology, 109:251–274, 1977.

[54] H. Nielsen, S. Brunak, and G. von Heijne. Machine learning approaches for the prediction of signal peptides and other protein sorting signals. Protein Eng., 12:3–9, 1999.

[55] H. Nielsen, J. Engelbrecht, S. Brunak, and G. von Heijne. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavages sites.

Protein Eng., 10:1–6, 1997.

[56] H. Nielsen, J. Engelbrecht, S. Brunak, and G. von Heijne. A neural network method for identification of prokaryotic and eukaryotic signal peptides and pre-diction of their cleavages sites. Int. J. Neural. Syst., 8:581–599, 1997.

[57] C. Papageorgiou, M. Oren, and T. Poggio. A general framework for object detection. In International Conference on Computer Vision ICCV’98, 1998.

[58] T. N. Petersen, C. Lundegaard, M. Nielsen, H. Bohr, and J. B. et al. Prediction of protein secondary structure at 80% accuracy. Proteins, 41:17–20, 2000.

[59] O. B. Ptitsyn and A. V. Finkelstein. Theory of protein secondary structure and algorithm of its prediction. Biopolymers, 22:15–25, 1983.

[60] N. Qian and T. Sejnowski. Predicting the secondary structure of globular pro-teins using neural network models. Journal of Molecular Biology, 202(4):865–

884, 1988.

[61] B. Robson. Conformational properties of amino acid residues in globular pro-teins. Journal of Molecular Biology, 107:327–356, 1976.

[62] M. J. Rooman, J. P. Kocher, and S. J. Wodak. Prediction of protein backbone conformation based on seven structure assignments: influence of local interac-tions. Journal of Molecular Biology, 221:961–979, 1991.

[63] B. Rost. Protein secondary structure prediction continues to rise, 2000. Avail-able at http://cubic.bioc.columbia.edu/papers/2001 opinion/paper.html.

[64] B. Rost and C. Sander. Improved prediction of protein secondary structure by use of sequence profiles and neural networks. In Proceedings of the National Academy of Sciences of the United States of America, volume 90, pages 7558–

7562, 1993.

[65] B. Rost and C. Sander. Prediction of protein secondary structure at better than 70% accuracy. Journal of Molecular Biology, 232(2):584–599, 1993.

[66] B. Rost and C. Sander. Third generation prediction of secondary structure.

In D. Webster, editor, In Protein structure prediction: methods and protocols, pages 71–95. Humana Press, Totowa, NJ., 2000.

[67] M. Rychetsky, S. Ortmann, and M. Glesner. Construction of a support vector machine with local experts. In Workshop on Support Vector Machines at the International Joint Conference on Artificial Intelligence (IJCAI 99), 1999.

[68] A. A. Salamov and V. V. Solovyev. Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignment. Jour-nal of Molecular Biology, 247:11–15, 1995.

[69] C. Sander and R. Schneider. Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins, 9(1):56–68, 1991.

[70] T. D. Schneider and R. M. Stephens. Sequence logos: A new way to display consensus sequences. Nucleic Acids Research, 18:6097–6100, 1990.

[71] B. Sch¨olkopf, C. J. C. Burges, and A. J. Smola, editors. Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge, MA, 1998.

[72] V. V. Solovyev and A. A. Salamov. Predicting a-helix and b-strand segments of globular proteins. CABIOS, 10:661–669, 1994.

[73] M. O. Stitson, A. Gammerman, V. Vapnik, V. Vovk, C. Watkins, and J. Weston.

Support vector regression with ANOVA decomposition kernels. In Sch¨olkopf et al. [71], pages 285–292.

[74] P. Stolorz, A. Lapedes, and Y. Xia. Predicting protein secondary structure using neural net and statistical methods. Journal of Molecular Biology, 225(2):363–

377, 1992.

[75] W. R. Taylor and J. M. Thornton. Prediction of super-secondary structure in proteins. Nature, 301:540–542, 1983.

[76] V. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, New York, NY, 1995.

[77] V. Vapnik. Statistical Learning Theory. Wiley, New York, NY, 1998.

相關文件