Abstract
Peptide identification by tandem mass spectrometry (MS/MS) and database searching is becoming the standard high-throughput technology in many areas of the life sciences. The analysis of post-translational modifications (PTMs) is a major source of complications in this area, which calls for efficient computational approaches. In this paper we describe PTMSearch, a novel algorithm in which the PTM search space is represented by a tree structure, and a greedy traversal algorithm is used to identify a path within the tree that corresponds to the PTMs that best fit the input data. Tests on simulated and real (experimental) PTMs show that the algorithm performs well in terms of speed and accuracy. Estimates are given for the error caused by the greedy heuristics, for the size of the search space and a scheme is presented for the calculation of statistical significance.
Chapter PDF
Similar content being viewed by others
Keywords
- Experimental Spectrum
- Collision Induce Dissociation
- Tandem Mass Spectrum
- Theoretical Spectrum
- Random Match
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Yates, J.R., Eng, J.K., McCormack, A.L., Schieltz, D.: Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Analytical Chemistry 67(8), 1426–1436 (1995)
Nathalie, F.-M., Garavelli, J.S., Boeckmann, B., Duvaud, S., Gasteiger, E., Gateau, A., Veuthey, A.-L., Bairoch, A.: Annotation of post-translational modifications in the Swiss-Prot knowledge base. PROTEOMICS 4(6), 1537–1550 (2004)
Yan, B., Zhou, T., Wang, P., Liu, Z., Emanuele II, V.A., Olman, V., Xu, Y.: A Point-Process Model for Rapid Identification of Post-Translational Modifications. Pacific Symposium on Biocomputing (11), 327–338 (2006)
Nesvizhskii, A.I., Vitek, O., Aebersold, R.: Analysis and validation of proteomic data generated by tandem mass spectrometry. Nature Methods 4(10), 787–797 (2007)
Li, Y., Chi, H., Wang, L.-H.H., Wang, H.-P.P., Fu, Y., Yuan, Z.-F.F., Li, S.-J.J., Liu, Y.-S.S., Sun, R.-X.X., Zeng, R., He, S.-M.M.: Speeding up tandem mass spectrometry based database searching by peptide and spectrum indexing. Rapid communications in mass spectrometry: RCM 24(6), 807–814 (2010)
Ahrné, E., Müller, M., Lisacek, F.: Unrestricted identification of modified proteins using MS/MS. Proteomics 10(4), 671–686 (2010)
Tsur, D., Tanner, S., Zandi, E., Bafna, V., Pevzner, P.A.: Identification of post-translational modifications via blind search of mass-spectra. Nat. Biotechnol. 23, 1562–1567 (2005)
Baliban, R.C., DiMaggio, P.A., Plazas-Mayorca, M.D., Young, N.L., Garcia, B.A., Floudas, C.A.: A Novel Approach for Untargeted Post-translational Modification Identification Using Integer Linear Optimization and Tandem Mass Spectrometry. Molecular & Cellular Proteomics 9(5), 764–779 (2010)
Knuth, D.E.: The art of computer programming, 2nd edn., vol. 3. Addison-Wesley Longman Publishing Co., Amsterdam (1998)
Sadygov, R.G., Yates, J.R.: A hypergeometric probability model for protein identification and validation using tandem mass spectral data and protein sequence databases. Anal. Chem. 75(15), 3792–3798 (2003)
Geer, L.Y., Markey, S.P., Kowalak, J.A., Wagner, L., Xu, M., Maynard, D.M., Yang, X., Shi, W., Bryant, S.H.: Open Mass Spectrometry Search Algorithm (June 2004)
Anderson, C.W.: Extreme value theory for a class of discrete distributions with applications to some stochastic processes. Journal of Applied Probability 7, 99–113 (1970)
Käll, L., Storey, J.D., MacCoss, M.J., Noble, W.S.S.: Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. Journal of proteome research 7(1), 29–34 (2008)
Fenyo, D., Beavis, R.C.: A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes. Analytical Chemistry 75(4), 768–774 (2003)
Falkner, J.A., Kachman, M., Veine, D.M., Walker, A., Strahler, J.R., Andrews, P.C.: Validated maldi-tof/tof mass spectra for protein standards. Journal of the American Society for Mass Spectrometry 18(5), 850–855 (2007)
Craig, R., Beavis, R.C.: Tandem: matching proteins with tandem mass spectra. Bioinformatics 20(9), 1466–1467 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kertész-Farkas, A., Reiz, B., Myers, M.P., Pongor, S. (2011). PTMSearch: A Greedy Tree Traversal Algorithm for Finding Protein Post-Translational Modifications in Tandem Mass Spectra. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6912. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23783-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-23783-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23782-9
Online ISBN: 978-3-642-23783-6
eBook Packages: Computer ScienceComputer Science (R0)