Abstract
Hidden Markov models (HMMs) are often used for biological sequence annotation. Each sequence element is represented by states with the same label. A sequence should be annotated with the labeling of highest probability. Computing this most probable labeling was shown NP-hard by Lyngsø and Pedersen [9]. We improve this result by proving the problem NP-hard for a fixed HMM. High probability labelings are often found by heuristics, such as taking the labeling corresponding to the most probable state path. We introduce an efficient algorithm that computes the most probable labeling for a wide class of HMMs, including models previously used for transmembrane protein topology prediction and coding region detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Burge, C., Karlin, S.: Prediction of complete gene structures in human genomic DNA. Journal of Molecular Biology 268(1), 78–94 (1997)
Casacuberta, F., de la Higuera, C.: Computational Complexity of Problems on Probabilistic Grammars and Transducers. In: Oliveira, A.L. (ed.) ICGI 2000. LNCS (LNAI), vol. 1891, pp. 15–24. Springer, Heidelberg (2000)
Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological sequence analysis. Cambridge University Press, Cambridge (1998)
Eppstein, D.: Finding the k shortest paths. SIAM Journal on Computing 28(2), 652–673 (1998)
Iseli, C., Jongeneel, C.V., Bucher, P.: ESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequences. In: Seventh International Conference on Intelligent Systems for Molecular Biology (ISMB), pp. 138–148 (1999)
Krogh, A.: Two methods for improving performance of an HMM and their application for gene finding. In: Fifth International Conference on Intelligent Systems for Molecular Biology (ISMB), pp. 179–186. AAAI Press, Menlo Park (1997)
Krogh, A., Larsson, B., von Heijne, G., Sonnhammer, E.L.: Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. Journal of Molecular Biology 305(3), 567–570 (2001)
Lio, P., Goldman, N., Thorne, J.L., Jones, D.T.: PASSML: combining evolutionary inference and protein secondary structure prediction. Bioinformatics 14(8), 726–733 (1998)
Lyngsø, R.B., Pedersen, C.N.S.: The consensus string problem and the complexity of comparing hidden Markov models. Journal of Computer and System Sciences 65(3), 545–569 (2002)
Martelli, P.L., Fariselli, P., Krogh, A., Casadio, R.: A sequence-profile-based HMM for predicting and discriminating beta barrel membrane proteins. Bioinformatics 18(1), S46–53 (2002)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–285 (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brejová, B., Brown, D.G., Vinař, T. (2004). The Most Probable Labeling Problem in HMMs and Its Application to Bioinformatics. In: Jonassen, I., Kim, J. (eds) Algorithms in Bioinformatics. WABI 2004. Lecture Notes in Computer Science(), vol 3240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30219-3_36
Download citation
DOI: https://doi.org/10.1007/978-3-540-30219-3_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23018-2
Online ISBN: 978-3-540-30219-3
eBook Packages: Springer Book Archive