Abstract
Haplotype assembly is the problem of reconstructing the two parental chromosomes of an individual from a set of sampled DNA-sequences. A combinatorial optimization problem that models haplotype assembly is the Minimum Error Correction problem (MEC). This problem has been intensively studied in the computational biology literature and is also known in the clustering literature: essentially we are required to find two cluster centres such that the sum of distances to the nearest centre, is minimized. We introduce here the problem Fixed haplotype-Minimum Error Correction (FH-MEC), a new variant of MEC which corresponds to instances where one of the haplotypes/centres is already given. We provide hardness results for the problem on various restricted instances. We also propose a new and very simple 2-approximation algorithm for MEC on binary input matrices.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
A binary matrix M is conflict-free if the rows of M can be bipartitioned into two sets such that the corresponding entries on each set are identical.
References
Alimonti, P., Kann, V.: Hardness of approximating problems on cubic graphs. In: Bongiovanni, G., Bovet, D.P., Di Battista, G. (eds.) CIAC 1997. LNCS, vol. 1203, pp. 288–298. Springer, Heidelberg (1997). https://doi.org/10.1007/3-540-62592-5_80
Bansal, V., Bafna, V.: HapCUT: an efficient and accurate algorithm for the haplotype assembly problem. Bioinformatics 24(16), i153–i159 (2008)
Bonizzoni, P., Dondi, R., Klau, G.W., Pirola, Y., Pisanti, N., Zaccaria, S.: On the minimum error correction problem for haplotype assembly in diploid and polyploid genomes. J. Comput. Biol. 23(9), 718–736 (2016)
Cilibrasi, R., Van Iersel, L., Kelk, S., Tromp, J.: The complexity of the single individual SNP haplotyping problem. Algorithmica 49(1), 13–36 (2007)
International HapMap Consortium, et al.: A haplotype map of the human genome. Nature 437(7063), 1299 (2005)
Downey, R.G., Fellows, M.R.: Fundamentals of Parameterized Complexity, vol. 201. Springer, London (2016). https://doi.org/10.1007/978-1-4471-5559-1
Etemadi, M., Bagherian, M., Chen, Z.-Z., Wang, L.: Better ILP models for haplotype assembly. BMC Bioinform. 19(1), 52 (2018)
Feige, U.: NP-hardness of hypercube 2-segmentation (2014). arXiv preprint arXiv:1411.0821
Jiao, Y., Xu, J., Li, M.: On the k-closest substring and k-consensus pattern problems. In: Sahinalp, S.C., Muthukrishnan, S., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 130–144. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27801-6_10
Lancia, G., Bafna, V., Istrail, S., Lippert, R., Schwartz, R.: SNPs problems, complexity, and algorithms. In: auf der Heide, F.M. (ed.) ESA 2001. LNCS, vol. 2161, pp. 182–193. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44676-1_15
Lippert, R., Schwartz, R., Lancia, G., Istrail, S.: Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem. Brief. Bioinform. 3(1), 23–31 (2002)
Ostrovsky, R., Rabani, Y.: Polynomial-time approximation schemes for geometric min-sum median clustering. J. ACM 49(2), 139–156 (2002)
Papadimitriou, C.H., Yannakakis, M.: Optimization, approximation, and complexity classes. J. Comput. Syst. Sci. 43(3), 425–440 (1991)
Phelps, K.T., Rifa, J., Villanueva, M.: Rank and kernel of binary hadamard codes. IEEE Trans. Inf. Theory 51(11), 3931–3937 (2005)
Acknowledgements
The last author acknowledges the support of an NWO TOP 2 grant.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Goblet, A., Kelk, S., Mihalák, M., Stamoulis, G. (2018). On a Fixed Haplotype Variant of the Minimum Error Correction Problem. In: Wang, L., Zhu, D. (eds) Computing and Combinatorics. COCOON 2018. Lecture Notes in Computer Science(), vol 10976. Springer, Cham. https://doi.org/10.1007/978-3-319-94776-1_46
Download citation
DOI: https://doi.org/10.1007/978-3-319-94776-1_46
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94775-4
Online ISBN: 978-3-319-94776-1
eBook Packages: Computer ScienceComputer Science (R0)