Abstract
The genomic scaffold filling problem has attracted a lot of attention since 2010. The general problem is on filling an incomplete sequence (sequence scaffold) I into \(I'\), with respect to a complete reference genome G, such that the number of adjacencies between G and \(I'\) is maximized. The problem is NP-complete and APX-hard, and admits a 1.2-approximation. In this survey paper, we will first review the progress being made for this setting.
However, the sequence input I is not quite practical and does not fit most of the real datasets (where a scaffold is more often given as a list of contigs). Then, we will review the most recent progress on this new version of the genomic scaffold filling problem where, (1) a scaffold S is given, the missing genes \(X=c(G)-c(S)\) can only be inserted in between the contigs, and the objective is to maximize the number of adjacencies between G and the filled \(S'\), and (2) a scaffold S is given, a subset of the missing genes \(X'\subset X=c(G)-c(S)\) can only be inserted in between the contigs, and the objective is still to maximize the number of adjacencies between G and the filled \(S''\). Some open problems will be posed for further research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Angibaud, S., Fertin, G., Rusu, I., Thevenin, A., Vialette, S.: On the approximability of comparing genomes with duplicates. J. Graph Algorithms Appl. 13(1), 19–53 (2009)
Alon, N., Yuster, R., Zwick, U.: Color-coding. J. ACM 42(4), 844–856 (1995)
Blin, G., Fertin, G., Sikora, F., Vialette, S.: The exemplar breakpoint distance for non-trivial genomes cannot be approximated. In: Das, S., Uehara, R. (eds.) WALCOM 2009. LNCS, vol. 5431, pp. 357–368. Springer, Heidelberg (2009)
Bulteau, L., Carrieri, A.P., Dondi, R.: Fixed-parameter algorithms for scaffold filling. Theoret. Comput. Sci. 568, 72–83 (2015)
Chain, P.S., Grafham, D.V., Fulton, R.S., et al.: Genome project standards in a new era of sequencing. Science 326, 236–237 (2009)
Chen, Z., Fu, B., Zhu, B.: The approximability of the exemplar breakpoint distance problem. In: Cheng, S.-W., Poon, C.K. (eds.) AAIM 2006. LNCS, vol. 4041, pp. 291–302. Springer, Heidelberg (2006)
Chen, Z., Fu, B., Xu, J., Yang, B., Zhao, Z., Zhu, B.: Non-breaking similarity of genomes with gene repetitions. In: Ma, B., Zhang, K. (eds.) CPM 2007. LNCS, vol. 4580, pp. 119–130. Springer, Heidelberg (2007)
Chen, Z., Fu, B., Fowler, R., Zhu, B.: On the inapproximability of the exemplar conserved interval distance problem of genomes. J. Comb. Optim. 15(2), 201–221 (2008)
Chen, Z., Fu, B., Goebel, R., Lin, G., Tong, W., Xu, J., Yang, B., Zhao, Z., Zhu, B.: On the approximability of the exemplar adjacency number problem of genomes with gene repetitions. Theoret. Comput. Sci. 550, 59–65 (2014)
Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. In: Proceedings of the 13th ACM-SIAM Symposium on Discrete Algorithms (SODA 2002), pp. 667–676 (2002)
Downey, R., Fellows, M.: Parameterized Complexity. Springer, New York (1999)
Flum, J., Grohe, M.: Parameterized Complexity Theory. Springer, Heidelberg (2006)
Guo, J., Niedermeier, R., Wernicke, S.: Parameterized complexity of vertex cover variants. Theor. Comput. Syst. 41(3), 501–520 (2007)
Jiang, H., Zhong, F., Zhu, B.: Filling scaffolds with gene repetitions: maximizing the number of adjacencies. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 55–64. Springer, Heidelberg (2011)
Jiang, H., Zheng, C., Sankoff, D., Zhu, B.: Scaffold filling under the breakpoint, related distances. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(4), 1220–1229 (2012)
Jiang, H., Ma, J., Luan, J., Zhu, D.: Approximation and nonapproximability for the one-sided scaffold filling problem. In: Xu, D., Du, D., Du, D. (eds.) COCOON 2015. LNCS, vol. 9198, pp. 251–263. Springer, Heidelberg (2015)
Jiang, H., Fan, C., Yang, B., Zhong, F., Zhu, D., Zhu, B.: Genomic scaffold filling revisited. In: Proceedings of the 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016), Tel Aviv, Israel, 27–29 June 2016 (2016)
Jiang, M.: The zero exemplar distance problem. In: Tannier, E. (ed.) RECOMB-CG 2010. LNCS, vol. 6398, pp. 74–82. Springer, Heidelberg (2010)
Khanna, S., Motwani, R., Sudan, M., Vazirani, U.: On syntactic versus computational views of approximability. SIAM J. Comput. 28(1), 164–191 (1998)
Liu, N., Jiang, H., Zhu, D., Zhu, B.: An improved approximation algorithm for scaffold filling to maximize the common adjacencies. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(4), 905–913 (2013)
Liu, N., Zhu, D., Jiang, H., Zhu, B.: A 1.5-approximation algorithm for two-sided scaffold filling. Algorithmica 74(1), 91–116 (2016)
Liu, N., Zou, P., Zhu, B.: A polynomial time solution for permutation scaffold filling. Technical report, Department of Computer Science, Montana State University (2016)
Muñoz, A., Zheng, C., Zhu, Q., Albert, V., Rounsley, S., Sankoff, D.: Scaffold filling, contig fusion and gene order comparison. BMC Bioinform. 11, 304 (2010)
Niedermeier, R.: Invitation to Fixed-Parameter Algorithms. Oxford University Press, New York (2006)
Yancopoulos, S., Attie, O., Friedberg, R.: Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 21, 3340–3346 (2005)
Zhu, B.: A retrospective on genomic preprocessing for comparative genomics. In: Chauve, C., et al. (eds.) Models and Algorithms for Genome Evolution, pp. 183–206. Springer, London (2013)
Acknowledgments
I would like to thank my collaborators for this series of research: Chenglin Fan, Haitao Jiang, Nan Liu, David Sankoff, Boting Yang, Chunfang Zheng, Farong Zhong, Daming Zhu and Peng Zou.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhu, B. (2016). Genomic Scaffold Filling: A Progress Report. In: Zhu, D., Bereg, S. (eds) Frontiers in Algorithmics. FAW 2016. Lecture Notes in Computer Science(), vol 9711. Springer, Cham. https://doi.org/10.1007/978-3-319-39817-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-39817-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39816-7
Online ISBN: 978-3-319-39817-4
eBook Packages: Computer ScienceComputer Science (R0)