Abstract
Recently, Pereira Zanetti et al. [9] have proposed a new definition of a distance for which the computational complexity of the median problem is currently unknown. In this formulation, each genome is represented as a permutation on n elements that is the product of disjoint cycles of length 1 (telomeres) and length 2 (adjacencies). The permutations are converted into their matrix representation, and the rank distance d is used to define the median.
In their paper, the authors provide an \(O(n^3)\) algorithm for determining three candidate medians, prove the tight approximation ratio \(\frac{4}{3}\), and provide a sufficient condition for their candidates to be true medians. They also conduct some experiments that suggest that their method is accurate on simulated and real data.
In this paper, we extend their results and provide the following:
-
2 invariants characterizing the problem of finding the median of 3 matrices
-
a strengthening of one of the results in the work of Pereira Zanetti et al.
-
a sufficient condition for optimality that can be checked in O(n) time
-
a faster, \(O(n^2)\) algorithm for determining the median under this condition
-
a new heuristic algorithm for this problem based on compressed sensing.
J. Meidanis is on Leave from University of Campinas, Campinas, Brazil.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sankoff, D., Blanchette, M.: Multiple genome rearrangement and breakpoint phylogeny. J. Comput. Biol. 5(3), 555–570 (1998)
Moret, B.M., Wang, L.S., Warnow, T., Wyman, S.K.: New approaches for reconstructing phylogenies from gene order data. Bioinformatics 17, 165–173 (2001)
Bourque, G., Pevzner, P.A.: Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Res. 12(1), 26–36 (2002)
Caprara, A.: The reversal median problem. INFORMS J. Comput. 15(1), 93–113 (2003)
Fertin, G., Labarre, A., Rusu, I., Tannier, E., Vialette, S.: Combinatorics of Genome Rearrangements. MIT Press, Cambridge (2009)
Tannier, E., Zheng, C., Sankoff, D.: Multichromosomal median and halving problems under different genomic distances. BMC Bioinform. 10, 120 (2009)
Feijao, P., Meidanis, J.: SCJ: a breakpoint-like distance that simplifies several rearrangement problems. Trans. Comput. Biol. Bioinform. 8, 1318–1329 (2011)
Pe’er, I., Shamir, R.: Approximation algorithms for the median problem in the breakpoint model. In: Sankoff, D., Nadeau, J.H. (eds.) Comparative Genomics, pp. 225–241. Springer, Berlin (2000)
Pereira Zanetti, J.P., Biller, P., Meidanis, J.: Median approximations for genomes modeled as matrices. Bull. Math. Biol. 78(4), 786–814 (2016)
Delsarte, P.: Bilinear forms over a finite field, with applications to coding theory. J. Comb. Theory A 25(3), 226–241 (1978)
Yancopoulos, S., Attie, O., Friedberg, R.: Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 21, 3340–3346 (2005)
Feijao, P., Meidanis, J.: Extending the algebraic formalism for genome rearrangements to include linear chromosomes. Trans. Comput. Biol. Bioinform. 10, 819–831 (2012)
Roman, S.: Advanced Linear Algebra. Graduate Texts in Mathematics. Springer, New York (2008)
Arvind, V., Joglekar, P.S.: Algorithmic problems for metrics on permutation groups. In: Geffert, V., Karhumäki, J., Bertoni, A., Preneel, B., Návrat, P., Bieliková, M. (eds.) SOFSEM 2008. LNCS, vol. 4910, pp. 136–147. Springer, Heidelberg (2008). doi:10.1007/978-3-540-77566-9_12
Aspvall, B., Shiloach, Y.: A fast algorithm for solving systems of linear equatlons with two variables per equation. Linear Algebra Appl. 34, 117–124 (1980)
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)
LPsolve Team: lp_solve 5.5. http://lpsolve.sourceforge.net/. Accessed 22 July 2017
IBM: CPLEX Optimizer. http://www-01.ibm.com/software/commerce/optimization/cplex-optimizer/. Accessed 22 July 2017
Lugo, M.: The cycle structure of compositions of random involutions (2009). https://arxiv.org/abs/0911.3604
R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/
Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Syst., 1695 (2006). http://igraph.org
Bates, D., Maechler, M.: Matrix: Sparse and Dense Matrix Classes and Methods. R package version 1.2-10. http://CRAN.R-project.org/package=Matrix
Trefethen, L.N., Bau, D.: Numerical Linear Algebra, 1st edn. SIAM: Society for Industrial and Applied Mathematics, Philadelphia (1997)
Biller, P., Guéguen, L., Tannier, E.: Moments of genome evolution by double cut-and-join. BMC Bioinform. 16(Suppl 14), S7 (2015)
Acknowledgments
The authors would like to thank Cedric Chauve and Pedro Feijão for helpful discussions. LC would like to acknowledge financial support from NSERC, CIHR, Genome Canada and the Sloan Foundation. JM would like to acknowledge financial support from NSERC.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Chindelevitch, L., Meidanis, J. (2017). On the Rank-Distance Median of 3 Permutations. In: Meidanis, J., Nakhleh, L. (eds) Comparative Genomics. RECOMB-CG 2017. Lecture Notes in Computer Science(), vol 10562. Springer, Cham. https://doi.org/10.1007/978-3-319-67979-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-67979-2_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67978-5
Online ISBN: 978-3-319-67979-2
eBook Packages: Computer ScienceComputer Science (R0)