Essays on the Binary Representations of the DNA Data
<p>(<b>A</b>) Results of maximum parsimony (MP hereinafter) analyses of the conventional plastid genomic DNA matrix of the bamboos (Arundinarieae, Poaceae, flowering plants) from [<a href="#B28-dna-05-00010" class="html-bibr">28</a>]. Final trees were rooted relative to <span class="html-italic">Dendrocalamus latiflorus</span> Munro [<a href="#B28-dna-05-00010" class="html-bibr">28</a>]. The cladogram represents the median consensus tree based on Robinson–Foulds (RF) distance (with the best score found = 8837) of 184 shortest output trees of length = 5019 (CI = 0.89, RI = 0.91). The number of taxa = 157. All constant characters from the original alignment are excluded from the analysis. The number of variable characters = 4304, number of parsimony-informative characters = 2003. * nodes received MP Jackknife (JK) support > 50% after 20,000 fast JK replicates; ! nodes recovered MP Bootstrap support in the analysis from [<a href="#B28-dna-05-00010" class="html-bibr">28</a>] (200 full heuristic replicates). (<b>B</b>) Results of MP of the binary representation of the conventional DNA matrix from A., re-coded following the proposed <span class="html-italic">1001</span> Method 1. Initial binary data were polarized before analysis relative to <span class="html-italic">D. latiflorus</span>, assumed as an outgroup [<a href="#B28-dna-05-00010" class="html-bibr">28</a>]. The cladogram represents the majority-rule consensus of 191 shortest output trees of length = 10,014 (CI = 0.88, RI = 0.89). The number of taxa = 157. The number of binary characters = 8783, number of parsimony-informative characters = 4088. * nodes received MP Jackknife (JK) support > 50% after 20,000 fast JK replicates. (<b>C</b>) Results of MP analyses of the binary representation of the conventional DNA matrix from A., re-coded following the proposed <span class="html-italic">1001</span> Method 2. Data polarized before analysis relative to <span class="html-italic">D. latiflorus</span>, assumed as an out-group based on the previous results of [<a href="#B28-dna-05-00010" class="html-bibr">28</a>]. The cladogram represents the majority-rule consensus of 139 shortest output trees of length = 4993 (CI = 0.89, RI = 0.91). The number of taxa = 157. The number of binary characters = 4993, number of parsimony-informative characters = 2027. * nodes received MP Jackknife (JK) support > 50% after 20,000 fast JK replicates. All MP analyses were conducted using program PAUPrat [<a href="#B29-dna-05-00010" class="html-bibr">29</a>,<a href="#B30-dna-05-00010" class="html-bibr">30</a>,<a href="#B31-dna-05-00010" class="html-bibr">31</a>] as implemented in CIPRES [<a href="#B32-dna-05-00010" class="html-bibr">32</a>] following 200 ratchet replicates with no more than 10 trees of length greater than or equal to 1 saved in each replicate, and the TBR branch swapping/MulTrees option in effect; -pct = 20%, all characters weighted uniformly, and gaps were treated as ‘‘missing”. MP jackknifing [<a href="#B33-dna-05-00010" class="html-bibr">33</a>] was conducted using PAUP* version 4.a168 [<a href="#B31-dna-05-00010" class="html-bibr">31</a>] (PAUP* hereinafter) as implemented in CIPRES [<a href="#B32-dna-05-00010" class="html-bibr">32</a>]. Robinson–Foulds consensus [<a href="#B14-dna-05-00010" class="html-bibr">14</a>,<a href="#B34-dna-05-00010" class="html-bibr">34</a>] calculated using RFS version 2.0 [<a href="#B34-dna-05-00010" class="html-bibr">34</a>]. Majority-rule consensus calculated in PAUP* [<a href="#B31-dna-05-00010" class="html-bibr">31</a>]. Branches with a minimum length of zero collapsed. All gaps and ambiguities of the conventional DNA matrix (<b>A</b>) were recoded as missing data (“?”) before binary permutations. Roman numerals correspond to the “major lineages” of Arundinarieae [<a href="#B28-dna-05-00010" class="html-bibr">28</a>].</p> "> Figure 2
<p>The results of two three-taxon statement analyses (3TA hereinafter) of Clades 1 and 2 (<a href="#dna-05-00010-f001" class="html-fig">Figure 1</a>). The DNA alignments have been polarized following <span class="html-italic">1001</span> Method 2 and subsequently established as binary three-taxon matrices using TAXODIUM version 1.2 [<a href="#B18-dna-05-00010" class="html-bibr">18</a>] (TAXODIUM hereinafter). Following the results of the previous analyses (<a href="#dna-05-00010-f001" class="html-fig">Figure 1</a>), <span class="html-italic">Indocalamus wilsonii</span> (Rendle) C.S.Chao and C.D.Chu (Clade 1) and <span class="html-italic">Bergbambos tessellata</span> (Nees) Stapleton (Clade 2) were assumed to be outgroup taxa before Method 2 was applied to the DNA characters. (<b>A</b>) The results of the first 3TA (Clade 1). Majority-rule consensus of 193 shortest output trees of length = 527,046 (CI = 0.92, RI = 0.91). The number of taxa in the 487168 character–3TA matrix is 72. All 487,168 3TSs are parsimony-informative and weighted uniformly. (<b>B</b>) The results of the second 3TA (Clade 2). Majority-rule consensus of 201 shortest output trees of length = 187,857 (CI = 0.86, RI = 0.83). The number of taxa in the 161,027 character–3TA matrix is 80. All 1,610,278 3TSs are parsimony-informative and weighted uniformly. For the meaning of Roman numerals and the details of the MP analyses, see the legend of <a href="#dna-05-00010-f001" class="html-fig">Figure 1</a>.</p> "> Figure 3
<p>(<b>A</b>) The simplified phylogeny of flowering plants and outgroups resulted from the MP analysis of the 38,553 bp cpDNA alignment from [<a href="#B35-dna-05-00010" class="html-bibr">35</a>]. The general strategy of the analysis is described in [<a href="#B18-dna-05-00010" class="html-bibr">18</a>]. The heuristic search for the most parsimonious tree was performed with the implied weights [<a href="#B36-dna-05-00010" class="html-bibr">36</a>] included in the search procedure, and the value of the <span class="html-italic">k</span>-function was assigned as three. The phylogeny is established as a single phylogram. Goloboff fit = −10,023.39940, with the actual length of the tree equal to 48186, CI = 0.55, RI = 0.61. The number of informative characters is equal to 13,328. (<b>B</b>) The most parsimonious hierarchy of patterns was obtained from the MP analysis of the same strategy as in (<b>A</b>). The latter was based on the polarized binary matrix recoded from the conventional cpDNA alignment (<b>A</b>) following <span class="html-italic">1001</span> Method 1, with <span class="html-italic">Cryptomeria</span> (Cupressaceae Bartlett, gymnosperms) assumed as the best outgroup. The hierarchy of patterns is established as a single cladogram. Goloboff fit = −24165.80162 with the actual length of the tree equal to 102,724, CI = 0.49, RI = 0.60. The number of informative characters equals 32,141. (<b>C</b>). The most parsimonious hierarchy of patterns resulted from the MP analysis, which followed the same strategy as in A (see above) but without implied weights [<a href="#B36-dna-05-00010" class="html-bibr">36</a>] included in the search procedure. The analysis was based on the polarized binary matrix recoded from the conventional cpDNA alignment (<b>A</b>) following <span class="html-italic">1001</span> Method 2, assuming <span class="html-italic">Cryptomeria</span> as the best outgroup. The hierarchy of patterns is established as a single cladogram of the length 48,552, CI = 0.56, RI = 0.62. The number of informative characters equals 15,653. (<b>D</b>) The single most parsimonious hierarchy of patterns resulted from the MP analysis, which followed the same strategy as in (<b>A</b>) (see above) but without implied weights [<a href="#B36-dna-05-00010" class="html-bibr">36</a>] included in the search procedure. The analysis was based on the three-taxon statement matrix with 1,652,888 fractionally weighted [<a href="#B4-dna-05-00010" class="html-bibr">4</a>,<a href="#B12-dna-05-00010" class="html-bibr">12</a>,<a href="#B14-dna-05-00010" class="html-bibr">14</a>,<a href="#B18-dna-05-00010" class="html-bibr">18</a>] three-taxon statements calculated by TAXODIUM [<a href="#B18-dna-05-00010" class="html-bibr">18</a>]. This matrix is derived from the polarized binary representation (<span class="html-italic">1001</span> Method 2) of the 28,196 bp largest clique, estimated by PHYLIP version 3.695 [<a href="#B19-dna-05-00010" class="html-bibr">19</a>] based on a 38,553 bp cpDNA alignment (<b>A</b>). <span class="html-italic">Cryptomeria</span> is assumed to be the best outgroup. The hierarchy of patterns is established as a cladogram of the length of 230,181.7318, CI = 0.99, RI = 0.99. The number of informative characters (three-taxon statements) equals 1 652 888.</p> "> Figure 4
<p>(<b>A</b>) An unrooted simplified molecular phylogeny of <span class="html-italic">Ceratophyllum</span> (Ceratophyllaceae A. Gray, flowering plants) [<a href="#B37-dna-05-00010" class="html-bibr">37</a>], showing the ambiguous placement of <span class="html-italic">C. echinatum</span> [<a href="#B37-dna-05-00010" class="html-bibr">37</a>]. (<b>B</b>) A summary of the cladistic analyses [<a href="#B38-dna-05-00010" class="html-bibr">38</a>], demonstrating that <span class="html-italic">C. echinatum</span> is a sister group to the narrowly defined genus <span class="html-italic">Ceratophyllum</span>. All analyses (<b>B</b>) were based on the binary ’presence–absence’ representation of the molecular data from [<a href="#B37-dna-05-00010" class="html-bibr">37</a>], adding an artificial all-zero outgroup. As a result of the cladistic analyses of the binary recoded DNA sequence data [<a href="#B37-dna-05-00010" class="html-bibr">37</a>,<a href="#B38-dna-05-00010" class="html-bibr">38</a>], <span class="html-italic">C. echinatum</span> was defined as a sister group of the narrowly circumscribed genus <span class="html-italic">Ceratophyllum</span> [<a href="#B38-dna-05-00010" class="html-bibr">38</a>] and transferred to the newly established genus <span class="html-italic">Fassettia</span> based on the obtained phylogenetic placement [<a href="#B38-dna-05-00010" class="html-bibr">38</a>]. See [<a href="#B38-dna-05-00010" class="html-bibr">38</a>] for details of the cladistic analyses and taxonomic treatment. Clade “<span class="html-italic">Ceratophyllum</span>” is marked with an asterisk (*). This figure also shows the ‘presence–absence’ binary coding (<b>B</b>) of the DNA sequence data (<b>A</b>), as implemented in <span class="html-italic">1001</span>.</p> "> Figure 5
<p>Leibniz’s original four-digit binary representation of Arabic numbers one, two, four, and eight (indicated by exclamation marks, added by us). In the third column of this table, Leibniz himself linked this representation with the combination of solid and dotted lines, each corresponding to one of the four <span class="html-italic">T’ai Hsüan Ching</span> tetragrams (indicated by exclamation marks, added by us), namely the tetragrams <span class="html-italic">Penetration</span>, <span class="html-italic">Legion</span>, <span class="html-italic">Fullness</span>, and <span class="html-italic">Law</span> (<span class="html-italic">Model</span>) [<a href="#B73-dna-05-00010" class="html-bibr">73</a>]. Reproduced from Leibniz’s manuscript <span class="html-italic">De Dyadics</span>, as interpreted and translated by Yakovlev [<a href="#B72-dna-05-00010" class="html-bibr">72</a>], see pp. 195, 201, and 202.</p> ">
Abstract
:1. Introduction
2. Binary Representation of DNA Sequence Data and Molecular Phylogenetics
- A = 0010
- A = 0010
- T = 0001
- C = 1000
- G = 0100
- (a)
- Non-polarized or ‘presence–absence’ binary matrix with no artificial all-zero outgroup added, with and without invariant characters (both Phylip (phy) and comma-separated values (CSV) files); The invariant (and non-informative) characters are, strictly speaking, not considered characters from the cladistic standpoint [2]. However, saving them in the resulting outputs may be necessary for future statistical analyses of the recoded DNA alignments. The same proposition is also valid for the second method of 1001 (see below). Using available software [13,20,31], we recommend removing all non-informative characters from the 1001 output files before conducting phylogenetic analyses.
- (b)
- Binary matrices that result from the polarization of the ‘presence–absence’ binary matrix relative to a real taxon (assumed outgroup), with and without invariant characters (both phy and CSV files).
‘Application of absence/presence coding has yet to be considered in molecular systematics, and no body of opinion considers base substitution as anything other than a special form of character state transformation.’([14], p. 36).
3. Binary Representation on the DNA Sequence Data, Leibniz, and Religion
- A = 0001 = 1
- T = 0010 = 2
- C = 0100 = 4
- G = 1000 = 8
- N1 = 0001 = 20
- N2 = 0010 = 21
- N3 = 0100 = 22
- N4 = 1000 = 23
[Leibniz] “… enduring mathematical contributions was his invention of the binary number system the basis for today’s world of digital computing and communications. In this system, which uses only two digits—0 and 1—every position starting from the right represents a successive power of 2…”.
“Leibniz’s faith in the illuminatory power of binary is nowhere more evident than in his use of it in philosophical theology. In 1694/5, when sketching notes on one of Weigel’s books, Leibniz devised an analogy between the representation of all numbers by 1 and 0 and the theologically orthodox idea of creation ex nihilo, that is, the creation of all things out of nothing by God. Treating God as the analogue of 1 and nothingness as the analogue of 0, Leibniz took the origination of all numbers from 1 and 0 in binary as a reflection of the doctrine that all created things originate from God and nothingness. This analogy would play a pivotal role in Leibniz’s willingness to inform correspondents about binary from 1696 onwards (and … played a pivotal role in his decision to publish an essay on binary in 1705)”.([70], p. 1).
4. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Zuntini, A.R.; Carruthers, T.; Maurin, O.; Bailey, P.C.; Leempoel, K.; Brewer, G.E.; Epitawalage, N.; Françoso, E.; Gallego-Paramo, B.; McGinnie, C.; et al. Phylogenomics and the rise of the Angiosperms. Nature 2024, 629, 843–850. [Google Scholar] [CrossRef] [PubMed]
- Mavrodiev, E.V.; Madorsky, A. On pattern–cladistic analyses based on complete plastid genome sequences. Acta Biotheor. 2023, 71, 22. [Google Scholar] [CrossRef] [PubMed]
- Hennig, W. Phylogenetic Systematics; University of Illinois Press: Urbana, IL, USA, 1966. [Google Scholar]
- Williams, D.M.; Ebach, M.C. Foundations of Systematics and Biogeography; Springer: New York, NY, USA, 2008. [Google Scholar]
- Farris, J.S. The logical basis of phylogenetic analysis. In Advances in Cladistics, 2. Proceedings of the 2nd Meeting of the Willi Hennig Society; Platnick, N.I., Funk, V., Eds.; Columbia University Press: New York, NY, USA, 1983; pp. 7–36. [Google Scholar]
- Felsenstein, J. Inferring Phylogenies; Sinauer Associates Inc.: Sunderland, MA, USA, 2004. [Google Scholar]
- Rannala, B.; Yang, Z.H. Probability distribution of molecular evolutionary trees: A new method of phylogenetic inference. J. Mol. Evol. 1996, 43, 304–311. [Google Scholar] [CrossRef]
- Nelson, G.J.; Platnick, N. Systematics and Biogeography: Cladistics and Vicariance; Columbia University Press: New York, NY, USA, 1981. [Google Scholar]
- Platnick, N.I. Philosophy and the transformation of cladistics revisited. Cladistics 1985, 1, 87–94. [Google Scholar] [CrossRef] [PubMed]
- Wägele, J.W. Hennig’s phylogenetic systematics brought up to date. In Milestones in Systematics; Williams, D.M., Forey, P.L., Eds.; CRC Press: Boca Raton, FL, USA, 2004; pp. 101–125. [Google Scholar]
- Wägele, J.W. Foundations of Phylogenetic Systematics; Pfeil Verlag: München, Germany, 2005. [Google Scholar]
- Williams, D.M.; Ebach, M.C. Cladistics. A Guide to Biological Classification, 3rd ed.; Systematics Association Special; Cambridge University Press: Cambridge, UK, 2020; Volume Series 88. [Google Scholar]
- Swofford, D.L.; Begle, D.P. PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 3.1. User’s Manual; Laboratory of Molecular Systematics, MRC 534, MSC, Smithsonian Institution: Washington, DC, USA, 1993. [Google Scholar]
- Kitching, I.J.; Forey, P.L.; Humphries, C.; Williams, D. Cladistics: The Theory and Practice of Parsimony Analysis; Oxford University Press: Oxford, UK, 1998. [Google Scholar]
- Ebach, M.C.; Williams, D.M.; Vanderlaan, T.A. Implementation as theory, hierarchy as transformation, homology as synapomorphy. Zootaxa 2013, 3641, 587–594. [Google Scholar] [CrossRef]
- Wiley, E.O.; Lieberman, B.S. Phylogenetics: The Theory and Practice of Phylogenetic Systematics, 2nd ed.; Wiley-Blackwell: Hoboken, NJ, USA, 2011. [Google Scholar]
- Williams, D.M.; Siebert, D.J. Characters, homology and three–item analysis. In Homology and Systematics: Coding Characters for Phylogenetic Analysis; Scotland, R.W., Pennington, R.T., Eds.; Systematics Association Special Volumes; Chapman and Hall: London, UK; New York, NY, USA, 2000; Volume 58, pp. 183–208. [Google Scholar]
- Mavrodiev, E.V.; Madorsky, A. TAXODIUM Version 1.0: A simple way to generate uniform and fractionally weighted three–item matrices from various kinds of biological data. PLoS ONE 2012, 7, e48813. [Google Scholar] [CrossRef]
- Felsenstein, J. PHYLIP—Phylogeny Inference Package (Version 3.2). Cladistics 1989, 5, 164–166. [Google Scholar]
- Maddison, W.P.; Maddison, D.R. Mesquite: A Modular System for Evolutionary Analysis Version 3.81. Available online: https://www.mesquiteproject.org/ (accessed on 12 October 2024).
- Nelson, G.J. Ontogeny, phylogeny, paleontology, and Biogenetic Law. Syst. Zool. 1978, 27, 324–345. [Google Scholar] [CrossRef]
- Nixon, K.C.; Carpenter, J.M. On outgroups. Cladistics 1993, 9, 413–426. [Google Scholar] [CrossRef]
- de Pinna, M.C.C. Ontogeny, rooting, and polarity. In Models in Phylogeny Reconstruction; Scotland, R.W., Siebert, D.J., Williams, D.M., Eds.; Systematics Association Special Volume Series; Clarendon Press: Oxford, UK, 1994; Volume 52, pp. 157–172. [Google Scholar]
- Bryant, H.N. Character polarity and the rooting of cladograms. In The Character Concept in Evolutionary Biology; Wagner, G.P., Ed.; Academic Press: San Diego, CA, USA, 2001; pp. 319–342. [Google Scholar]
- Wiley, E.O. The phylogeny and biogeography of fossil and recent gars (Actinopterygii: Lepisosteidae); Miscellaneous Publication—; University of Kansas, Museum of Natural History: Lawrence, KS, USA, 1976; Volume 64, pp. 1–111. [Google Scholar]
- Platnick, N.I.; Gertsch, W.J. The Suborders of Spiders: A Cladistic Analysis (Arachnida, Araneae); American Museum of Natural History: New York, NY, USA, 1976; Number 2607; pp. 1–15. [Google Scholar]
- Mavrodiev, E.V.; Dell, C.; Schroder, L. A laid-back trip through the Hennigian Forests. PeerJ 2017, 5, e3578. [Google Scholar] [CrossRef]
- Ma, P.-F.; Zhang, Y.-X.; Zeng, C.-X.; Guo, Z.-H.; Li, D.-Z. Chloroplast phylogenomic analyses resolve deep–level relationships of an intractable bamboo tribe Arundinarieae (Poaceae). Syst. Biol. 2014, 63, 933–950. [Google Scholar] [CrossRef] [PubMed]
- Nixon, K.C. The Parsimony Ratchet, a new method for rapid parsimony analysis. Cladistics 1999, 15, 407–414. [Google Scholar] [CrossRef] [PubMed]
- Sikes, D.S.; Lewis, P.O. PAUPRat: PAUP* Implementation of the Parsimony Ratchet. Beta Software, Version 1; Distributed by the Authors; Department of Ecology and Evolutionary Biology, University of Connecticut: Storrs, CT, USA, 2001. [Google Scholar]
- Swofford, D.L. PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods) Version 4.0b10; Sinauer Associates: Sunderland, MA, USA, 2002. [Google Scholar]
- Miller, M.A.; Pfeiffer, W.; Schwartz, T. Creating the CIPRES Science Gateway for inference of large phylogenetic trees. In Proceedings of the Gateway Computing Environments Workshop (GCE), New Orleans, LA, USA, 14 November 2010; Saltz, J., Ed.; IEEE: New Orleans, LA, USA, 2010. [Google Scholar]
- Farris, J.S.; Albert, V.A.; Källersjö, M.; Lipscomb, D.; Kluge, A.G. Parsimony jackknifing outperforms neighbor-joining. Cladistics 1996, 12, 99–124. [Google Scholar]
- Bansal, M.S.; Burleigh, J.G.; Eulenstein, O.; Fernandez–Baca, D. Robinson–Foulds supertrees. Algorithms Mol. Biol. 2010, 5, 18. [Google Scholar] [CrossRef]
- Goremykin, V.V.; Nikiforova, S.V.; Biggs, P.J.; Zhong, B.; Delange, P.; Martin, W.; Woetzel, S.; Atherton, R.A.; Mclenachan, P.A.; Lockhart, P.J. The evolutionary root of flowering plants. Syst. Biol. 2013, 62, 50–61. [Google Scholar] [CrossRef]
- Goloboff, P.A. Estimating character weights during tree search. Cladistics 1993, 9, 83–91. [Google Scholar] [CrossRef]
- Szalontai, B.; Stranczinger, S.; Mesterhazy, A.; Scribailo, R.W.; Les, D.H.; Efremov, A.N.; Jacono, C.C.; Kipriyanova, L.M.; Kaushik, K.; Laktionov, A.P.; et al. Molecular phylogenetic analysis of Ceratophyllum L. taxa: A new perspective. Bot. J. Linn. Soc. 2018, 188, 161–172. [Google Scholar] [CrossRef]
- Mavrodiev, E.V.; Williams, D.M.; Ebach, M.C.; Mavrodieva, A.E. Fassettia, a new North American genus of family Ceratophyllaceae: Evidence based on cladistic analyses of current molecular data of Ceratophyllum. Aust. Syst. Bot. 2021, 34, 431–437. [Google Scholar] [CrossRef]
- Bernaola-Galvan, P.; Carpena, P.; Roman–Roldan, R.; Oliver, J.L. Study of statistical correlations in DNA sequences. Gene 2002, 300, 105–115. [Google Scholar] [CrossRef]
- Mendizabal-Ruiz, G.; Román-Godínez, I.; Torres-Ramos, S.; Salido-Ruiz, R.A.; Morales, J.A. On DNA numerical representations for genomic similarity computation. PLoS ONE 2017, 12, e0173288. [Google Scholar] [CrossRef]
- Zhou, Y.; Zeng, P.; Li, Y.H.; Zhang, Z.; Cui, Q. SRAMP: Prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features. Nucleic Acids Res. 2016, 44, e91. [Google Scholar] [CrossRef] [PubMed]
- Chen, Z.; Zhao, P.; Li, F.; Leier, A.; Marquez-Lago, T.T.; Wang, Y.; Webb, G.I.; Smith, A.I.; Daly, R.J.; Chou, K.C.; et al. iFeature: A python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics 2018, 34, 2499–2502. [Google Scholar] [CrossRef] [PubMed]
- Felsenstein, J.; Sawyer, S.; Kochin, R. An efficient method for matching nucleic acid sequences. Nucleic Acids Res. 1982, 10, 133–139. [Google Scholar] [CrossRef] [PubMed]
- Demeler, B.; Zhou, G.W. Neural network optimization for Escherichia coli promoter prediction. Nucleic Acids Res. 1991, 19, 1593–1599. [Google Scholar] [CrossRef]
- Pleijel, F. On character coding for phylogeny reconstruction. Cladistics 1995, 11, 309–315. [Google Scholar] [CrossRef]
- Williams, D.M.; Ebach, M.C. The data matrix. Geodiversitas 2006, 28, 409–420. [Google Scholar]
- Nelson, G.J.; Platnick, N.I. Three–taxon statements—A more precise use of parsimony? Cladistics 1991, 7, 351–366. [Google Scholar] [CrossRef]
- Goloboff, P.A.; Farris, J.S.; Nixon, K.C. TNT, a free program for phylogenetic analysis. Cladistics 2008, 24, 774–786. [Google Scholar] [CrossRef]
- Nixon, K.C.; Carpenter, J.M. On homology. Cladistics 2012, 28, 160–169. [Google Scholar] [CrossRef]
- Farris, J.S. Methods for computing Wagner trees. Syst. Zool. 1970, 19, 83–92. [Google Scholar] [CrossRef]
- Farris, J.S. Estimating phylogenetic trees from distance matrices. Am. Nat. 1972, 106, 645–668. [Google Scholar] [CrossRef]
- Farris, J.S. Outgroups and parsimony. Syst. Zool. 1982, 31, 328–334. [Google Scholar] [CrossRef]
- Kluge, A.G. Phylogenetic relationships in the lizard family Pygopodidae: An evaluation of theory, methods and data. Miscellaneous Publs. Mus. Zool. Univ. Mich. 1976, 152, 1–72. [Google Scholar]
- Meacham, C.A. The role of hypothesized direction of characters in the estimation of evolutionary history. Taxon 1984, 33, 26–38. [Google Scholar] [CrossRef]
- Meacham, C.A. Polarity assessment in phylogenetic systematics—More about directed characters—A reply. Taxon 1986, 35, 538–540. [Google Scholar] [CrossRef]
- Maddison, W.P.; Donoghue, M.J.; Maddison, D.R. Outgroup analysis and parsimony. Syst. Zool. 1984, 33, 83–103. [Google Scholar] [CrossRef]
- Lyons-Weiler, J.; Hoelzer, G.A.; Tausch, R.J. Optimal outgroup analysis. Biol. J. Linn. Soc. 1998, 64, 493–511. [Google Scholar] [CrossRef]
- Arnold, E.N. Systematics and adaptive radiation of equatorial African lizards assigned to the genera Adolfus, Bedriagaia, Gastropholis, Holaspis, and Lacerta (Reptilia, Lacertidae). J. Nat. Hist. 1989, 23, 525–555. [Google Scholar] [CrossRef]
- Watrous, L.E.; Wheeler, Q.D. The out-group comparison method of character analysis. Syst. Zool. 1981, 30, 1–11. [Google Scholar] [CrossRef]
- Donoghue, M.J.; Maddison, W.P. Polarity assessment in phylogenetic systematics: A response to Meacham. Taxon 1986, 35, 534–538. [Google Scholar] [CrossRef]
- Platnick, N.I.; Humphries, C.J.; Nelson, G.; Williams, D.M. Is Farris optimization perfect?: Three–taxon statements and multiple branching. Cladistics 1996, 12, 243–252. [Google Scholar] [PubMed]
- Wilkinson, M. Common cladistic information and its consensus representation: Reduced Adams and reduced cladistic consensus trees and profiles. Syst. Biol. 1994, 43, 343–368. [Google Scholar] [CrossRef]
- Platnick, N.I. Character optimization and weighting—Differences between the standard and three–taxon approaches to phylogenetic inference. Cladistics 1993, 9, 267–272. [Google Scholar] [PubMed]
- Rieppel, O.; Williams, D.M.; Ebach, M.C. Adolf Naef (1883–1949): On foundational concepts and principles of systematic morphology. J. Hist. Biol. 2013, 46, 445–510. [Google Scholar] [CrossRef]
- Nelson, G.J. Outline of a theory of comparative biology. Syst. Zool. 1970, 19, 373–384. [Google Scholar] [CrossRef]
- Chen, Z.; Zhao, P.; Li, F.; Marquez-Lago, T.T.; Leier, A.; Revote, J.; Zhu, Y.; Powell, D.R.; Akutsu, T.; Webb, G.I.; et al. iLearn: An integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data. Brief. Bioinform. 2020, 21, 1047–1057. [Google Scholar] [CrossRef]
- Mavrodiev, E.V.; Williams, D.M.; Ebach, M.C. On the typology of relations. Evol. Biol. 2019, 46, 71–89. [Google Scholar] [CrossRef]
- Baum, B.R. Combining trees as a way of combining data sets for phylogenetic inference and the desirability of combining gene trees. Taxon 1992, 41, 3–10. [Google Scholar] [CrossRef]
- Ragan, M.A. Matrix representation in reconstructing phylogenetic relationships among the eukaryotes. BioSystems 1992, 28, 47–55. [Google Scholar] [CrossRef]
- Strickland, L. Leibniz on number systems. In Handbook of the History and Philosophy of Mathematical Practice; Sriraman, B., Ed.; Springer: Cham, Switzerland, 2024; pp. 167–197. [Google Scholar]
- Strickland, L.; Lewis, H.R. Leibniz on Binary: The Invention of Computer Arithmetic; MIT Press: Boston, MA, USA, 2022. [Google Scholar]
- Yakovlev, V.M. Leibniz G.W.: Letters and Essays on Chinese Philosophy and the Binary System of Calculation (Preface, Translations, and Notes); Russian Academy of Sciences, Institute of Philosophy: Moscow, Russia, 2005. [Google Scholar]
- Nylan, M. The Canon of Supreme Mystery, by Yang Hsiung. A Translation with Commentary of the T’ai Hsuan Ching; State University of New York Press: Albany, NY, USA, 1993. [Google Scholar]
- Chen, W.; Liao, B.; Xiang, X.; Zhu, W. An improved binary representation of DNA sequences and its applications. MATCH Commun. Math. Comput. Chem. 2009, 61, 767–780. [Google Scholar]
- Li, T.; Li, M.; Wu, Y.; Li, Y. Visualization methods for DNA sequences: A review and prospects. Biomolecules 2024, 14, 1447. [Google Scholar] [CrossRef] [PubMed]
- Swetz, F.J. Leibniz, the Yijing, and the religious conversion of the Chinese. Math. Mag. 2003, 76, 276–291. [Google Scholar] [CrossRef]
- Pauli, W. The influence of archetypal ideas on the scientific theories of Kepler. In The Interpretation of Nature and the Psyche; Jung, C.G., Pauli, W., Eds.; Pantheon Books: New York, NY, USA, 1955; pp. 147–240. [Google Scholar]
- Ryan, J.A. Leibniz’ Binary system and Shao Yong’s “Yijing”. Philos. East West 1996, 46, 59–90. [Google Scholar] [CrossRef]
- Felsenstein, J. Evolutionary trees from DNA sequences: A maximum likelihood approach. J. Mol. Evol. 1981, 17, 368–376. [Google Scholar] [CrossRef]
- Katoh, K.; Misawa, K.; Kuma, K.I.; Miyata, T. MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002, 30, 3059–3066. [Google Scholar] [CrossRef]
- Katoh, K.; Rozewicki, J.; Yamada, K.D. MAFFT online service: Multiple sequence alignment, interactive sequence choice and visualization. Brief. Bioinform. 2019, 20, 1160–1166. [Google Scholar] [CrossRef]
- Jung, C.G. Approaching the unconscious. In Man and His Symbols; Jung, C.G., von Franz, M.L., Eds.; Aldus: London, UK, 1964; pp. 18–103. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mavrodiev, E.V.; Mavrodiev, N.E. Essays on the Binary Representations of the DNA Data. DNA 2025, 5, 10. https://doi.org/10.3390/dna5010010
Mavrodiev EV, Mavrodiev NE. Essays on the Binary Representations of the DNA Data. DNA. 2025; 5(1):10. https://doi.org/10.3390/dna5010010
Chicago/Turabian StyleMavrodiev, Evgeny V., and Nicholas E. Mavrodiev. 2025. "Essays on the Binary Representations of the DNA Data" DNA 5, no. 1: 10. https://doi.org/10.3390/dna5010010
APA StyleMavrodiev, E. V., & Mavrodiev, N. E. (2025). Essays on the Binary Representations of the DNA Data. DNA, 5(1), 10. https://doi.org/10.3390/dna5010010