Given a text string $S = \sb{s\sb1s\sb2s\sb3\... s\sb{n}},$ we want to preprocess S such that given a pattern $P = p\sb1p\sb2p\sb3\... p\sb{m},$ we can find $\{i\vert s\sb{i\... s\sb{i+m-1}} = P\}$ as efficiently as possible. Suffix trees are a data structure solution to this problem. Unfortunately, when n is large, the storage required by a suffix tree can be prohibitive. This thesis presents several related new representations for a close relative of the suffix tree, the PAT tree, that retain the functionality of suffix trees while requiring a fraction of the storage used by current methods.
Cited By
- Arroyuelo D, Campos D, Gómez-Brandón A, Navarro G, Rojas C and Vrgoč D Space & Time Efficient Leapfrog Triejoin Proceedings of the 7th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), (1-9)
- Arroyuelo D, Bustos B, Gómez-Brandón A, Hogan A, Navarro G and Reutter J (2024). Worst-Case-Optimal Similarity Joins on Graph Databases, Proceedings of the ACM on Management of Data, 2:1, (1-26), Online publication date: 12-Mar-2024.
- Jiang S, Feng J, Wang C, Liu J, Xiong Z, Sha C, Zheng W, Liang J and Xiao Y (2023). EASC, Knowledge-Based Systems, 278:C, Online publication date: 25-Oct-2023.
- de Bernardo G, Gagie T, Ladra S, Navarro G and Seco D (2023). Faster compressed quadtrees, Journal of Computer and System Sciences, 131:C, (86-104), Online publication date: 1-Feb-2023.
- Chakraborty S, Grossi R, Sadakane K and Satti S (2023). Succinct representation for (non)deterministic finite automata, Journal of Computer and System Sciences, 131:C, (1-12), Online publication date: 1-Feb-2023.
- Fuentes-Sepúlveda J, Navarro G and Seco D (2023). Navigating planar topologies in near-optimal space and time, Computational Geometry: Theory and Applications, 109:C, Online publication date: 1-Feb-2023.
- Boffa A, Ferragina P and Vinciguerra G (2022). A Learned Approach to Design Compressed Rank/Select Data Structures, ACM Transactions on Algorithms, 18:3, (1-28), Online publication date: 31-Jul-2022.
- Li Z, Li J and Huo H (2022). Optimal in-place suffix sorting, Information and Computation, 285:PB, Online publication date: 1-May-2022.
- Fariña A, Gagie T, Grabowski S, Manzini G, Navarro G and Ordóñez A (2022). Efficient and compact representations of some non-canonical prefix-free codes, Theoretical Computer Science, 907:C, (11-25), Online publication date: 12-Mar-2022.
- Hoffmann M, Monaghan M and Reinert K PriSeT Proceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, (1-12)
- Arroyuelo D, Hogan A, Navarro G, Reutter J, Rojas-Ledesma J and Soto A Worst-Case Optimal Graph Joins in Almost No Space Proceedings of the 2021 International Conference on Management of Data, (102-114)
- Kärkkäinen J and Kempa D (2019). Better External Memory LCP Array Construction, ACM Journal of Experimental Algorithmics, 24, (1-27), Online publication date: 17-Dec-2019.
- Lasch R, Oukid I, Dementiev R, May N, Demirsoy S and Sattler K Fast & Strong Proceedings of the 15th International Workshop on Data Management on New Hardware, (1-10)
- Navarro G (2019). Document listing on repetitive collections with guaranteed performance, Theoretical Computer Science, 772:C, (58-72), Online publication date: 7-Jun-2019.
- Gagie T, He M and Navarro G (2019). Path queries on functions, Theoretical Computer Science, 770:C, (34-50), Online publication date: 24-May-2019.
- Kammer F, Kratsch D and Laudahn M (2019). Space-Efficient Biconnected Components and Recognition of Outerplanar Graphs, Algorithmica, 81:3, (1180-1204), Online publication date: 1-Mar-2019.
- Banerjee N, Chakraborty S, Raman V and Satti S (2018). Space Efficient Linear Time Algorithms for BFS, DFS and Applications, Theory of Computing Systems, 62:8, (1736-1762), Online publication date: 1-Nov-2018.
- Arz J and Fischer J (2018). Lempel---Ziv-78 Compressed String Dictionaries, Algorithmica, 80:7, (2012-2047), Online publication date: 1-Jul-2018.
- Fischer J, I T, Köppl D and Sadakane K (2018). Lempel---Ziv Factorization Powered by Space Efficient Suffix Trees, Algorithmica, 80:7, (2048-2081), Online publication date: 1-Jul-2018.
- Gog S, Konow R and Navarro G (2017). Practical Compact Indexes for Top-k Document Retrieval, ACM Journal of Experimental Algorithmics, 22, (1-37), Online publication date: 15-Dec-2017.
- Belazzougui D and Cunial F (2017). A Framework for Space-Efficient String Kernels, Algorithmica, 79:3, (857-883), Online publication date: 1-Nov-2017.
- Pibiri G and Venturini R (2017). Clustered Elias-Fano Indexes, ACM Transactions on Information Systems, 36:1, (1-33), Online publication date: 30-Aug-2017.
- Grossi R, Iacono J, Navarro G, Raman R and Satti S (2017). Asymptotically Optimal Encodings of Range Data Structures for Selection and Top-k Queries, ACM Transactions on Algorithms, 13:2, (1-31), Online publication date: 29-May-2017.
- (2017). On prefix normal words and prefix normal forms, Theoretical Computer Science, 659:C, (1-13), Online publication date: 10-Jan-2017.
- R. Brisaboa N, De Bernardo G, Konow R, Navarro G and Seco D (2016). Aggregated 2D range queries on clustered points, Information Systems, 60:C, (34-49), Online publication date: 1-Aug-2016.
- Gog S and Venturini R Fast and Compact Hamming Distance Index Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, (285-294)
- Cordova J and Navarro G Practical Dynamic Entropy-Compressed Bitvectors with Applications Proceedings of the 15th International Symposium on Experimental Algorithms - Volume 9685, (105-117)
- Brisaboa N, Cerdeira-Pena A, Fariña A and Navarro G A Compact RDF Store Using Suffix Arrays Proceedings of the 22nd International Symposium on String Processing and Information Retrieval - Volume 9309, (103-115)
- Belazzougui D and Navarro G (2015). Optimal Lower and Upper Bounds for Representing Sequences, ACM Transactions on Algorithms, 11:4, (1-21), Online publication date: 23-Jun-2015.
- Navarro G, Puglisi S and Valenzuela D (2015). General Document Retrieval in Compact Space, ACM Journal of Experimental Algorithmics, 19, (1-46), Online publication date: 3-Feb-2015.
- González R, Navarro G and Ferrada H (2015). Locally Compressed Suffix Arrays, ACM Journal of Experimental Algorithmics, 19, (1.1-1.30), Online publication date: 3-Feb-2015.
- Claude F, Navarro G and Ordóñez A (2015). The wavelet matrix, Information Systems, 47:C, (15-32), Online publication date: 1-Jan-2015.
- Claude F, Konow R and Navarro G Efficient Indexing and Representation of Web Access Logs Proceedings of the 21st International Symposium on String Processing and Information Retrieval - Volume 8799, (65-76)
- Navarro G and Ordóñez A Grammar Compressed Sequences with Rank/Select Support Proceedings of the 21st International Symposium on String Processing and Information Retrieval - Volume 8799, (31-44)
- Belazzougui D and Cunial F Indexed Matching Statistics and Shortest Unique Substrings Proceedings of the 21st International Symposium on String Processing and Information Retrieval - Volume 8799, (179-190)
- Belazzougui D and Navarro G (2014). Alphabet-Independent Compressed Text Indexing, ACM Transactions on Algorithms, 10:4, (1-19), Online publication date: 1-Aug-2014.
- Belazzougui D Linear time construction of compressed text indices in compact space Proceedings of the forty-sixth annual ACM symposium on Theory of computing, (148-193)
- Hon W, Shah R, Thankachan S and Vitter J (2014). Space-Efficient Frameworks for Top-k String Retrieval, Journal of the ACM, 61:2, (1-36), Online publication date: 1-Apr-2014.
- Navarro G (2014). Spaces, Trees, and Colors, ACM Computing Surveys, 46:4, (1-47), Online publication date: 1-Apr-2014.
- Gog S and Ohlebusch E (2013). Compressed suffix trees, ACM Journal of Experimental Algorithmics, 18, (2.1-2.31), Online publication date: 1-Dec-2013.
- Hsu B and Ottaviano G Space-efficient data structures for Top-k completion Proceedings of the 22nd international conference on World Wide Web, (583-594)
- Vigna S Quasi-succinct indices Proceedings of the sixth ACM international conference on Web search and data mining, (83-92)
- Brisaboa N, Ladra S and Navarro G (2013). DACs, Information Processing and Management: an International Journal, 49:1, (392-404), Online publication date: 1-Jan-2013.
- Barbay J, Fischer J and Navarro G (2012). LRM-Trees, Theoretical Computer Science, 459, (26-41), Online publication date: 1-Nov-2012.
- Takabatake Y, Tabei Y and Sakamoto H Variable-Length codes for space-efficient grammar-based compression Proceedings of the 19th international conference on String Processing and Information Retrieval, (398-410)
- Hernández C and Navarro G Compressed representation of web and social networks via dense subgraphs Proceedings of the 19th international conference on String Processing and Information Retrieval, (264-276)
- Claude F and Navarro G The wavelet matrix Proceedings of the 19th international conference on String Processing and Information Retrieval, (167-179)
- Belazzougui D and Navarro G New lower and upper bounds for representing sequences Proceedings of the 20th Annual European conference on Algorithms, (181-192)
- Navarro G and Providel E Fast, small, simple rank/select on bitmaps Proceedings of the 11th international conference on Experimental Algorithms, (295-306)
- Claude F, Navarro G, Peltola H, Salmela L and Tarhio J (2012). String matching with alphabet sampling, Journal of Discrete Algorithms, 11, (37-50), Online publication date: 1-Feb-2012.
- Lim H, Fan B, Andersen D and Kaminsky M SILT Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, (1-13)
- Claude F, Nicholson P and Seco D Space efficient wavelet tree construction Proceedings of the 18th international conference on String processing and information retrieval, (185-196)
- Kimura K, Koike A and Nakai K Seed-set construction by equi-entropy partitioning for efficient and sensitive short-read mapping Proceedings of the 11th international conference on Algorithms in bioinformatics, (151-162)
- Claude F and Navarro G (2011). Self-Indexed Grammar-Based Compression, Fundamenta Informaticae, 111:3, (313-337), Online publication date: 1-Aug-2011.
- Brodal G, Gfeller B, Jørgensen A and Sanders P (2011). Towards optimal range medians, Theoretical Computer Science, 412:24, (2588-2601), Online publication date: 1-May-2011.
- Culpepper J and Moffat A (2010). Efficient set intersection for inverted indexing, ACM Transactions on Information Systems, 29:1, (1-25), Online publication date: 1-Dec-2010.
- Fernández J, Martínez-Prieto M and Gutierrez C Compact representation of large RDF data sets for publishing and exchange Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I, (193-208)
- Claude F and Navarro G (2010). Fast and Compact Web Graph Representations, ACM Transactions on the Web, 4:4, (1-31), Online publication date: 1-Sep-2010.
- Burcsi P, Cicalese F, Fici G and Lipták Z On table arrangements, scrabble freaks, and jumbled pattern matching Proceedings of the 5th international conference on Fun with algorithms, (89-101)
- Barbay J, Claude F and Navarro G Compact rich-functional binary relation representations Proceedings of the 9th Latin American conference on Theoretical Informatics, (170-183)
- Claude F and Navarro G Extended compact web graph representations Algorithms and Applications, (77-91)
- Välimäki N, Mäkinen V, Gerlach W and Dixit K (2010). Engineering a compressed suffix tree implementation, ACM Journal of Experimental Algorithmics, 14, (4.2-4.23), Online publication date: 1-Dec-2009.
- Fariòa A, Ladra S, Pedreira O and Places Á (2009). Rank and Select for Succinct Data Structures, Electronic Notes in Theoretical Computer Science (ENTCS), 236, (131-145), Online publication date: 1-Apr-2009.
- Farzan A and Munro J A Uniform Approach Towards Succinct Representation of Trees Proceedings of the 11th Scandinavian workshop on Algorithm Theory, (173-184)
- Lu H and Yeh C (2008). Balanced parentheses strike back, ACM Transactions on Algorithms, 4:3, (1-13), Online publication date: 1-Jun-2008.
- Vigna S Broadword implementation of rank/select queries Proceedings of the 7th international conference on Experimental algorithms, (154-168)
- Yamanaka K and Nakano S A compact encoding of plane triangulations with efficient query supports Proceedings of the 2nd international conference on Algorithms and computation, (120-131)
- Golynski A (2007). Optimal lower bounds for rank and select indexes, Theoretical Computer Science, 387:3, (348-359), Online publication date: 20-Nov-2007.
- Mäkinen V and Navarro G (2007). Rank and select revisited and extended, Theoretical Computer Science, 387:3, (332-347), Online publication date: 20-Nov-2007.
- Raman R, Raman V and Satti S (2007). Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets, ACM Transactions on Algorithms, 3:4, (43-es), Online publication date: 1-Nov-2007.
- Culpepper J and Moffat A Compact set representation for information retrieval Proceedings of the 14th international conference on String processing and information retrieval, (137-148)
- Na J and Park K (2007). Alphabet-independent linear-time construction of compressed suffix arrays using o(nlogn)-bit working space, Theoretical Computer Science, 385:1-3, (127-136), Online publication date: 1-Oct-2007.
- Lee S and Park K Dynamic rank-select structures with applications to run-length encoded texts Proceedings of the 18th annual conference on Combinatorial Pattern Matching, (95-106)
- Fredriksson K and Nikitin F Simple compression code supporting random access and fast string matching Proceedings of the 6th international conference on Experimental algorithms, (203-216)
- Yamanaka K and Nakano S A Compact Encoding of Rectangular Drawings with Efficient Query Supports Proceedings of the 3rd international conference on Algorithmic Aspects in Information and Management, (68-81)
- Ferragina P, Manzini G, Mäkinen V and Navarro G (2007). Compressed representations of sequences and full-text indexes, ACM Transactions on Algorithms, 3:2, (20-es), Online publication date: 1-May-2007.
- Navarro G and Mäkinen V (2007). Compressed full-text indexes, ACM Computing Surveys, 39:1, (2-es), Online publication date: 12-Apr-2007.
- Golynski A Optimal lower bounds for rank and select indexes Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part I, (370-381)
- Mäkinen V and Navarro G Position-Restricted substring searching Proceedings of the 7th Latin American conference on Theoretical Informatics, (703-714)
- Na J Linear-Time construction of compressed suffix arrays using o(n log n)-bit working space for large alphabets Proceedings of the 16th annual conference on Combinatorial Pattern Matching, (57-67)
- Mäkinen V and Navarro G Succinct suffix arrays based on run-length encoding Proceedings of the 16th annual conference on Combinatorial Pattern Matching, (45-56)
- Kim D, Na J, Kim J and Park K Efficient implementation of rank and select functions for succinct representation Proceedings of the 4th international conference on Experimental and Efficient Algorithms, (315-327)
- Miltersen P Lower bounds on the size of selection and rank indexes Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms, (11-12)
- Raman R, Raman V and Rao S Succinct indexable dictionaries with applications to encoding k-ary trees and multisets Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, (233-242)
- Chiang Y, Lin C and Lu H Orderly spanning trees with applications to graph encoding and graph drawing Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms, (506-515)
- Grossi R and Vitter J Compressed suffix arrays and suffix trees with applications to text indexing and string matching (extended abstract) Proceedings of the thirty-second annual ACM symposium on Theory of computing, (397-406)
Index Terms
- Compact pat trees
Recommendations
Integral Equations on Certain Compact Homogeneous Spaces
Let ${\bf T}_n = {{{\bf R}_n } / {{\bf Z}_n }}$, where ${\bf R}_n$ is n-dimensional Euclidean space and ${\bf Z}_n$ is the n-dimensional lattice group, be the n-dimensional torus. We denote elements of ${\bf T}_n$ by ${\bf \theta }$, ${\bf \varphi }$, ...
On the existence of solutions to compact dynamic optimization problems
We consider a system where, at each stagen ź {1, 2, ...}=N, an elementxn* is to be chosen out of a nonempty, closed, feasible regionAn in a compact semigroupX, so as to achieve a sequence $$\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} $$...
Majorization and the spectral radius of starlike trees
A starlike tree is a tree with exactly one vertex of degree greater than two. The spectral radius of a graph G, that is denoted by $$\lambda (G)$$?(G), is the largest eigenvalue of G. Let k and $$n_1,\ldots ,n_k$$n1,?,nk be some positive integers. Let $$...