Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Suffix arrays: a new method for on-line string searches

Published: 01 October 1993 Publication History

Abstract

No abstract available.

Cited By

View all
  • (2024)KaMPIng: Flexible and (Near) Zero-Overhead C++ Bindings for MPIProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00050(1-21)Online publication date: 17-Nov-2024
  • (2024)Linear time online algorithms for constructing linear-size suffix trieTheoretical Computer Science10.1016/j.tcs.2024.1147651015:COnline publication date: 1-Nov-2024
  • (2024)Elastic founder graphs improved and enhancedTheoretical Computer Science10.1016/j.tcs.2023.114269982:COnline publication date: 8-Jan-2024
  • Show More Cited By

Recommendations

Reviews

John Sadowsky

The authors present algorithms for constructing and querying a data structure, the suffix array, which was introduced in their earlier paper [1], for use in online searches for strings in a large block of text. The suffix array is closely related to the suffix tree data structure. In both cases, the suffixes of the text are organized into a data structure, either as a search tree or as an array with auxiliary arrays to facilitate searching and comparison tasks. (If the text is A=a 0 a 1 a 2 &ldots;a N-1 , then the suffixes of A are of the form s k =a k a k+1 &ldots;a N-1 .) The primary result of this paper is that the suffix array uses less space for storage than does a suffix tree, and has complexity, in search and construction time and in storage required, that is independent of the size of the underlying alphabet, although these improvements are at the cost of an increase in initial data structure construction time. In particular, a suffix tree for a text A <__?__Pub Caret1>of length N over an alphabet of length S can be built in time O N log S using O N space and can answer queries of the form “Is W (of length P ) a substring of A __?__” in time O P log S . With more complicated auxiliary data, these complexities can be reduced to O N for the construction time and O P for the query, but at an increase in storage complexity to O NS . On the other hand, the suffix array can achieve query complexity of O P+ log N (a potential improvement for large alphabets) at the cost of O N log N for construction. The storage required is 3N integers which, although the same order of magnitude as the suffix tree, in practice amounts to three to five times less storage. The paper is well written, with several innovative algorithms presented so they can be implemented easily. With the aid of a survey paper, such as Apostolico [2] (cited in the bibliography of the paper), a researcher or practitioner could understand and make good use of this work.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image SIAM Journal on Computing
SIAM Journal on Computing  Volume 22, Issue 5
Oct. 1993
228 pages
ISSN:0097-5397
  • Editor:
  • Z. Galil
Issue’s Table of Contents

Publisher

Society for Industrial and Applied Mathematics

United States

Publication History

Published: 01 October 1993

Author Tags

  1. inverted indices
  2. pattern matching
  3. string matching
  4. string searching
  5. suffix trees
  6. text indexing

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)KaMPIng: Flexible and (Near) Zero-Overhead C++ Bindings for MPIProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00050(1-21)Online publication date: 17-Nov-2024
  • (2024)Linear time online algorithms for constructing linear-size suffix trieTheoretical Computer Science10.1016/j.tcs.2024.1147651015:COnline publication date: 1-Nov-2024
  • (2024)Elastic founder graphs improved and enhancedTheoretical Computer Science10.1016/j.tcs.2023.114269982:COnline publication date: 8-Jan-2024
  • (2024) r-indexing the eBWTInformation and Computation10.1016/j.ic.2024.105155298:COnline publication date: 1-Jun-2024
  • (2024)Constructing and indexing the bijective and extended Burrows–Wheeler transformInformation and Computation10.1016/j.ic.2024.105153297:COnline publication date: 1-Mar-2024
  • (2024)Accelerating spliced alignment of long RNA sequencing reads using parallel maximal exact match retrievalComputers in Biology and Medicine10.1016/j.compbiomed.2024.108542175:COnline publication date: 18-Jul-2024
  • (2024)Compressed and queryable self-indexes for RDF archivesKnowledge and Information Systems10.1007/s10115-023-01967-766:1(381-417)Online publication date: 1-Jan-2024
  • (2024)LZ78 Substring Compression with CDAWGsString Processing and Information Retrieval10.1007/978-3-031-72200-4_22(289-305)Online publication date: 23-Sep-2024
  • (2024)All-Pairs Suffix-Prefix on Dynamic Set of StringsString Processing and Information Retrieval10.1007/978-3-031-72200-4_15(192-203)Online publication date: 23-Sep-2024
  • (2023)The RefinedWeb dataset for falcon LLMProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669586(79155-79172)Online publication date: 10-Dec-2023
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media