Pizza&Chili Corpus
Compressed Indexes and their Testbeds

The Italian mirror | The Chilean mirror

Succinct Suffix Array

Description

The Succinct Suffix Array (SSA) is a wavelet tree FM-index that gives to the wavelet tree the shape of the Huffman tree of the text.

The size of the index built on a string $ T[1,n]$ is bounded by $ n(H_0(T)+1)(1+o(1))$ bits, where $ H_0(T)$ is the 0-th order empirical entropy of $ T$.

Using the SSA, the counting of the occurrences of an arbitrary pattern $ P[1,p]$ as a substring of $ T$ takes on average $ O(p H_0)$ time.

Implementation

Veli Mäkinen, R. González

Build_options

free_text: The text will be freed immediately after using it.

samplerate=<number>: The index marks one text position every <number> entries. The default value is 64.

Paper

Mäkinen, V. and Navarro, G. New search algorithms and time/space tradeoffs for succinct suffix arrays. Technical Report C-2004-20 (April), University of Helsinki, Finland.

V. Mäkinen and G. Navarro. Succinct suffix arrays based on run-length encoding. Nordic Journal of Computing 12(1):40-66, 2005.

Downloads



Send Mail to Us | © P. Ferragina and G. Navarro, Last update: September, 2005.