Compressed Compact Suffix Array |
Description The Compressed Compact Suffix Array (CCSA) is a simple self-index using essentially the structures of the Compact Suffix Array (CSA). The CSA was aimed at explicitly exploiting the self-repetitions of the suffix array in order to store it in a more compact form. The size of the index built on a string is bounded by bits, where is the -th order empirical entropy of . The above bound holds simultaneously for all with . And is an arbitrary constant. Using the CCSA, the counting of the occurrences of an arbitrary pattern as a substring of takes time. Locating each pattern occurrence takes time. Reporting a text substring of length takes time. Implementation Veli Mäkinen, Rodrigo GonzálezBuild_options free_text: The text will be freed immediately after using it. samplerate=<number>: The index marks one text position every <number> entries. The default value is 64. Paper V. Mäkinen and G. Navarro. Compressed compact suffix arrays. In Proc. 15th Annual Symposium on Combinatorial Pattern Matching (CPM), LNCS v. 3109 (2004), pp. 420-433. Downloads |