The Run-Length FM-Index (RLFM) is an improvement over the Wavelet Tree FM-index, which exploits the equal-letter runs of the BWT to achieve bits of space, where is the -th order empirical entropy of . The above bound holds simultaneously for all with .
Using the RLFM, the counting of the occurrences of an arbitrary pattern as a substring of takes time. Locating each pattern occurrence takes time. Reporting a text substring of length takes time.
ImplementationVeli Mäkinen, Rodrigo González
free_text: The text will be freed immediately after using it.
samplerate=<number>: The index marks one text position every <number> entries. The default value is 64.
V. Mäkinen and G. Navarro. New search algorithms and time/space tradeoffs for succinct suffix arrays. Technical Report C-2004-20 (April), University of Helsinki, Finland.
V. Mäkinen and G. Navarro. Run-length FM-index. In Proc. DIMACS Workshop: The Burrows-Wheeler Transform: Ten Years Later (Aug. 2004), pp. 17-19.
V. Mäkinen and G. Navarro. Succinct suffix arrays based on run-length encoding. In Proc. 16th Annual Symposium on Combinatorial Pattern Matching (CPM), LNCS v. 3537 (2005), pp. 45-56.
V. Mäkinen and G. Navarro. Succinct suffix arrays based on run-length encoding. Nordic Journal of Computing 12(1):40-66, 2005.