Pizza&Chili Corpus
Compressed Indexes and their Testbeds

The Italian mirror | The Chilean mirror

LZ-Index

Description

The LZ-index is a compressed index based on LZ78 compression. It requires 4 n H_k(T) + o(n log s) bits of space for a text T[1,n] over an alphabet of size s. Its worst case locate time is O(m^3 log s + (m + occ) log n), where m is the pattern length and occ the number of occurrences. In practice, its time to locate occurrences is very competitive.

Implementation

Gonzalo Navarro and Diego Arroyuelo, University of Chile (Chile).

email us for any problem, bug, or comment: ({gnavarro, darroyue} at dcc[dot]uchile[dot]cl)

Papers about the LZ-index data structure

Gonzalo Navarro. Indexing Text using the Ziv-Lempel Trie. Journal of Discrete Algorithms (JDA) 2(1):87-114, 2004.

Downloads

  • LZ-Index: The original LZ-index data structure.
  • LZ-index-1: Reduced version of LZ-index, requiring 3 n H_k(T) + o(n log s) bits of storage. [Last Update: May 29, 2007].
  • LZ-index-7: Another reduced version of LZ-index, also requiring 3 n H_k(T) + o(n log s) bits of storage. [Last Update: May 29, 2007].
  • LZ-index-4: Reduced version of LZ-index, requiring (2+e) n H_k(T) + o(n log s). [Last Update: May 29, 2007].

The reduced versions are based on similar ideas to that presented in the paper:

D. Arroyuelo, G. Navarro, and K. Sadakane. Reducing the space requirement of LZ-index. In Proc. CPM'06, pages 319-330. Lecture Notes in Computer Science 4009.

Soon we will make available a technical report explaining the details of these implementations.



Send Mail to Us | © P. Ferragina and G. Navarro, Last update: September, 2005.