Pizza&Chili Corpus
Compressed Indexes and their Testbeds

The Italian mirror | The Chilean mirror

The Initiative

Every submitted software must come with a COPYRIGHT which makes the software free for research and teaching purposes. Any other restriction should be stated clearly in this COPYRIGHT file. We strongly suggest to follow the LGPL license. A software may come only with its executables, but we strongly suggest to add the sources. The submitted indexes must implement the whole API interface we propose. In case some functions are missing, then we suggest to add bogus functions that indicate that this function is not implemented and exit. They should also indicate clearly in their documentation which features they do not implement, and which restrictions they have. We prefer that sources are in C or C++.

If you are providing a new index (or updating one), you must send us a directory with the software. The top-level directory must contain a README file indicating index name, authors, date of creation/updates, limitations (including unimplemented functionalities), known bugs, construction parameters, explanation on versions, etc., as well as a COPYRIGHT file indicating the usage permissions. The directory may directly contain the files, or it may be composed of subdirectories for different versions (e.g. different implementations yielding different space/time tradeoffs). Always leave old versions with their names, as one might compare an index against some published result regarding an older version of your index. Ideally, each version should have its own makefile that builds the libraries implementing the interface we describe in this document. The rest should follow automatically as one compiles our shells with your library of functions provided.

 

Several people have already contributed to the site in one way or another.

  • Rodrigo González, University of Chile, Chile. General work in all the corpus contents, including adapting most implementations to the common interface and developing the code for AF-FMI.
  • Rossano Venturini, University of Pisa, Italy. General work in all the corpus contents, including developing testing shells and the new version of the FM-index.
  • Diego Arroyuelo, University of Chile, Chile. LZ-index implementation.
  • Rodrigo Cánovas, University of Chile, Chile. Refactor of CSA code, including methods save/load.
  • Veli Mäkinen, University of Helsinki, Finland. WT-FMI and RL-FM index implementation.
  • Joaquín Adiego, University of Valladolid, Spain. PPMDI compressor for Unix/Linux, extracted in turn from Dick Cheney's XMLPPM v.0.98.2.
  • Sebastian Kreft, University of Chile, Chile. LZ77-index and Repetitive corpus.
  • Jouni Sirén, University of Helsinki, Finland. RLCSA.

 



Send Mail to Us | © P. Ferragina and G. Navarro, Last update: September, 2005.