Pizza&Chili Corpus
Compressed Indexes and their Testbeds

The Italian mirror | The Chilean mirror

Statistics for Logs Collections

Traces

We instrumentalized program executions with AspectJ, and obtained execution traces from them. These traces contain function calls and function returns. Arguments are ommited. The execution traces provided are:

  • Horspool: Execution trace of the pattern matching of ACGTACGT on the DNA sequence dna from Pizza&Chili, using a Boyer-Moore Horspool implementation.
  • NQueens: Execution trace of a solver for the N-Queens problem run with N = 11.
  • Quicksort: Execution trace of sorting with Quicksort an array of 100000 random integers between 0 and 999999.
Algorithms implementations in Java were obtained from The Algorithms - Java.

Logs

We retrieved some logs from Loghub dataset and cut them to their first 100MiB:

  • HDFS 1: Logs from Hadoop distributed file system.
  • BGL: Logs from Blue Gene/L supercomputer.
  • Thunderbird: Logs from Thunderbird supercomputer.
  • Windows: Logs from Windows operative system.
  • Android: Logs from Android operative system.

More detailed information about them can be found in the Loghub site.

We also used cut the first 100MiB from the file log20170630 from EDGAR Log File Data Set.

Collection Size (MiB) Alphabet size Inv match prob
Horspool 546MiB 39 18.37
NQueens 289MiB 38 19.43
Quicksort 364MiB 37 19.92
HDFS 1 100MiB 73 28.09
BGL 100MiB 78 21.02
Thunderbird 100MiB 92 23.24
Windows 100MiB 91 23.92
Android 100MiB 176 29.98
EDGAR 100MiB 71 8.59

Collection p7zip bzip2 gzip Re-Pair
Horspool 0.38% 0.10% 0.61% 0.07%
NQueens 0.40% 0.11% 0.59% 0.08%
Quicksort 0.47% 0.12% 0.68% 0.09%
HDFS 1 7.82% 6.67% 10.10% 9.93%
BGL 5.53% 6.29% 8.10% 10.47%
Thunderbird 4.48% 4.77% 9.34% 6.96%
Windows 0.54% 1.53% 5.82% 0.73%
Android 5.34% 7.72% 13.54% 6.99%
EDGAR 8.49% 8.39% 13.35% 11.05%

Collection H0 H1 H2 H3 H4 H5 H6 H7 H8
Horspool 56.53%
(1)
16.67%
(39)
4.21%
(124)
2.20%
(166)
1.75%
(192)
1.15%
(214)
0.94%
(236)
0.70%
(256)
0.46%
(276)
NQueens 57.38%
(1)
15.36%
(38)
2.81%
(99)
1.50%
(124)
1.09%
(140)
0.99%
(158)
0.58%
(173)
0.49%
(187)
0.49%
(200)
Quicksort 57.39%
(1)
16.20%
(37)
4.15%
(110)
1.52%
(152)
1.41%
(178)
1.18%
(204)
1.05%
(228)
1.05%
(250)
1.05%
(271)
HDFS 1 65.21%
(1)
34.45%
(73)
19.14%
(620)
13.66%
(3,208)
11.88%
(22,873)
9.29%
(190,244)
5.99%
(953,541)
5.09%
(1,860,388)
4.94%
(2,708,524)
BGL 61.67%
(1)
33.64%
(78)
18.21%
(1,334)
11.34%
(7,350)
9.83%
(27,427)
8.95%
(172,826)
7.93%
(1,024,318)
7.22%
(2,560,842)
6.49%
(4,568,486)
Thunderbird 64.49%
(1)
43.89%
(92)
25.03%
(3,392)
13.83%
(20,367)
9.63%
(75,138)
8.02%
(279,635)
6.42%
(602,013)
5.76%
(1,013,062)
5.41%
(1,516,157)
Windows 64.63%
(1)
37.00%
(91)
13.50%
(1,877)
6.25%
(11,845)
3.71%
(63,690)
2.78%
(131,816)
2.45%
(162,875)
2.36%
(182,799)
2.19%
(201,376)
Android 68.85%
(1)
49.16%
(176)
30.00%
(8,432)
16.41%
(63,027)
10.77%
(240,984)
7.90%
(642,596)
6.37%
(1,263,015)
5.38%
(1,836,198)
4.57%
(2,587,796)
EDGAR 47.82%
(1)
35.18%
(71)
23.76%
(2,157)
17.29%
(29,224)
13.21%
(122,042)
10.37%
(417,774)
9.11%
(1,116,130)
8.49%
(1,759,021)
7.95%
(2,369,053)

Collection delta z v r g
Horspool 68,686 102,970 98,799 988,423 215,730
NQueens 35,744 56,660 55,148 317,130 118,810
Quicksort 56,779 90,308 90,594 698,998 178,250
HDFS 1 1,173,611 2,513,337 2,536,755 10,606,913 3,994,531
BGL 1,301,165 2,988,284 2,954,491 14,587,097 4,589,796
Thunderbird 1,048,593 2,032,339 2,033,192 7,226,224 2,990,075
Windows 61,711 176,432 174,407 529,211 270,756
Android 881,571 1,979,226 1,982,376 7,298,023 2,975,004
EDGAR 1,116,985 2,775,673 2,747,012 11,442,035 4,473,998


Send Mail to Us | © P. Ferragina and G. Navarro, Last update: October, 2010.