Giuseppe Ottaviano presents "Better Index Compression with Partitioned Elias-Fano"

Giuseppe Ottaviano
Title: "Better Index Compression with Partitioned Elias-Fano"   Giuseppe Ottaviano   ABSTRACT            Inverted indexes are the core of every modern information retrieval system; to cope with steadily growing document corpora, such as the Web, it is crucial to represent them as efficiently as possible with respect to space and decompression speed. Recently a data structure traditionally used in the field of succinct data structures, namely the Elias-Fano representation of monotone sequences, has been applied to the compression of indexes, showing surprisingly excellent performance and good compression. This talk will start by briefly reviewing the Elias-Fano representation and its application to inverted indexes. We will then move on to a new representation based on a two-level Elias-Fano data structure, which significantly improves compression efficiency by (almost-)optimally partitioning the posting lists, with only a negligible slow-down in decompression. BIOGRAPHICAL NOTE Giuseppe Ottaviano is a postdoctoral researcher at ISTI-CNR in Pisa. He received his Ph.D. from the University of Pisa with a thesis on succinct data structures. He is mainly interested in data structures and data compression, in particular with application to the storage of large quantities of data for information retrieval and search engines. He is also interested in machine learning and computer vision.