Inverted index compression based on term and document identifier reassignment
buir.advisor | Aykanat, Cevdet | |
dc.contributor.author | Baykan, İzzet Çağrı | |
dc.date.accessioned | 2016-01-08T18:07:54Z | |
dc.date.available | 2016-01-08T18:07:54Z | |
dc.date.issued | 2008 | |
dc.description | Ankara : The Department of Computer Engineering and the Institute of Engineering and Science of Bilkent University, 2008. | en_US |
dc.description | Thesis (Master's) -- Bilkent University, 2008. | en_US |
dc.description | Includes bibliographical references leaves 43-46. | en_US |
dc.description.abstract | Compression of inverted indexes received great attention in recent years. An inverted index consists of lists of document identifiers, also referred as posting lists, for each term. Compressing an inverted index reduces the size of the index, which also improves the query performance due to the reduction on disk access times. In recent studies, it is shown that reassigning document identifiers has great effect in compression of an inverted index. In this work, we propose a novel technique that reassigns both term and document identifiers of an inverted index by transforming the matrix representation of the index into a block-diagonal form, which improves the compression ratio dramatically. We adapted row-net hypergraph-partitioning model for the transformation into block-diagonal form, which improves the compression ratio by as much as 50%. To the best of our knowledge, this method performs more effectively than previous inverted index compression techniques. | en_US |
dc.description.provenance | Made available in DSpace on 2016-01-08T18:07:54Z (GMT). No. of bitstreams: 1 0003644.pdf: 462685 bytes, checksum: 7e18c1b31752682fffd8fb679539e7de (MD5) | en |
dc.description.statementofresponsibility | Baykan, İzzet Çağrı | en_US |
dc.format.extent | ix, 46 leaves, graphs | en_US |
dc.identifier.itemid | BILKUTUPB109724 | |
dc.identifier.uri | http://hdl.handle.net/11693/14779 | |
dc.language.iso | English | en_US |
dc.rights | info:eu-repo/semantics/openAccess | en_US |
dc.subject | Inverted index | en_US |
dc.subject | Inverted index compression | en_US |
dc.subject | Block-diagonal form | en_US |
dc.subject | Document identifier reassignment | en_US |
dc.subject | Hypergraph partitioning | en_US |
dc.subject.lcc | QA76.9.T48 B39 2008 | en_US |
dc.subject.lcsh | Text processing (Computer science) | en_US |
dc.subject.lcsh | Information storage and retrieval systems. | en_US |
dc.subject.lcsh | Information retrieval. | en_US |
dc.title | Inverted index compression based on term and document identifier reassignment | en_US |
dc.type | Thesis | en_US |
thesis.degree.discipline | Computer Engineering | |
thesis.degree.grantor | Bilkent University | |
thesis.degree.level | Master's | |
thesis.degree.name | MS (Master of Science) |
Files
Original bundle
1 - 1 of 1