Browsing by Subject "Inverted index"

Now showing 1 - 6 of 6

Open Access
Cluster searching strategies for collaborative recommendation systems
(2013) Altingovde, I. S.; Subakan, Ö. N.; Ulusoy, Özgür
In-memory nearest neighbor computation is a typical collaborative filtering approach for high recommendation accuracy. However, this approach is not scalable given the huge number of customers and items in typical commercial applications. Cluster-based collaborative filtering techniques can be a remedy for the efficiency problem, but they usually provide relatively lower accuracy figures, since they may become over-generalized and produce less-personalized recommendations. Our research explores an individualistic strategy which initially clusters the users and then exploits the members within clusters, but not just the cluster representatives, during the recommendation generation stage. We provide an efficient implementation of this strategy by adapting a specifically tailored cluster- skipping inverted index structure. Experimental results reveal that the individualistic strategy with the cluster-skipping index is a good compromise that yields high accuracy and reasonable scalability figures. © 2012 Elsevier Ltd. All rights reserved.
Open Access
Inverted index compression based on term and document identifier reassignment
(2008) Baykan, İzzet Çağrı
Compression of inverted indexes received great attention in recent years. An inverted index consists of lists of document identifiers, also referred as posting lists, for each term. Compressing an inverted index reduces the size of the index, which also improves the query performance due to the reduction on disk access times. In recent studies, it is shown that reassigning document identifiers has great effect in compression of an inverted index. In this work, we propose a novel technique that reassigns both term and document identifiers of an inverted index by transforming the matrix representation of the index into a block-diagonal form, which improves the compression ratio dramatically. We adapted row-net hypergraph-partitioning model for the transformation into block-diagonal form, which improves the compression ratio by as much as 50%. To the best of our knowledge, this method performs more effectively than previous inverted index compression techniques.
Open Access
Large-scale cluster-based retrieval experiments on Turkish texts
(ACM, 2007) Altıngövde, İsmail Şengör; Özcan, Rıfat; Öcalan Hüseyin C.; Can, Fazlı; Ulusoy, Özgür
We present cluster-based retrieval (CBR) experiments on the largest available Turkish document collection. Our experiments evaluate retrieval effectiveness and efficiency on both an automatically generated clustering structure and a manual classification of documents. In particular, we compare CBR effectiveness with full-text search (FS) and evaluate several implementation alternatives for CBR. Our findings reveal that CBR yields comparable effectiveness figures with FS. Furthermore, by using a specifically tailored cluster-skipping inverted index we significantly improve in-memory query processing efficiency of CBR in comparison to other traditional CBR techniques and even FS.
Open Access
Performance of query processing implementations in ranking-based text retrieval systems using inverted indices
(Elsevier Ltd, 2006-07) Cambazoglu, B. B.; Aykanat, Cevdet
Similarity calculations and document ranking form the computationally expensive parts of query processing in ranking-based text retrieval. In this work, for these calculations, 11 alternative implementation techniques are presented under four different categories, and their asymptotic time and space complexities are investigated. To our knowledge, six of these techniques are not discussed in any other publication before. Furthermore, analytical experiments are carried out on a 30 GB document collection to evaluate the practical performance of different implementations in terms of query processing time and space consumption. Advantages and disadvantages of each technique are illustrated under different querying scenarios, and several experiments that investigate the scalability of the implementations are presented. © 2005 Elsevier Ltd. All rights reserved.
Open Access
A practitioner's guide for static index pruning
(Springer, 2009-04) Altıngövde, İsmail Şengör; Özcan, Rıfat; Ulusoy, Özgür
We compare the term- and document-centric static index pruning approaches as described in the literature and investigate their sensitivity to the scoring functions employed during the pruning and actual retrieval stages. © Springer-Verlag Berlin Heidelberg 2009.
Open Access
Site-based dynamic pruning for query processing in search engines
(ACM, 2008-07) Altıngövde İsmail Şengör; Demir, Engin; Can, Fazlı; Ulusoy, Özgür
Web search engines typically index and retrieve at the page level. In this study, we investigate a dynamic pruning strategy that allows the query processor to first determine the most promising websites and then proceed with the similarity computations for those pages only within these sites.