Browsing by Subject "Inverted indices"

Now showing 1 - 4 of 4

Open Access
Cluster searching strategies for collaborative recommendation systems
(2013) Altingovde, I. S.; Subakan, Ö. N.; Ulusoy, Özgür
In-memory nearest neighbor computation is a typical collaborative filtering approach for high recommendation accuracy. However, this approach is not scalable given the huge number of customers and items in typical commercial applications. Cluster-based collaborative filtering techniques can be a remedy for the efficiency problem, but they usually provide relatively lower accuracy figures, since they may become over-generalized and produce less-personalized recommendations. Our research explores an individualistic strategy which initially clusters the users and then exploits the members within clusters, but not just the cluster representatives, during the recommendation generation stage. We provide an efficient implementation of this strategy by adapting a specifically tailored cluster- skipping inverted index structure. Experimental results reveal that the individualistic strategy with the cluster-skipping index is a good compromise that yields high accuracy and reasonable scalability figures. © 2012 Elsevier Ltd. All rights reserved.
Open Access
Exploiting query views for static index pruning in web search engines
(ACM, 2009-11) Altıngövde, İsmail Şengör; Özcan, Rıfat; Ulusoy, Özgür
We propose incorporating query views in a number of static pruning strategies, namely term-centric, document-centric and access-based approaches. These query-view based strategies considerably outperform their counterparts for both disjunctive and conjunctive query processing in Web search engines. Copyright 2009 ACM.
Open Access
Memory resident parallel inverted index construction
(Springer, London, 2012) Küçükyılmaz, Tayfun; Türk, Ata; Aykanat, Cevdet
Advances in cloud computing, 64-bit architectures and huge RAMs enable performing many search related tasks in memory.We argue that term-based partitioned parallel inverted index construction is among such tasks, and provide an efficient parallel framework that achieves this task. We show that by utilizing an efficient bucketing scheme we can eliminate the need for the generation of a global index and reduce the communication overhead without disturbing balancing constraint. We also propose and investigate assignment schemes that can further reduce communication overheads without disturbing balancing constraints. The conducted experiments indicate promising results. © 2012 Springer-Verlag London Limited.
Open Access
SONIC: streaming overlapping community detection
(Springer, 2016) Sarıyüce, A. E.; Gedik, B.; Jacques-Silva, G.; Wu, Kun-Lung; Catalyurek, U.V.
A community within a graph can be broadly defined as a set of vertices that exhibit high cohesiveness (relatively high number of edges within the set) and low conductance (relatively low number of edges leaving the set). Community detection is a fundamental graph processing analytic that can be applied to several application domains, including social networks. In this context, communities are often overlapping, as a person can be involved in more than one community (e.g., friends, and family); and evolving, since the structure of the network changes. We address the problem of streaming overlapping community detection, where the goal is to maintain communities in the presence of streaming updates. This way, the communities can be updated more efficiently. To this end, we introduce SONIC—a find-and-merge type of community detection algorithm that can efficiently handle streaming updates. SONIC first detects when graph updates yield significant community changes. Upon the detection, it updates the communities via an incremental merge procedure. The SONIC algorithm incorporates two additional techniques to speed-up the incremental merge; min-hashing and inverted indexes. Results show that SONIC can provide high quality overlapping communities, while handling streaming updates several orders of magnitude faster than the alternatives performing from-scratch computation.