A new approach to search result clustering and labeling
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
283 - 292
Item Usage Stats
MetadataShow full item record
Search engines present query results as a long ordered list of web snippets divided into several pages. Post-processing of retrieval results for easier access of desired information is an important research problem. In this paper, we present a novel search result clustering approach to split the long list of documents returned by search engines into meaningfully grouped and labeled clusters. Our method emphasizes clustering quality by using cover coefficient-based and sequential k-means clustering algorithms. A cluster labeling method based on term weighting is also introduced for reflecting cluster contents. In addition, we present a new metric that employs precision and recall to assess the success of cluster labeling. We adopt a comparative strategy to derive the relative performance of the proposed method with respect to two prominent search result clustering methods: Suffix Tree Clustering and Lingo. Experimental results in the publicly available AMBIENT and ODP-239 datasets show that our method can successfully achieve both clustering and labeling tasks. © 2011 Springer-Verlag Berlin Heidelberg.
search result clustering
web information retrieval
K-Means clustering algorithm
Precision and recall
Web information retrieval
Content based retrieval
World Wide Web
Permalink (Please cite this version)http://hdl.handle.net/11693/28246
Showing items related by title, author, creator and subject.
Akgun, B.; Aykin I. (IEEE Computer Society, 2013)The clustering algorithms designed for traditional sensor networks have been adapted for energy harvesting sensor networks (EHWSN). However, in these algorithms, the intra-cluster MAC protocols to be used were either not ...
Ali, S.A.; Sevgi, C. (2012)Clustering can be used as an effective technique to achieve both energy load balancing and an extended lifetime for a wireless sensor network (WSN). This paper presents a novel approach that first creates energy balanced ...
Altingovde I.S.; Ozcan, R.; Ocalan H.C.; Can F.; Ulusoy Ö. (2007)We present cluster-based retrieval (CBR) experiments on the largest available Turkish document collection. Our experiments evaluate retrieval effectiveness and efficiency on both an automatically generated clustering ...