A new approach to search result clustering and labeling

Türel, Anıl

A new approach to search result clustering and labeling

buir.advisor	Can, Fazlı
dc.contributor.author	Türel, Anıl
dc.date.accessioned	2016-01-08T18:15:40Z
dc.date.available	2016-01-08T18:15:40Z
dc.date.issued	2011
dc.description	Cataloged from PDF version of article.	en_US
dc.description	Includes bibliographical references leaves 58-62.	en_US
dc.description.abstract	Search engines present query results as a long ordered list of web snippets divided into several pages. Post-processing of information retrieval results for easier access to the desired information is an important research problem. A post-processing technique is clustering search results by topics and labeling these groups to reflect the topic of each cluster. In this thesis, we present a novel search result clustering approach to split the long list of documents returned by search engines into meaningfully grouped and labeled clusters. Our method emphasizes clustering quality by using cover coefficient and sequential k-means clustering algorithms. Cluster labeling is crucial because meaningless or confusing labels may mislead users to check wrong clusters for the query and lose extra time. Additionally, labels should reflect the contents of documents within the cluster accurately. To be able to label clusters effectively, a new cluster labeling method based on term weighting is introduced. We also present a new metric that employs precision and recall to assess the success of cluster labeling. We adopt a comparative evaluation strategy to derive the relative performance of the proposed method with respect to the two prominent search result clustering methods: Suffix Tree Clustering and Lingo. Moreover, we perform the experiments using the publicly available Ambient and ODP-239 datasets. Experimental results show that the proposed method can successfully achieve both clustering and labeling tasks.	en_US
dc.description.statementofresponsibility	Türel, Anıl	en_US
dc.format.extent	xiv, 67 leaves	en_US
dc.identifier.uri	http://hdl.handle.net/11693/15255
dc.language.iso	English	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Search result clustering	en_US
dc.subject	cluster labeling	en_US
dc.subject	web information retrieval	en_US
dc.subject	clustering evaluation	en_US
dc.subject	labeling evaluation	en_US
dc.subject.lcc	TK5105.884 .T87 2011	en_US
dc.subject.lcsh	Search engines--Programming.	en_US
dc.subject.lcsh	Web search engines--Mathematical models.	en_US
dc.subject.lcsh	Information storage and retrieval systems.	en_US
dc.subject.lcsh	Information retrieval.	en_US
dc.subject.lcsh	Internet searching.	en_US
dc.title	A new approach to search result clustering and labeling	en_US
dc.type	Thesis	en_US
thesis.degree.discipline	Computer Engineering
thesis.degree.grantor	Bilkent University
thesis.degree.level	Master's
thesis.degree.name	MS (Master of Science)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 0006009.pdf
Size:: 805.7 KB
Format:: Adobe Portable Document Format

Download

Collections

Graduate School of Engineering and Science