Cascaded cross entropy-based search result diversification
buir.advisor | Can, Fazlı | |
dc.contributor.author | Köroğlu, Bilge | |
dc.date.accessioned | 2016-01-08T18:24:49Z | |
dc.date.available | 2016-01-08T18:24:49Z | |
dc.date.issued | 2012 | |
dc.description | Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University, 2012. | en_US |
dc.description | Thesis (Master's) -- Bilkent University, 2012. | en_US |
dc.description | Includes bibliographical references leaves 82-86. | en_US |
dc.description.abstract | Search engines are used to find information on the web. Retrieving relevant documents for ambiguous queries based on query-document similarity does not satisfy the users because such queries have more than one different meaning. In this study, a new method, cascaded cross entropy-based search result diversification (CCED), is proposed to list the web pages corresponding to different meanings of the query in higher rank positions. It combines modified reciprocal rank and cross entropy measures to balance the trade-off between query-document relevancy and diversity among the retrieved documents. We use the Latent Dirichlet Allocation (LDA) algorithm to compute query-document relevancy scores. The number of different meanings of an ambiguous query is estimated by complete-link clustering. We construct the first Turkish test collection for result diversification, BILDIV-2012. The performance of CCED is compared with Maximum Marginal Relevance (MMR) and IA-Select algorithms. In this comparison, the Ambient, TREC Diversity Track, and BILDIV-2012 test collections are used. We also compare performance of these algorithms with those of Bing and Google. The results indicate that CCED is the most successful method in terms of satisfying the users interested in different meanings of the query in higher rank positions of the result list. | en_US |
dc.description.provenance | Made available in DSpace on 2016-01-08T18:24:49Z (GMT). No. of bitstreams: 1 0006504.pdf: 1991735 bytes, checksum: 6703e4fba051ba997a072e0b395b6198 (MD5) | en |
dc.description.statementofresponsibility | Köroğlu, Bilge | en_US |
dc.format.extent | xi, 89 leaves | en_US |
dc.identifier.itemid | B133868 | |
dc.identifier.uri | http://hdl.handle.net/11693/15799 | |
dc.language.iso | English | en_US |
dc.rights | info:eu-repo/semantics/openAccess | en_US |
dc.subject | Ambiguous Query | en_US |
dc.subject | Cross Entropy | en_US |
dc.subject | IA-Select | en_US |
dc.subject | Latent Dirichlet Allocation (LDA) | en_US |
dc.subject.lcc | TK5105.884 .K67 2012 | en_US |
dc.subject.lcsh | Search engines--Programming. | en_US |
dc.subject.lcsh | Web search engines--Mathematical models. | en_US |
dc.subject.lcsh | Information storage and retrieval systems. | en_US |
dc.subject.lcsh | Information retrieval. | en_US |
dc.subject.lcsh | Internet searching. | en_US |
dc.title | Cascaded cross entropy-based search result diversification | en_US |
dc.type | Thesis | en_US |
thesis.degree.discipline | Computer Engineering | |
thesis.degree.grantor | Bilkent University | |
thesis.degree.level | Master's | |
thesis.degree.name | MS (Master of Science) |
Files
Original bundle
1 - 1 of 1