• About
  • Policies
  • What is openaccess
  • Library
  • Contact
Advanced search
      View Item 
      •   BUIR Home
      • University Library
      • Bilkent Theses
      • Theses - Department of Computer Engineering
      • Dept. of Computer Engineering - Master's degree
      • View Item
      •   BUIR Home
      • University Library
      • Bilkent Theses
      • Theses - Department of Computer Engineering
      • Dept. of Computer Engineering - Master's degree
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Cascaded cross entropy-based search result diversification

      Thumbnail
      View / Download
      1.9 Mb
      Author
      Köroğlu, Bilge
      Advisor
      Can, Fazlı
      Date
      2012
      Publisher
      Bilkent University
      Language
      English
      Type
      Thesis
      Item Usage Stats
      75
      views
      98
      downloads
      Abstract
      Search engines are used to find information on the web. Retrieving relevant documents for ambiguous queries based on query-document similarity does not satisfy the users because such queries have more than one different meaning. In this study, a new method, cascaded cross entropy-based search result diversification (CCED), is proposed to list the web pages corresponding to different meanings of the query in higher rank positions. It combines modified reciprocal rank and cross entropy measures to balance the trade-off between query-document relevancy and diversity among the retrieved documents. We use the Latent Dirichlet Allocation (LDA) algorithm to compute query-document relevancy scores. The number of different meanings of an ambiguous query is estimated by complete-link clustering. We construct the first Turkish test collection for result diversification, BILDIV-2012. The performance of CCED is compared with Maximum Marginal Relevance (MMR) and IA-Select algorithms. In this comparison, the Ambient, TREC Diversity Track, and BILDIV-2012 test collections are used. We also compare performance of these algorithms with those of Bing and Google. The results indicate that CCED is the most successful method in terms of satisfying the users interested in different meanings of the query in higher rank positions of the result list.
      Keywords
      Ambiguous Query
      Cross Entropy
      IA-Select
      Latent Dirichlet Allocation (LDA)
      Permalink
      http://hdl.handle.net/11693/15799
      Collections
      • Dept. of Computer Engineering - Master's degree 508
      Show full item record

      Browse

      All of BUIRCommunities & CollectionsTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsThis CollectionTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartments

      My Account

      Login

      Statistics

      View Usage StatisticsView Google Analytics Statistics

      Bilkent University

      If you have trouble accessing this page and need to request an alternate format, contact the site administrator. Phone: (312) 290 1771
      Copyright © Bilkent University - Library IT

      Contact Us | Send Feedback | Off-Campus Access | Admin | Privacy