Optimization of an educational search engine using learning to rank algorithms

Date
2015-09
Advisor
Ulusoy, Özgür
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Bilkent University
Volume
Issue
Pages
Language
English
Type
Thesis
Journal Title
Journal ISSN
Volume Title
Series
Abstract

Web search is one of the most popular internet activities among users. Due to high usage of search engines, there are huge data available about history of user search issues. Using query logs as a source of implicit feedback, researchers can learn useful patterns about general search behaviors. We employ a detailed query log analysis provided by a commercial educational vertical search engine. We compare the results of our query log analysis with the general web search characteristics. Due to di erence in terms of search behavior between web users and students, we propose an educational ranking model using learning to rank algorithms to better re ect the search habits of the students in the educational domain to further enhance the search engine performance. We introduce novel features best suited to the educational domain. We show that our model including educational features outperforms two baseline models which are the original ranking of the commercial educational vertical search engine and the model constructed using the state of the art ranking functions, up to 14% and 11%, respectively. We also employ di erent learning to rank models for di erent clusters of queries and the results indicate that having models for each cluster of queries further enhances the performance of our proposed model. Speci cally, the course of the query and the grade of the user issuing the query are good sources of feedback to have a better model in the educational domain. We propose a novel Propagation Algorithm to be used for queries having lower frequencies where information derived from query logs is not enough to exploit. We report that our model constructed using the features generated by our proposed algorithm performs better for singleton queries compared to both the educational learning to rank model we introduce and models learned with common features introduced in the literature.

Course
Other identifiers
Book Title
Keywords
Information retrieval, Web search, Vertical search engine, Learning to rank algorithms, Educational domain
Citation
Published Version (Please cite this version)