Optimization of an educational search engine using learning to rank algorithms
MetadataShow full item record
Please cite this item using this persistent URLhttp://hdl.handle.net/11693/29079
Web search is one of the most popular internet activities among users. Due to high usage of search engines, there are huge data available about history of user search issues. Using query logs as a source of implicit feedback, researchers can learn useful patterns about general search behaviors. We employ a detailed query log analysis provided by a commercial educational vertical search engine. We compare the results of our query log analysis with the general web search characteristics. Due to di erence in terms of search behavior between web users and students, we propose an educational ranking model using learning to rank algorithms to better re ect the search habits of the students in the educational domain to further enhance the search engine performance. We introduce novel features best suited to the educational domain. We show that our model including educational features outperforms two baseline models which are the original ranking of the commercial educational vertical search engine and the model constructed using the state of the art ranking functions, up to 14% and 11%, respectively. We also employ di erent learning to rank models for di erent clusters of queries and the results indicate that having models for each cluster of queries further enhances the performance of our proposed model. Speci cally, the course of the query and the grade of the user issuing the query are good sources of feedback to have a better model in the educational domain. We propose a novel Propagation Algorithm to be used for queries having lower frequencies where information derived from query logs is not enough to exploit. We report that our model constructed using the features generated by our proposed algorithm performs better for singleton queries compared to both the educational learning to rank model we introduce and models learned with common features introduced in the literature.