Improving educational search and question answering

Ulusoy, Özgür
Source Title
Print ISSN
Electronic ISSN
Bilkent University
Journal Title
Journal ISSN
Volume Title

Students use general web search engines (GSEs) as their primary source of research while trying to find answers to school related questions. Although GSEs are highly relevant for the general population, they may return results that are out of education context. Another rising trend; social community question answering websites (CQ&A) are the secondary choice for students who try to get answers from other peers online. We focus on discovering possible improvements on educational search by leveraging both of the two information sources. The first part of our work involves Q&A websites. In order to gain contextual and behavioral insights, we extract the content of a commonly used educational Q&A website with a scraper we implement. We analyze the content in terms of user behavior and try to understand to what extent the educational Q&A differs from the general purpose Q&A. In the second part, we implement a classifier for educational questions. This classifier is built by an ensemble method that employs several regular learning algorithms and retrieval based ones that utilize external resources. We also build a query expander to facilitate classification. We further improve the classification using search engine results. In the third part, in order to find out whether search engine ranking can be improved in the education domain using the classification model, we collect and label a set of query results retrieved from a GSE. We propose five ad-hoc methods to improve search ranking based on the idea that the query-document category relation is an indicator of relevance. We evaluate these methods on various query sets and show that some of the methods significantly improve the rankings in the education domain. In the last part, we focus on educational spell checking. In educational search systems, it is common for users to make spelling mistakes. Actual query logs of two commercial search engines in the education domain are analyzed in terms of spelling mistakes using 5 well-known spell correction software that are not education specific and lack the terms that are used in the education field. It is shown that by extending the spell-check dictionary of one of them, even with a small-sized education oriented word-list, one can improve the precision, recall and F1 values of a spell-checker.

Other identifiers
Book Title
Published Version (Please cite this version)