Browsing by Author "Özcan, R."
Now showing 1 - 8 of 8
Results Per Page
Sort Options
Item Open Access Characterizing, predicting, and handling web search queries that match very few or no results(John Wiley & Sons, 2018) Sarıgil, Erdem; Altıngövde, I. S.; Blanco, R.; Cambazoğlu, B. B.; Özcan, R.; Ulusoy, ÖzgürA non‐negligible fraction of user queries end up with very few or even no matching results in leading commercial web search engines. In this work, we provide a detailed characterization of such queries and show that search engines try to improve such queries by showing the results of related queries. Through a user study, we show that these query suggestions are usually perceived as relevant. Also, through a query log analysis, we show that the users are dissatisfied after submitting a query that match no results at least 88.5% of the time. As a first step towards solving these no‐answer queries, we devised a large number of features that can be used to identify such queries and built machine‐learning models. These models can be useful for scenarios such as the mobile‐ or meta‐search, where identifying a query that will retrieve no results at the client device (i.e., even before submitting it to the search engine) may yield gains in terms of the bandwidth usage, power consumption, and/or monetary costs. Experiments over query logs indicate that, despite the heavy skew in class sizes, our models achieve good prediction quality, with accuracy (in terms of area under the curve) up to 0.95.Item Open Access A financial cost metric for result caching(ACM, 2013-07-08) Sazoğlu, Fethi Burak; Cambazoğlu, B. B.; Özcan, R.; Altıngövde, I. S.; Ulusoy, ÖzgürWeb search engines cache results of frequent and/or recent queries. Result caching strategies can be evaluated using different metrics, hit rate being the most well-known. Recent works take the processing overhead of queries into account when evaluating the performance of result caching strategies and propose cost-aware caching strategies. In this paper, we propose a financial cost metric that goes one step beyond and takes also the hourly electricity prices into account when computing the cost. We evaluate the most well-known static, dynamic, and hybrid result caching strategies under this new metric. Moreover, we propose a financial-cost-aware version of the well-known LRU strategy and show that it outperforms the original LRU strategy in terms of the financial cost metric. Copyright © 2013 ACM.Item Open Access How k-12 students search for learning?: analysis of an educational search engine log(ACM, 2014-07) Usta, Arif; Altıngövde, İsmail Şengör; Vidinli, İ. B.; Özcan, R.; Ulusoy, ÖzgürIn this study, we analyze an educational search engine log for shedding light on K-12 students' search behavior in a learning environment. We specially focus on query, session, user and click characteristics and compare the trends to the findings in the literature for general web search engines. Our analysis helps understanding how students search with the purpose of learning in an educational vertical, and reveals new directions to improve the search performance in the education domain. Copyright 2014 ACM.Item Open Access Improving educational web search for question-like queries through subject classification(Elsevier, 2019) Yılmaz, Tolga; Özcan, R.; Altıngövde, İ. Ş; Ulusoy, ÖzgürStudents use general web search engines as their primary source of research while trying to find answers to school-related questions. Although search engines are highly relevant for the general population, they may return results that are out of educational context. Another rising trend; social community question answering websites are the second choice for students who try to get answers from other peers online. We attempt discovering possible improvements in educational search by leveraging both of these information sources. For this purpose, we first implement a classifier for educational questions. This classifier is built by an ensemble method that employs several regular learning algorithms and retrieval based approaches that utilize external resources. We also build a query expander to facilitate classification. We further improve the classification using search engine results and obtain 83.5% accuracy. Although our work is entirely based on the Turkish language, the features could easily be mapped to other languages as well. In order to find out whether search engine ranking can be improved in the education domain using the classification model, we collect and label a set of query results retrieved from a general web search engine. We propose five ad-hoc methods to improve search ranking based on the idea that the query-document category relation is an indicator of relevance. We evaluate these methods for overall performance, varying query length and based on factoid and non-factoid queries. We show that some of the methods significantly improve the rankings in the education domain.Item Open Access Learning to rank for educational search engines(IEEE, 2021-04-27) Usta, Arif; Altıngövde, İ. S.; Özcan, R.; Ulusoy, ÖzgürIn this digital age, there is an abundance of online educational materials in public and proprietary platforms. To allow effective retrieval of educational resources, it is a necessity to build keyword-based search engines over these collections. In modern Web search engines, high-quality rankings are obtained by applying machine learning techniques, known as learning to rank (LTR). In this article, our focus is on constructing machine-learned ranking models to be employed in a search engine in the education domain. Our contributions are threefold. First, we identify and analyze a rich set of features (including click-based and domain-specific ones) to be employed in educational search. LTR models trained on these features outperform various baselines based on ad-hoc retrieval functions and two neural models. As our second contribution, we utilize domain knowledge to build query-dependent ranking models specialized for certain courses or education levels. Our experiments reveal that query-dependent models outperform both the general ranking model and other baselines. Finally, given well-known importance of user clicks in LTR, our third contribution is for handling singleton queries without any click information. To this end, we propose a new strategy to “propagate” click information from the other, similar, queries to the singleton queries. The proposed click propagation approach yields a better ranking performance than the general ranking model and another baseline from the literature. Overall, these findings reveal that both the general and query-dependent ranking models, trained using LTR approaches, yield high effectiveness in educational search, which may ultimately lead to a better learning experience.Item Open Access Re-finding behaviour in educational search(Springer, Cham, 2019) Usta, Arif; Altıngövde, İ. Ş.; Özcan, R.; Ulusoy, ÖzgürOne of the search tasks in Web search is repeat search behaviour to find out documents that users once visited, which is called re-finding. Although there have been several works in the context of general-purpose Web search addressing the latter phenomena, the problem is usually overlooked for vertical search engines. In this work, we report re-finding and newfinding behaviours of users in an educational search context and compare results with the findings in the literature for general-purpose web search. Our analysis shows that re-finding pattern of students differs from web search drastically as only 26% of all queries indicate re-finding behaviour compared to 40% in Web.Item Open Access Strategies for setting time-to-live values in result caches(ACM, 2013-10-11) Sazoğlu, Fethi Burak; Cambazoğlu, B. B.; Özcan, R.; Altıngövde, İsmail Şengör; Ulusoy, ÖzgürIn web query result caching, staleness of queries are often bounded via a time-to-live (TTL) mechanism, which expires the validity of cached query results at some point in time. In this work, we evaluate the performance of three alternative TTL mechanisms: time-based TTL, frequency-based TTL, and click-based TTL. Moreover, we propose hybrid approaches obtained by pair-wise combination of these mechanisms. Our results indicate that combining time-based TTL with frequency-based TTL yields superior performance (i.e., lower stale query traffic and less redundant computation) than using a particular mechanism in isolation. Copyright is held by the owner/author(s).Item Open Access A “suggested” picture of web search in Turkish(ACM, 2016-05) Sarıgil, E.; Yılmaz, O.; Altıngövde, İ. Ş.; Özcan, R.; Ulusoy, ÖzgürAlthough query log analysis provides crucial insights about Web users’ search interests, conducting such analyses is almost impossible for some languages, as large-scale and public query logs are quite scarce. In this study, we first survey the existing query collections in Turkish and discuss their limitations. Next, we adopt a novel strategy to obtain a set of Turkish queries using the query autocompletion services from the four major search engines and provide the first large-scale analysis of Web queries and their results in Turkish.