Browsing by Author "Cambazoğlu, B. B."
Now showing 1 - 6 of 6
Results Per Page
Sort Options
Item Open Access Characterizing web search queries that match very few or no results(ACM, 2012-11) Altıngövde, İ. Ş.; Blanco, R.; Cambazoğlu, B. B.; Özcan, Rıfat; Sarıgil, Erdem; Ulusoy, ÖzgürDespite the continuous efforts to improve the web search quality, a non-negligible fraction of user queries end up with very few or even no matching results in leading web search engines. In this work, we provide a detailed characterization of such queries based on an analysis of a real-life query log. Our experimental setup allows us to characterize the queries with few/no results and compare the mechanisms employed by the major search engines in handling them.Item Open Access Characterizing, predicting, and handling web search queries that match very few or no results(John Wiley & Sons, 2018) Sarıgil, Erdem; Altıngövde, I. S.; Blanco, R.; Cambazoğlu, B. B.; Özcan, R.; Ulusoy, ÖzgürA non‐negligible fraction of user queries end up with very few or even no matching results in leading commercial web search engines. In this work, we provide a detailed characterization of such queries and show that search engines try to improve such queries by showing the results of related queries. Through a user study, we show that these query suggestions are usually perceived as relevant. Also, through a query log analysis, we show that the users are dissatisfied after submitting a query that match no results at least 88.5% of the time. As a first step towards solving these no‐answer queries, we devised a large number of features that can be used to identify such queries and built machine‐learning models. These models can be useful for scenarios such as the mobile‐ or meta‐search, where identifying a query that will retrieve no results at the client device (i.e., even before submitting it to the search engine) may yield gains in terms of the bandwidth usage, power consumption, and/or monetary costs. Experiments over query logs indicate that, despite the heavy skew in class sizes, our models achieve good prediction quality, with accuracy (in terms of area under the curve) up to 0.95.Item Open Access A financial cost metric for result caching(ACM, 2013-07-08) Sazoğlu, Fethi Burak; Cambazoğlu, B. B.; Özcan, R.; Altıngövde, I. S.; Ulusoy, ÖzgürWeb search engines cache results of frequent and/or recent queries. Result caching strategies can be evaluated using different metrics, hit rate being the most well-known. Recent works take the processing overhead of queries into account when evaluating the performance of result caching strategies and propose cost-aware caching strategies. In this paper, we propose a financial cost metric that goes one step beyond and takes also the hourly electricity prices into account when computing the cost. We evaluate the most well-known static, dynamic, and hybrid result caching strategies under this new metric. Moreover, we propose a financial-cost-aware version of the well-known LRU strategy and show that it outperforms the original LRU strategy in terms of the financial cost metric. Copyright © 2013 ACM.Item Open Access Incorporating the surfing behavior of web users into PageRank(ACM, 2013-10-11) Ashyralyyev, Shatlyk; Cambazoğlu, B. B.; Aykanat, CevdetIn large-scale commercial web search engines, estimating the importance of a web page is a crucial ingredient in ranking web search results. So far, to assess the importance of web pages, two different types of feedback have been taken into account, independent of each other: the feedback obtained from the hyperlink structure among the web pages (e.g., PageRank) or the web browsing patterns of users (e.g., BrowseRank). Unfortunately, both types of feedback have certain drawbacks. While the former lacks the user preferences and is vulnerable to malicious intent, the latter suffers from sparsity and hence low web coverage. In this work, we combine these two types of feedback under a hybrid page ranking model in order to alleviate the above-mentioned drawbacks. Our empirical results indicate that the proposed model leads to better estimation of page importance according to an evaluation metric that relies on user click feedback obtained from web search query logs. We conduct all of our experiments in a realistic setting, using a very large scale web page collection (around 6.5 billion web pages) and web browsing data (around two billion web page visits). Copyright is held by the owner/author(s).Item Open Access A large-scale sentiment analysis for Yahoo! Answers(ACM, 2012) Küçüktunç, O.; Cambazoğlu, B. B.; Weber, I.; Ferhatosmanoğlu, HakanSentiment extraction from online web documents has recently been an active research topic due to its potential use in commercial applications. By sentiment analysis, we refer to the problem of assigning a quantitative positive/negative mood to a short bit of text. Most studies in this area are limited to the identification of sentiments and do not investigate the interplay between sentiments and other factors. In this work, we use a sentiment extraction tool to investigate the influence of factors such as gender, age, education level, the topic at hand, or even the time of the day on sentiments in the context of a large online question answering site. We start our analysis by looking at direct correlations, e.g., we observe more positive sentiments on weekends, very neutral ones in the Science & Mathematics topic, a trend for younger people to express stronger sentiments, or people in military bases to ask the most neutral questions. We then extend this basic analysis by investigating how properties of the (asker, answerer) pair affect the sentiment present in the answer. Among other things, we observe a dependence on the pairing of some inferred attributes estimated by a user's ZIP code. We also show that the best answers differ in their sentiments from other answers, e.g., in the Business & Finance topic, best answers tend to have a more neutral sentiment than other answers. Finally, we report results for the task of predicting the attitude that a question will provoke in answers. We believe that understanding factors influencing the mood of users is not only interesting from a sociological point of view, but also has applications in advertising, recommendation, and search. Copyright 2012 ACM.Item Open Access Strategies for setting time-to-live values in result caches(ACM, 2013-10-11) Sazoğlu, Fethi Burak; Cambazoğlu, B. B.; Özcan, R.; Altıngövde, İsmail Şengör; Ulusoy, ÖzgürIn web query result caching, staleness of queries are often bounded via a time-to-live (TTL) mechanism, which expires the validity of cached query results at some point in time. In this work, we evaluate the performance of three alternative TTL mechanisms: time-based TTL, frequency-based TTL, and click-based TTL. Moreover, we propose hybrid approaches obtained by pair-wise combination of these mechanisms. Our results indicate that combining time-based TTL with frequency-based TTL yields superior performance (i.e., lower stale query traffic and less redundant computation) than using a particular mechanism in isolation. Copyright is held by the owner/author(s).