Browsing by Author "Usta, Arif"

Now showing 1 - 7 of 7

Open Access
DBTagger: multi-task learning for keyword mapping in NLIDBs using Bi-directional recurrent neural networks
(Association for Computing Machinery, 2021-01) Usta, Arif; Karakayali, Akifhan; Ulusoy, Özgür
Translating Natural Language Queries (NLQs) to Structured Query Language (SQL) in interfaces deployed in relational databases is a challenging task, which has been widely studied in database community recently. Conventional rule based systems utilize series of solutions as a pipeline to deal with each step of this task, namely stop word filtering, tokenization, stemming/lemmatization, parsing, tagging, and translation. Recent works have mostly focused on the translation step overlooking the earlier steps by using adhoc solutions. In the pipeline, one of the most critical and challenging problems is keyword mapping; constructing a mapping between tokens in the query and relational database elements (tables, attributes, values, etc.). We define the keyword mapping problem as a sequence tagging problem, and propose a novel deep learning based supervised approach that utilizes POS tags of NLQs. Our proposed approach, called DBTagger (DataBase Tagger), is an end-to-end and schema independent solution, which makes it practical for various relational databases. We evaluate our approach on eight different datasets, and report new state-of-the-art accuracy results, 92.4% on the average. Our results also indicate that DBTagger is faster than its counterparts up to 10000 times and scalable for bigger databases.
Open Access
How k-12 students search for learning?: analysis of an educational search engine log
(ACM, 2014-07) Usta, Arif; Altıngövde, İsmail Şengör; Vidinli, İ. B.; Özcan, R.; Ulusoy, Özgür
In this study, we analyze an educational search engine log for shedding light on K-12 students' search behavior in a learning environment. We specially focus on query, session, user and click characteristics and compare the trends to the findings in the literature for general web search engines. Our analysis helps understanding how students search with the purpose of learning in an educational vertical, and reveals new directions to improve the search performance in the education domain. Copyright 2014 ACM.
Open Access
Learning to rank for educational search engines
(IEEE, 2021-04-27) Usta, Arif; Altıngövde, İ. S.; Özcan, R.; Ulusoy, Özgür
In this digital age, there is an abundance of online educational materials in public and proprietary platforms. To allow effective retrieval of educational resources, it is a necessity to build keyword-based search engines over these collections. In modern Web search engines, high-quality rankings are obtained by applying machine learning techniques, known as learning to rank (LTR). In this article, our focus is on constructing machine-learned ranking models to be employed in a search engine in the education domain. Our contributions are threefold. First, we identify and analyze a rich set of features (including click-based and domain-specific ones) to be employed in educational search. LTR models trained on these features outperform various baselines based on ad-hoc retrieval functions and two neural models. As our second contribution, we utilize domain knowledge to build query-dependent ranking models specialized for certain courses or education levels. Our experiments reveal that query-dependent models outperform both the general ranking model and other baselines. Finally, given well-known importance of user clicks in LTR, our third contribution is for handling singleton queries without any click information. To this end, we propose a new strategy to “propagate” click information from the other, similar, queries to the singleton queries. The proposed click propagation approach yields a better ranking performance than the general ranking model and another baseline from the literature. Overall, these findings reveal that both the general and query-dependent ranking models, trained using LTR approaches, yield high effectiveness in educational search, which may ultimately lead to a better learning experience.
Open Access
Optimization of an educational search engine using learning to rank algorithms
(2015-09) Usta, Arif
Web search is one of the most popular internet activities among users. Due to high usage of search engines, there are huge data available about history of user search issues. Using query logs as a source of implicit feedback, researchers can learn useful patterns about general search behaviors. We employ a detailed query log analysis provided by a commercial educational vertical search engine. We compare the results of our query log analysis with the general web search characteristics. Due to di erence in terms of search behavior between web users and students, we propose an educational ranking model using learning to rank algorithms to better re ect the search habits of the students in the educational domain to further enhance the search engine performance. We introduce novel features best suited to the educational domain. We show that our model including educational features outperforms two baseline models which are the original ranking of the commercial educational vertical search engine and the model constructed using the state of the art ranking functions, up to 14% and 11%, respectively. We also employ di erent learning to rank models for di erent clusters of queries and the results indicate that having models for each cluster of queries further enhances the performance of our proposed model. Speci cally, the course of the query and the grade of the user issuing the query are good sources of feedback to have a better model in the educational domain. We propose a novel Propagation Algorithm to be used for queries having lower frequencies where information derived from query logs is not enough to exploit. We report that our model constructed using the features generated by our proposed algorithm performs better for singleton queries compared to both the educational learning to rank model we introduce and models learned with common features introduced in the literature.
Open Access
Re-finding behaviour in educational search
(Springer, Cham, 2019) Usta, Arif; Altıngövde, İ. Ş.; Özcan, R.; Ulusoy, Özgür
One of the search tasks in Web search is repeat search behaviour to find out documents that users once visited, which is called re-finding. Although there have been several works in the context of general-purpose Web search addressing the latter phenomena, the problem is usually overlooked for vertical search engines. In this work, we report re-finding and newfinding behaviours of users in an educational search context and compare results with the findings in the literature for general-purpose web search. Our analysis shows that re-finding pattern of students differs from web search drastically as only 26% of all queries indicate re-finding behaviour compared to 40% in Web.
Open Access
Towards deeply intelligent interfaces in relational databases
(2021-08) Usta, Arif
Relational databases is one of the most popular and broadly utilized infrastruc-tures to store data in a structured fashion. In order to retrieve data, users have to phrase their information need in Structured Query Language (SQL). SQL is a powerfully expressive and ﬂexible language, yet one has to know the schema underlying the database on which the query is issued and to be familiar with SQL syntax, which is not trivial for casual users. To this end, we propose two diﬀerent strategies to provide more intelligent user interfaces to relational databases by utilizing deep learning techniques. As the ﬁrst study, we propose a solution for keyword mapping in Natural Language Interfaces to Databases (NLIDB), which aims to translate Natural Language Queries (NLQs) to SQL. We deﬁne the key-word mapping problem as a sequence tagging problem, and propose a novel deep learning based supervised approach that utilizes part-of-speech (POS) tags of NLQs. Our proposed approach, called DBTagger (DataBase Tagger), is an end-to-end and schema independent solution. Query recommendation paradigm, a well-known strategy broadly utilized in Web search engines, is helpful to suggest queries of expert users to the casual users to help them with their information need. As the second study, we propose Conquer, a CONtextual QUEry Recom-mendation algorithm on relational databases exploiting deep learning. First, we train local embeddings of a database using Graph Convolutional Networks to ex-tract distributed representations of the tuples in latent space. We represent SQL queries with a semantic vector by averaging the embeddings of the tuples returned as a result of the query. We employ cosine similarity over the ﬁnal representations of the queries to generate recommendations, as a Witness-Based approach. Our results show that in classiﬁcation accuracy of database rows as an indicator for embedding quality, Conquer outperforms state-of-the-art techniques.
Open Access
Towards interactive data exploration
(Springer, 2019) Binnig, C.; Basık, Fuat; Buratti, B.; Çetintemel, U.; Chung, Y.; Crotty, A.; Cousins, C.; Ebert, D.; Eichmann, P.; Galakatos, A.; Hattasch, B.; Ilkhechi, A.; Kraska, T.; Shang, Z.; Tromba, I.; Usta, Arif; Utama, P.; Upfal, E.; Wang, L.; Weir, N.; Zeleznik, R.; Zgraggen, E.; Castellanos, M.; Chrysanthis, P.; Pelechrinis, K.
Enabling interactive visualization over new datasets at “human speed” is key to democratizing data science and maximizing human productivity. In this work, we first argue why existing analytics infrastructures do not support interactive data exploration and outline the challenges and opportunities of building a system specifically designed for interactive data exploration. Furthermore, we present the results of building IDEA, a new type of system for interactive data exploration that is specifically designed to integrate seamlessly with existing data management landscapes and allow users to explore their data instantly without expensive data preparation costs. Finally, we discuss other important considerations for interactive data exploration systems including benchmarking, natural language interfaces, as well as interactive machine learning.