Transqlate: translating enriched natural language sentences to SQL queries using transformers

buir.advisorUlusoy, Özgür
dc.contributor.authorFarshkar Azari, Mousa
dc.date.accessioned2022-09-13T11:26:36Z
dc.date.available2022-09-13T11:26:36Z
dc.date.copyright2022-09
dc.date.issued2022-09
dc.date.submitted2022-09-12
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionCataloged from PDF version of article.en_US
dc.descriptionThesis (Master's): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2022.en_US
dc.descriptionIncludes bibliographical references (leaves 41-46).en_US
dc.description.abstractA large amount of the structured data owned by different enterprises is typically stored in Relational Database Management Systems, and a decent knowledge of Structured Language Query (SQL) is required to extract desired information from the relational databases. Many naive users need to access the information from databases, and they do not have the necessary skills or knowledge. Additionally, even some expert users might find it challenging to provide complex SQL queries when they do not know the schema underlying the database. To this end, a considerable amount of research has been conducted recently for the translation of queries formulated by users in a natural language to SQL queries to be processed by database systems. In this thesis, we provide some deep intelligent strategies to be used in natural language to SQL translation. We propose TranSQLate, a novel method to enrich the input sequences and provide more effective Natural Language Interface to Database (NLIDB) systems. We apply our strategies to the Vanilla transformer and T5 transformer models in three different ways. With enriched inputs, we achieve up to 16.7% improvement in translation accuracy, 6.5 points in SacreBLEU score, and 18 points in the n-gram precision, compared to not enriched versions. Our method surpasses the strategies used in the state-of-the-art systems NALIR, TEMPLAR, and DBTagger, in terms of translation accuracy over IMDB, scholar, and Yelp datasets.en_US
dc.description.degreeM.S.en_US
dc.description.provenanceSubmitted by Betül Özen (ozen@bilkent.edu.tr) on 2022-09-13T11:26:36Z No. of bitstreams: 1 B161280.pdf: 1699704 bytes, checksum: 6fdc76bc98209d29db7d676d97504b27 (MD5)en
dc.description.provenanceMade available in DSpace on 2022-09-13T11:26:36Z (GMT). No. of bitstreams: 1 B161280.pdf: 1699704 bytes, checksum: 6fdc76bc98209d29db7d676d97504b27 (MD5) Previous issue date: 2022-09en
dc.description.statementofresponsibilityby Mousa Farshkar Azarien_US
dc.embargo.release2023-03-12
dc.format.extentx, 46 leaves : charts ; 30 cm.en_US
dc.identifier.itemidB161280
dc.identifier.urihttp://hdl.handle.net/11693/110500
dc.language.isoEnglishen_US
dc.publisherBilkent Universityen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectNatural language processingen_US
dc.subjectStructured query languageen_US
dc.subjectRelational database systemsen_US
dc.subjectNatural language interface to databasesen_US
dc.subjectNeural networksen_US
dc.subjectDeep learningen_US
dc.titleTransqlate: translating enriched natural language sentences to SQL queries using transformersen_US
dc.title.alternativeTransqlate: zenginleştirilmiş doğal dil cümlelerini transformerler kullanarak SQL sorgularına çevirmeen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
B161280.pdf
Size:
1.62 MB
Format:
Adobe Portable Document Format
Description:
Full printable version
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: