Noun phrase chunker for Turkish using dependency parser

Date

2010

Editor(s)

Advisor

Ulusoy, Özgür

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Print ISSN

Electronic ISSN

Publisher

Volume

Issue

Pages

Language

English

Type

Journal Title

Journal ISSN

Volume Title

Attention Stats
Usage Stats
6
views
11
downloads

Series

Abstract

Noun phrase chunking is a sub-category of shallow parsing that can be used for many natural language processing tasks. In this thesis, we propose a noun phrase chunker system for Turkish texts. We use a weighted constraint dependency parser to represent the relationship between sentence components and to determine noun phrases. The dependency parser uses a set of hand-crafted rules which can combine morphological and semantic information for constraints. The rules are suitable for handling complex noun phrase structures because of their flexibility. The developed dependency parser can be easily used for shallow parsing of all phrase types by changing the employed rule set. The lack of reliable human tagged datasets is a significant problem for natural language studies about Turkish. Therefore, we constructed the first noun phrase dataset for Turkish. According to our evaluation results, our noun phrase chunker gives promising results on this dataset. The correct morphological disambiguation of words is required for the correctness of the dependency parser. Therefore, in this thesis, we propose a hybrid morphological disambiguation technique which combines statistical information, hand-crafted grammar rules, and transformation based learning rules. We have also constructed a dataset for testing the performance of our disambiguation system. According to tests, the disambiguation system is highly effective.

Course

Other identifiers

Book Title

Degree Discipline

Computer Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Citation

Published Version (Please cite this version)