Automatic rule learning exploiting morphological features for named entity recognition in Turkish

Date

2011

Authors

Tatar, S.
Cicekli I.

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Journal of Information Science

Print ISSN

1655515

Electronic ISSN

Publisher

Volume

37

Issue

2

Pages

137 - 151

Language

English

Journal Title

Journal ISSN

Volume Title

Series

Abstract

Named entity recognition (NER) is one of the basic tasks in automatic extraction of information from natural language texts. In this paper, we describe an automatic rule learning method that exploits different features of the input text to identify the named entities located in the natural language texts. Moreover, we explore the use of morphological features for extracting named entities from Turkish texts. We believe that the developed system can also be used for other agglutinative languages. The paper also provides a comprehensive overview of the field by reviewing the NER research literature. We conducted our experiments on the TurkIE dataset, a corpus of articles collected from different Turkish newspapers. Our method achieved an average F-score of 91.08% on the dataset. The results of the comparative experiments demonstrate that the developed technique is successfully applicable to the task of automatic NER and exploiting morphological features can significantly improve the NER from Turkish, an agglutinative language. © The Author(s) 2011.

Course

Other identifiers

Book Title

Citation