Statistical morphological disambiguation for agglutinative languages

Hakkani-Tür, D. Z.; Oflazer, K.; Tür, G.

Statistical morphological disambiguation for agglutinative languages

dc.citation.epage	410	en_US
dc.citation.issueNumber	4	en_US
dc.citation.spage	381	en_US
dc.citation.volumeNumber	36	en_US
dc.contributor.author	Hakkani-Tür, D. Z.	en_US
dc.contributor.author	Oflazer, K.	en_US
dc.contributor.author	Tür, G.	en_US
dc.date.accessioned	2019-02-01T12:13:18Z
dc.date.available	2019-02-01T12:13:18Z	en_US
dc.date.issued	2002	en_US
dc.department	Department of Computer Engineering	en_US
dc.description.abstract	We present statistical models for morphological disambiguation in agglutinative languages, with a specific application to Turkish. Turkish presents an interesting problem for statistical models as the potential tag set size is very large because of the productive derivational morphology. We propose to handle this by breaking up the morhosyntactic tags into inflectional groups, each of which contains the inflectional features for each (intermediate) derived form. Our statistical models score the probability of each morhosyntactic tag by considering statistics over the individual inflectional groups and surface roots in trigram models. Among the four models that we have developed and tested, the simplest model ignoring the local morphotactics within words performs the best. Our best trigram model performs with 93.95% accuracy on our test data getting all the morhosyntactic and semantic features correct. If we are just interested in syntactically relevant features and ignore a very small set of semantic features, then the accuracy increases to 95.07%.	en_US
dc.identifier.doi	10.1023/A:1020271707826	en_US
dc.identifier.issn	0010-4817	en_US
dc.identifier.issn	1572-8412	en_US
dc.identifier.uri	http://hdl.handle.net/11693/48722	en_US
dc.language.iso	English	en_US
dc.publisher	Springer/	en_US
dc.publisher	Kluwer Academic Publishers	en_US
dc.relation.isversionof	https://doi.org/10.1023/A:1020271707826	en_US
dc.source.title	Computers and the Humanities	en_US
dc.subject	Agglutinative Languages	en_US
dc.subject	Morphological Disambiguation	en_US
dc.subject	N-Gram Language Models	en_US
dc.subject	Statistical Natural Language Processing	en_US
dc.subject	Turkish	en_US
dc.title	Statistical morphological disambiguation for agglutinative languages	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Statistical_morphological_disambiguation_for_agglutinative_languages.pdf
Size:: 181.91 KB
Format:: Adobe Portable Document Format
Description:: Full printable version

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Scholarly Publications - Computer Engineering