Automatic categorization of ottoman literary texts by poet and time period
dc.citation.epage | 57 | en_US |
dc.citation.spage | 51 | en_US |
dc.contributor.author | Can, Ethem F. | en_US |
dc.contributor.author | Can, Fazlı | en_US |
dc.contributor.author | Duygulu, Pınar | en_US |
dc.contributor.author | Kalpaklı, Mehmet | en_US |
dc.date.accessioned | 2016-02-08T12:11:39Z | |
dc.date.available | 2016-02-08T12:11:39Z | |
dc.date.issued | 2012 | en_US |
dc.department | Department of Computer Engineering | en_US |
dc.department | Department of History | en_US |
dc.description | Conference name: 26th International Symposium on Computer and Information Sciences | en_US |
dc.description.abstract | Millions of manuscripts and printed texts are available in the Ottoman language. The automatic categorization of Ottoman texts would make these documents much more accessible in various applications ranging from historical investigations to literary analyses. In this work, we use transcribed version of Ottoman literary texts in the Latin alphabet and show that it is possible to develop effective Automatic Text Categorization techniques that can be applied to the Ottoman language. For this purpose, we use two fundamentally different machine learning methods: Naïve Bayes and Support Vector Machines, and employ four style markers: most frequent words, token lengths, two-word collocations, and type lengths. In the experiments, we use the collected works (divans) of ten different poets: two poets from five different hundred-year periods ranging from the 15th to 19th century. The experimental results show that it is possible to obtain highly accurate classifications in terms of poet and time period. By using statistical analysis we are able to recommend which style marker and machine learning method are to be used in future studies. © 2012 Springer-Verlag London Limited. | en_US |
dc.description.provenance | Made available in DSpace on 2016-02-08T12:11:39Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2012 | en |
dc.identifier.doi | 10.1007/978-1-4471-2155-8_6 | en_US |
dc.identifier.doi | 10.1007/978-1-4471-2155-8 | en_US |
dc.identifier.uri | http://hdl.handle.net/11693/28118 | en_US |
dc.language.iso | English | en_US |
dc.publisher | Springer, London | en_US |
dc.relation.isversionof | https://doi.org/10.1007/978-1-4471-2155-8_6 | en_US |
dc.relation.isversionof | https://doi.org/10.1007/978-1-4471-2155-8 | en_US |
dc.source.title | Computer and Information Sciences II | en_US |
dc.subject | Automatic categorization | en_US |
dc.subject | Automatic text categorization | en_US |
dc.subject | Highly accurate | en_US |
dc.subject | Literary analysis | en_US |
dc.subject | Literary texts | en_US |
dc.subject | Machine learning methods | en_US |
dc.subject | Printed texts | en_US |
dc.subject | Style markers | en_US |
dc.subject | Biographies | en_US |
dc.subject | Information science | en_US |
dc.subject | Text processing | en_US |
dc.subject | Learning systems | en_US |
dc.title | Automatic categorization of ottoman literary texts by poet and time period | en_US |
dc.type | Conference Paper | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Automatic categorization of ottoman literary texts by poet and time period.pdf
- Size:
- 1.64 MB
- Format:
- Adobe Portable Document Format
- Description:
- Full printable version