Automatic categorization of ottoman literary texts by poet and time period

dc.citation.epage57en_US
dc.citation.spage51en_US
dc.contributor.authorCan, Ethem F.en_US
dc.contributor.authorCan, Fazlıen_US
dc.contributor.authorDuygulu, Pınaren_US
dc.contributor.authorKalpaklı, Mehmeten_US
dc.date.accessioned2016-02-08T12:11:39Z
dc.date.available2016-02-08T12:11:39Z
dc.date.issued2012en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.departmentDepartment of Historyen_US
dc.descriptionConference name: 26th International Symposium on Computer and Information Sciencesen_US
dc.description.abstractMillions of manuscripts and printed texts are available in the Ottoman language. The automatic categorization of Ottoman texts would make these documents much more accessible in various applications ranging from historical investigations to literary analyses. In this work, we use transcribed version of Ottoman literary texts in the Latin alphabet and show that it is possible to develop effective Automatic Text Categorization techniques that can be applied to the Ottoman language. For this purpose, we use two fundamentally different machine learning methods: Naïve Bayes and Support Vector Machines, and employ four style markers: most frequent words, token lengths, two-word collocations, and type lengths. In the experiments, we use the collected works (divans) of ten different poets: two poets from five different hundred-year periods ranging from the 15th to 19th century. The experimental results show that it is possible to obtain highly accurate classifications in terms of poet and time period. By using statistical analysis we are able to recommend which style marker and machine learning method are to be used in future studies. © 2012 Springer-Verlag London Limited.en_US
dc.description.provenanceMade available in DSpace on 2016-02-08T12:11:39Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2012en
dc.identifier.doi10.1007/978-1-4471-2155-8_6en_US
dc.identifier.doi10.1007/978-1-4471-2155-8en_US
dc.identifier.urihttp://hdl.handle.net/11693/28118en_US
dc.language.isoEnglishen_US
dc.publisherSpringer, Londonen_US
dc.relation.isversionofhttps://doi.org/10.1007/978-1-4471-2155-8_6en_US
dc.relation.isversionofhttps://doi.org/10.1007/978-1-4471-2155-8en_US
dc.source.titleComputer and Information Sciences IIen_US
dc.subjectAutomatic categorizationen_US
dc.subjectAutomatic text categorizationen_US
dc.subjectHighly accurateen_US
dc.subjectLiterary analysisen_US
dc.subjectLiterary textsen_US
dc.subjectMachine learning methodsen_US
dc.subjectPrinted textsen_US
dc.subjectStyle markersen_US
dc.subjectBiographiesen_US
dc.subjectInformation scienceen_US
dc.subjectText processingen_US
dc.subjectLearning systemsen_US
dc.titleAutomatic categorization of ottoman literary texts by poet and time perioden_US
dc.typeConference Paperen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Automatic categorization of ottoman literary texts by poet and time period.pdf
Size:
1.64 MB
Format:
Adobe Portable Document Format
Description:
Full printable version