• About
  • Policies
  • What is openaccess
  • Library
  • Contact
Advanced search
      View Item 
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Automatic categorization of ottoman literary texts by poet and time period

      Thumbnail
      View / Download
      1.6 Mb
      Author
      Can, Ethem F.
      Can, Fazlı
      Duygulu, Pınar
      Kalpaklı, Mehmet
      Date
      2012
      Source Title
      Computer and Information Sciences II
      Publisher
      Springer, London
      Pages
      51 - 57
      Language
      English
      Type
      Conference Paper
      Book Chapter
      Item Usage Stats
      135
      views
      118
      downloads
      Abstract
      Millions of manuscripts and printed texts are available in the Ottoman language. The automatic categorization of Ottoman texts would make these documents much more accessible in various applications ranging from historical investigations to literary analyses. In this work, we use transcribed version of Ottoman literary texts in the Latin alphabet and show that it is possible to develop effective Automatic Text Categorization techniques that can be applied to the Ottoman language. For this purpose, we use two fundamentally different machine learning methods: Naïve Bayes and Support Vector Machines, and employ four style markers: most frequent words, token lengths, two-word collocations, and type lengths. In the experiments, we use the collected works (divans) of ten different poets: two poets from five different hundred-year periods ranging from the 15th to 19th century. The experimental results show that it is possible to obtain highly accurate classifications in terms of poet and time period. By using statistical analysis we are able to recommend which style marker and machine learning method are to be used in future studies. © 2012 Springer-Verlag London Limited.
      Keywords
      Automatic categorization
      Automatic text categorization
      Highly accurate
      Literary analysis
      Literary texts
      Machine learning methods
      Printed texts
      Style markers
      Biographies
      Information science
      Text processing
      Learning systems
      Permalink
      http://hdl.handle.net/11693/28118
      Published Version (Please cite this version)
      https://doi.org/10.1007/978-1-4471-2155-8_6
      https://doi.org/10.1007/978-1-4471-2155-8
      Collections
      • Department of Computer Engineering 1368
      • Department of History 260
      Show full item record

      Browse

      All of BUIRCommunities & CollectionsTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsThis CollectionTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartments

      My Account

      Login

      Statistics

      View Usage StatisticsView Google Analytics Statistics

      Bilkent University

      If you have trouble accessing this page and need to request an alternate format, contact the site administrator. Phone: (312) 290 1771
      Copyright © Bilkent University - Library IT

      Contact Us | Send Feedback | Off-Campus Access | Admin | Privacy