Authorship attribution: performance of various features and classification methods
dc.citation.epage | 162 | en_US |
dc.citation.spage | 158 | en_US |
dc.contributor.author | Bozkurt, İlker Nadi | en_US |
dc.contributor.author | Bağlıoğlu, Özgür | en_US |
dc.contributor.author | Uyar, Erkan | en_US |
dc.coverage.spatial | Ankara, Turkey | |
dc.date.accessioned | 2016-02-08T11:41:59Z | |
dc.date.available | 2016-02-08T11:41:59Z | |
dc.date.issued | 2007-11 | en_US |
dc.department | Department of Computer Engineering | en_US |
dc.description | Date of Conference: 7-9 Nov. 2007 | |
dc.description | Conference name: 22nd international symposium on computer and information sciences, 2007 | |
dc.description.abstract | Authorship attribution is the process of determining the writer of a document. In literature, there are lots of classification techniques conducted in this process. In this paper we explore information retrieval methods such as tf-idf structure with support vector machines, parametric and nonparametric methods with supervised and unsupervised (clustering) classification techniques in authorship attribution. We performed various experiments with articles gathered from Turkish newspaper Milliyet. We performed experiments on different features extracted from these texts with different classifiers, and combined these results to improve our success rates. We identified which classifiers give satisfactory results on which feature sets. According to experiments, the success rates dramatically changes with different combinations, however the best among them are support vector classifier with bag of words, and Gaussian with function words. ©2007 IEEE. | en_US |
dc.description.provenance | Made available in DSpace on 2016-02-08T11:41:59Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2007 | en_US |
dc.identifier.doi | 10.1109/ISCIS.2007.4456854 | en_US |
dc.identifier.uri | http://hdl.handle.net/11693/27016 | en_US |
dc.language.iso | English | en_US |
dc.publisher | IEEE | en_US |
dc.relation.isversionof | https://doi.org/10.1109/ISCIS.2007.4456854 | en_US |
dc.source.title | 22nd International Symposium on Computer and Information Sciences, ISCIS 2007 - Proceedings | en_US |
dc.subject | Authorship attribution | en_US |
dc.subject | Classifier feature reationship | en_US |
dc.subject | Feature reduction | en_US |
dc.subject | Parametric nonparametric calssifiers | en_US |
dc.subject | Text categorization | en_US |
dc.subject | Classifiers | en_US |
dc.subject | Communication | en_US |
dc.subject | Cybernetics | en_US |
dc.subject | Experiments | en_US |
dc.subject | Image retrieval | en_US |
dc.subject | Information management | en_US |
dc.subject | Information retrieval | en_US |
dc.subject | Information science | en_US |
dc.subject | Information services | en_US |
dc.subject | Learning systems | en_US |
dc.subject | Search engines | en_US |
dc.subject | Support vector machines | en_US |
dc.subject | Vectors | en_US |
dc.subject | Authorship attribution | en_US |
dc.subject | Bag of words | en_US |
dc.subject | Classification methods | en_US |
dc.subject | Classification techniques | en_US |
dc.subject | Feature sets | en_US |
dc.subject | Function words | en_US |
dc.subject | Gaussian | en_US |
dc.subject | International symposium | en_US |
dc.subject | Non parametric methods | en_US |
dc.subject | Retrieval methods | en_US |
dc.subject | Support vector classifier (SVC) | en_US |
dc.subject | Turkish | en_US |
dc.subject | Vector machines | en_US |
dc.subject | Classification (of information) | en_US |
dc.title | Authorship attribution: performance of various features and classification methods | en_US |
dc.type | Conference Paper | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Authorship attribution Performance of various features and classification methods.pdf
- Size:
- 114.56 KB
- Format:
- Adobe Portable Document Format
- Description:
- Full printable version