Authorship attribution: performance of various features and classification methods
Bozkurt, İlker Nadi
22nd International Symposium on Computer and Information Sciences, ISCIS 2007 - Proceedings
158 - 162
Item Usage Stats
Authorship attribution is the process of determining the writer of a document. In literature, there are lots of classification techniques conducted in this process. In this paper we explore information retrieval methods such as tf-idf structure with support vector machines, parametric and nonparametric methods with supervised and unsupervised (clustering) classification techniques in authorship attribution. We performed various experiments with articles gathered from Turkish newspaper Milliyet. We performed experiments on different features extracted from these texts with different classifiers, and combined these results to improve our success rates. We identified which classifiers give satisfactory results on which feature sets. According to experiments, the success rates dramatically changes with different combinations, however the best among them are support vector classifier with bag of words, and Gaussian with function words. ©2007 IEEE.
Classifier feature reationship
Parametric nonparametric calssifiers
Support vector machines
Bag of words
Non parametric methods
Support vector classifier (SVC)
Classification (of information)
Published Version (Please cite this version)https://doi.org/10.1109/ISCIS.2007.4456854
Showing items related by title, author, creator and subject.
An analysis of manipulated information and respective alternative costs in information systems and in decision making structures Güvenen O.; Öztürk, M.H. (International Institute of Informatics and Systemics, IIIS, 2006)Today Information Technologies create base for the most important decision support systems for the practices in academia, business and politics. The effectiveness and success of operations that are supported by information ...
Altıngövde, İsmail Şengör; Özel, Selma A.; Ulusoy, Özgür; Özsoyoğlu G.; Özsoyoğlu, Z.M. (Springer, Berlin, Heidelberg, 2001)This paper deals with the problem of modeling web information resources using expert knowledge and personalized user information, and querying them in terms of topics and topic relationships. We propose a model for web ...
Ali, Syed Amjad; Ince, E.A. (IEEE, 2007)The statistical characteristics of impulsive noise differ greatly from those of Gaussian noise. Hence, the performance of conventional decoders, optimized for additive white Gaussian noise (AWGN) channels is not promising ...