First large-scale information retrieval experiments on Turkish texts
Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
627 - 628
MetadataShow full item record
Please cite this item using this persistent URLhttp://hdl.handle.net/11693/27221
We present the results of the first large-scale Turkish information retrieval experiments performed on a TREC-like test collection. The test bed, which has been created for this study, contains 95.5 million words, 408,305 documents, 72 ad hoc queries and has a size of about 800MB. All documents come from the Turkish newspaper Milliyet. We implement and apply simple to sophisticated stemmers and various query-document matching fonctions and show that truncating words at a prefix length of 5 creates an effective retrieval environment in Turkish. However, a lemmatizer-based stemmer provides significantly better effectiveness over a variety of matching functions.
- Work in Progress 656
Showing items related by title, author, creator and subject.
An analysis of manipulated information and respective alternative costs in information systems and in decision making structures Güvenen O.; Öztürk, M.H. (International Institute of Informatics and Systemics, IIIS, 2006)Today Information Technologies create base for the most important decision support systems for the practices in academia, business and politics. The effectiveness and success of operations that are supported by information ...
Altingövde İ.S.; Özel, S.A.; Ulusoy Ö.; Özsoyoğlu G.; Özsoyoğlu, Z.M. (Springer Verlag, 2001)This paper deals with the problem of modeling web information resources using expert knowledge and personalized user information, and querying them in terms of topics and topic relationships. We propose a model for web ...
Ali, S.A.; Ince, E.A. (2007)The statistical characteristics of impulsive noise differ greatly from those of Gaussian noise. Hence, the performance of conventional decoders, optimized for additive white Gaussian noise (AWGN) channels is not promising ...