Information retrieval on Turkish texts

Date
2008-02
Authors
Can, F.
Kocberber, S.
Balcik, E.
Kaynak, C.
Ocalan, H. C.
Vursavas, O. M.
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Association for Information Science and Technology. Journal
Print ISSN
2330-1635
Electronic ISSN
Publisher
John Wiley & Sons, Inc.
Volume
59
Issue
3
Pages
407 - 421
Language
English
Journal Title
Journal ISSN
Volume Title
Series
Abstract

In this study, we investigate information retrieval (IR) on Turkish texts using a large-scale test collection that contains 408,305 documents and 72 ad hoc queries. We examine the effects of several stemming options and query-document matching functions on retrieval performance. We show that a simple word truncation approach, a word truncation approach that uses language-dependent corpus statistics, and an elaborate lemmatizer-based stemmer provide similar retrieval effectiveness in Turkish IR. We investigate the effects of a range of search conditions on the retrieval performance; these include scalability issues, query and document length effects, and the use of stop-word list in indexing. © 2007 Wiley Periodicals, Inc.

Course
Other identifiers
Book Title
Citation
Published Version (Please cite this version)