Information retrieval on Turkish texts

Date

2008-02

Authors

Can, F.
Kocberber, S.
Balcik, E.
Kaynak, C.
Ocalan, H. C.
Vursavas, O. M.

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Association for Information Science and Technology. Journal

Print ISSN

2330-1635

Electronic ISSN

Publisher

John Wiley & Sons, Inc.

Volume

59

Issue

3

Pages

407 - 421

Language

English

Journal Title

Journal ISSN

Volume Title

Series

Abstract

In this study, we investigate information retrieval (IR) on Turkish texts using a large-scale test collection that contains 408,305 documents and 72 ad hoc queries. We examine the effects of several stemming options and query-document matching functions on retrieval performance. We show that a simple word truncation approach, a word truncation approach that uses language-dependent corpus statistics, and an elaborate lemmatizer-based stemmer provide similar retrieval effectiveness in Turkish IR. We investigate the effects of a range of search conditions on the retrieval performance; these include scalability issues, query and document length effects, and the use of stop-word list in indexing. © 2007 Wiley Periodicals, Inc.

Course

Other identifiers

Book Title

Citation