Architecture of a grid-enabled Web search engine
Author
Cambazoglu, B. B.
Karaca, E.
Kucukyilmaz T.
Turk, A.
Aykanat, Cevdet
Date
2007Source Title
Information Processing and Management
Print ISSN
0306-4573
Publisher
Elsevier Ltd
Volume
43
Issue
3
Pages
609 - 623
Language
English
Type
ArticleItem Usage Stats
139
views
views
108
downloads
downloads
Abstract
Search Engine for South-East Europe (SE4SEE) is a socio-cultural search engine running on the grid infrastructure. It offers a personalized, on-demand, country-specific, category-based Web search facility. The main goal of SE4SEE is to attack the page freshness problem by performing the search on the original pages residing on the Web, rather than on the previously fetched copies as done in the traditional search engines. SE4SEE also aims to obtain high download rates in Web crawling by making use of the geographically distributed nature of the grid. In this work, we present the architectural design issues and implementation details of this search engine. We conduct various experiments to illustrate performance results obtained on a grid infrastructure and justify the use of the search strategy employed in SE4SEE. © 2006 Elsevier Ltd. All rights reserved.
Keywords
Grid computingSearch engine
Text classification
Web crawling
Social aspects
Websites
World Wide Web
Text classification
Web crawling