Exploiting interclass rules for focused crawling
Date
2004
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
BUIR Usage Stats
4
views
views
18
downloads
downloads
Citation Stats
Series
Abstract
A baseline crawler was developed at the Bilkent University based on a focused-crawling approach. The focused crawler is an agent that targets a particular topic and visits and gathers only a relevant, narrow Web segment while trying not to waste resources on irrelevant materials. The rule-based Web-crawling approach uses linkage statistics among topics to improve a baseline focused crawler's harvest rate and coverage. The crawler also employs a canonical topic taxonomy to train a naïve-Bayesian classifier, which then helps determine the relevancy of crawled pages.
Source Title
IEEE Intelligent Systems
Publisher
IEEE
Course
Other identifiers
Book Title
Degree Discipline
Degree Level
Degree Name
Citation
Permalink
Published Version (Please cite this version)
Language
English