Exploiting interclass rules for focused crawling

Altingövde, I. S.; Ulusoy, Özgür

Exploiting interclass rules for focused crawling

Files

Exploiting interclass rules for focused crawling.pdf (153.63 KB)

Date

2004

Authors

Altingövde, I. S.

Ulusoy, Özgür

BUIR Usage Stats

4
views

23
downloads

Citation Stats

Abstract

A baseline crawler was developed at the Bilkent University based on a focused-crawling approach. The focused crawler is an agent that targets a particular topic and visits and gathers only a relevant, narrow Web segment while trying not to waste resources on irrelevant materials. The rule-based Web-crawling approach uses linkage statistics among topics to improve a baseline focused crawler's harvest rate and coverage. The crawler also employs a canonical topic taxonomy to train a naïve-Bayesian classifier, which then helps determine the relevancy of crawled pages.

Source Title

IEEE Intelligent Systems

Publisher

IEEE

Keywords

Best First Search, Breadth First Search, Domain Name Systems (DNS), Web Crawling Approaches, Classification (of information), Data Acquisition, Indexing (of information), Knowledge Based Systems, Network Protocols, Online Searching, Queueing Theory, Websites

Permalink

http://hdl.handle.net/11693/38244

Published Version (Please cite this version)

http://dx.doi.org/10.1109/MIS.2004.62

Collections

Scholarly Publications - Computer Engineering

Language

English

Type

Review

Full item page

Exploiting interclass rules for focused crawling

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Exploiting interclass rules for focused crawling

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type