Data-parallel web crawling models

Cambazoglu, B. B.; Turk, A.; Aykanat, Cevdet

Data-parallel web crawling models

Files

Data-parallel web crawling models.pdf (175.9 KB)

Date

2004

Authors

Cambazoglu, B. B.

Turk, A.

Aykanat, Cevdet

BUIR Usage Stats

5
views

14
downloads

Citation Stats

Abstract

The need to quickly locate, gather, and store the vast amount of material in the Web necessitates parallel computing. In this paper, we propose two models, based on multi-constraint graph-partitioning, for efficient data-parallel Web crawling. The models aim to balance the amount of data downloaded and stored by each processor as well as balancing the number of page requests made by the processors. The models also minimize the total volume of communication during the link exchange between the processors. To evaluate the performance of the models, experimental results are presented on a sample Web repository containing around 915,000 pages. © Springer-Verlag 2004.

Source Title

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Publisher

Springer

Keywords

Artificial intelligence, Computers, Data parallel, Multi-constraints, Web Crawling, Web repositories, Parallel processing systems

Permalink

http://hdl.handle.net/11693/24172

Published Version (Please cite this version)

https://doi.org/10.1007/978-3-540-30182-0_80

Collections

Scholarly Publications - Computer Engineering

Language

English

Type

Article

Full item page

Data-parallel web crawling models

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Data-parallel web crawling models

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type