Data-parallel web crawling models
buir.contributor.author | Aykanat, Cevdet | |
dc.citation.epage | 809 | en_US |
dc.citation.spage | 801 | en_US |
dc.citation.volumeNumber | 3280 | en_US |
dc.contributor.author | Cambazoglu, B. B. | en_US |
dc.contributor.author | Turk, A. | en_US |
dc.contributor.author | Aykanat, Cevdet | en_US |
dc.date.accessioned | 2016-02-08T10:25:09Z | |
dc.date.available | 2016-02-08T10:25:09Z | en_US |
dc.date.issued | 2004 | en_US |
dc.department | Department of Computer Engineering | en_US |
dc.description.abstract | The need to quickly locate, gather, and store the vast amount of material in the Web necessitates parallel computing. In this paper, we propose two models, based on multi-constraint graph-partitioning, for efficient data-parallel Web crawling. The models aim to balance the amount of data downloaded and stored by each processor as well as balancing the number of page requests made by the processors. The models also minimize the total volume of communication during the link exchange between the processors. To evaluate the performance of the models, experimental results are presented on a sample Web repository containing around 915,000 pages. © Springer-Verlag 2004. | en_US |
dc.description.provenance | Made available in DSpace on 2016-02-08T10:25:09Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2004 | en_US |
dc.identifier.doi | 10.1007/978-3-540-30182-0_80 | en_US |
dc.identifier.issn | 0302-9743 | en_US |
dc.identifier.issn | 1611-3349 | en_US |
dc.identifier.uri | http://hdl.handle.net/11693/24172 | en_US |
dc.language.iso | English | en_US |
dc.publisher | Springer | en_US |
dc.relation.isversionof | https://doi.org/10.1007/978-3-540-30182-0_80 | en_US |
dc.source.title | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | en_US |
dc.subject | Artificial intelligence | en_US |
dc.subject | Computers | en_US |
dc.subject | Data parallel | en_US |
dc.subject | Multi-constraints | en_US |
dc.subject | Web Crawling | en_US |
dc.subject | Web repositories | en_US |
dc.subject | Parallel processing systems | en_US |
dc.title | Data-parallel web crawling models | en_US |
dc.type | Article | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Data-parallel web crawling models.pdf
- Size:
- 175.9 KB
- Format:
- Adobe Portable Document Format
- Description:
- Full printable version