Exploiting interclass rules for focused crawling
buir.contributor.author | Ulusoy, Özgür | |
dc.citation.epage | 73 | en_US |
dc.citation.issueNumber | 6 | en_US |
dc.citation.spage | 66 | en_US |
dc.citation.volumeNumber | 19 | en_US |
dc.contributor.author | Altingövde, I. S. | en_US |
dc.contributor.author | Ulusoy, Özgür | en_US |
dc.date.accessioned | 2018-04-12T13:51:38Z | |
dc.date.available | 2018-04-12T13:51:38Z | en_US |
dc.date.issued | 2004 | en_US |
dc.department | Department of Computer Engineering | en_US |
dc.description.abstract | A baseline crawler was developed at the Bilkent University based on a focused-crawling approach. The focused crawler is an agent that targets a particular topic and visits and gathers only a relevant, narrow Web segment while trying not to waste resources on irrelevant materials. The rule-based Web-crawling approach uses linkage statistics among topics to improve a baseline focused crawler's harvest rate and coverage. The crawler also employs a canonical topic taxonomy to train a naïve-Bayesian classifier, which then helps determine the relevancy of crawled pages. | en_US |
dc.description.provenance | Made available in DSpace on 2018-04-12T13:51:38Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 179475 bytes, checksum: ea0bedeb05ac9ccfb983c327e155f0c2 (MD5) Previous issue date: 2004 | en_US |
dc.identifier.doi | 10.1109/MIS.2004.62 | en_US |
dc.identifier.issn | 1541-1672 | en_US |
dc.identifier.issn | 1941-1294 | en_US |
dc.identifier.uri | http://hdl.handle.net/11693/38244 | en_US |
dc.language.iso | English | en_US |
dc.publisher | IEEE | en_US |
dc.relation.isversionof | http://dx.doi.org/10.1109/MIS.2004.62 | en_US |
dc.source.title | IEEE Intelligent Systems | en_US |
dc.subject | Best First Search | en_US |
dc.subject | Breadth First Search | en_US |
dc.subject | Domain Name Systems (DNS) | en_US |
dc.subject | Web Crawling Approaches | en_US |
dc.subject | Classification (of information) | en_US |
dc.subject | Data Acquisition | en_US |
dc.subject | Indexing (of information) | en_US |
dc.subject | Knowledge Based Systems | en_US |
dc.subject | Network Protocols | en_US |
dc.subject | Online Searching | en_US |
dc.subject | Queueing Theory | en_US |
dc.subject | Websites | en_US |
dc.title | Exploiting interclass rules for focused crawling | en_US |
dc.type | Review | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Exploiting interclass rules for focused crawling.pdf
- Size:
- 153.63 KB
- Format:
- Adobe Portable Document Format
- Description:
- Full printable version