Exploiting interclass rules for focused crawling

buir.contributor.authorUlusoy, Özgür
dc.citation.epage73en_US
dc.citation.issueNumber6en_US
dc.citation.spage66en_US
dc.citation.volumeNumber19en_US
dc.contributor.authorAltingövde, I. S.en_US
dc.contributor.authorUlusoy, Özgüren_US
dc.date.accessioned2018-04-12T13:51:38Z
dc.date.available2018-04-12T13:51:38Zen_US
dc.date.issued2004en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractA baseline crawler was developed at the Bilkent University based on a focused-crawling approach. The focused crawler is an agent that targets a particular topic and visits and gathers only a relevant, narrow Web segment while trying not to waste resources on irrelevant materials. The rule-based Web-crawling approach uses linkage statistics among topics to improve a baseline focused crawler's harvest rate and coverage. The crawler also employs a canonical topic taxonomy to train a naïve-Bayesian classifier, which then helps determine the relevancy of crawled pages.en_US
dc.description.provenanceMade available in DSpace on 2018-04-12T13:51:38Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 179475 bytes, checksum: ea0bedeb05ac9ccfb983c327e155f0c2 (MD5) Previous issue date: 2004en_US
dc.identifier.doi10.1109/MIS.2004.62en_US
dc.identifier.issn1541-1672en_US
dc.identifier.issn1941-1294en_US
dc.identifier.urihttp://hdl.handle.net/11693/38244en_US
dc.language.isoEnglishen_US
dc.publisherIEEEen_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/MIS.2004.62en_US
dc.source.titleIEEE Intelligent Systemsen_US
dc.subjectBest First Searchen_US
dc.subjectBreadth First Searchen_US
dc.subjectDomain Name Systems (DNS)en_US
dc.subjectWeb Crawling Approachesen_US
dc.subjectClassification (of information)en_US
dc.subjectData Acquisitionen_US
dc.subjectIndexing (of information)en_US
dc.subjectKnowledge Based Systemsen_US
dc.subjectNetwork Protocolsen_US
dc.subjectOnline Searchingen_US
dc.subjectQueueing Theoryen_US
dc.subjectWebsitesen_US
dc.titleExploiting interclass rules for focused crawlingen_US
dc.typeReviewen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Exploiting interclass rules for focused crawling.pdf
Size:
153.63 KB
Format:
Adobe Portable Document Format
Description:
Full printable version