Parallel frequent item set mining with selective item replication

buir.contributor.authorAykanat, Cevdet
dc.citation.epage1640en_US
dc.citation.issueNumber10en_US
dc.citation.spage1632en_US
dc.citation.volumeNumber22en_US
dc.contributor.authorÖzkural E.en_US
dc.contributor.authorUçar, B.en_US
dc.contributor.authorAykanat, Cevdeten_US
dc.date.accessioned2016-02-08T09:52:28Z
dc.date.available2016-02-08T09:52:28Z
dc.date.issued2011en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractWe introduce a transaction database distribution scheme that divides the frequent item set mining task in a top-down fashion. Our method operates on a graph where vertices correspond to frequent items and edges correspond to frequent item sets of size two. We show that partitioning this graph by a vertex separator is sufficient to decide a distribution of the items such that the subdatabases determined by the item distribution can be mined independently. This distribution entails an amount of data replication, which may be reduced by setting appropriate weights to vertices. The data distribution scheme is used in the design of two new parallel frequent item set mining algorithms. Both algorithms replicate the items that correspond to the separator. NoClique replicates the work induced by the separator and NoClique2 computes the same work collectively. Computational load balancing and minimization of redundant or collective work may be achieved by assigning appropriate load estimates to vertices. The experiments show favorable speedups on a system with small-to-medium number of processors for synthetic and real-world databases. © 2011 IEEE.en_US
dc.description.provenanceMade available in DSpace on 2016-02-08T09:52:28Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2011en
dc.identifier.doi10.1109/TPDS.2011.32en_US
dc.identifier.issn1045-9219en_US
dc.identifier.urihttp://hdl.handle.net/11693/21884en_US
dc.language.isoEnglishen_US
dc.publisherInstitute of Electrical and Electronics Engineersen_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/TPDS.2011.32en_US
dc.source.titleIEEE Transactions on Parallel and Distributed Systemsen_US
dc.subjectFrequent item set miningen_US
dc.subjectParallel data miningen_US
dc.subjectMining methods and algorithmsen_US
dc.subjectSelective data replicationen_US
dc.subjectGraph partitioning by vertex separatoen_US
dc.titleParallel frequent item set mining with selective item replicationen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Parallel frequent item set mining with selective item replication.pdf
Size:
625.03 KB
Format:
Adobe Portable Document Format
Description:
Full printable version