Scaling sparse matrix-matrix multiplication in the accumulo database

buir.contributor.authorDemirci, Gündüz Vehbi
buir.contributor.authorAykanat, Cevdet
dc.citation.epage62en_US
dc.citation.issueNumber1
dc.citation.spage31en_US
dc.citation.volumeNumber38
dc.contributor.authorDemirci, Gündüz Vehbien_US
dc.contributor.authorAykanat, Cevdeten_US
dc.date.accessioned2020-02-03T12:48:40Z
dc.date.available2020-02-03T12:48:40Z
dc.date.issued2020
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractWe propose and implement a sparse matrix-matrix multiplication (SpGEMM) algorithm running on top of Accumulo’s iterator framework which enables high performance distributed parallelism. The proposed algorithm provides write-locality while ingesting the output matrix back to database via utilizing row-by-row parallel SpGEMM. The proposed solution also alleviates scanning of input matrices multiple times by making use of Accumulo’s batch scanning capability which is used for accessing multiple ranges of key-value pairs in parallel. Even though the use of batch-scanning introduces some latency overheads, these overheads are alleviated by the proposed solution and by using node-level parallelism structures. We also propose a matrix partitioning scheme which reduces the total communication volume and provides a balance of workload among servers. The results of extensive experiments performed on both real-world and synthetic sparse matrices show that the proposed algorithm scales significantly better than the outer-product parallel SpGEMM algorithm available in the Graphulo library. By applying the proposed matrix partitioning, the performance of the proposed algorithm is further improved considerably.en_US
dc.description.provenanceSubmitted by Zeynep Aykut (zeynepay@bilkent.edu.tr) on 2020-02-03T12:48:39Z No. of bitstreams: 1 Scaling_sparse_matrix-matrix_multiplication_in_the_accumulo_database.pdf: 672671 bytes, checksum: 414b7442709efb5d7906ac883b8bda9d (MD5)en
dc.description.provenanceMade available in DSpace on 2020-02-03T12:48:40Z (GMT). No. of bitstreams: 1 Scaling_sparse_matrix-matrix_multiplication_in_the_accumulo_database.pdf: 672671 bytes, checksum: 414b7442709efb5d7906ac883b8bda9d (MD5) Previous issue date: 2019en
dc.identifier.doi10.1007/s10619-019-07257-yen_US
dc.identifier.issn0926-8782en_US
dc.identifier.urihttp://hdl.handle.net/11693/53002en_US
dc.language.isoEnglishen_US
dc.publisherSpringeren_US
dc.relation.isversionofhttps://dx.doi.org/10.1007/s10619-019-07257-yen_US
dc.source.titleDistributed and Parallel Databasesen_US
dc.subjectAccumuloen_US
dc.subjectData localityen_US
dc.subjectDatabasesen_US
dc.subjectGraph partitioningen_US
dc.subjectGraphuloen_US
dc.subjectMatrix partitioningen_US
dc.subjectNoSQLen_US
dc.subjectParallel and distributed computingen_US
dc.subjectSparse matricesen_US
dc.subjectSparse matrix–matrix multiplicationen_US
dc.subjectSpGEMMen_US
dc.titleScaling sparse matrix-matrix multiplication in the accumulo databaseen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Scaling_sparse_matrix_matrix_multiplication_in_the_accumulo_database.pdf
Size:
646.85 KB
Format:
Adobe Portable Document Format
Description:
View / Download