True load balancing for Matricized Tensor Times Khatri-Rao Product

buir.contributor.authorAbubaker, Nabil
buir.contributor.authorAykanat, Cevdet
buir.contributor.orcidAbubaker, Nabil|0000-0002-5060-3059
buir.contributor.orcidAykanat, Cevdet|0000-0002-4559-1321
dc.citation.epage1986en_US
dc.citation.issueNumber8en_US
dc.citation.spage1974en_US
dc.citation.volumeNumber32en_US
dc.contributor.authorAbubaker, Nabil
dc.contributor.authorAykanat, Cevdet
dc.contributor.authorAcer, S.
dc.date.accessioned2022-01-31T11:22:12Z
dc.date.available2022-01-31T11:22:12Z
dc.date.issued2021-01-22
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractMTTKRP is the bottleneck operation in algorithms used to compute the CP tensor decomposition. For sparse tensors, utilizing the compressed sparse fibers (CSF) storage format and the CSF-oriented MTTKRP algorithms is important for both memory and computational efficiency on distributed-memory architectures. Existing intelligent tensor partitioning models assume the computational cost of MTTKRP to be proportional to the total number of nonzeros in the tensor. However, this is not the case for the CSF-oriented MTTKRP on distributed-memory architectures. We outline two deficiencies of nonzero-based intelligent partitioning models when CSF-oriented MTTKRP operations are performed locally: failure to encode processors' computational loads and increase in total computation due to fiber fragmentation. We focus on existing fine-grain hypergraph model and propose a novel vertex weighting scheme that enables this model encode correct computational loads of processors. We also propose to augment the fine-grain model by fiber nets for reducing the increase in total computational load via minimizing fiber fragmentation. In this way, the proposed model encodes minimizing the load of the bottleneck processor. Parallel experiments with real-world sparse tensors on up to 1024 processors prove the validity of the outlined deficiencies and demonstrate the merit of our proposed improvements in terms of parallel runtimes.en_US
dc.description.provenanceSubmitted by Evrim Ergin (eergin@bilkent.edu.tr) on 2022-01-31T11:22:12Z No. of bitstreams: 1 True_load_balancing_for_Matricized_Tensor_Times_Khatri-Rao_Product.pdf: 1602420 bytes, checksum: dfec07c5cd0d8ec5398b2ad2ede3e584 (MD5)en
dc.description.provenanceMade available in DSpace on 2022-01-31T11:22:12Z (GMT). No. of bitstreams: 1 True_load_balancing_for_Matricized_Tensor_Times_Khatri-Rao_Product.pdf: 1602420 bytes, checksum: dfec07c5cd0d8ec5398b2ad2ede3e584 (MD5) Previous issue date: 2021-01-22en
dc.identifier.doi10.1109/TPDS.2021.3053836en_US
dc.identifier.eissn1558-2183
dc.identifier.issn1045-9219
dc.identifier.urihttp://hdl.handle.net/11693/76913
dc.language.isoEnglishen_US
dc.publisherIEEEen_US
dc.relation.isversionofhttps://doi.org/10.1109/TPDS.2021.3053836en_US
dc.source.titleIEEE Transactions on Parallel and Distributed Systemsen_US
dc.subjectLoad balancingen_US
dc.subjectSparse tensorsen_US
dc.subjectMTTKRPen_US
dc.subjectCP decompositionen_US
dc.subjectFine-grain hypergraph partitioningen_US
dc.titleTrue load balancing for Matricized Tensor Times Khatri-Rao Producten_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
True_load_balancing_for_Matricized_Tensor_Times_Khatri-Rao_Product.pdf
Size:
1.53 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: