Straggler mitigation through unequal error protection for distributed approximate matrix multiplication

buir.contributor.authorTegin, Büşra
buir.contributor.authorDuman, Tolga Mete
buir.contributor.orcidTegin, Büşra|0000-0002-3342-5414
buir.contributor.orcidDuman, Tolga Mete|0000-0002-5187-8660
dc.citation.epage483en_US
dc.citation.issueNumber2en_US
dc.citation.spage468en_US
dc.citation.volumeNumber40en_US
dc.contributor.authorTegin, Büşra
dc.contributor.authorHernandez, Eduin E.
dc.contributor.authorRini, Stefano
dc.contributor.authorDuman, Tolga Mete
dc.date.accessioned2023-02-26T09:45:28Z
dc.date.available2023-02-26T09:45:28Z
dc.date.issued2022-02-01
dc.departmentDepartment of Electrical and Electronics Engineeringen_US
dc.description.abstractLarge-scale machine learning and data mining methods routinely distribute computations across multiple agents to parallelize processing. The time required for the computations at the agents is affected by the availability of local resources and/or poor channel conditions, thus giving rise to the “straggler problem.” In this paper, we address this problem for distributed approximate matrix multiplication. In particular, we employ Unequal Error Protection (UEP) codes to obtain an approximation of the matrix product to provide higher protection for the blocks with a higher effect on the multiplication outcome. We characterize the performance of the proposed approach from a theoretical perspective by bounding the expected reconstruction error for matrices with uncorrelated entries. We also apply the proposed coding strategy to the computation of the back-propagation step in the training of a Deep Neural Network (DNN) for an image classification task in the evaluation of the gradients. Our numerical experiments show that it is indeed possible to obtain significant improvements in the overall time required to achieve DNN training convergence by producing approximation of matrix products using UEP codes in the presence of stragglers.en_US
dc.description.provenanceSubmitted by Cem Çağatay Akgün (cem.akgun@bilkent.edu.tr) on 2023-02-26T09:45:28Z No. of bitstreams: 1 Straggler_Mitigation_Through_Unequal_Error_Protection_for_Distributed_Approximate_Matrix_Multiplication.pdf: 2471562 bytes, checksum: e30288ea04f72c3dcacdc9c7659a1d64 (MD5)en
dc.description.provenanceMade available in DSpace on 2023-02-26T09:45:28Z (GMT). No. of bitstreams: 1 Straggler_Mitigation_Through_Unequal_Error_Protection_for_Distributed_Approximate_Matrix_Multiplication.pdf: 2471562 bytes, checksum: e30288ea04f72c3dcacdc9c7659a1d64 (MD5) Previous issue date: 2022-02-01en
dc.identifier.doi10.1109/JSAC.2021.3118350en_US
dc.identifier.issn07338716
dc.identifier.urihttp://hdl.handle.net/11693/111760
dc.language.isoEnglishen_US
dc.publisherInstitute of Electrical and Electronics Engineers Inc.en_US
dc.relation.isversionofhttps://dx.doi.org/10.1109/JSAC.2021.3118350en_US
dc.source.titleIEEE Journal on Selected Areas in Communicationsen_US
dc.subjectDistributed computationen_US
dc.subjectApproximate matrix multiplicationen_US
dc.subjectStragglersen_US
dc.subjectUnequal error protectionen_US
dc.titleStraggler mitigation through unequal error protection for distributed approximate matrix multiplicationen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Straggler_Mitigation_Through_Unequal_Error_Protection_for_Distributed_Approximate_Matrix_Multiplication.pdf
Size:
2.36 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: