Gülcan, SelçukÖzdal, Muhammet MustafaAykanat, Cevdet2024-03-182024-03-182023-04-200167-739Xhttps://hdl.handle.net/11693/114906We investigate the parallelization of Stochastic Gradient Descent (SGD) for matrix completion on multicore architectures. We provide an experimental analysis of current SGD algorithms to find out their bottlenecks and limitations. Grid-based methods suffer from load imbalance among 2D blocks of the rating matrix, especially when datasets are skewed and sparse. Asynchronous methods, on the other hand, can face cache issues due to their memory access pattern. We propose bin-packing-based block balancing methods that are alternative to the recently proposed BaPa method. We then introduce Locality Aware SGD (LASGD), a grid-based asynchronous parallel SGD algorithm that efficiently utilizes cache by changing nonzero update sequence without affecting factor update order and carefully arranging latent factor matrices in the memory. Combined with our proposed load balancing methods, our experiments show that LASGD performs significantly better than alternative approaches in parallel shared-memory systems.enCC BY-NC-ND 4.0 DEED (Attribution-NonCommercial-NoDerivs 4.0 International)Matrix completionRecommendation systemStochastic gradient descentShared memory parallel systemsLoad balancingLocality-aware schedulingLoad balanced locality-aware parallel SGD on multicore architectures for latent factor based collaborative filteringArticle10.1016/j.future.2023.04.0071872-7115