Parallel pruning for k-means clustering on shared memory architectures
We have developed and evaluated two parallelization schemes for a tree-based k-means clustering method on shared memory machines. One scheme is to partition the pattern space across processors. We have determined that spatial decomposition of patterns outperforms random decomposition even though random decomposition has almost no load imbalance problem. The other scheme is the parallel traverse of the search tree. This approach solves the load imbalance problem and performs slightly better than the spatial decomposition, but the efficiency is reduced due to thread synchronizations. In both cases, parallel treebased k-means clustering is significantly faster than the direct parallel k-means. © Springer-Verlag Berlin Heidelberg 2001.