Implementation of a specialized algorithm for clustering using minimum enclosing balls
Yıldırım, Emre Alper
Item Usage Stats
MetadataShow full item record
Clustering is the process of organizing objects into groups whose members are similar in some ways. The main objective is to identify the underlying structures and patterns among the objects correctly. Therefore, a cluster is a collection of objects which are more similar to each other than to the objects belonging to other clusters. The clustering problem has applications in wide-ranging areas including facility location, classification of massive data, and marketing. Many of these applications call for the solutions of the large-scale clustering problems. The main problem of focus in this thesis is the computation of k spheres that enclose a given set of m vectors, which represent the set of objects, in such a way that the radius of the largest sphere or the sum of the radii of spheres is as small as possible. The solutions of these problems allow one to divide the set of objects into k groups based on the level of similarity among them. Both of the aforementioned mathematical problems belong to the hardest class of optimization problems (i.e., they are NP-hard). Furthermore, as indicated by previous results in the literature, it is not only hard to find an optimal solution to these problems but also to find a good approximation to each one of them. In this thesis, specialized algorithms have been designed and implemented by taking into account the special underlying structures of the studied problems. These algorithms are based on an efficient and systematic search of an optimal solution using a Branch-and-Bound framework. In the course of the algorithms, the problem of computing the smallest sphere that encloses a given set of vectors appears as a sequence of subproblems that need to be solved. Our algorithms heavily rely on the recently developed efficient algorithms for this subproblem. A software has been developed that can implement the proposed algorithms in order to use them in practice. A user-friendly interface has been designed for the software. Extensive computational results reveal that our algorithms are capable of solving large-scale instances of the problems efficiently. Since the architecture of the software has been designed in a flexible and modular fashion, it serves as a solid foundation for further studies in this area.
Keywordsgeometric optimization problems
design of algorithms
QA278 .G87 2010
Cluster analysis--Data processing.