Aggregate profile clustering for telco analytics

Date
2013
Authors
Abbasoğlu, M.A.
Gedik, B.
Ferhatosmanoğlu H.
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Proceedings of the VLDB Endowment
Print ISSN
21508097
Electronic ISSN
Publisher
Volume
6
Issue
12
Pages
1234 - 1237
Language
English
Type
Article
Journal Title
Journal ISSN
Volume Title
Series
Abstract

Many telco analytics require maintaining call profiles based on recent customer call patterns. Such call profiles are typically organized as aggregations computed at different time scales over the recent customer interactions. Customer call profiles are key inputs for analytics targeted at improving operations, marketing, and sales of telco providers. Many of these analytics require clustering customer call profiles, so that customers with similar calling patterns can be modeled as a group. Example applications include optimizing tariffs, customer segmentation, and usage forecasting. In this demo, we present our system for scalable aggregate profile clustering in a streaming setting. We focus on managing anonymized segments of customers for tariff optimization. Due to the large number of customers, maintaining profile clusters have high processing and memory resource requirements. In order to tackle this problem, we apply distributed stream processing. However, in the presence of distributed state, it is a major challenge to partition the profiles over machines (nodes) such that memory and computation balance is maintained, while keeping the clustering accuracy high. Furthermore, to adapt to potentially changing customer calling patterns, the partitioning of profiles to machines should be continuously revised, yet one should minimize the migration of profiles so as not to disturb the online processing of updates. We provide a re-partitioning technique that achieves all these goals. We keep micro-cluster summaries at each node, collect these summaries at a centralize node, and use a greedy algorithm with novel affinity heuristics to revise the partitioning. We present a demo that showcases our Storm and Hbase based implementation of the proposed solution in the context of a customer segmentation application. © 2013 VLDB Endowment.

Course
Other identifiers
Book Title
Keywords
Clustering accuracy, Clustering customers, Customer interaction, Customer segmentation, Different time scale, Distributed stream processing, Greedy algorithms, Online processing, Aggregates, Distributed parameter control systems, Optimization, Sales
Citation
Published Version (Please cite this version)