Data analytics for alarm management systems

Date

2017-04

Editor(s)

Advisor

Gedik, Buğra

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Print ISSN

Electronic ISSN

Publisher

Bilkent University

Volume

Issue

Pages

Language

English

Journal Title

Journal ISSN

Volume Title

Series

Abstract

Mobile network operators run Operations Support Systems (OSS) that produce vast amounts of alarm events. These events can have different significance levels, domains, and also can trigger other ones. Network Operators face the challenge to identify the significance and root causes of these system problems in real-time and to keep the number of remedial actions at an optimal level, so that customer satisfaction rates can be guaranteed at a reasonable cost. A solution containing alarm correlation, rule mining and root cause analysis is described to help scalable streaming alarm management systems. This solution is applied to Alarm Collector and Analyzer (ALACA), which is operated in the network operation center of a major mobile telecom provider. It is used for alarm event analyses, where the alarms are correlated and processed to find root-causes in a streaming fashion. The developed system includes a dynamic index for matching active alarms, an algorithm for generating candidate alarm rules, a sliding-window based approach to save system resources, and a graph based solution to identify root causes. ALACA helps operators to enhance the design of their alarm management systems by allowing continuous analysis of data and event streams and predict network behavior with respect to potential failures by using the results of root cause analysis. The experimental results that provide insights on performance of real-time alarm data analytics systems are presented.

Course

Other identifiers

Book Title

Citation

item.page.isversionof