Scaling forecasting algorithms using clustered modeling
Please cite this item using this persistent URLhttp://hdl.handle.net/11693/15905
Research on statistical forecasting has traditionally focused on building more accurate models for a given time-series. The models are mostly applied only to limited data due to their limitation on efficiency and scalability. However, many enterprise applications such as Customer Relationship Model (CRM) and Customer Experience Management (CEM) require scalable forecasting on large number of data series. For example, telecommunication companies need to forecast each of their customers’ traffic load individually to understand their needs and behavior, and to tailor targeted campaigns. Forecasting models are easily applied on aggregate traffic data to estimate the total traffic volume for revenue estimation and resource planning. However, they cannot be applied to each user individually as building accurate models for large number of users would be time consuming. The problem is exacerbated when the forecasting process is continuous and the models need to be updated periodically. We address the problem of building and updating forecasting models continuously for multiple data series and propose dynamic clustered modeling optimized for forecasting. We introduce representative models as an analogy to cluster centers, and apply the models to each individual series through iterative nonlinear optimization. The approach performs modeling and clustering simultaneously, makes forecasts by applying representative models to each data, and updates the model parameters for a continuous forecasting process. Our findings indicate that understanding an individual’s behavior within its segment’s model provides more scalability and accuracy than computing the individual model itself. Experimental results from a real telecom CRM application show the method is highly efficient and scalable, and also more accurate than having separate individual models.