Temporal workload-aware replicated partitioning for social networks

Date

2014-11

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

IEEE Transactions on Knowledge & Data Engineering

Print ISSN

1041-4347

Electronic ISSN

Publisher

Institute of Electrical and Electronics Engineers

Volume

26

Issue

11

Pages

1 - 14

Language

English

Journal Title

Journal ISSN

Volume Title

Series

Abstract

Most frequent and expensive queries in social networks involve multi-user operations such as requesting the latest tweets or news-feeds of friends. The performance of such queries are heavily dependent on the data partitioning and replication methodologies adopted by the underlying systems. Existing solutions for data distribution in these systems involve hash- or graph-based approaches that ignore the multi-way relations among data. In this work, we propose a novel data partitioning and selective replication method that utilizes the temporal information in prior workloads to predict future query patterns. Our method utilizes the social network structure and the temporality of the interactions among its users to construct a hypergraph that correctly models multi-user operations. It then performs simultaneous partitioning and replication of this hypergraph to reduce the query span while respecting load balance and I/O load constraints under replication. To test our model, we enhance the Cassandra NoSQL system to support selective replication and we implement a social network application (a Twitter clone) utilizing our enhanced Cassandra. We conduct experiments on a cloud computing environment (Amazon EC2) to test the developed systems. Comparison of the proposed method with hash- and enhanced graph-based schemes indicate that it significantly improves latency and throughput.

Course

Other identifiers

Book Title

Degree Discipline

Degree Level

Degree Name

Citation