Distributed block formation and layout for disk-based management of large-scale graphs

Date

2017

Authors

Yaşar, A.
Gedik, B.
Ferhatosmanoğlu, H.

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Distributed and Parallel Databases

Print ISSN

0926-8782

Electronic ISSN

Publisher

Springer

Volume

35

Issue

1

Pages

23 - 53

Language

English

Journal Title

Journal ISSN

Volume Title

Series

Abstract

We are witnessing an enormous growth in social networks as well as in the volume of data generated by them. An important portion of this data is in the form of graphs. In recent years, several graph processing and management systems emerged to handle large-scale graphs. The primary goal of these systems is to run graph algorithms and queries in an efficient and scalable manner. Unlike relational data, graphs are semi-structured in nature. Thus, storing and accessing graph data using secondary storage requires new solutions that can provide locality of access for graph processing workloads. In this work, we propose a scalable block formation and layout technique for graphs, which aims at reducing the I/O cost of disk-based graph processing algorithms. To achieve this, we designed a scalable MapReduce-style method called ICBL, which can divide the graph into a series of disk blocks that contain sub-graphs with high locality. Furthermore, ICBL can order the resulting blocks on disk to further reduce non-local accesses. We experimentally evaluated ICBL to showcase its scalability, layout quality, as well as the effectiveness of automatic parameter tuning for ICBL. We deployed the graph layouts generated by ICBL on the Neo4j open source graph database, http://www.neo4j.org/ (2015) graph database management system. Our results show that the layout generated by ICBL reduces the query running times over Neo4j more than 2 × compared to the default layout. © 2017, Springer Science+Business Media New York.

Course

Other identifiers

Book Title

Citation