A disk-based graph database system with incremental storage layout optimization
Item Usage Stats
MetadataShow full item record
The world has become ever more connected, where the data generated by people, software systems, and the physical world is more accessible than before and is much larger in volume, variety, and velocity. In many application domains, such as telecommunications and social media, live data recording the relationships between people, systems, and the environment is available. This data often takes the form of a temporally evolving graph, where entities are the vertices and the relationships between them are the edges. For this reason, managing dynamic relationships represented by a graph structure has been a common requirement for modern data processing applications. Graph databases gained importance with the proliferation of such data processing applications. In this work, we developed a disk-based graph database system, which is able to manage incremental updates on the graph structure. The updates arrive in a streaming manner and the system creates and maintains an optimized storage layout for the graph in an incremental way. This optimized storage layout enables the database to support traversal based graph algorithms more e ciently, by minimizing the disk I/O required to execute them. The storage layout optimizations we develop aim at taking advantage of spatial locality of edges to minimize the traversal I/O cost, but achieves this in an incremental way as new edges/vertices get added and removed.