Browsing by Subject "Data replication"

Now showing 1 - 5 of 5

Open Access
Data replication versus timing constraints in distributed database systems
(ACM, 1994-03) Ulusoy, Özgür
In a database system supporting a real-time application, each transaction is associated with a timing constraint, typically in the form of a deadline. Replicated database systems possess desirable features for real-time applications, such as a high level of data availability, and potentially improved response time for queries. On the other hand, multiple copy updates lead to a considerable overhead due to the communication required among the data sites holding the copies. In this paper, we investigate the impact of storing multiple copies of data on satisfying the timing constraints of real-time transactions. A detailed performance model of a distributed database system is employed in evaluating the effects of various workload parameters and design alternatives on the system performance. The performance is expressed in terms of the fraction of satisfied transaction deadlines.
Open Access
Exploiting replicated data for communication load balancing in image-space parallel direct volume rendering of unstructured grids
(2009) Okuyan, Erkan
The focus of this work is on parallel volume rendering applications in which renderings with different parameters are successively repeated over the same dataset. The only reason for inter-task interaction is the existence of data primitives that are inputs to several tasks. Both computational structure and expected task execution times may change during successive rendering instances. Change in computational structure means change in the data primitive requirements of tasks. Since the individual processors of a parallel system have a limited storage capacity, we can reserve a limited amount of storage for holding replicas at each processor. For the parallelization of a particular rendering instance, the remapping model should utilize the replication pattern of the previous rendering instance(s) for reducing the communication overhead due to the data replication requirement of the current rendering instance. We propose a two-phase model for solving this problem. The hypergraphpartitioning-based model proposed for the first phase aims to minimize the total message volume that will be incurred due to the replication/migration of input data while maintaining balance on computational and receive-volume loads of processors. The network-flow-based model proposed for the second phase aims to minimize the maximum message volume handled by processors via utilizing the flexibility in assigning send-communication tasks to processors, which is introduced by data replication. The validity of our proposed model is verified on image-space parallelization of a direct volume rendering algorithm.
Open Access
Processing real-time transactions in a replicated database system
(Springer/Kluwer Academic Publishers, 1994) Ulusoy, Özgür
A database system supporting a real-time application has to provide real-time information to the executing transactions. Each real-time transaction is associated with a timing constraint, typically in the form of a deadline. It is difficult to satisfy all timing constraints due to the consistency requirements of the underlying database. In scheduling the transactions it is aimed to process as many transactions as possible within their deadlines. Replicated database systems possess desirable features for real-time applications, such as a high level of data availability, and potentially improved response time for queries. On the other hand, multiple copy updates lead to a considerable overhead due to the communication required among the data sites holding the copies. In this paper, we investigate the impact of storing multiple copies of data on satisfying the timing constraints of real-time transactions. A detailed performance model of a distributed database system is employed in evaluating the effects of various workload parameters and design alternatives on the system performance. The performance is expressed in terms of the fraction of satisfied transaction deadlines. A comparison of several real-time concurrency control protocols, which are based on different approaches in involving timing constraints of transactions in scheduling, is also provided in performance experiments. © 1994 Kluwer Academic Publishers.
Open Access
Replicated hypergraph partitioning
(2010) Selvitopi, Reha Oğuz
Hypergraph partitioning is recently used in distributed information retrieval (IR) and spatial databases to correctly capture the communication and disk access costs. In the hypergraph models for these areas, the quality of the partitions obtained using hypergraph partitioning can be crucial for the objective of the targeted problem. Replication is a widely used terminology to address different performance issues in distributed IR and database systems. The main motivation behind replication is to improve the performance of the targeted issue at the cost of using more space. In this work, we focus on replicated hypergraph partitioning schemes that improve the quality of hypergraph partitioning by vertex replication. To this end, we propose a replicated partitioning scheme where replication and partitioning are performed in conjunction. Our approach utilizes successful multilevel and recursive bipartitioning methodologies for hypergraph partitioning. The replication is achieved in the uncoarsening phase of the multilevel methodology by extending the efficient Fiduccia-Mattheyses (FM) iterative improvement heuristic. We call this extended heuristic replicated FM (rFM). The proposed rFM heuristic supports move, replication and unreplication operations on the vertices by introducing new algorithms and vertex states. We show rFM has the same complexity as FM and integrate the proposed replication scheme into the multilevel hypergraph partitioning tool PaToH. We test the proposed replication scheme on realistic datasets and obtain promising results.
Open Access
Research issues in peer-to-peer data management
(IEEE, 2007-11) Ulusoy, Özgür
Data management in Peer-to-Peer (P2P) systems is a complicated and challenging issue due to the scale of the network and highly transient population of peers. In this paper, we identify important research problems in P2P data management, and describe briefly some methods that have appeared in the literature addressing those problems. We also discuss some open research issues and directions regarding data management in P2P systems. ©2007 IEEE.