Browsing by Subject "Big Data"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Open Access Application of map/reduce paradigm in supercomputing systems(Bilkent University, 2013) Demirci, Gündüz VehbiMap/Reduce is a framework first introduced by Google in order to rapidly develop big data analytic applications on distributed computing systems. Even though the Map/Reduce paradigm had a game changing impact on certain fields of computer science such as information retrieval and data mining, it did not have such an impact on the scientific computing domain yet. The current implementations of Map/Reduce are especially designed for commodity PC clusters, where failures of compute nodes are common and inter-processor communication is slow. However, scientific computing applications are usually executed on high performance computing (HPC) systems and such systems provide high communication bandwidth with low message latency where failures of processors are rare. Therefore, Map/Reduce framework causes performance degradation and becomes less preferable in scientific computing domain. Due to these reasons, specific implementations of Map/Reduce paradigm are needed for scientific computing domain. Among the existing implementations, we focus our attention on the MapReduce-MPI (MR-MPI) library developed at Sandia National Labs. In this thesis, we argue that by utilizing MR-MPI Library, the Map/Reduce programming paradigm can be successfully utilized for scientific computing applications that require scalability and performance. We tested MR-MPI Library in HPC systems with several fundamental algorithms that are frequently used in scientific computing and data mining domains. Implemented algorithms include all-pair-similarity-search (APSS), all-pair-shortest-path (APSP), and page-rank (PR). Tests were performed on well-known large-scale HPC systems IBM BlueGene/Q (Juqueen) and Cray XE6 (Hermit) to examine scalability and speedup of these algorithms.Item Open Access High throughput udp-based peer-to-peer secure data transfer(Bilkent University, 2018-05) Doğan, Fadime TuğbaHigh throughput sequencing (HTS) platforms have been developed in recent years. These technologies enable researchers to answer a wide range of biological questions by obtaining whole or targeted segments of genomes of individuals. However, HTS technologies generate very large amounts of data. Even after using the best compression algorithms, data size is still huge due to large original le size. As most of the genome projects' contributors are located in di erent countries, transfer of the data becomes an important problem in genomics. Currently used methods for genome data sharing is transferring the les via File Transfer Protocol (FTP), Tsunami protocol or Aspera Software, storing them on public databases or clouds, working on the les stored on central servers and circulating external hard disks. However, all of these methods have some drawbacks like cost, speed, or privacy. In this thesis, to address this problem, we introduce an application called BioPeer. BioPeer uses an open source UDPbased UDT protocol written by Barchart, Inc for data transfer. We implement peer-to-peer le sharing architecture to BioPeer. This architecture is similar to BitTorrent, where large les are transferred in chunks, and synchronized between peers within the same project. To ensure every client is able to connect other clients, we employ NAT traversal via UDP hole punching method. So, users who are behind NAT devices are able to send and receive data from other peers. To provide secure le transfer, BioPeer encrypts les using Advanced Encryption Standard (AES) cipher. Symmetric encryption keys are exchanged via RSA (Rivest-Shamir-Adleman) algorithm. Additionally, content distribution network (CDN) infrastructure is implemented in order to achieve high throughput with BioPeer.Item Open Access Social TV ratings: a multi-case analysis from Turkish television industry(Bilkent University, 2016-05) Temel, Erdem AkınIn recent years, viewing habits of TV viewers and television itself have changed significantly thanks to the integration of exponentially developing web technologies to continuously evolving mobile devices. Televised content became digitized and freed from time and space, while public expression became available in a time and space unbound form via social media. This integration and its ever growing outcomes started to be called Social TV, which includes dialogues among viewers and/or producers, social media based ratings, screen interactions, analyses over user created content both in numbers and in relation to contexts etc. Academic definitions seem to be insufficient in defining the general scheme of Social TV. Thus, an important part of this thesis aims to offer a comprehensive definition to this newly developed interaction cluster. Moreover, this thesis argues that Social TV ratings are complementary to the traditional set-top-box rating systems with even a potential to replace them in the future. To support this argument, historical background of Turkish Social TV is provided including its current state, as well as a detailed discussion of the pros and cons of Social TV ratings against traditional rating systems.