High throughput udp-based peer-to-peer secure data transfer

Date
2018-05
Editor(s)
Advisor
Alkan, Can
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Bilkent University
Volume
Issue
Pages
Language
English
Journal Title
Journal ISSN
Volume Title
Series
Abstract

High throughput sequencing (HTS) platforms have been developed in recent years. These technologies enable researchers to answer a wide range of biological questions by obtaining whole or targeted segments of genomes of individuals. However, HTS technologies generate very large amounts of data. Even after using the best compression algorithms, data size is still huge due to large original le size. As most of the genome projects' contributors are located in di erent countries, transfer of the data becomes an important problem in genomics. Currently used methods for genome data sharing is transferring the les via File Transfer Protocol (FTP), Tsunami protocol or Aspera Software, storing them on public databases or clouds, working on the les stored on central servers and circulating external hard disks. However, all of these methods have some drawbacks like cost, speed, or privacy. In this thesis, to address this problem, we introduce an application called BioPeer. BioPeer uses an open source UDPbased UDT protocol written by Barchart, Inc for data transfer. We implement peer-to-peer le sharing architecture to BioPeer. This architecture is similar to BitTorrent, where large les are transferred in chunks, and synchronized between peers within the same project. To ensure every client is able to connect other clients, we employ NAT traversal via UDP hole punching method. So, users who are behind NAT devices are able to send and receive data from other peers. To provide secure le transfer, BioPeer encrypts les using Advanced Encryption Standard (AES) cipher. Symmetric encryption keys are exchanged via RSA (Rivest-Shamir-Adleman) algorithm. Additionally, content distribution network (CDN) infrastructure is implemented in order to achieve high throughput with BioPeer.

Course
Other identifiers
Book Title
Citation
Published Version (Please cite this version)