Dept. of Computer Engineering - Ph.D. / Sc.D.

Permanent URI for this collection

https://hdl.handle.net/11693/13860

Browse

Now showing 1 - 20 of 84

Open Access
Towards modeling and mitigating misinformation propagation in online social networks
(Bilkent University, 2023-01) Yılmaz, Tolga
Misinformation on the internet and social media has become a pressing concern due to its potential impacts on society, undermining trust and impacting human decisions on global issues such as health, energy, politics, terrorism, and disasters. As a solution to the problem, computational methods have been employed to detect and mitigate the spread of false or misleading information. These efforts have included the development of algorithms to identify fake news and troll accounts, as well as research on the dissemination of misinformation on social media platforms. However, the problem of misinformation on the web and social networks remains a complex and ongoing challenge, requiring continued attention and research. We contribute to three different solution aspects of the problem. First, we design and implement an extensible social network simulation framework called Crowd that helps model, simulate, visualize and analyze social network scenarios. Second, we gamify misinformation propagation as a cooperative game between nodes and identify how misinformation spreads under various criteria. Then, we design a network-level game where the nodes are controlled from a higher perspective. In this game, we train and test a deep reinforcement learning method based on Multi-Agent Deep Deterministic Policy Gradients and show that our method outperforms well-known node-selection algorithms, such as page-rank, centrality, and CELF, over various social networks in defending against misinformation or participating in it. Finally, we promote and propose a blockchain and deep learning hybrid approach that utilizes crowdsourcing to target the misinformation problem while providing transparency, immutability, and validity of votes. We provide the results of extensive simulations under various combinations of well-known attacks on reputation systems and a case study that compares our results with a current study on Twitter.
Open Access
Resource optimization of multi-purpose IoT wireless sensor networks with shared monitoring points
(Bilkent University, 2022-11) Çavdar, Mustafa Can
Wireless sensor networks (WSNs) have many applications and are an essential part of IoT systems. The primary functionality of a WSN is to gather data from certain points that are covered with sensor nodes and transmit the collected data to remote central units for further processing. In IoT use cases, a WSN infrastructure may need to be shared by many applications. Moreover, the data gathered from a certain point or sub-region can satisfy the need of multiple ap-plications. Hence, sensing the data once in such cases is advantageous to increase the acceptance ratio of the applications and reduce waiting times of applications, makespan, energy consumption, and traffic in the network. We call this approach monitoring point-based shared data approach. In this thesis, we focus on both placement and scheduling of the applications, each of which requires some points in the area a WSN covers to be monitored. We propose genetic algorithm-based approaches to deal with these two problems. Additionally, we propose greedy al-gorithms that will be useful where fast decision-making is required. We realized extensive simulation experiments and compared our algorithms with the methods from the literature. The results show the effectiveness of our algorithms in terms of various metrics.
Open Access
Novel algorithms and models for scaling parallel sparse tensor and matrix factorizations
(Bilkent University, 2022-07) Abubaker, Nabil F. T.
Two important and widely-used factorization algorithms, namely CPD-ALS for sparse tensor decomposition and distributed stratiﬁed SGD for low-rank matrix factorization, suﬀer from limited scalability. In CPD-ALS, the computational load associated with a tensor/subtensor assigned to a processor is a function of the nonzero counts as well as the ﬁber counts of the tensor when the CSF stor-age is utilized. The tensor ﬁbers fragment as a result of nonzero distributions, which makes balancing the computational loads a hard problem. Two strategies are proposed to tackle the balancing problem on an existing ﬁne-grain hyper-graph model: a novel weighting scheme to cover the cost of ﬁbers in the true load as well as an augmentation to the hypergraph with ﬁber nets to encode reducing the increase in computational load. CPD-ALS also suﬀers from high latency overhead due to the high number of point-to-point messages incurred as the processor count increases. A framework is proposed to limit the number of messages to O(log2 K), for a K-processor system, exchanged in log2 K stages. A hypergraph-based method is proposed to encapsulate the communication of the new log2 K-stage algorithm. In the existing stratiﬁed SGD implementations, the volume of communication is proportional to one of the dimensions of the input matrix and prohibits the scalability. Exchanging the essential data necessary for the correctness of the SSGD algorithm as point-to-point messages is proposed to reduce the volume. This, although invaluable for reducing the band-width overhead, would increase the upper bound on the number of exchanged messages from O(K) to O(K2) rendering the algorithm latency-bound. A novel Hold-and-Combine algorithm is proposed to exchange the essential communication volume with up to O(K logK) messages. Extensive experiments on HPC systems demonstrate the importance of the proposed algorithms and models in scaling CPD-ALS and stratiﬁed SGD.
Open Access
Fast compound graph layout with constraint support
(Bilkent University, 2022-08) Balcı, Hasan
Visual analysis of relational data becomes more challenging in today's world as the amount of data increases exponentially. Effective visual display of such data is therefore a key requirement to simplify the analysis process. Compound graphs present a practical structure for both representing the relational data with varying levels of groupings or abstractions and managing its complexity. In addition, a good automatic layout of these graphs lets users understand relationships, uncover new insights and find important patterns hidden in the data. To this end, we introduce a new layout algorithm named fCoSE (fast Compound Spring Embedder) for compound graphs with support for user-specified placement constraints. fCoSE combines the speed of spectral layout with the aesthetics and quality of force-directed layout while satisfying specified constraints and properly displaying compound structures. The algorithm first generates a draft layout with the help of a spectral approach, then enforces placement constraints by using newly introduced heuristics and finally polishes the layout via a force-directed layout algorithm modified to maintain enforced constraints. Our experiments performed on both real-life and randomly generated graphs verify that fCoSE outperforms its competitors in terms of both speed and generally accepted graph layout criteria and is fast enough to be used in interactive applications with small to medium-sized graphs.
Open Access
Rendering three-dimensional scenes with tetrahedral meshes
(Bilkent University, 2022-07) Aman, Aytek
We propose compact and efficient tetrahedral mesh representations to improve the ray-tracing performance. We reorder tetrahedral mesh data using a space-filling curve to improve cache locality. Most importantly, we propose efficient ray traversal algorithms. We provide details of the regular ray tracing operations on tetrahedral meshes and the Graphics Processing Unit (GPU) implementation of our traversal method. We demonstrate our findings through a set of comprehensive experiments. Our method outperforms existing tetrahedral mesh-based traversal methods and yields comparable results to the traversal methods based on the state-of-the-art acceleration structures such as k-dimensional (k-d) tree and Bounding Volume Hierarchy (BVH) in terms of speed. Storage-wise, our method uses less memory than its tetrahedral mesh-based counterparts, thus allowing larger scenes to be rendered on the GPU. We also describe additional applications of our technique specifically for volume rendering, two-level hybrid acceleration structures for animation purposes, and point queries in two-dimensional (2-D) and three-dimensional (3-D) triangulations. Finally, we present a practical method to tetrahedralize very large scenes.
Open Access
Collective data forecasting in dynamic transport networks
(Bilkent University, 2021-09) Güvercin, Mehmet
Forecasting is a crucial tool for intelligent transportation systems and passengers of these systems and critical for transportation planning and management, as the transportation variable (e.g. delay, traﬃc speed) are among major costs in transportation. Each transportation variable may cause a further propagation in dynamic transport network. Hence, the transportation variable pattern of a node and the location of the node in the transport network can provide useful information for other nodes. We address the problem of forecasting transportation variable of a transport network node, utilizing the network information as well as the transportation variable patterns of similar nodes in the network. We propose ECFM, Exploratory Clustered Forecasting Modeling, on both static and dynamic transportation network which makes use of graph based fea-tures for time-series estimation. ECFM approach builds a representative time-series for each group of nodes in the transport network and ﬁts a common model like Seasonal Autoregressive Integrated Moving Average (SARIMA), Long-Short Term Memory (LSTM), Regression with Autoregressive Integrated Moving Av-erage errors (REG-ARIMA), Regression with Long-Short Term Memory errors (REG-LSTM) for each, using the network based features as regressors. The models are then applied individually to each node data for predicting the node’s transportation variable. We perform a network based analysis of the transport network and identify graph-based features and we represent nodes as vectors that are used for both grouping nodes and as regressors in forecasting models. We evaluate proposed ECFM, Exploratory Clustered Forecasting Modeling, on two datasets (ﬂight de-lay dataset, traﬃc speed dataset). The experiments show that ECFM provides accurate forecasts of delays/traﬃcs compared to individual forecasting models. Centrality measure of nodes such as betweenness centrality score is found to be an eﬀective regressor in the clustered modeling. Clustered models built on dynamic networks performs better compared to static networks. ECFM, Exploratory Clustered Forecasting Modeling, is an conceptual ap-proach and it is domain independent. Our proposed approach tries to incorporate information, related to estimated variable, exist in similar nodes of the network. Thus, we can achieve to build robust estimation models on enriched data.
Open Access
Towards deeply intelligent interfaces in relational databases
(Bilkent University, 2021-08) Usta, Arif
Relational databases is one of the most popular and broadly utilized infrastruc-tures to store data in a structured fashion. In order to retrieve data, users have to phrase their information need in Structured Query Language (SQL). SQL is a powerfully expressive and ﬂexible language, yet one has to know the schema underlying the database on which the query is issued and to be familiar with SQL syntax, which is not trivial for casual users. To this end, we propose two diﬀerent strategies to provide more intelligent user interfaces to relational databases by utilizing deep learning techniques. As the ﬁrst study, we propose a solution for keyword mapping in Natural Language Interfaces to Databases (NLIDB), which aims to translate Natural Language Queries (NLQs) to SQL. We deﬁne the key-word mapping problem as a sequence tagging problem, and propose a novel deep learning based supervised approach that utilizes part-of-speech (POS) tags of NLQs. Our proposed approach, called DBTagger (DataBase Tagger), is an end-to-end and schema independent solution. Query recommendation paradigm, a well-known strategy broadly utilized in Web search engines, is helpful to suggest queries of expert users to the casual users to help them with their information need. As the second study, we propose Conquer, a CONtextual QUEry Recom-mendation algorithm on relational databases exploiting deep learning. First, we train local embeddings of a database using Graph Convolutional Networks to ex-tract distributed representations of the tuples in latent space. We represent SQL queries with a semantic vector by averaging the embeddings of the tuples returned as a result of the query. We employ cosine similarity over the ﬁnal representations of the queries to generate recommendations, as a Witness-Based approach. Our results show that in classiﬁcation accuracy of database rows as an indicator for embedding quality, Conquer outperforms state-of-the-art techniques.
Open Access
Hypergraph partitioning and reordering for parallel sparse triangular solves and tensor decomposition
(Bilkent University, 2021-07) Torun, Tuğba
Several scientiﬁc and real-world problems require computations with sparse ma-trices, or more generally, sparse tensors which are multi-dimensional arrays. For sparse matrix computations, parallelization of sparse triangular systems intro-duces signiﬁcant challenges because of the sequential nature of the computations involved. One approach to parallelize sparse triangular systems is to use sparse triangular SPIKE (stSPIKE) algorithm, which was originally proposed for shared memory architectures. stSPIKE decouples the problem into independent smaller systems and requires the solution of a much smaller reduced sparse triangular sys-tem. We extend and implement stSPIKE for distributed-memory architectures. Then we propose distributed-memory parallel Gauss-Seidel (dmpGS) and ILU (dmpILU) algorithms by means of stSPIKE. Furthermore, we propose novel hy-pergraph partitioning models and in-block reordering methods for minimizing the size and nonzero count of the reduced systems that arise in dmpGS and dmpILU. For sparse tensor computations, tensor decomposition is widely used in the anal-ysis of multi-dimensional data. The canonical polyadic decomposition (CPD) is one of the most popular tensor decomposition methods, which is commonly computed by the CPD-ALS algorithm. Due to high computational and mem-ory demands of CPD-ALS, it is inevitable to use a distributed-memory-parallel algorithm for eﬃciency. The medium-grain CPD-ALS algorithm, which adopts multi-dimensional cartesian tensor partitioning, is one of the most successful dis-tributed CPD-ALS algorithms for sparse tensors. We propose a novel hypergraph partitioning model, CartHP, whose partitioning objective correctly encapsulates the minimization of total communication volume of multi-dimensional cartesian tensor partitioning. Extensive experiments on real-world sparse matrices and tensors validate the parallel scalability of the proposed algorithms as well as the eﬀectiveness of the proposed hypergraph partitioning and reordering models.
Open Access
On the tradeoff between privacy and utility in genomic studies: differential privacy under dependent tuples
(Bilkent University, 2020-08) Alserr, Nour M. N.
The rapid progress in genome sequencing and the decrease in the sequencing costs have led to the high availability of genomic data. Studying these data can greatly help answer the key questions about disease associations and our evolution. However, due to growing privacy concerns about the sensitive information of participants, accessing key results and data of genomic studies (such as genomewide association studies - GWAS) is restricted to only trusted individuals. On the other hand, paving the way to biomedical breakthroughs and discoveries requires granting open access to genomic datasets. Privacy-preserving mechanisms can be a solution for granting wider access to such data while protecting their owners. In particular, there has been growing interest in applying the concept of differential privacy (DP) while sharing summary statistics about genomic data. DP provides a mathematically rigorous approach to prevent the risk of membership inference while sharing statistical information about a dataset. However, DP has a known drawback as it does not take into account the correlation between dataset tuples, which is a common situation for genomic datasets due to the inherent correlations between the genomes of family members. This may degrade the privacy guarantees offered by the DP. In this Thesis, focusing on static and dynamic genomic datasets, we show this drawback of the DP and we propose techniques to mitigate it. First, using a real-world genomic dataset, we demonstrate the feasibility of an attribute inference attack on differentially private query results by utilizing the correlations between the entries in the dataset. We show the privacy loss in count, minor allele frequency (MAF), and chi-square queries. The results explain the scale of vulnerability when we have dependent tuples in the dataset. Our results demonstrate that the adversary can infer sensitive genomic data about a user from the differentially private results of a sum query by exploiting the correlations between the genomes of family members. Our results also show that using the results of differentially-private MAF queries on static and dynamic genomic datasets and utilizing the dependency between tuples, an adversary can reveal up to 50% more sensitive information about the genome of a target (compared to original privacy guarantees of standard DP-based mechanisms), while differentially-privacy chi-square queries can reveal up to 40% more sensitive information. Furthermore, we show that the adversary can use the inferred genomic data obtained from the attribute inference attack to infer the membership of a target in another genomic dataset (e.g., associated with a sensitive trait). Using a log-likelihood-ratio (LLR) test, our results also show that the inference power of the adversary can be significantly high in such an attack even by using inferred (and hence partially incorrect) genomes. Finally, we propose a mechanism for privacy-preserving sharing of statistics from genomic datasets to attain privacy guarantees while taking into consideration the dependence between tuples. By evaluating our mechanism on different genomic datasets, we empirically demonstrate that our proposed mechanism can achieve up to 50% better privacy than traditional DP-based solutions.
Open Access
Deep learning for digital pathology
(Bilkent University, 2020-11) Sarı, Can Taylan
Histopathological examination is today’s gold standard for cancer diagnosis and grading. However, this task is time consuming and prone to errors as it requires detailed visual inspection and interpretation of a histopathological sample provided on a glass slide under a microscope by an expert pathologist. Low-cost and high-technology whole slide digital scanners produced in recent years have eliminated the disadvantages of physical glass slide samples by digitizing histopathological samples and relocating them to digital media. Digital pathology aims at alleviating the problems of traditional examination approaches by providing auxiliary computerized tools that quantitatively analyze digitized histopathological images. Traditional machine learning methods have proposed to extract handcrafted features from histopathological images and to use these features in the design of a classification or a segmentation algorithm. The performance of these methods mainly relies on the features that they use, and thus, their success strictly depends on the ability of these features to successfully quantify the histopathology domain. More recent studies have employed deep architectures to learn expressive and robust features directly from images avoiding complex feature extraction procedures of traditional approaches. Although deep learning methods perform well in many classification and segmentation problems, convolutional neural networks that they frequently make use of require annotated data for training and this makes it difficult to utilize unannotated data that cover the majority of the available data in the histopathology domain. This thesis addresses the challenges of traditional and deep learning approaches by incorporating unsupervised learning into classification and segmentation algorithms for feature extraction and training regularization purposes in the histopathology domain. As the first contribution of this thesis, the first study presents a new unsupervised feature extractor for effective representation and classification of histopathological tissue images. This study introduces a deep belief network to quantize the salient subregions, which are identified with domain-specific prior knowledge, by extracting a set of features directly learned on image data in an unsupervised way and uses the distribution of these quantizations for image representation and classification. As its second contribution, the second study proposes a new regularization method to train a fully convolutional network for semantic tissue segmentation in histopathological images. This study relies on the benefit of unsupervised learning, in the form of image reconstruction, for network training. To this end, it puts forward an idea of defining a new embedding, which is generated by superimposing an input image on its segmentation map, that allows uniting the main supervised task of semantic segmentation and an auxiliary unsupervised task of image reconstruction into a single one and proposes to learn this united task by a generative adversarial network. We compare our classification and segmentation methods with traditional machine learning methods and the state-of-the-art deep learning algorithms on various histopathological image datasets. Visual and quantitative results of our experiments demonstrate that the proposed methods are capable of learning robust features from histopathological images and provides more accurate results than their counterparts.
Open Access
Custom hardware optimizations for reliable and high performance computer architectures
(Bilkent University, 2020-09) Ahangari, Hamzeh
In recent years, we have witnessed a huge wave of innovations, such as in Artificial Intelligence (AI) and Internet-of-Things (IoT). In this trend, software tools are constantly and increasingly demanding more processing power, which can no longer be met by processors traditionally. In response to this need, a diverse range of hardware, including GPUs, FPGAs, and AI accelerators, are coming to the market every day. On the other hand, while hardware platforms are becoming more power-hungry due to higher performance demand, concurrent reduction in the size of transistors, and placing high emphasis on reducing the voltage, altogether have always been sources of reliability concerns in circuits. This particularly is applicable to error-sensitive applications, such as transportation and aviation industries where an error can be catastrophic. The reliability issues may have other reasons too, like harsh environmental conditions. These two problems of modern electronic circuits, meaning the need for higher performance and reliability at the same time, require appropriate solutions. In order to satisfy both the performance and the reliability constraints either designs based on reconfigurable circuits, such as FPGAs, or designs based on Commercial-Off-The-Shelf (COTS) components like general-purpose processors, can be an appropriate approach because the platforms can be used in a wide variety of applications. In this regard, three solutions have been proposed in this thesis. These solutions target 1) safety and reliability at the system-level using redundant processors, 2) performance at the architecture-level using multiple accelerators, and 3) reliability at the circuit-level through the use of redundant transistors. Specifically, in the first work, the contribution of some prevalent parameters in the design of safetycritical computers, using COTS processors, is discussed. Redundant architectures are modeled by the Markov chains, and sensitivity of system safety to parameters has been analyzed. Most importantly, the significant presence of Common Cause Failures (CCFs) has been investigated. In the second work, the design, and implementation of an HLS-based, FPGA-accelerated, high-throughput/work-efficient, synthesizable template-based graph processing framework has been presented. The template framework is simplified for easy mapping to FPGA, even for software programmers. The framework is particularly experimented on Intel state-ofthe-art Xeon+FPGA platform to implement iterative graph algorithms. Beside high-throughput pipeline, work-efficient mode significantly reduces total graph processing run-time with a novel active-list design. In the third work, Joint SRAM (JSRAM) cell, a novel circuit-level technique to exploit the trade-off between reliability and memory size, is introduced. This idea is applicable to any SRAM structure like cache memory, register file, FPGA block RAM, or FPGA look-up table (LUT), and even latches and Flip-Flops. In fault-prone conditions, the structure can be configured in such a way that four cells are combined together at the circuit level to form one large and robust memory bit. Unlike prevalent hardware redundancy techniques, like Triple Modular Redundancy (TMR), there is no explicit majority voter at the output. The proposed solution mainly focuses on transient faults, where the reliable mode can provide auto-correction and full immunity against single faults.
Open Access
Server and wireless network resource allocation strategies in heterogeneous cloud data centers
(Bilkent University, 2020-08) Mergenci, Cem
Resource allocation is one of the most important challenges in operating a data center. We investigate allocation of two main types of resources: servers and network links. Server resource allocation problem is the problem of how to allocate virtual machines (VMs) to physical machines (PMs). By modeling server resources (CPU, memory, storage, IO, etc.) as a multidimensional vector space, we present design criteria for metrics that measure the fitness of an allocation of VMs into PMs. We propose two novel metrics that conform to these design criteria. We also propose VM allocation methods that use these metrics to compare allocation alternatives when allocating a set of VMs into a set of PMs. We compare performances of our proposed metrics to the ones from the literature using vector bin packing with heterogeneous bins (VBPHB) benchmark. Results show that our methods find feasible solutions to a greater number of allocation problems than the others. Network resource allocation problem is examined in hybrid wireless data centers. We propose a system model in which each top-of-the-rack (ToR) switch is equipped with two radios operating in 60-GHz band using 3-channel 802.11ad. Given traffic flows between servers, we allocate wireless links between ToR switches so that the traffic carried over the wireless network is maximized. We also present a method to randomly generate traffic based on a real data center traffic pattern. We evaluate the performance of our proposed traffic allocation methods using randomly generated traffic. Results show that our methods can offload significant amount of traffic from wired to wireless network, while achieving low latency, high throughput, and high bandwidth utilization.
Open Access
Reducing communication overhead in sparse matrix and tensor computations
(Bilkent University, 2020-08) Karsavuran, Mustafa Ozan
Encapsulating multiple communication cost metrics, i.e., bandwidth and latency, is proven to be important in reducing communication overhead in the parallelization of sparse and irregular applications. Communication hypergraph model was proposed in a two-phase setting for encapsulating multiple communication cost metrics. The reduce-communication hypergraph model suﬀers from failing to correctly encapsulate send-volume balancing. We propose a novel vertex weighting scheme that enables part weights to correctly encode send-volume loads of processors for send-volume balancing. The model also suﬀers from increasing the total communication volume during partitioning. To decrease this increase, we propose a method that utilizes the recursive bipartitioning (RB) paradigm and reﬁnes each bipartition by vertex swaps. For performance evaluation, we consider column-parallel SpMV, which is one of the most widely known applications in which the reduce-task assignment problem arises. Extensive experiments on 313 matrices show that, compared to the existing model, the proposed models achieve considerable improvements in all communication cost metrics. These improvements lead to an average decrease of 30 percent in parallel SpMV time on 512 processors for 70 matrices with high irregularity. We further enhance the reduce-communication hypergraph model so that it also encapsulates the minimization of the maximum number of messages sent by a processor. For this purpose, we propose a novel cutsize metric which we realize using RB paradigm while partitioning the reduce-communication hypergraph. We also introduce a new type of net for the communication hypergraph which models decreasing the increase in the total communication volume directly with the partitioning objective. Experiments on 300 matrices show that the proposed models achieve considerable improvements in communication cost metrics which lead to better column-parallel SpMM time on 1024 processors. We propose a hypergraph model for general medium-grain sparse tensor partitioning which does not enforce any topological constraint on the partitioning. The proposed model is based on splitting the given tensor into nonzero-disjoint component tensors. Then a mode-dependent coarse-grain hypergraph is constructed for each component tensor. A net amalgamation operation is proposed to form a composite medium-grain hypergraph from these mode-dependent coarse-grain hypergraphs to correctly encapsulate the minimization of the communication volume. We propose a heuristic which splits the nonzeros of dense slices to obtain sparse slices in component tensors. We also utilize the well-known RB paradigm to improve the quality of the splitting heuristic. We propose a medium-grain tripartite graph model with the aim of a faster partitioning at the expense of increasing the total communication volume. Parallel experiments conducted on 10 real-world tensors on up to 1024 processors conﬁrm the validity of the proposed hypergraph and graph models.
Open Access
Towards unifiying mobility datasets
(Bilkent University, 2019-12) Basık, Fuat
With the proliferation of smart phones integrated with positioning systems and the increasing penetration of Internet-of-Things (IoT) in our daily lives, mobility data has become widely available. A vast variety of mobile services and applications either have a location-based context or produce spatio-temporal records as a byproduct. These records contain information about both the entities that produce them, as well as the environment they were produced in. Availability of such data supports smart services in areas including healthcare, computational social sciences and location-based marketing. We postulate that the spatio-temporal usage records belonging to the same real-world entity can be matched across records from different location-enhanced services. This is a fundamental problem in many applications such as linking user identities for security, understanding privacy limitations of location based services, or producing a unified dataset from multiple sources for urban planning and traffic management. Such integrated datasets are also essential for service providers to optimise their services and improve business intelligence. As such, in this work, we explore scalable solutions to link entities across two mobility datasets, using only their spatio-temporal information to pave to road towards unifying mobility datasets. The first approach is rule-based linkage, based on the concept of k-l diversity | that we developed to capture both spatial and temporal aspects of the linkage. This model is realized by developing a scalable linking algorithm called ST-Link, which makes use of effective spatial and temporal filtering mechanisms that significantly reduce the search space for matching users. Furthermore, ST-Link utilizes sequential scan procedures to avoid random disk access and thus scales to large datasets. The second approach is similarity based linkage that proposes a mobility based representation and similarity computation for entities. An efficient matching process is then developed to identify the final linked pairs, with an automated mechanism to decide when to stop the linkage. We scale the process with a locality-sensitive hashing (LSH) based approach that significantly reduces candidate pairs for matching. To realize the effectiveness and efficiency of our techniques in practice, we introduce an algorithm called SLIM. We evaluated our work with respect to accuracy and performance using several datasets. Experiments show that both ST-Link and SLIM are effective in practice for performing spatio-temporal linkage and can scale to large datasets. Moreover, the LSH-based scalability brings two to four orders of magnitude speedup.
Open Access
Discovering regulatory non-coding RNA interactions
(Bilkent University, 2019-09) Olgun, Gülden
The vast majority of eukaryotic transcriptomes comprise noncoding RNAs (ncRNAs) which are not translated into proteins. Despite the accumulating evidence on the functional roles of ncRNAs, we are still far from understanding the whole spectrum of molecular functions ncRNAs can undertake and how they accomplish them. In this thesis we develop computational methods for discovering interactions among ncRNAs and tools to analyze them functionally. In the first part of the thesis, we present an integrative approach to discover long non-coding RNA (lncRNA) mediated sponge interactions where lncRNAs can indirectly regulate mRNAs expression levels by sequestering microRNAs (miRNAs), and act as sponges. We conduct partial correlation analysis and kernel independence tests on patient gene expression profiles and further refine the candidate interactions with miRNA target information. We use this approach to find sponge interactions specific to breast-cancer subtypes. We find that although there are sponges common to multiple subtypes, there are also distinct subtype-specific interactions with high prognostic potential. Secondly, we develop a method to identify synergistically acting miRNA pairs. These pairs have weak or no repression on the target mRNA when they act individually, but when together they induce strong repression of their target gene expression. We test the combinations of RNA triplets using non-parametric kernel-based interaction tests. In forming the triplets to test, we consider target predictions between the miRNAs and mRNA. We apply our approach on kidney tumor samples. The discovered triplets have several lines of biological evidence on a functional association among them or their relevance to kidney tumors. In the third part of the thesis, we focus on functional enrichment analysis of noncoding RNAs while some non-coding RNAs (ncRNAs) have been found to play critical regulatory roles in biological processes, most remain functionally uncharacterized. This presents a challenge whenever an interesting set of ncRNAs set needs to be analyzed in a functional context. We develop a method that performs cis enrichment analysis for a given set of ncRNAs. Enrichment is carried out by using the functional annotations of the coding genes located proximally to the input ncRNAs. To demonstrate how this method could be used to gain insight into the functional importance of a list of interesting ncRNAs, we tackle different biological questions on datasets of cancer and psychiatric disorders. Particularly, we also analyze 28 different types of cancers in terms of molecular process perturbed and linked to altered lncRNA expression. We hope that the methods developed herein will help elucidate functional roles of ncRNAs and aid the development of therapies based on ncRNAs.
Open Access
Hybrid fog-cloud based data distribution for internet of things applications
(Bilkent University, 2019-09) Karataş, Fırat
Technological advancements keep making machines, devices, and appliances faster, more capable, and more connected to each other. The network of all interconnected smart devices is called Internet of Things (IoT). It is envisioned that there will be billions of interconnected IoT devices producing and consuming petabytes of data that may be needed by multiple IoT applications. This brings challenges to store and process such a large amount of data in an efficient and effective way. Cloud computing and its extension to the network edge, fog computing, emerge as new technology alternatives to tackle some of these challenges in transporting, storing, and processing petabytes of IoT data in an efficient and effective manner. In this thesis, we propose a geographically distributed hierarchical cloud and fog computing based IoT storage and processing architecture, and propose techniques for placing IoT data into its components, i.e., cloud and fog data centers. Data is considered in different types and each type of data may be needed by multiple applications. Considering this fact, we generate feasible and realistic network models for a large-scale distributed storage architecture, and propose algorithms for efficient and effective placement of data generated and consumed by large number of geographically distributed IoT nodes. Data used by multiple applications is stored only once in a location that is easily accessed by applications needing that type of data. We performed extensive simulation experiments to evaluate our proposal. The results show that our network architecture and placement techniques can be used to store IoT data efficiently while providing reduced latency for IoT applications without increasing network bandwidth consumed.
Open Access
Partitioning models for scaling distributed graph computations
(Bilkent University, 2019-08) Demirci, Gündüz Vehbi
The focus of this thesis is intelligent partitioning models and methods for scaling the performance of parallel graph computations on distributed-memory systems. Distributed databases utilize graph partitioning to provide servers with data-locality and workload-balance. Some queries performed on a database may form cascades due to the queries triggering each other. The current partitioning methods consider the graph structure and logs of query workload. We introduce the cascade-aware graph partitioning problem with the objective of minimizing the overall cost of communication operations between servers during cascade processes. We propose a randomized algorithm that integrates the graph structure and cascade processes to use as input for large-scale partitioning. Experiments on graphs representing real social networks demonstrate the e ectiveness of the proposed solution in terms of the partitioning objectives. Sparse-general-matrix-multiplication (SpGEMM) is a key computational kernel used in scienti c computing and high-performance graph computations. We propose an SpGEMM algorithm for Accumulo database which enables high performance distributed parallelism through its iterator framework. The proposed algorithm provides write-locality and avoids scanning input matrices multiple times by utilizing Accumulo's batch scanning capability and node-level parallelism structures. We also propose a matrix partitioning scheme that reduces the total communication volume and provides a workload-balance among servers. Extensive experiments performed on both real-world and synthetic sparse matrices show that the proposed algorithm and matrix partitioning scheme provide signi cant performance improvements. Scalability of parallel SpGEMM algorithms are heavily communication bound. Multidimensional partitioning of SpGEMM's workload is essential to achieve higher scalability. We propose hypergraph models that utilize the arrangement of processors and also attain a multidimensional partitioning on SpGEMM's workload. Thorough experimentation performed on both realistic as well as synthetically generated SpGEMM instances demonstrates the e ectiveness of the proposed partitioning models.
Open Access
Deep feature representations and multi-instance multi-label learning of whole slide breast histopathology images
(Bilkent University, 2019-03) Mercan, Caner
The examination of a tissue sample has traditionally involved a pathologist investigating the case under a microscope. Whole slide imaging technology has recently been utilized for the digitization of biopsy slides, replicating the microscopic examination procedure with the computer screen. This technology made it possible to scan the slides at very high resolutions, reaching up to 100; 000 100; 000 pixels. The advancements in the imaging technology has allowed the development of automated tools that could help reduce the workload of pathologists during the diagnostic process by performing analysis on the whole slide histopathology images. One of the challenges of whole slide image analysis is the ambiguity of the correspondence between the diagnostically relevant regions in a slide and the slide-level diagnostic labels in the pathology forms provided by the pathologists. Another challenge is the lack of feature representation methods for the variable number of variable-sized regions of interest (ROIs) in breast histopathology images as the state-of-the-art deep convolutional networks can only operate on fixed-sized small patches which may cause structural and contextual information loss. The last and arguably the most important challenge involves the clinical significance of breast histopathology, for the misdiagnosis or the missed diagnoses of a case may lead to unnecessary surgery, radiation or hormonal therapy. We address these challenges with the following contributions. The first contribution introduces the formulation of the whole slide breast histopathology image analysis problem as a multi-instance multi-label learning (MIMLL) task where a slide corresponds to a bag that is associated with the slide-level diagnoses provided by the pathologists, and the ROIs inside the slide correspond to the instances in the bag. The second contribution involves a novel feature representation method for the variable number of variable-sized ROIs using the activations of deep convolutional networks. Our final contribution includes a more advanced MIMLL formulation that can simultaneously perform multi-class slide-level classification and ROI-level inference. Through quantitative and qualitative experiments, we show that the proposed MIMLL methods are capable of learning from only slide-level information for the multi-class classification of whole slide breast histopathology images and the novel deep feature representations outperform the traditional features in fully supervised and weakly supervised settings.
Open Access
Reducing processor-memory performance gap and improving network-on-chip throughput
(Bilkent University, 2019-02) Mustafa, Naveed U. l.
Performance of computing systems has tremendously improved over last few decades primarily due to decreasing transistor size and increasing clock rate. Billions of transistors placed on a single chip and switching at high clock rate result in overheating of the chip. The demand for performance improvement without increasing the heat dissipation lead to the inception of multi/many core design where multiple cores and/or memories communicate through a network on chip. Unfortunately, performance of memory devices has not improved at the same rate as that of processors and hence become a performance bottleneck. On the other hand, varying traffic pattern in real applications limits the network throughput delivered by a routing algorithm. In this thesis, we address the issue of reducing processor-memory performance gap in two ways: First, by integrating improved and newly developed memory technologies in memory hierarchy of a computing system. Second, by equipping the execution platform with necessary architectural features and enabling its compiler to parallelize memory access instructions. We also address issue of improving network throughput by proposing a selection scheme that switches routing algorithm of an NoC with changing traffic pattern of an application. We present integration of emerging non-volatile memory (NVM) devices in memory hierarchy of a computing system in the context of database management systems (DBMS). To this end, we propose modifications in storage engine (SE) of a DBMS aiming at fast access to data through bypassing the slow disk interfaces while maintaining all the functionalities of a robust DBMS. As a case study, we modify the SE of PostgreSQL and detail the necessary changes and challenges such modifications entail. We evaluate our proposal using a comprehensive emulation platform. Results indicate that our modified SE reduces query execution time by up to 45% and 13% when compared to disk and NVM storage, with average reductions of 19% and 4%, respectively. Detailed analysis of these results shows that our modified SE suffers from data readiness problem. To solve this, we develop a general purpose library that employs helper threads to prefetch data from NVM hardware via a simple application program interface (API). Our library further improves query execution time for our modified SE when compared to disk and NVM storage by up to 54% and 17%, with average reductions of 23% and 8%, respectively. As a second way to reduce processor-memory performance gap, we propose a compiler optimization aiming at reduction of memory bound stalls. The proposed optimization generates efficient instruction schedule through classification of memory references and consists of two steps: affinity analysis and affinity-aware instruction scheduling. We suggest two different approaches for affinity analysis, i.e., source code annotation and automated analysis. Our experimental results show that application of annotation-based approach on a memory intensive program reduces stall cycles by 67.44%, leading to 25.61% improvement in execution time. We also evaluate automated-analysis approach using eleven different image processing benchmarks. Experimental results show that automated-analysis reduces stall cycles, on average, by 69.83%. As all benchmarks are both compute and memory-intensive, we achieve improvement in execution time by up to 30%, with a modest average of 5.79%. In order to improve network throughput, we propose a selection scheme that switches routing algorithm with changing traffic pattern. We use two selection strategies: static and dynamic selection. While static selection is made off-line, dynamic approach uses run-time information on network congestion for selection of routing algorithm. Experimental results show that our proposal improves throughput for real applications up to 37.49%. They key conclusion of this thesis is that improvement in performance of a computing system needs multifaceted approach i.e., improving the performance of memory and communication subsystem at the same time. The reduction in performance gap between processors and memories requires not only integration of improved memory technologies in system but also software/compiler support. We also conclude that switching routing algorithm with changing traffic pattern of an application leads to improvement of NoC throughput.
Open Access
Algorithms for structural variation discovery using multiple sequence signatures
(Bilkent University, 2018-09) Söylev, Arda
Genomic variations including single nucleotide polymorphisms (SNPs), small INDELs and structural variations (SVs) are known to have significant phenotypic effects on individuals. Among them, SVs, that alter more than 50 nucleotides of DNA, are the major source of complex genetic diseases such as Crohn's, schizophrenia and autism. Additionally, the total number of nucleotides affected by SVs are substantially higher than SNPs (3.5 Mbp SNP, 15-20 Mbp SV). Today, we are able to perform whole genome sequencing (WGS) by utilizing high throughput sequencing technology (HTS) to discover these modifications unimaginably faster, cheaper and more accurate than before. However, as demonstrated in the 1000 Genomes Project, HTS technology still has significant limitations. The major problem lies in the short read lengths (<250 bp) produced by the current sequencing platforms and the fact that most genomes include large amounts of repeats make it very challenging to unambiguously map and accurately characterize genomic variants. Thus, most of the existing SV discovery tools focus on detecting relatively simple types of SVs such as insertions, deletions, and short inversions. In fact, other types of SVs including the complex ones are of crucial importance and several have been associated with genomic disorders. To better understand the contribution of these SVs to human genome, we need new approaches to accurately discover and genotype such variants. Therefore, there is still a need for accurate algorithms to fully characterize a broader spectrum of SVs and thus improve calling accuracy of more simple variants. Here we introduce TARDIS that harbors novel algorithms to accurately characterize various types of SVs including deletions, novel sequence insertions, inversions, transposon insertions, nuclear mitochondria insertions, tandem duplications and interspersed segmental duplications in direct or inverted orientations using short read whole genome sequencing datasets. Within our framework, we make use of multiple sequence signatures including read pair, read depth and split read in order to capture different sequence signatures and increase our SV prediction accuracy. Additionally, we are able to analyze more than one possible mapping location of each read to overcome the problems associated with repeated nature of genomes. Recently, due to the limitations of short-read sequencing technology, newer library preparation techniques emerged and 10x Genomics is one of these initiatives. This technique is regarded as a cost-effective alternative to long read sequencing, which can obtain long range contiguity information. We extended TARDIS to be able to utilize Linked-Read information of 10x Genomics to overcome some of the constraints of short-read sequencing technology. We evaluated the prediction performance of our algorithms through several experiments using both simulated and real data sets. In the simulation experiments, TARDIS achieved 97.67% sensitivity with only 1.12% false discovery rate. For experiments that involve real data, we used two haploid genomes (CHM1 and CHM13) and one human genome (NA12878) from the Illumina Platinum Genomes set. Comparison of our results with orthogonal PacBio call sets from the same genomes revealed higher accuracy for TARDIS than state of the art methods. Furthermore, we showed a surprisingly low false discovery rate of our approach for discovery of tandem, direct and inverted interspersed segmental duplications prediction on CHM1 (less than 5% for the top 50 predictions). The algorithms we describe here are the first to predict insertion location and the various types of new segmental duplications using HTS data.