Department of Computer Engineering
Permanent URI for this community
Browse
Browsing Department of Computer Engineering by Title
Now showing 1 - 20 of 672
Results Per Page
Sort Options
Item Open Access 1.5D parallel sparse matrix-vector multiply(Society for Industrial and Applied Mathematics, 2018) Kayaaslan, E.; Aykanat, Cevdet; Uçar, B.There are three common parallel sparse matrix-vector multiply algorithms: 1D row-parallel, 1D column-parallel, and 2D row-column-parallel. The 1D parallel algorithms offer the advantage of having only one communication phase. On the other hand, the 2D parallel algorithm is more scalable, but it suffers from two communication phases. Here, we introduce a novel concept of heterogeneous messages where a heterogeneous message may contain both input-vector entries and partially computed output-vector entries. This concept not only leads to a decreased number of messages but also enables fusing the input- and output-communication phases into a single phase. These findings are exploited to propose a 1.5D parallel sparse matrix-vector multiply algorithm which is called local row-column-parallel. This proposed algorithm requires a constrained fine-grain partitioning in which each fine-grain task is assigned to the processor that contains either its input-vector entry, its output-vector entry, or both. We propose two methods to carry out the constrained fine-grain partitioning. We conduct our experiments on a large set of test matrices to evaluate the partitioning qualities and partitioning times of these proposed 1.5D methods.Item Open Access 2010 IAPR workshop on pattern recognition in remote sensing, PRRS 2010: preface(2010) Aksoy, S.; Younan, N. H.; Forstner, W.Item Open Access 3D Hair sketching for real-time dynamic & key frame animations(Springer, 2008-07) Aras, R.; Başarankut, B.; Çapın, T.; Özgüç, B.Physically based simulation of human hair is a well studied and well known problem. But the "pure" physically based representation of hair (and other animation elements) is not the only concern of the animators, who want to "control" the creation and animation phases of the content. This paper describes a sketch-based tool, with which a user can both create hair models with different styling parameters and produce animations of these created hair models using physically and key frame-based techniques. The model creation and animation production tasks are all performed with direct manipulation techniques in real-time. © 2008 Springer-Verlag.Item Open Access 3D human pose search using oriented cylinders(IEEE, 2009-09-10) Pehlivan, Selen; Duygulu, PınarIn this study, we present a representation based on a new 3D search technique for volumetric human poses which is then used to recognize actions in three dimensional video sequences. We generate a set of cylinder like 3D kernels in various sizes and orientations. These kernels are searched over 3D volumes to find high response regions. The distribution of these responses are then used to represent a 3D pose. We use the proposed representation for (i) pose retrieval using Nearest Neighbor (NN) based classification and Support Vector Machine (SVM) based classification methods, and for (ii) action recognition on a set of actions using Dynamic Time Warping (DTW) and Hidden Markov Model (HMM) based classification methods. Evaluations on IXMAS dataset supports the effectiveness of such a robust pose representation. ©2009 IEEE.Item Open Access 3D model compression using connectivity-guided adaptive wavelet transform built into 2D SPIHT(Academic Press, 2010-01) Köse K.; Çetin, A. Enis; Güdükbay, Uğur; Onural, L.Connectivity-Guided Adaptive Wavelet Transform based mesh compression framework is proposed. The transformation uses the connectivity information of the 3D model to exploit the inter-pixel correlations. Orthographic projection is used for converting the 3D mesh into a 2D image-like representation. The proposed conversion method does not change the connectivity among the vertices of the 3D model. There is a correlation between the pixels of the composed image due to the connectivity of the 3D mesh. The proposed wavelet transform uses an adaptive predictor that exploits the connectivity information of the 3D model. Known image compression tools cannot take advantage of the correlations between the samples. The wavelet transformed data is then encoded using a zero-tree wavelet based method. Since the encoder creates a hierarchical bitstream, the proposed technique is a progressive mesh compression technique. Experimental results show that the proposed method has a better rate distortion performance than MPEG-3DGC/MPEG-4 mesh coder.Item Open Access A utilization based genetic algorithm for virtual machine placement in cloud systems(2024-01-15) Çavdar, Mustafa Can; Körpeoğlu, İbrahim; Ulusoy, ÖzgürDue to the increasing demand for cloud computing and related services, cloud providers need to come up with methods and mechanisms that increase the performance, availability and reliability of data centers and cloud systems. Server virtualization is a key component to achieve this, which enables sharing of resources of a single physical machine among multiple virtual machines in a totally isolated manner. Optimizing virtualization has a very significant effect on the overall performance of a cloud computing system. This requires efficient and effective placement of virtual machines into physical machines. Since this is an optimization problem that involves multiple constraints and objectives, we propose a method based on genetic algorithms to place virtual machines into physical servers of a data center. By considering the utilization of machines and node distances, our method, called Utilization Based Genetic Algorithm (UBGA), aims at reducing resource waste, network load, and energy consumption at the same time. We compared our method against several other placement methods in terms of utilization achieved, networking bandwidth consumed, and energy costs incurred, using an open-source, publicly available CloudSim simulator. The results show that our method provides better performance compared to other placement approaches.Item Open Access Abstract 207: the cBioPortal for cancer genomics(American Association for Cancer Research (AACR), 2021) Gao, J.; Mazor, T.; de Bruijn, I.; Abeshouse, A.; Baiceanu, D.; Erkoç, Z.; Gross, B.; Higgins, D.; Jagannathan, P. K.; Kalletla, K.; Kumari, Priti; Kundra, R.; Li, X.; Lindsay, J.; Lisman, A.; Lukasse, P.; Madala, D.; Madupuri, R.; Ochoa, A.; Plantalech, O.; Quach, J.; Rodenburg, S.; Satravada, A.; Schaeffer, F.; Sheridan, R.; Sikina, L.; Sümer, S. O.; Sun, Y.; van Dijk, P.; van Nierop, P.; Wang, A.; Wilson, M.; Zhang, H.; Zhao, G.; van Hagen, S.; van Bochove, K.; Doğrusöz, Uğur; Heath, A.; Resnick, A.; Pugh, T. J.; Sander, C.; Cerami, E.; Schultz, N.The cBioPortal for Cancer Genomics is an open-source software platform that enables interactive, exploratory analysis of large-scale cancer genomics data sets with a user-friendly interface. It integrates genomic and clinical data, and provides a suite of visualization and analysis options, including OncoPrint, mutation diagram, variant interpretation, survival analysis, expression correlation analysis, alteration enrichment analysis, cohort and patient-level visualization, among others.The public site (https://www.cbioportal.org) hosts data from almost 300 studies spanning individual labs and large consortia. Data is also available in the cBioPortal Datahub (https://github.com/cBioPortal/datahub/). In 2020 we added data from 21 studies, totaling almost 30,000 samples. In addition, we added data to existing TCGA PanCancer Atlas studies, including MSI status, mRNA-seq z-scores relative to normal tissue, microbiome data, and RPPA-based protein expression. The cBioPortal also supports AACR Project GENIE with a dedicated instance hosting the GENIE cohort of 112,000 clinically sequenced samples from 19 institutions worldwide (https://genie.cbioportal.org).The site is accessed by over 30,000 unique visitors per month. To support these users, we hosted a five-part instructional webinar series. Recordings of these webinars are available on our website and have already been viewed thousands of times.In addition, more than 50 instances are installed at academic institutions and pharmaceutical/biotechnology companies. In support of these local instances, we continue to simplify the installation process: we now provide a docker compose solution which includes all microservices to run the web app as well as data validation, import and migration.We continue to enhance and expand the functionality of cBioPortal. This year we significantly enhanced the group comparison feature; it is now integrated into gene-specific queries and supports comparison of more data types including DNA methylation, microbiome, and any outcome measure. We also expanded support of longitudinal data: the existing patient timeline has been refactored and now supports a wider range of data and visualizations; a new “Genomic Evolution” tab highlights changes in mutation allele frequencies across multiple samples from a patient; and samples can now be selected based on pre- or post-treatment status. Other features released this year include: allowing users to add gene-level plots for continuous molecular profiles in study view, enabling users to select the desired transcript on the Mutations tab, and integration of PathwayMapper.The cBioPortal is fully open source (https://github.com/cBioPortal/) under a GNU Affero GPL license. Development is a collaborative effort among groups at Memorial Sloan Kettering Cancer Center, Dana-Farber Cancer Institute, Children's Hospital of Philadelphia, Princess Margaret Cancer Centre, Bilkent University and The Hyve.Item Open Access Accelerating read mapping with FastHASH(BioMed Central Ltd., 2013) Xin, H.; Lee, D.; Hormozdiari, F.; Yedkar, S.; Mutlu, O.; Alkan C.With the introduction of next-generation sequencing (NGS) technologies, we are facing an exponential increase in the amount of genomic sequence data. The success of all medical and genetic applications of next-generation sequencing critically depends on the existence of computational techniques that can process and analyze the enormous amount of sequence data quickly and accurately. Unfortunately, the current read mapping algorithms have difficulties in coping with the massive amounts of data generated by NGS. We propose a new algorithm, FastHASH, which drastically improves the performance of the seed-and-extend type hash table based read mapping algorithms, while maintaining the high sensitivity and comprehensiveness of such methods. FastHASH is a generic algorithm compatible with all seed-and-extend class read mapping algorithms. It introduces two main techniques, namely Adjacency Filtering, and Cheap K-mer Selection. We implemented FastHASH and merged it into the codebase of the popular read mapping program, mrFAST. Depending on the edit distance cutoffs, we observed up to 19-fold speedup while still maintaining 100% sensitivity and high comprehensiveness. © 2013 Xin et al.Item Open Access Accelerating the HyperLogLog cardinality estimation algorithm(Hindawi Limited, 2017) Bozkus, C.; Fraguela, B. B.In recent years, vast amounts of data of different kinds, from pictures and videos from our cameras to software logs from sensor networks and Internet routers operating day and night, are being generated. This has led to new big data problems, which require new algorithms to handle these large volumes of data and as a result are very computationally demanding because of the volumes to process. In this paper, we parallelize one of these new algorithms, namely, the HyperLogLog algorithm, which estimates the number of different items in a large data set with minimal memory usage, as it lowers the typical memory usage of this type of calculation from O(n) to O(1). We have implemented parallelizations based on OpenMP and OpenCL and evaluated them in a standard multicore system, an Intel Xeon Phi, and two GPUs from different vendors. The results obtained in our experiments, in which we reach a speedup of 88.6 with respect to an optimized sequential implementation, are very positive, particularly taking into account the need to run this kind of algorithm on large amounts of data. © 2017 Cem Bozkus and Basilio B. Fraguela.Item Open Access Access pattern-based code compression for memory-constrained systems(Association for Computing Machinery, 2008-09) Ozturk, O.; Kandemir, M.; Chen, G.As compared to a large spectrum of performance optimizations, relatively less effort has been dedicated to optimize other aspects of embedded applications such as memory space requirements, power, real-time predictability, and reliability. In particular, many modern embedded systems operate under tight memory space constraints. One way of addressing this constraint is to compress executable code and data as much as possible. While researchers on code compression have studied efficient hardware and software based code compression strategies, many of these techniques do not take application behavior into account; that is, the same compression/decompression strategy is used irrespective of the application being optimized. This article presents an application-sensitive code compression strategy based on control flow graph (CFG) representation of the embedded program. The idea is to start with a memory image wherein all basic blocks of the application are compressed, and decompress only the blocks that are predicted to be needed in the near future. When the current access to a basic block is over, our approach also decides the point at which the block could be compressed. We propose and evaluate several compression and decompression strategies that try to reduce memory requirements without excessively increasing the original instruction cycle counts. Some of our strategies make use of profile data, whereas others are fully automatic. Our experimental evaluation using seven applications from the MediaBench suite and three large embedded applications reveals that the proposed code compression strategy is very successful in practice. Our results also indicate that working at a basic block granularity, as opposed to a procedure granularity, is important for maximizing memory space savings. © 2008 ACM.Item Open Access ACMICS: an agent communication model for interacting crowd simulation(Springer, 2017) Kullu, K.; Güdükbay, Uğur; Manocha, D.Behavioral plausibility is one of the major aims of crowd simulation research. We present a novel approach that simulates communication between the agents and assess its influence on overall crowd behavior. Our formulation uses a communication model that tends to simulate human-like communication capability. The underlying formulation is based on a message structure that corresponds to a simplified version of Foundation for Intelligent Physical Agents Agent Communication Language Message Structure Specification. Our algorithm distinguishes between low- and high-level communication tasks so that ACMICS can be easily extended and employed in new simulation scenarios. We highlight the performance of our communication model on different crowd simulation scenarios. We also extend our approach to model evacuation behavior in unknown environments. Overall, our communication model has a small runtime overhead and can be used for interactive simulation with tens or hundreds of agents. © 2017, The Author(s).Item Open Access ACMICS: An agent communication model for interacting crowd simulation: JAAMAS track(International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS), 2018) Kullu, K.; Güdükbay, Uğur; Manocha, D.We present and evaluate a novel approach to simulate communication between the agents. Our approach distinguishes low- And high-level communication tasks. This separation makes it easy to extend and use it in new scenarios. We highlight the benefits of our approach using different simulation scenarios consisting of hun-dreds of agents. We also model evacuation behavior in unknown environments and highlight the benefits of our approach particularly in simulating such behavior.Item Open Access Adaptive compute-phase prediction and thread prioritization to mitigate memory access latency(ACM, 2014-06) Aktürk, İsmail; Öztürk, ÖzcanThe full potential of chip multiprocessors remains unex- ploited due to the thread oblivious memory access sched- ulers used in off-chip main memory controllers. This is especially pronounced in embedded systems due to limita- Tions in memory. We propose an adaptive compute-phase prediction and thread prioritization algorithm for memory access scheduling for embedded chip multiprocessors. The proposed algorithm eficiently categorize threads based on execution characteristics and provides fine-grained priori- Tization that allows to differentiate threads and prioritize their memory access requests accordingly. The threads in compute phase are prioritized among the threads in mem- ory phase. Furthermore, the threads in compute phase are prioritized among themselves based on the potential of mak- ing more progress in their execution. Compared to the prior works First-Ready First-Come First-Serve (FR-FCFS) and Compute-phase Prediction with Writeback-Refresh Overlap (CP-WO), the proposed algorithm reduces the execution time of the generated workloads up to 23.6% and 12.9%, respectively. Copyright 2014 ACM.Item Open Access Adaptive control of a spring-mass hopper(IEEE, 2011) Uyanık, İsmail; Saranlı, Uluç; Morgül, ÖmerPractical realization of model-based dynamic legged behaviors is substantially more challenging than statically stable behaviors due to their heavy dependence on second-order system dynamics. This problem is further aggravated by the difficulty of accurately measuring or estimating dynamic parameters such as spring and damping constants for associated models and the fact that such parameters are prone to change in time due to heavy use and associated material fatigue. In this paper, we present an on-line, model-based adaptive control method for running with a planar spring-mass hopper based on a once-per-step parameter correction scheme. Our method can be used both as a system identification tool to determine possibly time-varying spring and damping constants of a miscalibrated system, or as an adaptive controller that can eliminate steady-state tracking errors through appropriate adjustments on dynamic system parameters. We present systematic simulation studies to show that our method can successfully accomplish both of these tasks. © 2011 IEEE.Item Open Access Adaptive decomposition and remapping algorithms for object-space-parallel direct volume rendering of unstructured grids(Academic Press, 2007-01) Aykanat, Cevdet; Cambazoglu, B. B.; Findik, F.; Kurc, T.Object space (OS) parallelization of an efficient direct volume rendering algorithm for unstructured grids on distributed-memory architectures is investigated. The adaptive OS decomposition problem is modeled as a graph partitioning (GP) problem using an efficient and highly accurate estimation scheme for view-dependent node and edge weighting. In the proposed model, minimizing the cutsize corresponds to minimizing the parallelization overhead due to the data communication and redundant computation/storage while maintaining the GP balance constraint corresponds to maintaining the computational load balance in parallel rendering. A GP-based, view-independent cell clustering scheme is introduced to induce more tractable view-dependent computational graphs for successive visualizations. As another contribution, a graph-theoretical remapping model is proposed as a solution to the general remapping problem and is used in minimization of the cell-data migration overhead. The remapping tool RM-MeTiS is developed by modifying the GP tool MeTiS and is used in partitioning the remapping graphs. Experiments are conducted using benchmark datasets on a 28-node PC cluster to evaluate the performance of the proposed models. © 2006 Elsevier Inc. All rights reserved.Item Open Access Adaptive routing framework for network on chip architectures(ACM, 2016-01) Mustafa, Naveed Ul; Öztürk, Özcan; Niar, S.In this paper we suggest and demonstrate the idea of applying multiple routing algorithms during the execution of a real application mapped on a Network-on-Chip (NoC). Traffic pattern of a real application may change during its execution. As performance of an algorithm depends on the traffic pattern, using the same routing algorithm for the entire span of execution may be inefficient. We study the feasibility of this idea for applications such as SPARSE and MPEG-4 decoder, by applying different routing algorithms. By applying more than one routing algorithms, throughput improves up to 17.37% and 6.74% in the case of SPARSE and MPEG-4 decoder applications, respectively, as compared to the application of single routing algorithm. © 2016 ACM.Item Open Access Adaptive time-to-live strategies for query result caching in web search engines(2012) Alıcı, Sadiye; Altıngövde, I. Ş.; Rıfat, Özcan; Cambazoğlu, B. Barla; Ulusoy, ÖzgürAn important research problem that has recently started to receive attention is the freshness issue in search engine result caches. In the current techniques in literature, the cached search result pages are associated with a fixed time-to-live (TTL) value in order to bound the staleness of search results presented to the users, potentially as part of a more complex cache refresh or invalidation mechanism. In this paper, we propose techniques where the TTL values are set in an adaptive manner, on a per-query basis. Our results show that the proposed techniques reduce the fraction of stale results served by the cache and also decrease the fraction of redundant query evaluations on the search engine backend compared to a strategy using a fixed TTL value for all queries. © 2012 Springer-Verlag Berlin Heidelberg.Item Open Access Adopting integrated application lifecycle management within a large-scale software company: an action research approach(Elsevier, 2018) Tüzün, Eray; Tekinerdogan, B.; Macit, Y.; Ince, K.Context: Application Lifecycle Management (ALM) is a paradigm for integrating and managing the various activities related to the governance, development and maintenance of software products. In the last decade, several ALM tools have been proposed to support this process, and an increasing number of companies have started to adopt ALM. Objective: We aim to investigate the impact of adopting ALM in a real industrial context to understand and justify both the benefits and obstacles of applying integrated ALM. Method: As a research methodology, we apply action research that we have carried out within HAVELSAN, a large-scale IT company. The research was carried out over a period of seven years starting in 2010 when the ALM initiative has been started in the company to increase productivity and decrease maintenance costs. Results: The paper presents the results of the action research that includes the application of ALM practices. The transitions among the different steps are discussed in detail, together with the identified obstacles, benefits and lessons learned. Conclusions: Our seven-year study shows that the adoption of ALM processes is not trivial and its success is related to many factors. An important conclusion is that a piecemeal solution as provided by ALM 1.0 is not feasible for the complex process and tool integration problems of large enterprises. Hence the transition to ALM 2.0 was found necessary to cope with the organizational and business needs. Although ALM 2.0 appeared to be a more mature ALM approach, there are still obstacles that need attention from both researchers and practitioners.Item Open Access Advanced CAD environments: a knowledge-based foundation(Elsevier Science Publishers Ltd, 1993-03) Akman, V.Current computer‐aided design (CAD) systems are extremely powerful tools for the development of products in an industrial setting. However, they still leave a lot to be desired when it comes to fulfilling a designer's demands and dreams. We believe that knowledge engineering, or more broadly artificial intelligence (AI), is a promising candidate for providing advanced design environments. This paper proposes assorted techniques from AI—such as logical knowledge representation, naive physics, and commonsense reasoning—as effective means of obtaining such environments. In order to make adept, temporal comments, an architecture machine must have a certain basic understanding of qualities. Though at first primitive, this qualitative appreciation itself would evolve within a value system that is very personal, between a man and a machine.Item Open Access The advanced video information system: data structures and query processing(Springer, 1996) Adalı, S.; Candan, K. S.; Chen, Su-Shing; Erol, K.; Subrahmanian, V. S.We describe how video data can be organized and structured so as to facilitate efficient querying. We develop a formal model for video data and show how spatial data structures, suitably modified, provide an elegant way of storing such data. We develop algorithms to process various kinds of video queries and show that, in most cases, the complexity of these algorithms is linear. A prototype system, called the Advanced Video Information System (AVIS), based on these concepts, has been designed at the University of Maryland.