Browsing by Subject "Database systems"

Now showing 1 - 20 of 45

Open Access
2-D adaptive prediction based Gaussianity tests in microcalcification detection
(SPIE, 1998-01) Gürcan, M. Nafi; Yardımcı, Yasemin; Çetin, A. Enis
With increasing use of Picture Archiving and Communication Systems (PACS), Computer-aided Diagnosis (CAD) methods will be more widely utilized. In this paper, we develop a CAD method for the detection of microcalcification clusters in mammograms, which are an early sign of breast cancer. The method we propose makes use of two-dimensional (2-D) adaptive filtering and a Gaussianity test recently developed by Ojeda et al. for causal invertible time series. The first step of this test is adaptive linear prediction. It is assumed that the prediction error sequence has a Gaussian distribution as the mammogram images do not contain sharp edges. Since microcalcifications appear as isolated bright spots, the prediction error sequence contains large outliers around microcalcification locations. The second step of the algorithm is the computation of a test statistic from the prediction error values to determine whether the samples are from a Gaussian distribution. The Gaussianity test is applied over small, overlapping square regions. The regions, in which the Gaussianity test fails, are marked as suspicious regions. Experimental results obtained from a mammogram database are presented.
Open Access
Adaptive schemes for location update generation in execution location-dependent continuous queries
(Elsevier Inc., 2006-04) Lam, Kam-Yiu; Ulusoy, Özgür
An important feature that is expected to be owned by today's mobile computing systems is the ability of processing location-dependent continuous queries on moving objects. The result of a location-dependent query depends on the current location of the mobile client which has generated the query as well as the locations of the moving objects on which the query has been issued. When a location-dependent query is specified to be continuous, the result of the query can continuously change. In order to provide accurate and timely query results to a client, the location of the client as well as the locations of moving objects in the system has to be closely monitored. Most of the location generation methods proposed in the literature aim to optimize utilization of the limited wireless bandwidth. The issues of correctness and timeliness of query results reported to clients have been largely ignored. In this paper, we propose an adaptive monitoring method (AMM) and a deadline-driven method (DDM) for managing the locations of moving objects. The aim of our methods is to generate location updates with the consideration of maintaining the correctness of query evaluation results without increasing location update workload. Extensive simulation experiments have been conducted to investigate the performance of the proposed methods as compared to a well-known location update generation method, the plain dead-reckoning (pdr). © 2005 Elsevier Inc. All rights reserved.
Open Access
Algorithms for effective querying of compound graph-based pathway databases
(BioMed Central Ltd., 2009-11-16) Doğrusöz, Uğur; Çetintaş, Ahmet; Demir, Emek; Babur, Özgün
Background: Graph-based pathway ontologies and databases are widely used to represent data about cellular processes. This representation makes it possible to programmatically integrate cellular networks and to investigate them using the well-understood concepts of graph theory in order to predict their structural and dynamic properties. An extension of this graph representation, namely hierarchically structured or compound graphs, in which a member of a biological network may recursively contain a sub-network of a somehow logically similar group of biological objects, provides many additional benefits for analysis of biological pathways, including reduction of complexity by decomposition into distinct components or modules. In this regard, it is essential to effectively query such integrated large compound networks to extract the sub-networks of interest with the help of efficient algorithms and software tools. Results: Towards this goal, we developed a querying framework, along with a number of graph-theoretic algorithms from simple neighborhood queries to shortest paths to feedback loops, that is applicable to all sorts of graph-based pathway databases, from PPIs (protein-protein interactions) to metabolic and signaling pathways. The framework is unique in that it can account for compound or nested structures and ubiquitous entities present in the pathway data. In addition, the queries may be related to each other through "AND" and "OR" operators, and can be recursively organized into a tree, in which the result of one query might be a source and/or target for another, to form more complex queries. The algorithms were implemented within the querying component of a new version of the software tool PATIKAweb (Pathway Analysis Tool for Integration and Knowledge Acquisition) and have proven useful for answering a number of biologically significant questions for large graph-based pathway databases. Conclusion: The PATIKA Project Web site is http://www.patika.org. PATIKAweb version 2.1 is available at http://web.patika.org. © 2009 Dogrusoz et al; licensee BioMed Central Ltd.
Open Access
Analysis of concurrency control protocols for real-time database systems
(Elsevier, 1998) Ulusoy, Özgür
This paper provides an approximate analytic solution method for evaluating the performance of concurrency control protocols developed for real-time database systems (RTDBSs). Transactions processed in a RTDBS are associated with timing constraints typically in the form of deadlines. The primary consideration in developing a RTDBS concurrency control protocol is the fact that satisfaction of the timing constraints of transactions is as important as maintaining the consistency of the underlying database. The proposed solution method provides the evaluation of the performance of concurrency control protocols in terms of the satisfaction rate of timing constraints. As a case study, a RTDBS concurrency control protocol, called High Priority, is analyzed using the proposed method. The accuracy of the performance results obtained is ascertained via simulation. The solution method is also used to investigate the real-time performance benefits of the High Priority over the ordinary Two-Phase Locking.
Open Access
Automatic detection of salient objects and spatial relations in videos for a video database system
(Elsevier BV, 2008-10) Sevilmiş, T.; Baştan M.; Güdükbay, Uğur; Ulusoy, Özgür
Multimedia databases have gained popularity due to rapidly growing quantities of multimedia data and the need to perform efficient indexing, retrieval and analysis of this data. One downside of multimedia databases is the necessity to process the data for feature extraction and labeling prior to storage and querying. Huge amount of data makes it impossible to complete this task manually. We propose a tool for the automatic detection and tracking of salient objects, and derivation of spatio-temporal relations between them in video. Our system aims to reduce the work for manual selection and labeling of objects significantly by detecting and tracking the salient objects, and hence, requiring to enter the label for each object only once within each shot instead of specifying the labels for each object in every frame they appear. This is also required as a first step in a fully-automatic video database management system in which the labeling should also be done automatically. The proposed framework covers a scalable architecture for video processing and stages of shot boundary detection, salient object detection and tracking, and knowledge-base construction for effective spatio-temporal object querying. © 2008 Elsevier B.V. All rights reserved.
Open Access
Automatic image captioning
(2004) Pan J.-Y.; Yang H.-J.; Duygulu, Pınar; Faloutsos, C.
In this paper, we examine the problem of automatic image captioning. Given a training set of captioned images, we want to discover correlations between image features and keywords, so that we can automatically find good keywords for a new image. We experiment thoroughly with multiple design alternatives on large datasets of various content styles, and our proposed methods achieve up to a 45% relative improvement on captioning accuracy over the state of the art.
Open Access
Automatic multimedia cross-modal correlation discovery
(ACM, 2004-08) Pan, J.-Y.; Yang, H.-J.; Faloutsos, C.; Duygulu, Pınar
Given an image (or video clip, or audio song), how do we automatically assign keywords to it? The general problem is to find correlations across the media in a collection of multimedia objects like video clips, with colors, and/or motion, and/or audio, and/or text scripts. We propose a novel, graph-based approach, "MMG", to discover such cross-modal correlations. Our "MMG" method requires no tuning, no clustering, no user-determined constants; it can be applied to any multi-media collection, as long as we have a similarity function for each medium; and it scales linearly with the database size. We report auto-captioning experiments on the "standard" Corel image database of 680 MB, where it outperforms domain specific, fine-tuned methods by up to 10 percentage points in captioning accuracy (50% relative improvement).
Open Access
Automatic performance evaluation of Web search engines
(Elsevier, 2004) Can, F.; Nuray, R.; Sevdik, A. B.
Measuring the information retrieval effectiveness of World Wide Web search engines is costly because of human relevance judgments involved. However, both for business enterprises and people it is important to know the most effective Web search engines, since such search engines help their users find higher number of relevant Web pages with less effort. Furthermore, this information can be used for several practical purposes. In this study we introduce automatic Web search engine evaluation method as an efficient and effective assessment tool of such systems. The experiments based on eight Web search engines, 25 queries, and binary user relevance judgments show that our method provides results consistent with human-based evaluations. It is shown that the observed consistencies are statistically significant. This indicates that the new method can be successfully used in the evaluation of Web search engines. © 2003 Elsevier Ltd. All rights reserved.
Open Access
Automatic Ranking of Retrieval Systems in Imperfect Environments
(ACM, 2003-07-08) Nuray, Rabia; Can, Fazlı
The empirical investigation of the effectiveness of information retrieval (IR) systems requires a test collection, a set of query topics, and a set of relevance judgments made by human assessors for each query. Previous experiments show that differences in human relevance assessments do not affect the relative performance of retrieval systems. Based on this observation, we propose and evaluate a new approach to replace the human relevance judgments by an automatic method. Ranking of retrieval systems with our methodology correlates positively and significantly with that of human-based evaluations. In the experiments, we assume a Web-like imperfect environment: the indexing information for all documents is available for ranking, but some documents may not be available for retrieval. Such conditions can be due to document deletions or network problems. Our method of simulating imperfect environments can be used for Web search engine assessment and in estimating the effects of network conditions (e.g., network unreliability) on IR system performance.
Open Access
Bilkent University Multimedia Database Group at TRECVID 2008
(National Institute of Standards and Technology, 2008-11) Küçüktunç, Onur; Baştan, Muhammet; Güdükkbay, Uğur; Ulusoy, Özgür
Bilkent University Multimedia Database Group (BILMDG) participated in two tasks at TRECVID 2008: content-based copy detection (CBCD) and high-level feature extraction (FE). Mostly MPEG-7 [1] visual features, which are also used as low-level features in our MPEG-7 compliant video database management system, are extracted for these tasks. This paper discusses our approaches in each task.
Open Access
BilVideo: Design and implementation of a video database management system
(Springer, 2005) Dönderler, M. E.; Şaykol, E.; Arslan, U.; Ulusoy, Özgür; Güdükbay, Uğur
With the advances in information technology, the amount of multimedia data captured, produced, and stored is increasing rapidly. As a consequence, multimedia content is widely used for many applications in today's world, and hence, a need for organizing this data, and accessing it from repositories with vast amount of information has been a driving stimulus both commercially and academically. In compliance with this inevitable trend, first image and especially later video database management systems have attracted a great deal of attention, since traditional database systems are designed to deal with alphanumeric information only, thereby not being suitable for multimedia data. In this paper, a prototype video database management system, which we call BilVideo, is introduced. The system architecture of BilVideo is original in that it provides full support for spatio-temporal queries that contain any combination of spatial, temporal, object-appearance, external-predicate, trajectory-projection, and similarity-based object-trajectory conditions by a rule-based system built on a knowledge-base, while utilizing an object-relational database to respond to semantic (keyword, event/activity, and category-based), color, shape, and texture queries. The parts of BilVideo (Fact-Extractor, Video-Annotator, its Web-based visual query interface, and its SQL-like textual query language) are presented, as well. Moreover, our query processing strategy is also briefly explained. © 2005 Springer Science + Business Media, Inc.
Open Access
The BioPAX community standard for pathway data sharing
(Nature Publishing Group, 2010-09) Demir, Emek; Cary, M. P.; Paley, S.; Fukuda, K.; Lemer, C.; Vastrik, I.; Wu, G.; D'Eustachio, P.; Schaefer, C.; Luciano, J.; Schacherer, F.; Martinez-Flores, I.; Hu, Z.; Jimenez-Jacinto, V.; Joshi-Tope, G.; Kandasamy, K.; Lopez-Fuentes, A. C.; Mi, H.; Pichler, E.; Rodchenkov, I.; Splendiani, A.; Tkachev, S.; Zucker, J.; Gopinath, G.; Rajasimha, H.; Ramakrishnan, R.; Shah, I.; Syed, M.; Anwar, N.; Babur, Özgün; Blinov, M.; Brauner, E.; Corwin, D.; Donaldson, S.; Gibbons, F.; Goldberg, R.; Hornbeck, P.; Luna, A.; Murray-Rust, P.; Neumann, E.; Reubenacker, O.; Samwald, M.; Iersel, Martijn van; Wimalaratne, S.; Allen, K.; Braun, B.; Whirl-Carrillo, M.; Cheung, Kei-Hoi; Dahlquist, K.; Finney, A.; Gillespie, M.; Glass, E.; Gong, L.; Haw, R.; Honig, M.; Hubaut, O.; Kane, D.; Krupa, S.; Kutmon, M.; Leonard, J.; Marks, D.; Merberg, D.; Petri, V.; Pico, A.; Ravenscroft, D.; Ren, L.; Shah, N.; Sunshine, M.; Tang R.; Whaley, R.; Letovksy, S.; Buetow, K. H.; Rzhetsky, A.; Schachter, V.; Sobral, B. S.; Doğrusöz, Uğur; McWeeney, S.; Aladjem, M.; Birney, E.; Collado-Vides, J.; Goto, S.; Hucka, M.; Novère, Nicolas Le; Maltsev, N.; Pandey, A.; Thomas, P.; Wingender, E.; Karp, P. D.; Sander, C.; Bader, G. D.
Biological Pathway Exchange (BioPAX) is a standard language to represent biological pathways at the molecular and cellular level and to facilitate the exchange of pathway data. The rapid growth of the volume of pathway data has spurred the development of databases and computational tools to aid interpretation; however, use of these data is hampered by the current fragmentation of pathway information across many databases with incompatible formats. BioPAX, which was created through a community process, solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. Using BioPAX, millions of interactions, organized into thousands of pathways, from many organisms are available from a growing number of databases. This large amount of pathway data in a computable form will support visualization, analysis and biological discovery. © 2010 Nature America, Inc. All rights reserved.
Open Access
Coding of fingerprint images using binary subband decomposition and vector quantization
(SPIE, 1998-01) Gerek, Ömer N.; Çetin, A. Enis
In this paper, compression of binary digital fingerprint images is considered. High compression ratios for fingerprint images is essential for handling huge amount of images in databases. In our method, the fingerprint image is first processed by a binary nonlinear subband decomposition filter bank and the resulting subimages are coded using vector quantizers designed for quantizing binary images. It is observed that the discriminating properties of the fingerprint, images are preserved at very low bit rates. Simulation results are presented.
Open Access
Computer aided frequency planning for the radio and TV broadcasts
(Institute of Electrical and Electronics Engineers, 1996-06) Altıntaş, Ayhan; Ocalı, O.; Topçu, Satılmış; Tanyer, S. G.; Köymen, Hayrettin
The frequency planning of the VHF and UHF broadcasts in Turkey is described. This planning is done with the aid of computer databases and digital terrain map. The frequency offset is applied whenever applicable to increase the channel capacity. The offset assignment is done through simulated annealing algorithm. The international rules and regulations concerning Turkey are also considered.
Open Access
Constrained min-cut replication for K-way hypergraph partitioning
(Institute for Operations Research and the Management Sciences (I N F O R M S), 2014) Yazici V.; Aykanat, Cevdet
Replication is a widely-used technique in information retrieval and database systems for providing fault tolerance and reducing parallelization and processing costs. Combinatorial models based on hypergraph partitioning are proposed for various problems arising in information retrieval and database systems. We consider the possibility of using vertex replication to improve the quality of hypergraph partitioning. In this study, we focus on the constrained min-cut replication (CMCR) problem, where we are initially given a maximum replication capacity and a K-way hypergraph partition with an initial imbalance ratio. The objective in the CMCR problem is finding the optimal vertex replication sets for each part of the given partition such that the initial cut size of the partition is minimized, where the initial imbalance is either preserved or reduced under the given replication capacity constraint. In this study, we present a complexity analysis of the CMCR problem and propose a model based on a unique blend of coarsening and integer linear programming (ILP) schemes. This coarsening algorithm is derived from a novel utilization of the Dulmage-Mendelsohn decomposition. Experiments show that the ILP formulation coupled with the Dulmage-Mendelsohn decomposition-based coarsening provides high quality results in practical execution times for reducing the cut size of a given K-way hypergraph partition. © 2014 INFORMS.
Open Access
A database model for querying visual surveillance videos by integrating semantic and low-level features
(Springer, Berlin, Heidelberg, 2005) Şaykol, Ediz; Güdükbay, Uğur; Ulusoy, Özgür
Automated visual surveillance has emerged as a trendy application domain in recent years. Many approaches have been developed on video processing and understanding. Content-based access to surveillance video has become a challenging research area. The results of a considerable amount of work dealing with automated access to visual surveillance have appeared in the literature. However, the event models and the content-based querying and retrieval components have significant gaps remaining unfilled. To narrow these gaps, we propose a database model for querying surveillance videos by integrating semantic and low-level features. In this paper, the initial design of the database model, the query types, and the specifications of its query language are presented. © Springer-Verlag Berlin Heidelberg 2005.
Open Access
Distributed block formation and layout for disk-based management of large-scale graphs
(Springer, 2017) Yaşar, A.; Gedik, B.; Ferhatosmanoğlu, H.
We are witnessing an enormous growth in social networks as well as in the volume of data generated by them. An important portion of this data is in the form of graphs. In recent years, several graph processing and management systems emerged to handle large-scale graphs. The primary goal of these systems is to run graph algorithms and queries in an efficient and scalable manner. Unlike relational data, graphs are semi-structured in nature. Thus, storing and accessing graph data using secondary storage requires new solutions that can provide locality of access for graph processing workloads. In this work, we propose a scalable block formation and layout technique for graphs, which aims at reducing the I/O cost of disk-based graph processing algorithms. To achieve this, we designed a scalable MapReduce-style method called ICBL, which can divide the graph into a series of disk blocks that contain sub-graphs with high locality. Furthermore, ICBL can order the resulting blocks on disk to further reduce non-local accesses. We experimentally evaluated ICBL to showcase its scalability, layout quality, as well as the effectiveness of automatic parameter tuning for ICBL. We deployed the graph layouts generated by ICBL on the Neo4j open source graph database, http://www.neo4j.org/ (2015) graph database management system. Our results show that the layout generated by ICBL reduces the query running times over Neo4j more than 2 × compared to the default layout. © 2017, Springer Science+Business Media New York.
Open Access
Effective use of space for pivot-based metric indexing structures
(IEEE, 2008-04) Çelik, Cengiz
Among the metric space indexing methods, AESA is known to produce the lowest query costs in terms of the number of distance computations. However, its quadratic construction cost and space consumption makes it infeasiblefor large dataseis. There have been some work on reducing the space requirements of AESA. Instead of keeping all the distances between objects, LAESA appoints a subset of the database as pivots, keeping only the distances between objects and pivots. Kvp uses the idea of prioritizing the pivots based on their distances to objects, only keeping pivot distances that it evaluates as promising. FQA discretizes the distances using a fixed amount of bits per distance instead of using system's floating point types. Varying the number of bits to produce a performance-space trade-off was also studied in Kvp. Recently, BAESA has been proposed based on the same idea, but using different distance ranges for each pivot. The t-spanner based indexing structure compacts the distance matrix by introducing an approximation factor that makes the pivots less effective. In this work, we show that the Kvp prioritization is oriented toward symmetric distance distributions. We offer a new method that evaluates the effectiveness of pivots in a better fashion by making use of the overall distance distribution. We also simulate the performance of our method combined with distance discretization. Our results show that our approach is able to offer very good space-performance trade-offs compared to AESA and tree-based methods. © 2008 IEEE.
Open Access
Generating time-varying road network data using sparse trajectories
(IEEE, 2016-12) Eser, Elif; Kocayusufoğlu, F.; Eravci, Bahaedd; Ferhatosmanoglu, Hakan; Larriba-Pey, J. L.
While research on time-varying graphs has attracted recent attention, the research community has limited or no access to real datasets to develop effective algorithms and systems. Using noisy and sparse GPS traces from vehicles, we develop a time-varying road network data set where edge weights differ over time. We present our methodology and share this dataset, along with a graph manipulation tool. We estimate the traffic conditions using the sparse GPS data available by characterizing the sparsity issues and assessing the properties of travel sequence data frequency domain. We develop interpolation methods to complete the sparse data into a complete graph dataset with realistic time-varying edge values. We evaluate the performance of time-varying and static shortest path solutions over the generated dynamic road network. The shortest paths using the dynamic graph produce very different results than the static version. We provide an independent Java API and a graph database to analyze and manipulate the generated time-varying graph data easily, not requiring any knowledge about the inners of the graph database system. We expect our solution to support researchers to pursue problems of time-varying graphs in terms of theoretical, algorithmic, and systems aspects. The data and Java API are available at: http://elif.eser.bilkent.edu.tr/roadnetwork. © 2016 IEEE.
Open Access
HandVR: a hand-gesture-based interface to a video retrieval system
(Springer U K, 2015) Genç, S.; Baştan M.; Güdükbay, Uğur; Atalay, V.; Ulusoy, Özgür
Using one’s hands in human–computer interaction increases both the effectiveness of computer usage and the speed of interaction. One way of accomplishing this goal is to utilize computer vision techniques to develop hand-gesture-based interfaces. A video database system is one application where a hand-gesture-based interface is useful, because it provides a way to specify certain queries more easily. We present a hand-gesture-based interface for a video database system to specify motion and spatiotemporal object queries. We use a regular, low-cost camera to monitor the movements and configurations of the user’s hands and translate them to video queries. We conducted a user study to compare our gesture-based interface with a mouse-based interface on various types of video queries. The users evaluated the two interfaces in terms of different usability parameters, including the ease of learning, ease of use, ease of remembering (memory), naturalness, comfortable use, satisfaction, and enjoyment. The user study showed that querying video databases is a promising application area for hand-gesture-based interfaces, especially for queries involving motion and spatiotemporal relations.