Browsing by Subject "Similarity measure"

Now showing 1 - 7 of 7

Open Access
Alignment of uncalibrated images for multi-view classification
(IEEE, 2011) Arık, Sercan Ömer; Vuraf, E.; Frossard P.
Efficient solutions for the classification of multi-view images can be built on graph-based algorithms when little information is known about the scene or cameras. Such methods typically require a pair-wise similarity measure between images, where a common choice is the Euclidean distance. However, the accuracy of the Euclidean distance as a similarity measure is restricted to cases where images are captured from nearby viewpoints. In settings with large transformations and viewpoint changes, alignment of images is necessary prior to distance computation. We propose a method for the registration of uncalibrated images that capture the same 3D scene or object. We model the depth map of the scene as an algebraic surface, which yields a warp model in the form of a rational function between image pairs. The warp model is computed by minimizing the registration error, where the registered image is a weighted combination of two images generated with two different warp functions estimated from feature matches and image intensity functions in order to provide robust registration. We demonstrate the flexibility of our alignment method by experimentation on several wide-baseline image pairs with arbitrary scene geometries and texture levels. Moreover, the results on multi-view image classification suggest that the proposed alignment method can be effectively used in graph-based classification algorithms for the computation of pairwise distances where it achieves significant improvements over distance computation without prior alignment. © 2011 IEEE.
Open Access
Categorization in a hierarchically structured text database
(2001) Kutlu, Ferhat
Over the past two decades there has been a huge increase in the amount of data being stored in databases and the on-line ﬂow of data by the effects of improvements in Internet. This huge increase brought out the needs for intelligent tools to manage that size of data and its ﬂow. Hierarchical approach is the best way to satisfy these needs and it is so widespread among people dealing with databases and Internet. Usenet newsgroups system is one of the on-line databases that have built-in hierarchical structures. Our point of departure is this hierarchical structure which makes categorization tasks easier and faster. In fact most of the search engines in Internet also exploit inherent hierarchy of Internet. Growing size of data makes most of the traditional categorization algorithms obsolete. Thus we developed a brand-new categorization learning algorithm which constructs an index tree out of Usenet news database and then decides the related newsgroups of a new news by categorizing it over the index tree. In learning phase it has an agglomerative and bottom-up hierarchical approach. In categorization phase it does an overlapping and supervised categorization. k Nearest Neighbor categorization algorithm is used to compare the complexity measure and accuracy of our algorithm. This comparison does not only mean comparing two different algorithms but also means comparing hierarchical approach vs. ﬂat approach, similarity measure vs. distance measure and importance of accuracy vs. importance of speed. Our algorithm prefers hierarchical approach and similarity measure, and greatly outperforms k Nearest Neighbor categorization algorithm in speed with minimal loss of accuracy.
Open Access
Görsel arama sonuçlarının çoklu örnekle öğrenme yöntemiyle yeniden sıralanması
(IEEE, 2012-04) Şener, Fadime; Cinbiş, N. I.; Duygulu-Şahin, Pınar
Bu çalışmada, çoklu öğrenme yöntemi ile metin tabanlı arama motorlarından elde edilen görsel sorgu sonuçlarını iyileştirmek için geliştirilmiş olan, zayıf denetimli öğrenen bir yöntem sunulmaktadır. Bu yöntemde arama motorundan dönen sonuçlar zayıf pozitif kabul edilerek, sorgu kategorisinden görüntü içermeyen negatif görüntüler de kullanılarak; çoklu örnekle öğrenme için torbalar oluşturulmaktadır. Bu torbalar ve veri kümesindeki örnekler arasında kurulan torba-örnek benzerliğinden yararlanarak; torbalar yeni bir örnek uzayına taşınmakta ve problem klasik bir denetimli öğrenme problemi haline getirilmektedir. Daha sonra, lineer destek vektör makinesi (DVM) kullanılarak her sorgu için sınıflandırma modelleri oluşturulmaktadır. Elde edilen sınıflandırma değerlerine göre görseller yeniden sıralanmış ve arama motorundan gelen sonuçların iyileştirildiği görülmüştür. Bu çerçevede, torba boyları arasında bir örüntü bulmak için yaptığımız deneyleri sunmaktayız. In this study, we propose a weakly-supervised multiple instance learning (MIL) method to improve the results of text-based image search engines. In this approach, ranked image list of search engine for a keyword query is treated as weak-positive input data, and with additional negative input data, multiple instance learning bags are constructed. Then, Multiple Instance problem is converted to a standard supervised learning problem by mapping each bag into a feature space defined by instances in training bags using a bag-instance similarity measure. At the end, linear SVM is used to construct a classifier to re-rank keyword-based image search data. Based on the classification scores, we re-rank the images and improve precision over the search engine results. In this respect, we also present our experiments conducted to find a pattern for multiple instance bag sizes to obtain better average precision. © 2012 IEEE.
Open Access
L1 norm based multiplication-free cosine similarity measures for big data analysis
(IEEE, 2014-11) Akbaş, Cem Emre; Bozkurt, Alican; Arslan, Musa Tunç; Aslanoğlu, Hüseyin; Çetin, A. Enis
The cosine similarity measure is widely used in big data analysis to compare vectors. In this article a new set of vector similarity measures are proposed. New vector similarity measures are based on a multiplication-free operator which requires only additions and sign operations. A vector 'product' using the multiplication-free operator is also defined. The new vector product induces the ℓ1-norm. As a result, new cosine measure-like similarity measures are normalized by the ℓ1-norms of the vectors. They can be computed using the MapReduce framework. Simulation examples are presented. © 2014 IEEE.
Open Access
New event detection and topic tracking in Turkish
(John Wiley & Sons, Inc., 2010) Can, F.; Kocberber, S.; Baglioglu, O.; Kardas, S.; Ocalan, H. C.; Uyar, E.
Topic detection and tracking (TDT) applications aim to organize the temporally ordered stories of a news stream according to the events. Two major problems in TDT are new event detection (NED) and topic tracking (TT). These problems focus on finding the first stories of new events and identifying all subsequent stories on a certain topic defined by a small number of sample stories. In this work, we introduce the first large-scale TDT test collection for Turkish, and investigate the NED and TT problems in this language. We present our test-collection-construction approach, which is inspired by the TDT research initiative. We show that in TDT for Turkish with some similarity measures, a simple word truncation stemming method can compete with a lemmatizer-based stemming approach. Our findings show that contrary to our earlier observations on Turkish information retrieval, in NED word stopping has an impact on effectiveness. We demonstrate that the confidence scores of two different similarity measures can be combined in a straightforward manner for higher effectiveness. The influence of several similarity measures on effectiveness also is investigated. We show that it is possible to deploy TT applications in Turkish that can be used in operational settings. © 2010 ASIS&T.
Open Access
Stylistic document retrieval for Turkish
(IEEE, 2009-09) Zamalieva, Daniya; Kalaycılar, Fırat; Kale, Aslı; Pehlivan, Selen; Can, Fazlı
In information retrieval (IR) systems, there are a query and a collection of documents compared with this query and ranked according to a particular similarity measure. Since texts with the same content can be written by different authors, the writing styles of the documents change as well accordingly. This observation brings the idea of investigating text by means of style. In this paper, we analyze text documents in terms of stylistic features of the written text and measure effectiveness of these features in an IR system. Our main focus is on Turkish text documents. Although there are many studies about broadening IR systems with style based enhancement, there is no similar application for Turkish which performs retrieval depending purely on style. © 2009 IEEE.
Open Access
Topic tracking using chronological term ranking
(2013-10) Acun, Bilge; Başpınar, Alper; Oǧuz, Ekin; Saraç, M.İlker; Can, Fazlı
Topic tracking (TT) is an important component of topic detection and tracking (TDT) applications. TT algorithms aim to determine all subsequent stories of a certain topic based on a small number of initial sample stories. We propose an alternative similarity measure based on chronological term ranking (CTR) concept to quantify the relatedness among news articles for topic tracking. The CTR approach is based on the fact that in general important issues are presented at the beginning of news articles. By following this observation we modify the traditional Okapi BM25 similarity measure using the CTR concept. Using a large standard test collection we show that our method provides a statistically significantly improvement with respect to the Okapi BM25 measure. The highly successful performance indicates that the approach can be used in real applications. © 2013 Springer-Verlag London.