Browsing by Subject "Image processing--Digital techniques."

Now showing 1 - 20 of 45

Open Access
Activity analysis for assistive systems
(2014) İşcen, Ahmet
Although understanding and analyzing human actions is a popular research topic in computer vision, most of the research has focused on recognizing ”ordinary” actions, such as walking and jumping. Extending these methods for more specific domains, such as assistive technologies, is not a trivial task. In most cases, these applications contain more fine-grained activities with low inter-class variance and high intra-class variance. In this thesis, we propose to use motion information from snippets, or small video intervals, in order to recognize actions from daily activities. Proposed method encodes the motion by considering the motion statistics, such as the variance and the length of trajectories. It also encodes the position information by using a spatial grid. We show that such approach is especially helpful for the domain of medical device usage, which contains actions with fast movements Another contribution that we propose is to model the sequential information of actions by the order in which they occur. This is especially useful for fine-grained activities, such as cooking activities, where the visual information may not be enough to distinguish between different actions. As for the visual perspective of the problem, we propose to combine multiple visual descriptors by weighing their confidence values. Our experiments show that, temporal sequence model and the fusion of multiple descriptors significantly improve the performance when used together.
Open Access
BilVideo-7 : video parsing, indexing and retrieval
(2010) Baştan, Muhammet
Video indexing and retrieval aims to provide fast, natural and intuitive access to large video collections. This is getting more and more important as the amount of video data increases at a stunning rate. This thesis introduces the BilVideo-7 system to address the issues related to video parsing, indexing and retrieval. BilVideo-7 is a distributed and MPEG-7 compatible video indexing and retrieval system that supports complex multimodal queries in a unified framework. The video data model is based on an MPEG-7 profile which is designed to represent the videos by decomposing them into Shots, Keyframes, Still Regions and Moving Regions. The MPEG-7 compatible XML representations of videos according to this profile are obtained by the MPEG-7 compatible video feature extraction and annotation tool of BilVideo-7, and stored in a native XML database. Users can formulate text, color, texture, shape, location, motion and spatio-temporal queries on an intuitive, easy-touse visual query interface, whose composite query interface can be used to formulate very complex queries containing any type and number of video segments with their descriptors and specifying the spatio-temporal relations between them. The multithreaded query processing server parses incoming queries into subqueries and executes each subquery in a separate thread. Then, it fuses subquery results in a bottom-up manner to obtain the final query result and sends the result to the originating client. The whole system is unique in that it provides very powerful querying capabilities with a wide range of descriptors and multimodal query processing in an MPEG-7 compatible interoperable environment.
Open Access
Calculation of scalar optical diffraction field from its distributed samples over the space
(2010) Esmer, Gökhan Bora
As a three-dimensional viewing technique, holography provides successful threedimensional perceptions. The technique is based on duplication of the information carrying optical waves which come from an object. Therefore, calculation of the diffraction field due to the object is an important process in digital holography. To have the exact reconstruction of the object, the exact diffraction field created by the object has to be calculated. In the literature, one of the commonly used approach in calculation of the diffraction field due to an object is to superpose the fields created by the elementary building blocks of the object; such procedures may be called as the “source model” approach and such a computed field can be different from the exact field over the entire space. In this work, we propose four algorithms to calculate the exact diffraction field due to an object. These proposed algorithms may be called as the “field model” approach. In the first algorithm, the diffraction field given over the manifold, which defines the surface of the object, is decomposed onto a function set derived from propagating plane waves. Second algorithm is based on pseudo inversion of the systemmatrix which gives the relation between the given field samples and the field over a transversal plane. Third and fourth algorithms are iterative methods. In the third algorithm, diffraction field is calculated by a projection method onto convex sets. In the fourth algorithm, pseudo inversion of the system matrix is computed by conjugate gradient method. Depending on the number and the locations of the given samples, the proposed algorithms provide the exact field solution over the entire space. To compute the exact field, the number of given samples has to be larger than the number of plane waves that forms the diffraction field over the entire space. The solution is affected by the dependencies between the given samples. To decrease the dependencies between the given samples, the samples over the manifold may be taken randomly. Iterative algorithms outperforms the rest of them in terms of computational complexity when the number of given samples are larger than 1.4 times the number of plane waves forming the diffraction field over the entire space.
Open Access
A comparative study on human activity classification with miniature inertial and magnetic sensors
(2011) Yüksek, Murat Cihan
This study provides a comparative assessment on the different techniques of classifying human activities that are performed using body-worn miniature inertial and magnetic sensors. The classification techniques compared in this study are: naive Bayesian (NB) classifier, artificial neural networks (ANNs), dissimilarity-based classifier (DBC), various decision-tree methods, Gaussian mixture model (GMM), and support vector machines (SVM). The algorithms for these techniques are provided on two commonly used open source environments: Waikato environment for knowledge analysis (WEKA), a Java-based software; and pattern recognition toolbox (PRTools), a MATLAB toolbox. Human activities are classified using five sensor units worn on the chest, the arms, and the legs. Each sensor unit comprises a tri-axial gyroscope, a tri-axial accelerometer, and a tri-axial magnetometer. A feature set extracted from the raw sensor data using principal component analysis (PCA) is used in the classification process. Three different cross-validation techniques are employed to validate the classifiers. A performance comparison of the classification techniques is provided in terms of their correct differentiation rates, confusion matrices, and computational cost. The methods that result in the highest correct differentiation rates are found to be ANN (99.2%), SVM (99.2%), and GMM (99.1%). The magnetometer is the best type of sensor to be used in classification whereas gyroscope is the least useful. Considering the locations of the sensor units on body, the sensors worn on the legs seem to provide the most valuable information.
Open Access
Constrained Delaunay triangulation for diagnosis and grading of colon cancer
(2009) Erdoğan, Süleyman Tuncer
In our century, the increasing rate of cancer incidents makes it inevitable to employ computerized tools that aim to help pathologists more accurately diagnose and grade cancerous tissues. These mathematical tools offer more stable and objective frameworks, which cause a reduced rate of intra- and inter-observer variability. There has been a large set of studies on the subject of automated cancer diagnosis/grading, especially based on textural and/or structural tissue analysis. Although the previous structural approaches show promising results for different types of tissues, they are still unable to make use of the potential information that is provided by tissue components rather than cell nuclei. However, this additional information is one of the major information sources for the tissue types with differentiated components including luminal regions being useful to describe glands in a colon tissue. This thesis introduces a novel structural approach, a new type of constrained Delaunay triangulation, for the utilization of non-nuclei tissue components. This structural approach first defines two sets of nodes on cell nuclei and luminal regions. It then constructs a constrained Delaunay triangulation on the nucleus nodes with the lumen nodes forming its constraints. Finally, it classifies the tissue samples using the features extracted from this newly introduced constrained Delaunay triangulation. Working with 213 colon tissues taken from 58 patients, our experiments demonstrate that the constrained Delaunay triangulation approach leads to higher accuracies of 87.83 percent and 85.71 percent for the training and test sets, respectively. The experiments also show that the introduction of this new structural representation, which allows definition of new features, provides a more robust graph-based methodology for the examination of cancerous tissues and better performance than its predecessors.
Open Access
CUDA based implementation of flame detection algorithms in day and infrared camera videos
(2011) Hamzaçebi, Hasan
Automatic fire detection in videos is an important task but it is a challenging problem. Video based high performance fire detection algorithms are important for the detection of forest fires. The usage area of fire detection algorithms can further be extended to the places like state and heritage buildings, in which surveillance cameras are installed. In uncontrolled fires, early detection is crucial to extinguish the fire immediately. However, most of the current fire detection algorithms either suffer from high false alarm rates or low detection rates due to the optimization constraints for real-time performance. This problem is also aggravated by the high computational complexity in large areas, where multicamera surveillance is required. In this study, our aim is to speed up the existing color video fire detection algorithms by implementing in CUDA, which uses the parallel computational power of Graphics Processing Units (GPU). Our method does not only speed up the existing algorithms but it can also reduce the optimization constraints for real-time performance to increase detection probability without affecting false alarm rates. In addition, we have studied several methods that detect flames in infrared video and proposed an improvement for the algorithm to decrease the false alarm rate and increase the detection rate of the fire.
Open Access
Detection and classification of objects and texture
(2009) Tuna, Hakan
Object and texture recognition are two important subjects in computer vision. An efficient and fast algorithm to compute a short and efficient feature vector for classification of images is crucial for smart video surveillance systems. In this thesis, feature extraction methods for object and texture classification are investigated, compared and developed. A method for object classification based on shape characteristics is developed. Object silhouettes are extracted from videos by using the background subtraction method. Contour of the objects are obtained from these silhouettes and this 2-D contour signals are transformed into 1-D signals by using a type of radial transformation. Discrete cosine transformation is used to acquire the frequency characteristics of these signals and a support vector machine (SVM) is employed for classification of objects according to this frequency information. This method is implemented and integrated into a real time system together with object tracking. For texture recognition problem, we defined a new computationally efficient operator forming a semigroup on real numbers. The new operator does not require any multiplications. The codifference matrix based on the new operator is defined and an image descriptor using the codifference matrix is developed. Texture recognition and license plate identification examples based on the new descriptor are presented. We compared our method with regular covariance matrix method. Our method has lower computational complexity and it is experimentally shown that it performs as well as the regular covariance method.
Open Access
Detection of tree trunks as visual landmarks in outdoor environments
(2010) Yıldız, Tuğba
One of the basic problems to be addressed for a robot navigating in an outdoor environment is the tracking of its position and state. A fundamental first step in using algorithms for solving this problem, such as various visual Simultaneous Localization and Mapping (SLAM) strategies, is the extraction and identification of suitable stationary “landmarks” in the environment. This is particularly challenging in the outdoors geometrically consistent features such as lines are not frequent. In this thesis, we focus on using trees as persistent visual landmark features in outdoor settings. Existing work to this end only uses intensity information in images and does not work well in low-contrast settings. In contrast, we propose a novel method to incorporate both color and intensity information as well as regional attributes in an image towards robust of detection of tree trunks. We describe both extensions to the well-known edge-flow method as well as complementary Gabor-based edge detection methods to extract dominant edges in the vertical direction. The final stages of our algorithm then group these vertical edges into potential tree trunks using the integration of perceptual organization and all available image features. We characterize the detection performance of our algorithm for two different datasets, one homogeneous dataset with different images of the same tree types and a heterogeneous dataset with images taken from a much more diverse set of trees under more dramatic variations in illumination, viewpoint and background conditions. Our experiments show that our algorithm correctly finds up to 90% of trees with a false-positive rate lower than 15% in both datasets. These results establish that the integration of all available color, intensity and structure information results in a high performance tree trunk detection system that is suitable for use within a SLAM framework that outperforms other methods that only use image intensity information.
Open Access
Deterritorialisation of image : mapping out new media
(2003) Polat, Bican
This study endeavours to elaborate the possibilities that new media would have within the practice of art. Diverging from the pre-existing media, new media emerges as being based on the idea of digitisation. Depending upon the principles of variability and modularity it retains the capacity to link documents, images, sounds and texts in a variety of non-linear paths. The study aims to elaborate on new media within the context of Deleuzian “logic of multiplicities”.
Open Access
Example based retargeting human motion to arbitrary mesh models
(2013) Yaz, İlker O.
Animation of mesh models can be accomplished in many ways, including character animation with skinned skeletons, deformable models, or physic-based simulation. Generating animations with all of these techniques is time consuming and laborious for novice users; however adapting already available wide-range human motion capture data might simplify the process signi cantly. This thesis presents a method for retargeting human motion to arbitrary 3D mesh models with as little user interaction as possible. Traditional motion retargeting systems try to preserve original motion as is, while satisfying several motion constraints. In our approach, we use a few pose-to-pose examples provided by the user to extract desired semantics behind retargeting process by not limiting the transfer to be only literal. Hence, mesh models, which have di erent structures and/or motion semantics from humanoid skeleton, become possible targets. Also considering mesh models which are widely available and without any additional structure (e.g. skeleton), our method does not require such a structure by providing a build-in surface-based deformation system. Since deformation for animation purpose can require more than rigid behaviour, we augment existing rigid deformation approaches to provide volume preserving and cartoon-like deformation. For demonstrating results of our approach, we retarget several motion capture data to three well-known models, and also investigate how automatic retargeting methods developed considering humanoid models work on our models.
Open Access
Feature point classification and matching
(2007) Ay, Avşar Polat
A feature point is a salient point which can be separated from its neighborhood. Widely used definitions assume that feature points are corners. However, some non-feature points also satisfy this assumption. Hence, non-feature points, which are highly undesired, are usually detected as feature points. Texture properties around detected points can be used to eliminate non-feature points by determining the distinctiveness of the detected points within their neighborhoods. There are many texture description methods, such as autoregressive models, Gibbs/Markov random field models, time-frequency transforms, etc. To increase the performance of feature point related applications, two new feature point descriptors are proposed, and used in non-feature point elimination and feature point sorting-matching. To have a computationally feasible descriptor algorithm, a single image resolution scale is selected for analyzing the texture properties around the detected points. To create a scale-space, wavelet decomposition is applied to the given images and neighborhood scale-spaces are formed for every detected point. The analysis scale of a point is selected according to the changes in the kurtosis values of histograms which are extracted from the neighborhood scale-space. By using descriptors, the detected non-feature points are eliminated, feature points are sorted and with inclusion of conventional descriptors feature points are matched. According to the scores obtained in the experiments, the proposed detection-matching scheme performs more reliable than the Harris detector gray-level patch matching scheme. However, SIFT detection-matching scheme performs better than the proposed scheme.
Open Access
Fire and flame detection methods in images and videos
(2010) Habiboğlu, Yusuf Hakan
In this thesis, automatic fire detection methods are studied in color domain, spatial domain and temporal domain. We first investigated fire and flame colors of pixels. Chromatic Model, Fisher’s linear discriminant, Gaussian mixture color model and artificial neural networks are implemented and tested for flame color modeling. For images a system that extracts patches and classifies them using textural features is proposed. Performance of this system is given according to different thresholds and different features. A real-time detection system that uses information in color, spatial and temporal domains is proposed for videos. This system, which is develop by modifying previously implemented systems, divides video into spatiotemporal blocks and uses features extracted from these blocks to detect fire.
Open Access
Histopathological image classification using salient point patterns
(2011) Çığır, Celal
Over the last decade, computer aided diagnosis (CAD) systems have gained great importance to help pathologists improve the interpretation of histopathological tissue images for cancer detection. These systems offer valuable opportunities to reduce and eliminate the inter- and intra-observer variations in diagnosis, which is very common in the current practice of histopathological examination. Many studies have been dedicated to develop such systems for cancer diagnosis and grading, especially based on textural and structural tissue image analysis. Although the recent textural and structural approaches yield promising results for different types of tissues, they are still unable to make use of the potential biological information carried by different tissue components. However, these tissue components help better represent a tissue, and hence, they help better quantify the tissue changes caused by cancer. This thesis introduces a new textural approach, called Salient Point Patterns (SPP), for the utilization of tissue components in order to represent colon biopsy images. This textural approach first defines a set of salient points that correspond to nuclear, stromal, and luminal components of a colon tissue. Then, it extracts some features around these salient points to quantify the images. Finally, it classifies the tissue samples by using the extracted features. Working with 3236 colon biopsy samples that are taken from 258 different patients, our experiments demonstrate that Salient Point Patterns approach improves the classification accuracy, compared to its counterparts, which do not make use of tissue components in defining their texture descriptors. These experiments also show that different set of features can be used within the SPP approach for better representation of a tissue image.
Open Access
Human activity classification with miniature inertial sensors
(2009) Tunçel, Orkun
This thesis provides a comparative study on activity recognition using miniature inertial sensors (gyroscopes and accelerometers) and magnetometers worn on the human body. The classification methods used and compared in this study are: a rule-based algorithm (RBA) or decision tree, least-squares method (LSM), k-nearest neighbor algorithm (k-NN), dynamic time warping (DTW- 1 and DTW-2), and support vector machines (SVM). In the first part of this study, eight different leg motions are classified using only two single-axis gyroscopes. In the second part, human activities are classified using five sensor units worn on different parts of the body. Each sensor unit comprises a tri-axial gyroscope, a tri-axial accelerometer and a tri-axial magnetometer. Different feature sets extracted from the raw sensor data and these are used in the classification process. A number of feature extraction and reduction techniques (principal component analysis) as well as different cross-validation techniques have been implemented and compared. A performance comparison of these classification methods is provided in terms of their correct differentiation rates, confusion matrices, pre-processing and training times and classification times. Among the classification techniques we have considered and implemented, SVM, in general, gives the highest correct differentiation rate, followed by k-NN. The classification time for RBA is the shortest, followed by SVM or LSM, k-NN or DTW-1, and DTW-2 methods. SVM requires the longest training time, whereas DTW-2 takes the longest amount of classification time. Although there is not a significant difference between the correct differentiation rates obtained by different crossvalidation techniques, repeated random sub-sampling uses the shortest amount of classification time, whereas leave-one-out requires the longest.
Open Access
Image information mining using spatial relationship constraints
(2012) Karakuş, Fatih
There is a huge amount of data which is collected from the Earth observation satellites and they are continuously sending data to Earth receiving stations day by day. Therefore, mining of those data becomes more important for effective processing of collected multi-spectral images. The most popular approaches for this problem use the meta-data of the images such as geographical coordinates etc. However, these approaches do not offer a good solution for determining what those images contain. Some researches make a big step from the meta-data based approaches in this area by moving the focus of the study to content based approaches such as utilizing the region information of the sensed images. In this thesis, we propose a novel, generic and extendable image information mining system that uses spatial relationship constraints. In this system, we use not only the region content, but also relationships of those regions. First, we extract the region information of the images and then extract pairwise relationship information of those regions such as left, right, above, below, near, far and distance etc. This feature extraction process is defined as a generic process which is independent from how the region segmentation is obtained. In addition to these, since new features and new approaches are continuously being developed by the image information mining researchers, extendability feature of the our system plays a big role while we are designing our system. In this thesis, we also propose a novel feature vector structure in which a feature vector consists of several sub-feature vectors. In the proposed feature vector structure, each sub-feature vector can be exclusively selected to be used for search process and they can have different distance metrics to be used in comparisons between the same sub-feature vector of the other feature vector structures. Therefore, the system gives ability to users to choose which information about the region and its pairwise relationship with other regions to be used when they perform a search on the system. The proposed system is illustrated by using region based retrieval scenarios on very high spatial resolution satellite images.
Open Access
Image processing methods for food inspection
(2012) Yorulmaz, Onur
With the advances in computer technology, signal processing techniques are widely applied to many food safety applications. In this thesis, new methods are developed to solve two food safety problems using image processing techniques. First problem is the detection of fungal infection on popcorn kernel images. This is a damage called blue-eye caused by a fungus. A cepstrum based feature extraction method is applied to the kernel images for classification purposes. The results of this technique are compared with the results of a covariance based feature extraction method, and previous solutions to the problem. The tests are made on two different databases; reflectance and transmittance mode image databases, in which the method of the image acquisition differs. Support Vector Machine (SVM) is used for image feature classification. It is experimentally observed that an overall success rate of 96% is possible with the covariance matrix based feature extraction method over transmittance database and 94% is achieved for the reflectance database. The second food inspection problem is the detection of acrylamide on cookies that is generated by cooking at high temperatures. Acrylamide is a neurotoxin
Open Access
Image searching with signature filtering and multidimensional indexing
(1997) Günyaktı, Çağlar
Open Access
Improving the resolution of diffraction patterns from many low resolution recordings
(2010) Yücesoy, Veysel
Holography attempts to record and reconstruct wave fields. The resolution limitation of the recording equipments causes some problems in the reconstruction process. An automatic method for the registration and stitching of low resolution diffraction patterns to form a higher resolution one is proposed. There is no prior knowledge about the 3D position of the object in the recordings and it is assumed that there is only one particle in the object field. The method uses Wigner transform, Canny edge detection and Hough transform to register the patterns, and some additional iterative methods depending on the local variance of the reconstructed patterns to stitch them. The performance of the overall system is evaluated against object radius, noise in the original pattern, recording noise and presence of multiple particles in the object field by computer simulations.
Open Access
Intelligent sensing for robot mapping and simultaneous human localization and activity recognition
(2011) Altun, Kerem
We consider three different problems in two different sensing domains, namely ultrasonic sensing and inertial sensing. Since the applications considered in each domain are inherently different, this thesis is composed of two main parts. The approach common to the two parts is that raw data acquired from simple sensors is processed intelligently to extract useful information about the environment. In the first part, we employ active snake contours and Kohonen’s selforganizing feature maps (SOMs) for representing and evaluating discrete point maps of indoor environments efficiently and compactly. We develop a generic error criterion for comparing two different sets of points based on the Euclidean distance measure. The point sets can be chosen as (i) two different sets of map points acquired with different mapping techniques or different sensing modalities, (ii) two sets of fitted curve points to maps extracted by different mapping techniques or sensing modalities, or (iii) a set of extracted map points and a set of fitted curve points. The error criterion makes it possible to compare the accuracy of maps obtained with different techniques among themselves, as well as with an absolute reference. We optimize the parameters of active snake contours and SOMs using uniform sampling of the parameter space and particle swarm optimization. A demonstrative example from ultrasonic mapping is given based on experimental data and compared with a very accurate laser map, considered an absolute reference. Both techniques can fill the erroneous gaps in discrete point maps. Snake curve fitting results in more accurate maps than SOMs because it is more robust to outliers. The two methods and the error criterion are sufficiently general that they can also be applied to discrete point maps acquired with other mapping techniques and other sensing modalities. In the second part, we use body-worn inertial/magnetic sensor units for recognition of daily and sports activities, as well as for human localization in GPSdenied environments. Each sensor unit comprises a tri-axial gyroscope, a tri-axial accelerometer, and a tri-axial magnetometer. The error characteristics of the sensors are modeled using the Allan variance technique, and the parameters of lowand high-frequency error components are estimated. Then, we provide a comparative study on the different techniques of classifying human activities that are performed using body-worn miniature inertial and magnetic sensors. Human activities are classified using five sensor units worn on the chest, the arms, and the legs. We compute a large number of features extracted from the sensor data, and reduce these features using both Principal Components Analysis (PCA) and sequential forward feature selection (SFFS). We consider eight different pattern recognition techniques and provide a comparison in terms of the correct classification rates, computational costs, and their training and storage requirements. Results with sensors mounted on various locations on the body are also provided. The results indicate that if the system is trained by the data of an individual person, it is possible to obtain over 99% correct classification rates with a simple quadratic classifier such as the Bayesian decision method. However, if the training data of that person are not available beforehand, one has to resort to more complex classifiers with an expected correct classification rate of about 85%. We also consider the human localization problem using body-worn inertial/ magnetic sensors. Inertial sensors are characterized by drift error caused by the integration of their rate output to get position information. Because of this drift, the position and orientation data obtained from inertial sensor signals are reliable over only short periods of time. Therefore, position updates from externally referenced sensors are essential. However, if the map of the environment is known, the activity context of the user provides information about position. In particular, the switches in the activity context correspond to discrete locations on the map. By performing activity recognition simultaneously with localization, one can detect the activity context switches and use the corresponding position information as position updates in the localization filter. The localization filter also involves a smoother, which combines the two estimates obtained by running the zero-velocity update (ZUPT) algorithm both forward and backward in time. We performed experiments with eight subjects in an indoor and an outdoor environment involving “walking,” “turning,” and “standing” activities. Using the error criterion in the first part of the thesis, we show that the position errors can be decreased by about 85% on the average. We also present the results of a 3-D experiment performed in a realistic indoor environment and demonstrate that it is possible to achieve over 90% error reduction in position by performing activity recognition simultaneously with localization.
Open Access
A key-pose based representation for human action recognition
(2011) Kurt, Mehmet Can
This thesis utilizes a key-pose based representation to recognize human actions in videos. We believe that the pose of the human figure is a powerful source for describing the nature of the ongoing action in a frame. Each action can be represented by a unique set of frames that include all the possible spatial configurations of the human body parts throughout the time the action is performed. Such set of frames for each action referred as “key poses” uniquely distinguishes that action from the rest. For extracting “key poses”, we define a similarity value between the poses in a pair of frames by using the lines forming the human figure along with a shape matching method. By the help of a clustering algorithm, we group the similar frames of each action into a number of clusters and use the centroids as “key poses” for that action. Moreover, in order to utilize the motion information present in the action, we include simple line displacement vectors for each frame in the “key poses” selection process. Experiments on Weizmann and KTH datasets show the effectiveness of our key-pose based approach in representing and recognizing human actions.