Browsing by Subject "Missing data"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
Item Open Access Interactive training of advanced classifiers for mining remote sensing image archives(ACM, 2004) Aksoy, Selim; Koperski, K.; Tusk, C.; Marchisio G.Advances in satellite technology and availability of down-loaded images constantly increase the sizes of remote sensing image archives. Automatic content extraction, classification and content-based retrieval have become highly desired goals for the development of intelligent remote sensing databases. The common approach for mining these databases uses rules created by analysts. However, incorporating GIS information and human expert knowledge with digital image processing improves remote sensing image analysis. We developed a system that uses decision tree classifiers for interactive learning of land cover models and mining of image archives. Decision trees provide a promising solution for this problem because they can operate on both numerical (continuous) and categorical (discrete) data sources, and they do not require any assumptions about neither the distributions nor the independence of attribute values. This is especially important for the fusion of measurements from different sources like spectral data, DEM data and other ancillary GIS data. Furthermore, using surrogate splits provides the capability of dealing with missing data during both training and classification, and enables handling instrument malfunctions or the cases where one or more measurements do not exist for some locations. Quantitative and qualitative performance evaluation showed that decision trees provide powerful tools for modeling both pixel and region contents of images and mining of remote sensing image archives.Item Open Access Land cover classification with multi-sensor fusion of partly missing data(American Society for Photogrammetry and Remote Sensing, 2009-05) Aksoy, S.; Koperski, K.; Tusk, C.; Marchisio, G.We describe a system that uses decision tree-based tools for seamless acquisition of knowledge for classification of remotely sensed imagery. We concentrate on three important problems in this process: information fusion, model understandability, and handling of missing data. Importance of multi-sensor information fusion and the use of decision tree classifiers for such problems have been well-studied in the literature. However, these studies have been limited to the cases where all data sources have a full coverage for the scene under consideration. Our contribution in this paper is to show how decision tree classifiers can be learned with alternative (surrogate) decision nodes and result in models that are capable of dealing with missing data during both training and classification to handle cases where one or more measurements do not exist for some locations. We present detailed performance evaluation regarding the effectiveness of these classifiers for information fusion and feature selection, and study three different methods for handling missing data in comparative experiments. The results show that surrogate decisions incorporated into decision tree classifiers provide powerful models for fusing information from different data layers while being robust to missing data. © 2009 American Society for Photogrammetry and Remote Sensing.Item Open Access Non-uniformly sampled sequential data processing(Bilkent University, 2019-09) Şahin, Safa OnurWe study classification and regression for variable length sequential data, which is either non-uniformly sampled or contains missing samples. In most sequential data processing studies, one considers data sequence is uniformly sampled and complete, i.e., does not contain missing input values. However, non-uniformly sampled sequences and the missing data problem appear in a wide range of fields such as medical imaging and financial data. To resolve these problems, certain preprocessing techniques, statistical assumptions and imputation methods are usually employed. However, these approaches suffer since the statistical assumptions do not hold in general and the imputation of artificially generated and unrelated inputs deteriorate the model. To mitigate these problems, in chapter 2, we introduce a novel Long Short-Term Memory (LSTM) architecture. In particular, we extend the classical LSTM network with additional time gates, which incorporate the time information as a nonlinear scaling factor on the conventional gates. We also provide forward pass and backward pass update equations for the proposed LSTM architecture. We show that our approach is superior to the classical LSTM architecture, when there is correlation between time samples. In chapter 3, we investigate regression for variable length sequential data containing missing samples and introduce a novel tree architecture based on the Long Short-Term Memory (LSTM) networks. In our architecture, we employ a variable number of LSTM networks, which use only the existing inputs in the sequence, in a tree-like architecture without any statistical assumptions or imputations on the missing data. In particular, we incorporate the missingness information by selecting a subset of these LSTM networks based on presence-pattern of a certain number of previous inputs.Item Open Access UKSB sinir ağları ile eksik veri kümesi üzerinde sıralı bağlanım(IEEE, 2019-04) Şahin, Safa OnurBu bildiride, içerisinde eksik bilgi bulunan veri kümesinin Uzun Kısa-Soluklu Bellek (UKSB) sinir ağları ile sıralı bağlanımı çalışılmıştır. UKSB sinir ağını kullanan sıralı bağlanım uygulamalarında veri kümesi genellikle eksiksiz olarak olarak kabul edilir. Ancak, eksik veri problemi sıralı veri içeren gerçek hayat uygulamalarında sıklıkla karşılaşılan bir sorundur. Bu probleme çözüm amacıyla sunulan yöntemlerde eksik veri, sıralı verideki örüntüyü yakalayacak derecede modellenememekte ve bu yüzden yüksek performans artışları görünmemektedir. Bu bildiride, eksik veri, bağlanımı gerçekleştiren UKSB ağı tabanlı sinir ağının kendisi tarafından modellenmekte ve bağlanım sırasında üretilen bu veri kullanılmaktadır. Gerçek hayat uygulamalarından elde edilmiş sırlı veri kümeleriyle yapılan deneylerde, önerilen algoritmanın geleneksel metotlar karşısında üstün performans artışına sahip olduğu gözlemlenmiştir.