Browsing by Subject "Learning Systems"
Now showing 1 - 9 of 9
- Results Per Page
- Sort Options
Item Open Access Classification by feature partitioning(Springer/, 1996) Guvenir, H. A.; Şirin, İ.This paper presents a new form of exemplar-based learning, based on a representation scheme called jfaliirf parluinning, and a panitular implementation of this technique called CFF (for Classification by feature Partioning). Learning in CFP is accomplished by storing the objects separately in each (tenure dimension as disjoint sets of values called segments A segment is; expanded through generalization or specialized by dividing in into sub-segments. Cklassification is based on a weighted voting among the individual productions of the features, which are simply the class values of the segments corresponding to the values of a test instance fur each feature An empirical evaluation of CFP and its comparison with two other classification techniques, lhai consider each feature separately are given. © 1996 Kluwer Academic Publishers,.Item Open Access Clustered linear regression(Elsevier, 2002) Ari, B.; Güvenir, H. A.Clustered linear regression (CLR) is a new machine learning algorithm that improves the accuracy of classical linear regression by partitioning training space into subspaces. CLR makes some assumptions about the domain and the data set. Firstly, target value is assumed to be a function of feature values. Second assumption is that there are some linear approximations for this function in each subspace. Finally, there are enough training instances to determine subspaces and their linear approximations successfully. Tests indicate that if these approximations hold, CLR outperforms all other well-known machine-learning algorithms. Partitioning may continue until linear approximation fits all the instances in the training set - that generally occurs when the number of instances in the subspace is less than or equal to the number of features plus one. In other case, each new subspace will have a better fitting linear approximation. However, this will cause over fitting and gives less accurate results for the test instances. The stopping situation can be determined as no significant decrease or an increase in relative error. CLR uses a small portion of the training instances to determine the number of subspaces. The necessity of high number of training instances makes this algorithm suitable for data mining applications. © 2002 Elsevier Science B.V. All rights reserved.Item Open Access Concept representation with overlapping feature intervals(Taylor & Francis Inc., 1998) Güvenir, H. A.; Koç, H. G.This article presents a new form of exemplar-based learning method, based on overlapping feature intervals. In this model, a concept is represented by a collection of overlappling intervals for each feature and class. Classification with Overlapping Feature Intervals (COFI) is a particular implementation of this technique. In this incremental, inductive, and supervised learning method, the basic unit of the representation is an interval. The COFI algorithm learns the projections of the intervals in each feature dimension for each class. Initially, an interval is a point on a feature-class dimension; then it can be expanded through generalization. No specialization of intervals is done on feature-class dimensions by this algorithm. Classification in the COFI algorithm is based on a majority voting among the local predictions that are made individually by each feature. An evaluation of COFI and its comparison with similar other classification techniques is given.Item Open Access An eager regression method based on best feature projections(Springer, Berlin, Heidelberg, 2001) Aydın, Tolga; Güvenir, H. AltayThis paper describes a machine learning method, called Regression by Selecting Best Feature Projections (RSBFP). In the training phase, RSBFP projects the training data on each feature dimension and aims to find the predictive power of each feature attribute by constructing simple linear regression lines, one per each continuous feature and number of categories per each categorical feature. Because, although the predictive power of a continuous feature is constant, it varies for each distinct value of categorical features. Then the simple linear regression lines are sorted according to their predictive power. In the querying phase of learning, the best linear regression line and thus the best feature projection are selected to make predictions. © Springer-Verlag Berlin Heidelberg 2001.Item Open Access Instance-based regression by partitioning feature projections(Springer, 2004) Uysal, İ.; Güvenir, H. A.A new instance-based learning method is presented for regression problems with high-dimensional data. As an instance-based approach, the conventional method, KNN, is very popular for classification. Although KNN performs well on classification tasks, it does not perform as well on regression problems. We have developed a new instance-based method, called Regression by Partitioning Feature Projections (RPFP) which is designed to meet the requirement for a lazy method that achieves high levels of accuracy on regression problems. RPFP gives better performance than well-known eager approaches found in machine learning and statistics such as MARS, rule-based regression, and regression tree induction systems. The most important property of RPFP is that it is a projection-based approach that can handle interactions. We show that it outperforms existing eager or lazy approaches on many domains when there are many missing values in the training data.Item Open Access Learning problem solving strategies using refinement and macro generation(Elsevier BV, 1990) Güvenir, H. A.; Ernst, G. W.In this paper we propose a technique for learning efficient strategies for solving a certain class of problems. The method, RWM, makes use of two separate methods, namely, refinement and macro generation. The former is a method for partitioning a given problem into a sequence of easier subproblems. The latter is for efficiently learning composite moves which are useful in solving the problem. These methods and a system that incorporates them are described in detail. The kind of strategies learned by RWM are based on the GPS problem solving method. Examples of strategies learned for different types of problems are given. RWM has learned good strategies for some problems which are difficult by human standards. © 1990.Item Open Access Maximizing benefit of classifications using feature intervals(Springer, Berlin, Heidelberg, 2003) İkizler, Nazlı; Güvenir, H. AltayThere is a great need for classification methods that can properly handle asymmetric cost and benefit constraints of classifications. In this study, we aim to emphasize the importance of classification benefits by means of a new classification algorithm, Benefit-Maximizing classifier with Feature Intervals (BMFI) that uses feature projection based knowledge representation. Empirical results show that BMFI has promising performance compared to recent cost-sensitive algorithms in terms of the benefit gained.Item Open Access An overview of regression techniques for knowledge discovery(Cambridge University Press, 1999) Uysal, İ.; Güvenir, H. A.Predicting or learning numeric features is called regression in the statistical literature, and it is the subject of research in both machine learning and statistics. This paper reviews the important techniques and algorithms for regression developed by both communities. Regression is important for many applications, since lots of real life problems can be modeled as regression problems. The review includes Locally Weighted Regression (LWR), rule-based regression, Projection Pursuit Regression (PPR), instance-based regression, Multivariate Adaptive Regression Splines (MARS) and recursive partitioning regression methods that induce regression trees (CART, RETIS and M5).Item Open Access Regression on feature projections(Elsevier, 2000) Guvenir, H. A.; Uysal, I.This paper describes a machine learning method, called Regression on Feature Projections (RFP), for predicting a real-valued target feature, given the values of multiple predictive features. In RFP training is based on simply storing the projections of the training instances on each feature separately. Prediction of the target value for a query point is obtained through two averaging procedures executed sequentially. The first averaging process is to find the individual predictions of features by using the K-Nearest Neighbor (KNN) algorithm. The second averaging process combines the predictions of all features. During the first averaging step, each feature is associated with a weight in order to determine the prediction ability of the feature at the local query point. The weights, found for each local query point, are used in the second prediction step and enforce the method to have an adaptive or context-sensitive nature. We have compared RFP with KNN and the rule based-regression algorithms. Results on real data sets show that RFP achieves better or comparable accuracy and is faster than both KNN and Rule-based regression algorithms.