A review of software packages for data mining

Date

2003

Authors

Haughton, D.
Deichmann, J.
Eshghi, A.
Sayek, S.
Teebagy, N.
Topi, H.

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

The American Statistician

Print ISSN

0003-1305

Electronic ISSN

1537-2731

Publisher

American Statistical Association

Volume

57

Issue

4

Pages

290 - 309

Language

English

Journal Title

Journal ISSN

Volume Title

Citation Stats
Attention Stats
Usage Stats
2
views
22
downloads

Series

Abstract

We present to the statistical community an overview of five data mining packages with the intent of leaving the reader with a sense of the different capabilities, the ease or difficulty of use, and the user interface of each package. We are not attempting to perform a controlled comparison of the algorithms in each package to decide which has the strongest predictive power, but instead hope to give an idea of the approach to predictive modeling used in each of them. The packages are compared in the areas of descriptive statistics and graphics, predictive models, and association (market basket) analysis. As expected, the packages affiliated with the most popular statistical software packages (SAS and SPSS) provide the broadest range of features with remarkably similar modeling and interface approaches, whereas the other packages all have their special sets of features and specific target audiences whom we believe each of the packages will serve well. It is essential that an organization considering the purchase of a data mining package carefully evaluate the available options and choose the one that provides the best fit with its particular needs.

Course

Other identifiers

Book Title

Degree Discipline

Degree Level

Degree Name

Citation

Published Version (Please cite this version)