Modeling interestingness of streaming association rules as a benefit maximizing classification problem
Author(s)
Advisor
Date
2009Publisher
Bilkent University
Language
English
Type
ThesisItem Usage Stats
118
views
views
94
downloads
downloads
Abstract
In a typical application of association rule learning from market basket data,
a set of transactions for a fixed period of time is used as input to rule learning
algorithms. For example, the well-known Apriori algorithm can be applied to
learn a set of association rules from such a transaction set. However, learning
association rules from a set of transactions is not a one-time only process. For
example, a market manager may perform the association rule learning process
once every month over the set of transactions collected through the previous
month. For this reason, we will consider the problem where transaction sets
are input to the system as a stream of packages. The sets of transactions may
come in varying sizes and in varying periods. Once a set of transactions arrives,
the association rule learning algorithm is run on the last set of transactions,
resulting in a new set of association rules. Therefore, the set of association
rules learned will accumulate and increase in number over time, making the
mining of interesting ones out of this enlarging set of association rules impractical
for human experts. We refer to this sequence of rules as “association rule
set stream” or “streaming association rules” and the main motivation behind
this research is to develop a technique to overcome the interesting rule selection
problem. A successful association rule mining system should select and
present only the interesting rules to the domain experts. However, definition
of interestingness of association rules on a given domain usually differs from
one expert to the other and also over time for a given expert. In this thesis, we
propose a post-processing method to learn a subjective model for the interestingness
concept description of the streaming association rules. The uniqueness
of the proposed method is its ability to formulate the interestingness issue of
association rules as a benefit-maximizing classification problem and obtain a
different interestingness model for each user. In this new classification scheme,
the determining features are the selective objective interestingness factors, including
the rule’s content itself, related to the interestingness of the association
rules; and the target feature is the interestingness label of those rules. The proposed
method works incrementally and employs user interactivity at a certain
level. It is evaluated on a real supermarket dataset. The results show that the
model can successfully select the interesting ones.
Keywords
Interestingness learningdata mining
association rules
classification learning
incremental learning