Updating large itemsets with early pruning

Date

1999-07

Editor(s)

Advisor

Supervisor

Arkun, Erol

Co-Advisor

Co-Supervisor

Instructor

Source Title

Print ISSN

Electronic ISSN

Publisher

Bilkent University

Volume

Issue

Pages

Language

English

Journal Title

Journal ISSN

Volume Title

Series

Abstract

With the computerization of many business and government transactions, huge amounts of data have been stored in computers. The e.xisting database systems do not provide the users with the necessary tools and functionalities to capture all stored information easily. Therefore, automatic knowledge discovery techniques have been developed to capture and use the voluminous information hidden in large databases. Discovery of association rules is an important class of data mining, which is the process of extracting interesting and frequent patterns from the data. Association rules aim to capture the co-occurrences of items, and have wide applicability in many areas. Discovering association rules is based on the computation of large itemsets (set of items that occur frequently in the database) efficiently, and is a computationally expensive operation in large databases. Thus, maintenance of them in large dynamic databases is an important issue. In this thesis, we propose an efficient algorithm, to update large itemsets by considering the set of previously discovered itemsets. The main idea is to prune an itemset as soon as it is understood to be small in the updated database, and to keep the set of candidate large itemsets as small as possible. The proposed algorithm outperforms the existing update algorithms in terms of the number of scans over the databases, and the number of candidate large itemsets generated and counted. Moreover, it can be applied to other data mining tasks that are based on large itemset framework easily.

Course

Other identifiers

Book Title

Citation

item.page.isversionof