In this study, we propose a simple and novel data structure using hyper-links, H-struct, and a new mining algorithm, H-mine, which takes advantage of this data structure and dynamically adjusts links in the mining process. A distinct feature of this method is that it has a very limited and precisely predictable main memory cost and runs very quickly in memory-based settings. Moreover, it can be scaled up to very large databases using database partitioning. When the data set becomes dense, (conditional) FP-trees can be constructed dynamically as part of the mining process. Our study shows that H-mine has an excellent performance for various kinds of data, outperforms currently available algorithms in different settings, and is highly scalable to mining large databases. This study also proposes a new data mining methodology, space-preserving mining, which may have a major impact on the future development of efficient and scalable data mining methods.
No Reference information available - sign in for access.
No Citation information available - sign in for access.
No Supplementary Data.
No Article Media
Frequent pattern mining;
Document Type: Research Article
School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
University of Illinois, Urbana, IL, USA
Hong Kong University of Science and Technology, Hong Kong
Osaka University, Osaka, Japan
Peking University, Beijing, China
Publication date: 01 June 2007