Significant Pattern Mining (Westfall-Young Light)

Enlarged view: KDD logo

Felipe Llinares-López, Mahito Sugiyama, Laetitia Papaxanthos and Karsten Borgwardt

Fast and Memory-Efficient Significant Pattern Mining via Permutation Testing

Summary

In this project, we developed an approach to improve the statistical power in significant pattern mining by using permutation-testing.

Significant pattern mining algorithms must deal with a vast search space, often containing billions or even trillions of candidate patterns. However, these patterns are often heavily inter-related, resulting in pronounced statistical redundancies. Previously existing approaches either: (1) ignore these redundancies, leading to over-conservative significance thresholds and a loss of statistical power or (2) are computationally demanding, both in terms of runtime and memory usage, limiting their applicability to small-sized datasets.

Here, we proposed a novel, fast and memory-efficient permutation testing algorithm for significant pattern mining that overcomes both limitations.

Code

A beta version of code is available in our GitHub repository external pagehere.

Publication

Fast and Memory-Efficient Significant Pattern Mining via Permutation Testing

Felipe Llinares-López, Mahito Sugiyama, Laetitia Papaxanthos and Karsten Borgwardt
Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD2015), 2015, 725-734
external pageOnline  |  ETH Research Collection  |  Project page  |  external pageGitHub

    

Contact for questions regarding usage or reporting bugs.

JavaScript has been disabled in your browser