Significant Pattern Mining (Westfall-Young Light)

Main content

Felipe Llinares-López, Mahito Sugiyama, Laetitia Papaxanthos and Karsten Borgwardt

Fast and Memory-Efficient Significant Pattern Mining via Permutation Testing

Summary

In this project, we developed an approach to improve the statistical power in significant pattern mining by using permutation-testing.

Significant pattern mining algorithms must deal with a vast search space, often containing billions or even trillions of candidate patterns. However, these patterns are often heavily inter-related, resulting in pronounced statistical redundancies. Previously existing approaches either: (1) ignore these redundancies, leading to over-conservative significance thresholds and a loss of statistical power or (2) are computationally demanding, both in terms of runtime and memory usage, limiting their applicability to small-sized datasets.

Here, we proposed a novel, fast and memory-efficient permutation testing algorithm for significant pattern mining that overcomes both limitations.

Code

A beta version of code is available in our GitHub repository here and here.

Reference

Keyboard navigation between tabs via Alt arrow keys as well as Home and End.

Felipe Llinares-López, Mahito Sugiyama, Laetitia Papaxanthos and Karsten Borgwardt
Fast and Memory-Efficient Significant Pattern Mining via Permutation Testing
,
Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD2015), 2015, 725-734. (Online)

@inproceedings{Llinares-Lopez-2015-KDD,
author = {Llinares-L\'{o}pez, Felipe and Sugiyama, Mahito and Papaxanthos, Laetitia and Borgwardt, Karsten},
title = {Fast and Memory-Efficient Significant Pattern Mining via Permutation Testing},
booktitle = {Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
series = {KDD '15},
year = {2015},
isbn = {978-1-4503-3664-2},
location = {Sydney, NSW, Australia},
pages = {725--734},
numpages = {10},
url = {http://doi.acm.org/10.1145/2783258.2783363},
doi = "10.1145/2783258.2783363",
publisher = {ACM},
address = {New York, NY, USA},
keywords = {multiple hypothesis testing, p-value, significant pattern mining, westfall-young permutation},
}

Further information and the code can be found on the project page.

Contact for questions regarding usage or reporting bugs.

 
 
Page URL: https://www.bsse.ethz.ch/mlcb/research/machine-learning/wylight.html
Sat Jun 24 00:26:18 CEST 2017
© 2017 Eidgenössische Technische Hochschule Zürich