Rapid Outlier Detection

Main content

Fast computation of distance-based outlierness scores via sampling

Mahito Sugiyama, Karsten Borgwardt

Rapid Distance-Based Outlier Detection via Sampling


An efficient algorithm for outlier detection, which performs sampling once and measures outlierness of each data point by the distance from it to the nearest neighbor in the sample set. This algorithm has the following advantages:

  • Scalable; the time complexity is linear in the number of data points,
  • Effective; it is empirically shown to be the most effective on average among existing distance-based outlier detection methods, and
  • Easy to use; you only need to input the number of samples, and small sample size (default value is 20) is shown to be a good choice.


C implementation: code.zip (ZIP, 421 KB)

R package: spoutlier.zip (ZIP, 6 KB)

Also available at GitHub

Further information and reference

Please see the following paper for detailed information and refer it in your published research.

Keyboard navigation between tabs via Alt arrow keys as well as Home and End.

Mahito Sugiyama and Karsten Borgwardt
Rapid Distance-Based Outlier Detection via Sampling,
Advances in Neural Information Processing Systems 26 (NIPS 2013), 467-475. (Online)

Further information and the code can be found on the project page.

title = {Rapid {D}istance-{B}ased {O}utlier {D}etection via {S}ampling},
author = {Sugiyama, Mahito and Borgwardt, Karsten},
booktitle = {Advances in Neural Information Processing Systems 26},
editor = {C. J. C. Burges and L. Bottou and M. Welling and Z. Ghahramani and K. Q. Weinberger},
pages = {467--475},
year = {2013},
publisher = {Curran Associates, Inc.},
url = {http://papers.nips.cc/paper/5127-rapid-distance-based-outlier-detection-via-sampling.pdf}

Contact: Mahito Sugiyama

Page URL: https://www.bsse.ethz.ch/mlcb/research/machine-learning/rapid-outlier-detection-via-sampling.html
Fri Jun 23 21:14:54 CEST 2017
© 2017 Eidgenössische Technische Hochschule Zürich