OBE: Outlier by Example

Cui Zhu¹⁹,
Hiroyuki Kitagawa²⁰,
Spiros Papadimitriou²¹ &
…
Christos Faloutsos²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3056))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3096 Accesses
17 Citations

Abstract

Outlier detection in large datasets is an important problem. There are several recent approaches that employ very reasonable definitions of an outlier. However, a fundamental issue is that the notion of which objects are outliers typically varies between users or, even, datasets. In this paper, we present a novel solution to this problem, by bringing users into the loop. Our OBE (Outlier By Example) system is, to the best of our knowledge, the first that allows users to give some examples of what they consider as outliers. Then, it can directly incorporate a small number of such examples to successfully discover the hidden concept and spot further objects that exhibit the same “outlier-ness” as the examples. We describe the key design decisions and algorithms in building such a system and demonstrate on both real and synthetic datasets that OBE can indeed discover outliers that match the users’ intentions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study

Article 16 January 2016

On normalization and algorithm selection for unsupervised outlier detection

Article 21 November 2019

Local Outlier Detection with Interpretation

References

Barnett, V., Lewis, T.: Outliers in Statistical Data. John Wiley and Sons, Chichester (1994)
MATH Google Scholar
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: Lof: Identifying density-based local outliers. In: Proc. SIGMOD Conf., pp. 93–104 (2000)
Google Scholar
Bay, S.D., Schwabacher, M.: Mining Distance-Based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule. In: SIGKDD 2003, August 24-27 (2003)
Google Scholar
Hawkins, D.M.: Identification of Outliers. Chapman and Hall, Boca Raton (1980)
MATH Google Scholar
Johnson, T., Kwok, I., Ng, R.T.: Fast computation of 2-dimensional depth contours. In: Proc. KDD, pp. 224–228 (1998)
Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Comp. Surveys 31(3), 264–323 (1999)
Article Google Scholar
Knorr, E.M., Ng, R.T.: A unified notion of outliers: Properties and computation. In: Proc. KDD, pp. 219–222 (1997)
Google Scholar
Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: Proc. VLDB 1998, pp. 392–403 (1998)
Google Scholar
Knorr, E.M., Ng, R.T.: Finding intentional knowledge of distance-based outliers. In: Proc. VLDB, pp. 211–222 (1999)
Google Scholar
Knorr, E.M., Ng, R.T., Tucakov, V.: Distance-based outliers: Algorithms and applications. VLDB Journal 8, 237–253 (2000)
Article Google Scholar
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. John Wiley and Sons, Chichester (1987)
Book MATH Google Scholar
Papadimitriou, S., Kitagawa, H., Gibbons, P.B., Faloutsos, C.: LOCI: Fast Outlier Detection Using the Local Correlation Integral. In: Proc. ICDE, pp. 315–326 (2003)
Google Scholar
Yu, H., Han, J., Chang, K.: PEBL: Positive Example Based Learning for Web Page Classification Using SVM. In: Proc. KDD (2002)
Google Scholar
http://www.csie.nut.edu.tw/~cjlin/libsvm
Yamanishi, K., Takeuchi, J.: Discovering Outlier Filtering Rules from Unlabeled Data. In: Proc. KDD (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Systems and Information Engineering, University of Tsukuba,
Cui Zhu
Institute of Information Sciences and Electronics, University of Tsukuba,
Hiroyuki Kitagawa
School of Computer Science, Carnegie Mellon University,
Spiros Papadimitriou & Christos Faloutsos

Authors

Cui Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Hiroyuki Kitagawa
View author publications
You can also search for this author in PubMed Google Scholar
Spiros Papadimitriou
View author publications
You can also search for this author in PubMed Google Scholar
Christos Faloutsos
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Engineering and Information Technology, Deakin University, VIC 3125, Australia
Honghua Dai
University of Illinois at Urbana-Champaign, 61801, Urbana, IL, USA
Ramakrishnan Srikant
Faculty of Engineering and Information Technology, Centre for Quantum Computation and Intelligent Systems, and Australian ACS National Committee for Artificial Intelligence, University of Technology, Sydney, Australia
Chengqi Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, C., Kitagawa, H., Papadimitriou, S., Faloutsos, C. (2004). OBE: Outlier by Example. In: Dai, H., Srikant, R., Zhang, C. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science(), vol 3056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_29

Download citation

DOI: https://doi.org/10.1007/978-3-540-24775-3_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22064-0
Online ISBN: 978-3-540-24775-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

OBE: Outlier by Example

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study

On normalization and algorithm selection for unsupervised outlier detection

Local Outlier Detection with Interpretation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

OBE: Outlier by Example

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study

On normalization and algorithm selection for unsupervised outlier detection

Local Outlier Detection with Interpretation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation