Image by Author

Supervised vs Unsupervised Methods for Anomaly Detection

and figuring out what is right for your data situation

Chetana Didugu
5 min readApr 4, 2023

--

Ever since the Data Science fever has caught on, the most prolific application area apart from the loan credit scoring problem, is the Anomaly Detection problem.

What is Anomaly Detection

For those who are new to Data Science and Anomaly Detection in general, Anomaly Detection is the process of identifying anything that is out of the ordinary. Since we are talking about anomalies in the Data Science domain, we stick to data properties here. The problem statement, in plain English can be framed as: How do I detect a change in my data?

The most immediate application area for anomaly detection is Fraud Detection. Fraudulent activity is generally different from legitimate activity. For example, on an e-commerce website. Let us say a person’s account gets hacked. The hacker might more probably have a different purchase pattern (in terms of order value, order frequency, payment method used, shipment address used, or maybe even cookie information) than the legitimate owner of the account. How would an algorithm detect this? Now think about how it would do for a huge e-commerce company with millions or billions of users and just as many transactions taking place each day.

--

--

Chetana Didugu
Chetana Didugu

Written by Chetana Didugu

Data Scientist, Product Manager, Polyglot and a Tabibito | Ex-Zalando, Ex-Amazon https://www.linkedin.com/in/kavitha-chetana-didugu/ | https://github.com/kavit

No responses yet