Anomaly Detection System
Pharmaceutical Data Analytics Platform
Developed an account-level anomaly detection system for pharmaceutical data, identifying unusual patterns across multiple datasets to support data quality and business insights.

Overview
Designed and implemented an anomaly detection system to identify spikes, drops, and gaps in pharmaceutical sales and distribution data across accounts and regions. The system aimed to improve data quality validation and uncover meaningful business insights.
Problem
Existing anomaly checks were limited, not customizable at the account level, and lacked explainability. Stakeholders needed a more robust system that could detect nuanced patterns and clearly communicate their real-world impact.
Process
Defined key business questions and anomaly types, then built baseline generation methods using historical data. Implemented rule-based detection for spikes, drops, and gaps with adjustable tolerance levels. Developed account-level behavior profiling and introduced subnational aggregation to identify top contributing accounts. Iterated weekly with data stewards to refine detection logic.
Outcome
Expanded anomaly detection coverage and improved visibility into data inconsistencies across multiple sources. Enabled stakeholders to better understand anomalies through more explainable outputs and contributed to identifying additional anomalies beyond existing systems.
Lessons Learned
Learned how to balance statistical methods with business context, the importance of explainability in analytics, and how iterative feedback loops with stakeholders significantly improve system effectiveness.
Tools Used
- Python
- SQL
- Databricks
- PySpark
- Healthcare Data
- Data Modeling