Research & Papers

[P] Open Source Fraud Detection System handling 0.17% class imbalance with Random Forest

Production-ready Python app handles extreme class imbalance using Random Forest and class weighting for real-world fraud detection.

Deep Dive

Developer Arpahls built CFD, an open-source credit card fraud detection system. It's a production-grade Python application using the PaySim dataset, designed to handle a severe 0.17% class imbalance via Random Forest with class weighting. The modular system achieves ~0.99 AUC, features full integration tests, and decouples ingestion, feature engineering, and evaluation. It serves as a professional template for structuring real-world machine learning projects.

Why It Matters

Provides a battle-tested, modular blueprint for deploying effective fraud detection models in production environments.