Ayodeji Ogunlami mixes approaches:
In developing this hybrid system, sets of rules are required as well as a machine learning model. I would be making use of a vehicle insurance dataset from Kaggle in this demonstration.
The dataset can be downloaded from this link: https://www.kaggle.com/datasets/shivamb/vehicle-claim-fraud-detection
The ML model would be built using a random forest classifier on Azure Databricks using Pyspark.
This seems to be the most sensible approach, especially given how rare actual fraud incidents are and what that imbalance does to classification algorithms.
Comments closed