Anomaly Detection With Kafka Streams

Ajmal Karuthakantakath shows us an application which performs fairly simple anomaly detection using Kafka Streams:

The problem is in the banking loan payment domain, where customers have taken a loan and they need to make monthly payments to repay the loan amount.

Assume there are millions of customers in the system and all these customers need to make monthly payments to their account. Each customer may have a different monthly due date depending on their monthly loan due date.

Each customer payment will appear as a PaymentScheduleEvent event. Customers can make more than one PaymentScheduleEvent per month. Each monthly due date for a customer will appear as a PaymentDueEvent.

An arbitrarily chosen anomaly condition for this example is that if the amount due is more than $150 for any customer at any point in time, this generates an anomaly.

Click through for instructions, the application, and further resources.  If you want to learn Kafka Streams, this should keep you busy for a little while.

Related Posts

Pivoting Spark DataFrames

Unmesha Sreeveni shows how we can pivot a DataFrame in Apache Spark using one line of code: A pivot can be thought of as translating rows into columns while applying one or more aggregations. Lets see how we can achieve the same using the above dataframe. We will pivot the data based on “Item” column. […]

Read More

Troubleshooting Spark Performance

Bikas Saha and Mridul Murlidharan explain some of the basics of performance tuning with Apache Spark: Our objective was to build a system that would provide an intuitive insight into Spark jobs that not just provides visibility but also codifies the best practices and deep experience we have gained after years of debugging and optimizing […]

Read More


October 2017
« Sep Nov »