K-Means Clustering In R

Raghavan Madabusi provides an example of how k-means clustering can help segment data points in an understandable manner:

Call Detail Record (CDR) is the information captured by the telecom companies during Call, SMS, and Internet activity of a customer. This information provides greater insights about the customer’s needs when used with customer demographics. Most of the telecom companies use CDR information for fraud detection by clustering the user profiles, reducing customer churn by usage activity, and targeting the profitable customers by using RFM analysis.

In this blog, we will discuss about clustering of the customer activities for 24 hours by using unsupervised K-means clustering algorithm. It is used to understand segment of customers with respect to their usage by hours.

For example, customer segment with high activity may generate more revenue. Customer segment with high activity in the night hours might be fraud ones.

This article won’t really explain k-means clustering in any detail, but it does give you an example to apply the technique using R.

Related Posts

An Introduction To seplyr

John Mount guest blogs on the Revolutions blog about seplyr: seplyr is an R package that supplies improved standard evaluation interfaces for many common data wrangling tasks. The core of seplyr is a re-skinning of dplyr‘s functionality to seplyr conventions (similar to how stringr re-skins the implementing package stringi). Read on for a couple of examples of where seplyr can make it easier for you to […]

Read More

Matrix Transposition In T-SQL

Phil Factor has some fun transposing a matrix using T-SQL: What I’m doing is simply converting the table into its JSON form, and then using this to create a table using the multi-row VALUES  syntax which paradoxically allows expressions. The expression I’m using is JSON_Value, which allows me do effectively dictate the source within the table, via […]

Read More

Categories