Hierarchical Clustering

Chaitanya Sagar explains hierarchical clustering with examples in R:

Hope now you have a better understanding of clustering algorithms than what you started with. We discussed about Divisive and Agglomerative clustering techniques and four linkage methods namely, Single, Complete, Average and Ward’s method. Next, we implemented the discussed techniques in R using a numeric dataset. Note that we didn’t have any categorical variable in the dataset we used. You need to treat the categorical variables in order to incorporate them into a clustering algorithm. Lastly, we discussed a couple of plots to visualise the clusters/groups formed. Note here that we have assumed value of ‘k’ (number of clusters) is known. However, this is not always the case. There are a number of heuristics and rules-of-thumb for picking number of clusters. A given heuristic will work better on some datasets than others. It’s best to take advantage of domain knowledge to help set the number of clusters, if that’s possible. Otherwise, try a variety of heuristics, and perhaps a few different values of k.

There’s a lot to pick out of this post, but you’re able to walk through it step by step.  H/T R-Bloggers

Related Posts

Creating Seaborn Plots With R

Abdul Majed Raja shows how to call Python from R and build plots using the Seaborn Python package: The reticulate package provides a comprehensive set of tools for interoperability between Python and R. The package includes facilities for: Calling Python from R in a variety of ways including R Markdown, sourcing Python scripts, importing Python […]

Read More

Creating Map Plots With ggmap

Laura Ellis shows how to use the ggmap package to create choropleth maps in R: In the last map, it was a bit tricky to see the density of the incidents because all the graphed points were sitting on top of each other.  In this scenario, we are going to make the data all one […]

Read More


December 2017
« Nov Jan »