K-Means Clustering With Python

Kevin Feasel

2016-07-04

Python

David Crook discusses k-means clustering and how to implement it using Python:

K-Means takes in an unlabeled data set and a whole real number, k.  K is the number of centroids, or clusters you wish to find.  If you do not know how many clusters there should be, it is possible to do some pre-processing to find that more automatically, however that is out of the scope of this article.  Once you have a data set and defined the size of k, K-Means begins its iterative process.  It starts by selecting centroids by moving them to the average of the data associated with them.  It then reshuffles all of the data into new groups based on the proximity to each centroid.

This is a big and detailed post, and worth reading in its totality.

Related Posts

Comparing TensorFlow Versus PyTorch

Anirudh Rao compares PyTorch to TensorFlow: For small-scale server-side deployments both frameworks are easy to wrap in e.g. a Flask web server. For mobile and embedded deployments, TensorFlow works really well. This is more than what can be said of most other deep learning frameworks including PyTorch. Deploying to Android or iOS does require a non-trivial amount of work in TensorFlow. You don’t have to rewrite the entire inference portion of your model in Java or C++. […]

Read More

What’s New With Machine Learning Services

Niels Berglund looks at SQL Server 2019’s Machine Learning Services offering for updates: So, when I read What’s new in SQL Server 2019, I came across a lot of interesting “stuff”, but one thing that stood out was Java language programmability extensions. In essence, it allows us to execute Java code in SQL Server by using a […]

Read More

Categories

July 2016
MTWTFSS
« Jun Aug »
 123
45678910
11121314151617
18192021222324
25262728293031