K-Means Clustering With Python

Kevin Feasel

2016-07-04

Python

David Crook discusses k-means clustering and how to implement it using Python:

K-Means takes in an unlabeled data set and a whole real number, k.  K is the number of centroids, or clusters you wish to find.  If you do not know how many clusters there should be, it is possible to do some pre-processing to find that more automatically, however that is out of the scope of this article.  Once you have a data set and defined the size of k, K-Means begins its iterative process.  It starts by selecting centroids by moving them to the average of the data associated with them.  It then reshuffles all of the data into new groups based on the proximity to each centroid.

This is a big and detailed post, and worth reading in its totality.

Related Posts

Python versus R (Again)

Alex Woodie looks at whether Python is dominating R in the data science space: There is some evidence that Python’s popularity is hurting R usage. According to the TIOBE Index, Python is currently the third most popular language in the world, behind perennial heavyweights Java and C. From August 2018 to August 2019, Python usage surged […]

Read More

Microsoft ML Server 9.4

Kevin Feasel

2019-07-31

Python, R

Jeroen Ter Heerdt announces Microsoft Machine Learning Server 9.4: Today we’re excited to announce our latest Microsoft Machine Learning Server 9.4 release, which addresses popular customer requests as well as developments in the R and Python community. Microsoft Machine Learning Server is your flexible enterprise platform for analyzing data at scale, building intelligent apps, and […]

Read More

Categories

July 2016
MTWTFSS
« Jun Aug »
 123
45678910
11121314151617
18192021222324
25262728293031