Bayesian Nonparametric Models

Luba Belokon asked Vadim Smolyakov to explain Bayesian Nonparametric models and here’s the result:

Bayesian Nonparametrics are a class of models for which the number of parameters grows with data. A simple example is non-parametric K-means clustering [1]. Instead of fixing the number of clusters K, we let data determine the best number of clusters. By letting the number of model parameters (cluster means and covariances) grow with data, we are better able to describe the data as well as generate new data given our model.

Of course, to avoid over-fitting, we penalize the number of clusters K via a regularization parameter which controls the rate at which new clusters are created. 

This is an interesting discussion of the Dirichlet process, particularly as applied to K-mean clustering.  It helps you figure out your best choice for K, no small task.

Related Posts

Linear Programming in Python

Francisco Alvarez shows us an example of linear programming in Python: The first two constraints, x1 ≥ 0 and x2 ≥ 0 are called nonnegativity constraints. The other constraints are then called the main constraints. The function to be maximized (or minimized) is called the objective function. Here, the objective function is x1 + x2. Two classes of […]

Read More

Exploratory Data Analysis with inspectdf

Laura Ellis continues a dive into Exploratory Data Analysis, this time using the inspectdf package: I like this package because it’s got a lot of functionality and it’s incredibly straightforward to use. In short, it allows you to understand and visualize column types, sizes, values, value imbalance & distributions as well as correlations. Better yet, […]

Read More


October 2017
« Sep Nov »