David Smith notes that there are several data sets that Microsoft Research has made available:
Other data sets of note include:
-
A collection of 38M tweets related to the 2012 US election
-
3-D capture data from individuals performing a variety of hand gestures
-
Infer.NET, a framework for running Bayesian inference in graphical models
-
Images for 1 million celebrities, and associated tags
-
MS MARCO, is a new large-scale dataset for reading comprehension and question answering
Click through for more information, and then check out the data sets.