Microsoft Research Open Data Sets

David Smith notes that there are several data sets that Microsoft Research has made available:

Other data sets of note include:

  • A collection of 38M tweets related to the 2012 US election

  • 3-D capture data from individuals performing a variety of hand gestures

  • Infer.NET, a framework for running Bayesian inference in graphical models

  • Images for 1 million celebrities, and associated tags

  • MS MARCO, is a new large-scale dataset for reading comprehension and question answering

Click through for more information, and then check out the data sets.

Related Posts

Sentiment Analysis with Spark on Qubole

Jonathan Day, et al, have a tutorial on using Qubole to build a sentiment analysis model: This post covers the use of Qubole, Zeppelin, PySpark, and H2O PySparkling to develop a sentiment analysis model capable of providing real-time alerts on customer product reviews. In particular, this model allows users to monitor any natural language text […]

Read More

Running Spark MLlib to Feed Power BI

Brad Llewellyn shows how you can take Spark MLlib results and feed them into Power BI: MLlib is one of the primary extensions of Spark, along with Spark SQL, Spark Streaming and GraphX.  It is a machine learning framework built from the ground up to be massively scalable and operate within Spark.  This makes it […]

Read More

Categories

July 2018
MTWTFSS
« Jun Aug »
 1
2345678
9101112131415
16171819202122
23242526272829
3031