Kaggle-Maintained Data

Kevin Feasel

2018-11-09

Data

Noah Daniels announces Maintained by Kaggle data sets:

The “Maintained by Kaggle” badge means that Kaggle is now and will continue to actively maintain that dataset. This includes regular updates to descriptions and metadata, quicker response rates in discussion, and accurate current data from the source. Our goal is to create seamless workflows that allow everyone to do data science on Kaggle and be confident in the data they work with.

They have several data sets available from different open data projects for several cities, as well as NOAA and the World Bank.  If you’re looking for data sets to play with, this is a good option.

Related Posts

Building Test Data Following A Normal Distribution In T-SQL

I (finally) have a technical blog post: In order to show you the solution, I want to build up a reasonable sized sample.  Any solution looks great when reading five records, but let’s kick that up a notch.  Or, more specifically, a million notches:  I’m going to use a CTE tally table and load 5 […]

Read More

t-closeness And Data Anonymity

Kevin Feasel

2018-11-02

Data

John Cook shares some thoughts about k-anonymity and t-closeness: The idea of k-anonymity is that every database record appears at least k times. If you have a lot of records and few fields, your value of k could be high. But as you get more fields, it becomes more likely that a combination of fields is unique. If k = 1, then k-anonymity offers […]

Read More

Categories

November 2018
MTWTFSS
« Oct Dec »
 1234
567891011
12131415161718
19202122232425
2627282930