Data Curation

Christina Prevalsky makes the case for data curation:

The gaining popularity of self-service analytical tools such as Tableau increases the necessity of having curated data in your database. These tools aim to allow the end users to intuitively query data “at the speed of thought” from the data warehouse and visualize the results quickly. That type of capability allows users to go through several different iterations of the data to really explore the data and generate unique insights. These tools do not work well when the underlying database tables have not been curated properly.

This is a difficult and lengthy process, but it’s vital; data minus context is a lot less relevant than you’d hope.

Related Posts

Testing Spatial Equilibrium Concepts With tidycensus

Ignacio Sarmiento Barbieri walks us through the concept of spatial equilibrium and tests using data from the tidycensus package: Let’s take the model to the data and reproduce figures 2.1. and 2.2 of “Cities, Agglomeration, and Spatial Equilibrium”. The focus are two cities, Chicago and Boston. These cities are chosen because both differ in how easy […]

Read More

Interacting With SQL Server From Pandas

Tomaz Kastrun shows how to use pyodbc to interact with a SQL Server database from Pandas: In the SQL Server Management Studio (SSMS), the ease of using external procedure sp_execute_external_script has been (and still will be) discussed many times. But the reason for this short blog post is the fact that, changing Python environments using Conda package/module management within Microsoft […]

Read More

Categories

October 2016
MTWTFSS
« Sep Nov »
 12
3456789
10111213141516
17181920212223
24252627282930
31