Graphing Row Counts With R

Kevin Feasel

2016-11-17

R

I look at one use of R for DBAs:

I have a client data warehouse which holds daily rollups of revenue and cost for customers.  We’ve had some issues with the warehouse lately where data was not getting loaded due to system errors and timeouts, and our services team gave me a list of some customers who had gaps in their data due to persistent processing failures.  I figured out the root cause behind this (which will show up as tomorrow’s post), but I wanted to make sure that we filled in all of the gaps.

My obvious solution is to write a T-SQL query, getting some basic information by day for each customer.  I could scan through that result set, but the problem is that people aren’t great at reading tables of numbers; they do much better looking at pictures.  This is where R comes into play.

Click through for the code and a walkthrough of what each line is doing.

Related Posts

From Excel to R: Three Examples

Abdul Majed Raja has a few examples of things which are easy to do in Excel and how you can do them in R: Create a difference variable between the current value and the next valueThis is also known as lead and lag – especially in a time series dataset this varaible becomes very important in feature engineering. In […]

Read More

Calculating AUC in R

Andrew Treadway shows how you can calculate Area Under the Curve in R: AUC is an important metric in machine learning for classification. It is often used as a measure of a model’s performance. In effect, AUC is a measure between 0 and 1 of a model’s performance that rank-orders predictions from a model. For […]

Read More

Categories

November 2016
MTWTFSS
« Oct Dec »
 123456
78910111213
14151617181920
21222324252627
282930