Bike Rental Demand Estimation

Kevin Feasel



The Revolution Analytics blog has a Microsoft-driven article on estimating bike rental demand with Microsoft R Server:

In addition to the original features in the raw data, we add number of bikes rented in each of the previous 12 hours as features to provide better predictive power. We create acomputeLagFeatures() helper function to compute the 12 lag features and use it as the transformation function in rxDataStep().

Note that rxDataStep() processes data chunk by chunk and lag feature computation requires data from previous rows. In computLagFeatures(), we use the internal function .rxSet() to save the last n rows of a chunk to a variable lagData. When processing the next chunk, we use another internal function .rxGet() to retrieve the values stored in lagData and compute the lag features.

This is a great article for anybody wanting to dig into analytics, because they show their work.

Related Posts

Defining Tidy Data

John Mount shares thoughts about the concept of tidy data: A question is: is such a data set “tidy”? The paper itself claims the above definitions are “Codd’s 3rd normal form.” So, no the above table is not “tidy” under that paper’s definition. The the winner’s date of birth is a fact about the winner […]

Read More

Visualizing Earthquake Data

Giorgio Garziano continues a series on analyzing earthquake data: This is the third part of our post series about the exploratory analysis of a publicly available dataset reporting earthquakes and similar events within a specific 30 days time span. In this post, we are going to show static, interactive and animated earthquakes maps of different flavors by […]

Read More


May 2016
« Apr Jun »