Bike Rental Demand Estimation

Kevin Feasel



The Revolution Analytics blog has a Microsoft-driven article on estimating bike rental demand with Microsoft R Server:

In addition to the original features in the raw data, we add number of bikes rented in each of the previous 12 hours as features to provide better predictive power. We create acomputeLagFeatures() helper function to compute the 12 lag features and use it as the transformation function in rxDataStep().

Note that rxDataStep() processes data chunk by chunk and lag feature computation requires data from previous rows. In computLagFeatures(), we use the internal function .rxSet() to save the last n rows of a chunk to a variable lagData. When processing the next chunk, we use another internal function .rxGet() to retrieve the values stored in lagData and compute the lag features.

This is a great article for anybody wanting to dig into analytics, because they show their work.

Related Posts

Partitioning Data For Performance Improvement In R

John Mount shares a few examples of partitioning and parallelizing data operations in R: In this note we will show how to speed up work in R by partitioning data and process-level parallelization. We will show the technique with three different R packages: rqdatatable, data.table, and dplyr. The methods shown will also work with base-R and other packages. For each of the above […]

Read More

Sharing R Notebooks

Hanyu Cui and Hossein Falaki show how to share a notebook using RMarkdown: RMarkdown is the dynamic document format RStudio uses. It is normal Markdown plus embedded R (or any other language) code that can be executed to produce outputs, including tables and charts, within the document. Hence, after changing your R code, you can just rerun all […]

Read More


May 2016
« Apr Jun »