In addition to the original features in the raw data, we add number of bikes rented in each of the previous 12 hours as features to provide better predictive power. We create a
computeLagFeatures()helper function to compute the 12 lag features and use it as the transformation function in
rxDataStep()processes data chunk by chunk and lag feature computation requires data from previous rows. In
computLagFeatures(), we use the internal function
.rxSet()to save the last n rows of a chunk to a variable lagData. When processing the next chunk, we use another internal function
.rxGet()to retrieve the values stored in lagData and compute the lag features.
This is a great article for anybody wanting to dig into analytics, because they show their work.