Dynamic Programming In R With RCppDynProg

2019-01-03

In the above we have an input (or independent variable) `x` and an observed outcome (or dependent variable) `y_observed` (portrayed as points). `y_observed` is the unobserved idea value `y_ideal` (portrayed by the dashed curve) plus independent noise. The modeling goal is to get close the `y_ideal` curve using the `y_observed` observations. Obviously this can be done with a smoothing spline, but let’s use `RcppDynProg` to find a piecewise linear fit.
To encode this as a dynamic programming problem we need to build a cost matrix that for every consecutive interval of `x`-values we have estimated the out-of sample quality of fit. This is supplied by the function `RcppDynProg::lin_costs()` (using the PRESS statistic), but lets take a quick look at the idea.

It’s an interesting package whose purpose is to turn an input data stream into a set of linear functions which approximate the stream. I’m not sure I’ll ever have a chance to use it, but it’s good to know that it’s there if I do ever need it.

Linear Regression Assumptions

2019-06-17

Stephanie Glen has a chart which explains the four key assumptions behind when Ordinary Least Squares is the Best Linear Unbiased Estimator: If any of the main assumptions of linear regression are violated, any results or forecasts that you glean from your data will be extremely biased, inefficient or misleading. Navigating all of the different assumptions […]

Visualizing with Heatmaps in R

2019-06-17

Anisa Dhana shows how you can create a quick heatmap plot in R: To give your own colors use the scale_fill_gradientn function.ggplot(dat, aes(Age, Race)) + geom_raster(aes(fill = BMI)) + scale_fill_gradientn(colours=c("white", "red")) This is a quick example using ggplot2 but there are other heatmap libraries available too.