Data Frames

Kevin Feasel



Saravanan Subramanian has an introduction to data frames in R:

The R data frame is a high level data structure which is equivalent to a table in database systems.  It is highly useful to work with machine learning algorithms, and it’s very flexible and easy to use.

The standard definition of data frames are a “tightly coupled collections of variables which share many of the properties of matrices and of lists, used as the fundamental data structure by most of R‘s modeling software.”

Data frames are a powerful abstraction and make R a lot easier for database professionals than application developers who are used to thinking iteratively and with one object at a time.

Related Posts

Polishing Uncalibrated Models

Nina Zumel takes an uncalibrated random forest model and applies a calibration technique to improve the estimate on one variable: In the previous article in this series, we showed that common ensemble models like random forest and gradient boosting are uncalibrated: they are not guaranteed to estimate aggregates or rollups of the data in an unbiased way. […]

Read More

Generating Excel Spreadsheets from Shiny

Richard Hill and Andy Merlino show how you can export data from a Shiny app into Excel: R is great for report generation. Shiny allows us to easily create web apps that generate a variety of reports with R. This post details a demo Shiny app that generates an Excel report, a PowerPoint report, and a PDF […]

Read More


May 2016
« Apr Jun »