Data Frame Serialization In R

Kevin Feasel

2017-02-03

R

David Smith shows a new contender for serializing data frames in R, fst:

And now there’s a new package to add to the list: the fst package. Like the data.table package (the fast data.frame replacement for R), the primary focus of the fst package is speed. The chart below compares the speed of reading and writing data to/from CSV files (with fwrite/fread), feather, fts, and the native R RDS format. The vertical axis is throughput in megabytes per second — more is better. As you can see, fst outperforms the other options for both reading (orange) and writing (green).

These early numbers look great, so this is a project worth keeping an eye on.

Related Posts

From Excel to R: Three Examples

Abdul Majed Raja has a few examples of things which are easy to do in Excel and how you can do them in R: Create a difference variable between the current value and the next valueThis is also known as lead and lag – especially in a time series dataset this varaible becomes very important in feature engineering. In […]

Read More

Calculating AUC in R

Andrew Treadway shows how you can calculate Area Under the Curve in R: AUC is an important metric in machine learning for classification. It is often used as a measure of a model’s performance. In effect, AUC is a measure between 0 and 1 of a model’s performance that rank-orders predictions from a model. For […]

Read More

Categories

February 2017
MTWTFSS
« Jan Mar »
 12345
6789101112
13141516171819
20212223242526
2728