Data Frame Serialization In R

Kevin Feasel



David Smith shows a new contender for serializing data frames in R, fst:

And now there’s a new package to add to the list: the fst package. Like the data.table package (the fast data.frame replacement for R), the primary focus of the fst package is speed. The chart below compares the speed of reading and writing data to/from CSV files (with fwrite/fread), feather, fts, and the native R RDS format. The vertical axis is throughput in megabytes per second — more is better. As you can see, fst outperforms the other options for both reading (orange) and writing (green).

These early numbers look great, so this is a project worth keeping an eye on.

Related Posts


John Mount explains the vtreat package that he and Nina Zumel have put together: When attempting predictive modeling with real-world data you quicklyrun into difficulties beyond what is typically emphasized in machine learning coursework: Missing, invalid, or out of range values. Categorical variables with large sets of possible levels. Novel categorical levels discovered during test, cross-validation, or […]

Read More

R 3.4.4 Now Available

David Smith notes that R 3.4.4 is now generally available: R 3.4.4 has been released, and binaries for Windows, Mac, Linux and now available for download on CRAN. This update (codenamed “Someone to Lean On” — likely a Peanuts reference, though I couldn’t find which one with a quick search) is a minor bugfix release, and shouldn’t cause […]

Read More


February 2017
« Jan Mar »