John Mount takes us through a couple of data shaping packages:
The advantages of data_algebra and cdata are:
– The user specifies their desired transform declaratively by example and in data. What one does is: work an example, and then write down what you want (we have a tutorial on this here).
– The transform systems can print what a transform is going to do. This makes reasoning about data transforms much easier.
– The transforms, as they themselves are written as data, can be easily shared between systems (such as R and Python).
Let’s re-work a small R cdata example, using the Python package data_algebra.
Click through for the example.