Python And The Tidyverse

Kevin Feasel



Leo at Locke Data looks at a couple Python packages which implement Tidyverse concepts:

The Dplython README provides some clear examples of how the package can be used. Below is an summary of the common functions:

  • select() – used to get specific columns of the data-frame.

  • sift() – used to filter out rows based on the value of a variable in that row.

  • sample_n() and sample_frac() – used to provide a random sample of rows from the data-frame.

  • arrange() – used to sort results.

  • mutate() – used to create new columns based on existing columns.

I think the Tidyverse is immediately accessible for data platform professionals, so it’s good to see these concepts making their way to Python as well as R.

Related Posts

Gradient Boosting And XGBoost

Shirin Glander has another English-language transcript from a German video, this time covering gradient boosting techniques: Let’s look at how Gradient Boosting works. Most of the magic is described in the name: “Gradient” plus “Boosting”. Boosting builds models from individual so called “weak learners” in an iterative way. In the Random Forests part, I had already discussed the […]

Read More

Building A Convolutional Neural Network With TensorFlow

Anirudh Rao walks us through Convolutional Neural Networks in TensorFlow: What Are Convolutional Neural Networks? Convolutional Neural Networks, like neural networks, are made up of neurons with learnable weights and biases. Each neuron receives several inputs, takes a weighted sum over them, pass it through an activation function and responds with an output. The whole network has a loss function and all the tips and tricks that we developed for neural networks still apply on Convolutional […]

Read More


June 2018
« May Jul »