Storytelling With Data

Vik Paruchuri walks through exploratory data analysis using New York City schools data:

Heatmaps are good for mapping out gradients, but we’ll want something with more structure to plot out differences in SAT score across the city. School districts are a good way to visualize this information, as each district has its own administration. New York City has several dozen school districts, and each district is a small geographic area.

We can compute SAT score by school district, then plot this out on a map. In the below code, we’ll:

  • Group full by school district.

  • Compute the average of each column for each school district.

  • Convert the school_dist field to remove leading 0s, so we can match our geograpghic district data.

Also check out part 1 if you missed it.

Related Posts

Custom ggplot2 Fonts

Daniel Oehm shares two techniques for using custom fonts in your ggplot2 visuals: ggplot – You can spot one from a mile away, which is great! And when you do it’s a silent fist bump. But sometimes you want more than the standard theme. Fonts can breathe new life into your plots, helping to match […]

Read More

The Costs of Specialization within Data Science

Eric Colson argues in favor of data science generalists rather than specialists: But the goal of data science is not to execute. Rather, the goal is to learn and develop profound new business capabilities. Algorithmic products and services like recommendations systems, client engagement bandits, style preference classification, size matching, fashion design systems, logistics optimizers, seasonal trend detection, and more can’t be […]

Read More

Categories

July 2016
MTWTFSS
« Jun Aug »
 123
45678910
11121314151617
18192021222324
25262728293031