Vik Paruchuri walks through exploratory data analysis using New York City schools data:
Heatmaps are good for mapping out gradients, but we’ll want something with more structure to plot out differences in SAT score across the city. School districts are a good way to visualize this information, as each district has its own administration. New York City has several dozen school districts, and each district is a small geographic area.
We can compute SAT score by school district, then plot this out on a map. In the below code, we’ll:
-
Group full by school district.
-
Compute the average of each column for each school district.
-
Convert the school_dist field to remove leading 0s, so we can match our geograpghic district data.
Also check out part 1 if you missed it.