Not Catching Them All

Hanjo Odendaal explains clustering techniques using Pokemon:

To collect the data on all the first generation pokemon, I employ Hadley Wickam’s rvest package. I find it very intuitive and can handle all of my needs in collecting and extracting the data from a pokemon wiki. I will grab all the Pokemon up until to Gen II, which constitutes 251 individuals. I did find the website structure a bit of a pain as each pokemon had very different looking web pages. But, with some manual hacking, I eventually got the data in a nice format.

This probably means a lot more to you if you grew up in front of a Game Boy, but there’s some good technique in here regardless.

Related Posts

“Pretty But Useless” Visuals

I continue my dashboard visualization series with a bit of an extended rant: The best use of a pie chart is to show a simple share of a static total.  Here, we can see that Daredevil has almost half of the critics’ reviews, and that The Punisher and Jessica Jones are split. This simple pie […]

Read More

The Theory Behind cdata

John Mount has a video explaining the concepts behind cdata: We also have two really nifty articles on the theory and methods: Fluid data reshaping with cdata Coordinatized Data: A Fluid Data Specification Please give it a try! Click through for the video, which I found very helpful in tying together a number of data […]

Read More

Categories

August 2016
MTWTFSS
« Jul Sep »
1234567
891011121314
15161718192021
22232425262728
293031