Not Catching Them All

Hanjo Odendaal explains clustering techniques using Pokemon:

To collect the data on all the first generation pokemon, I employ Hadley Wickam’s rvest package. I find it very intuitive and can handle all of my needs in collecting and extracting the data from a pokemon wiki. I will grab all the Pokemon up until to Gen II, which constitutes 251 individuals. I did find the website structure a bit of a pain as each pokemon had very different looking web pages. But, with some manual hacking, I eventually got the data in a nice format.

This probably means a lot more to you if you grew up in front of a Game Boy, but there’s some good technique in here regardless.

Related Posts

A Quick Look At Data Visualization Tools

Vincent Wong walks us through data visualization tools on the market today: Highcharts When we talk about Echarts, we will usually compare it with Highcharts. The relationship between them is a bit like the relationship between WPS and Office. Highcharts is also a visualization library which you have to pay for it if you are […]

Read More

Interpreting The Area Under The Receiver Operating Characteristic Curve

Roos Colman explains what a Receiver Operating Characteristic (ROC) curve is and how we interpret the Area Under the Curve (AUC): The AUC can be defined as “The probability that a randomly selected case will have a higher test result than a randomly selected control”. Let’s use this definition to calculate and visualize the estimated […]

Read More

Categories

August 2016
MTWTFSS
« Jul Sep »
1234567
891011121314
15161718192021
22232425262728
293031