Jennifer Cooper shows how you can use R to scrape data from a text-based PDF:
Here’s a diagram of the workflow I used:
1. Start with PDF
2. Usetabulizer
to extract tables
3. Clean up data into “tidy” format usingtidyverse
(mainlydplyr
)
4. Visualize trends withggplot2
Read on for more detail on each step in the process. H/T R-Bloggers.