Press "Enter" to skip to content

PDF Scraping with R

Jennifer Cooper shows how you can use R to scrape data from a text-based PDF:

Here’s a diagram of the workflow I used:
1. Start with PDF
2. Use tabulizer to extract tables
3. Clean up data into “tidy” format using tidyverse (mainly dplyr)
4. Visualize trends with ggplot2

Read on for more detail on each step in the process. H/T R-Bloggers.