Never having done a web scrape before, this was the perfect subject for my first time – grabbing all the sessions submitted to PASS Summit 2017…and doing it with PowerShell! Here is the script I used for this. I have accounted for the following:
Apostrophes (aka single quote). They will break your insert unless you have two of them, and for some reason, people seem to use them all over the place.
Formatting the string data for insert. No, your data will not magically come out right in your insert with single quotes so you need to add them.
Additional ID and deleted fields.
Speaker URL and ID. Will be using this to scrape speaker details later.
Accurate lower and upper bounds. These were arrived at by trial and error (you’re welcome), as well as the clean up of the data I scraped. More on this later.
Powershell probably wouldn’t be my first language for web scrapes—that’d be Python—but Amy shows how to get a scrape going.