Scraping Session Data

Amy Herold has scraped PASS Summit 2017 submissions using Powershell:

Never having done a web scrape before, this was the perfect subject for my first time – grabbing all the sessions submitted to PASS Summit 2017…and doing it with PowerShell! Here is the script I used for this. I have accounted for the following:

  • Apostrophes (aka single quote). They will break your insert unless you have two of them, and for some reason, people seem to use them all over the place.

  • Formatting the string data for insert. No, your data will not magically come out right in your insert with single quotes so you need to add them.

  • Additional ID and deleted fields.

  • Speaker URL and ID. Will be using this to scrape speaker details later.

  • Accurate lower and upper bounds. These were arrived at by trial and error (you’re welcome), as well as the clean up of the data I scraped. More on this later.

Powershell probably wouldn’t be my first language for web scrapes—that’d be Python—but Amy shows how to get a scrape going.

Related Posts

An Overview of dbatools with Jess and Bert

Bert Wagner has a new video available: dbatools is one of the coolest community projects I’ve seen – it is amazing how many commands are available to help make managing your SQL Server instances a breeze. This week I had the opportunity to learn how to use dbatools to automate backups, change recovery models, and discover additional dbatools […]

Read More

Default Parameters In Powershell

Andy Levy shows us how to use default parameters in Powershell: By the 4th Invoke-DbaQuery, I found myself thinking “this repetitive typing kind of sucks.” Then I remembered Chrissy LeMaire’s segment in the first PSPowerHour where she talked about default values, and her accompanying dbatools blog post. Most of the blog posts and demos of this feature focus on […]

Read More

Categories

June 2017
MTWTFSS
« May Jul »
 1234
567891011
12131415161718
19202122232425
2627282930