Press "Enter" to skip to content

Curated SQL Posts

ScriptDOM Now Open Source

Drew Skwiers-Koballa has great news for us:

ScriptDOM is a powerful .NET library for code parsing, generating an abstract syntax tree (AST) that can be leveraged to apply code formatting, detect antipatterns, and more. We are thrilled to announce that the source code for ScriptDOM has been released into open source under the MIT license and is available on GitHub.  In addition, ScriptDOM is now distributed by Microsoft as a standalone NuGet package.

This is big and good news. We’ve been able to use ScriptDOM for quite a while, but now that we can extend and improve the library, that’s great.

Comments closed

Loading WhoIsActive Data on Azure SQL DB

Andrea Allred wants to know who’s doing what on this system:

I needed to collect sp_WhoIsActive into a table, but the twist was that it is on my Azure Managed Database, so I had to get creative with how I did it. We needed an Azure Pipeline to run it, but we wanted to record it every minute and firing a pipeline every minute adds up fast. So we decided that we would kick it off once an hour and have the process wait for a minute and then fire until the hour ended. Then it fire again at the top of the next hour and the same process would happen.

That’s an interesting way to do it. Another alternative might have been an Azure function app, which you could schedule to run every minute. I think that’d be a lot less expensive than running an Azure Pipeline, and this goes to show you that there are many ways to solve the same problem in Azure.

Comments closed

9 Gotchas Working with Postgres

Phil Booth categorizes various mistakes as learning experiences:

Previously on Extreme Learning, I discussed all the ways I’ve broken production using healthchecks. In this post I’ll do the same for PostgreSQL.

The common thread linking most of these gotchas is scalability. They’re things that won’t affect you while your database is small. But if one day you want your database not to be small, it pays to think about them in advance. Otherwise they’ll came back and bite you later, potentially when it’s least convenient. Plus in many cases it’s less work to do the right thing from the start, than it is to change a working system to do the right thing later on.

Click through for the nine lessons learned, eight of which are still relevant as of PostgreSQL version 12. Many of these also have analogues in the SQL Server world, e.g., don’t overuse triggers, use non-recursive methods for path traversal when possible, do add indexes on foreign keys.

Comments closed

Passing the Buck: Hyperparameters Edition

John Mount is not a fan of hyperparamters:

In my opinion one can see this scam of hiding some debt in with an asset spreading.

Earliest modeling systems, such as linear regression, had no hyper-parameters. An under specified algorithm was not considered a fully specified method.

Click through for John’s thoughts on the matter. I’m sympathetic to this argument and want to bring in an extra point John didn’t make. With hyperparameter tuning, you also introduce the risk of spurious correlation between the label and input features. This is particularly relevant if changing the seed or making hyperparameter tweaks results in a major change in model effectiveness.

Comments closed

Removing Chartjunk

Elizabeth Ricks takes out the trash:

Why is clutter so hard for us to let go of? Perhaps because we think something has always been there, so it must belong there and we’re afraid of what might happen if we eliminate it. Or perhaps we don’t have a good framework for evaluating whether something is useful or not. 

This same concept applies to our graphs and business communications. We tend to blindly accept the default settings of our tools and very rarely consider if the included elements actually have a purpose. The “Windows XP” question, in this case, is: does this element add enough informative value to make up for its presence?” 

This is one of the most important ideas Edward Tufte championed—other than “pie charts are dumb, so don’t use them”. I don’t completely agree with Tufte’s definition of the term, which is (paraphrasing and going by memory) any marking on the page not absolutely necessary to convey the most relevant details to a viewer. But in this post, Elizabeth shows quite a bit we could remove while losing no critical information.

Comments closed

Getting Started with Postgres’s psql

Ryan Booz provides a primer:

PostgreSQL has a separate command-line tool that’s been available for decades and is included with any installation of PostgreSQL. Many long-term PostgreSQL users, developers, and administrators rely on psql to help them quickly connect to databases, examine the schema, and execute SQL queries.

Knowing how to install and use basic psql commands is an essential skill to have for anyone that will connect to PostgreSQL.

Most of the article covers installation and starting up, but before you can write queries, you’ve got to connect to the server and database.

Comments closed