Press "Enter" to skip to content

Month: May 2020

Notes on Derived Tables

Itzik Ben-Gan continues a series on table expressions:

The term derived table is used in SQL and T-SQL with more than one meaning. So first I want to make it clear which one I’m referring to in this article. I’m referring to a specific language construct that you define typically, but not only, in the FROM clause of an outer query. I’ll provide the syntax for this construct shortly.

The more general use of the term derived table in SQL is the counterpart to a derived relation from relational theory. A derived relation is a result relation that is derived from one or more input base relations, by applying relational operators from relational algebra like projection, intersection and others to those base relations. Similarly, in the general sense, a derived table in SQL is a result table that is derived from one or more base tables, by evaluating expressions against those input base tables.

There’s a lot to digest in this post, so check it out.

Comments closed

Who and What is Using Your tempdb Space?

Guy Glantser helps us troubleshoot who is using tempdb space and for what purpose:

In the past two weeks I encountered 3 different scenarios in which tempdb has grown very large, so I decided to write about troubleshooting such scenarios.

Before I describe the methods for troubleshooting tempdb space usage, let’s begin with an overview of the types of objects that consume space in tempdb.

The answer, of course, was Professor Plum in the Aviary with the eager spool.

Comments closed

Never Forget a WHERE Clause Again

If you frequently forget to include the WHERE clause when you run queries, Rob Farley has an answer for you:

You meant to only update a few rows, but you weren’t paying attention to what you’d selected before hitting the Execute button. The WHERE clause got missed from what you executed, and you’ve updated every row. Oops.

Now, I totally hear you that you could’ve put this in a transaction, checked your results and then only done a COMMIT when you were sure. You could even have put that COMMIT inside an IF block that checked @@ROWCOUNT… yes, yes, yes. You could have used an SSMS add-in to let you execute the whole query that you’re currently on. You could’ve used a different tool entirely, such as Azure Data Studio. There are plenty of ways to avoid this. But I’m going to show you something that I find works for me, especially when I’m giving the script to someone else to run.

It’s more effort and makes the code harder to read, but if you have a real concern about somebody (and that somebody can include you) goofing this up, it does the job.

Comments closed

Building Queues in the Database

Erik Darling has created a series on queue tables in the database. Part one builds out a table:

A little explanation of the table: this is good for a reusable, ordered queue, that’ll give us some feedback on how long things took. It could also be used is there were a pending element, but we’d probably wanna change the indexing so that we could find either the last start time, or the last end time efficiently.

The thing that probably needs the most explanation here is the indexing and constraints. Because I know you, and you’re staring at the way I have my primary key, and you’re getting an itch and a twitch. That’s okay, it’s you’ve been conditioned for years to put the most selective column first. Unfortunately, that won’t work here.

Part two takes us through querying the queue:

For the table, our main priorities were indexing to make sure we can find work easily, and not allowing duplicate items.

For our worker proc, the main goals are going to be

– Looping until we run out of work
– Finding and reserving work with minimal locking
– Making that process atomic

Thankfully, this is a lot easier than it sounds. The table design does a lot of the work for us.

Erik’s design is not one I typically reach for, though my constraints are a bit different from his—typically, I’m using queue tables to run on periodic schedules and grab batches of records which finish processing before the next timed batch begins, and processes are idempotent, so if a queued item re-runs on occasion, it’s okay. But this is a really good technique if you need a more robust solution.

Comments closed

Counting Table Tennis Ball Bounces

Evgeni Chasnovski has some fun counting:

On May 7th 2020 Dan made a successful attempt to beat a world record for the longest duration to control a table tennis ball with a bat. He surpassed current record duration of 5h2m37s by 18 minutes and 27 seconds for a total of 5h21m4s. He also live streamed the event on his “TableTennisDaily” YouTube channel, which later was uploaded (important note for the future: this video is a result of live stream and not a “shot and uploaded” one). During cheering for Dan in real time I got curious about actual number of bounces he made.

And thus the quest begins.

As counting manually is error-prone and extremely boring, I decided to do this programmatically. The idea of solution is straightforward: somehow extract audio from the world record video, detect bounces (as they have distinctive sound) and count them.

Click through for the process as well as a link to a Git repo with the Python code.

Comments closed

Choosing Colors for Visuals

Mike Cisneros has some guidance around color selection when designing visuals:

Regardless of how you select it, this key color will be used to denote the data points, or the data series, on which you feel it is the most important for your audience to focus.

All of the other colors we use will be based on where they are on the color wheel in relation to this key color, how many colors we intend to use, and what kind of relationship the rest of the data has to the data represented by the key color. 

I’ll admit that I just reach for the pre-created palettes which have done the work for me already.

Comments closed

Thread Pool Exhaustion and Availability Groups

Sean Gallardy lays down the gauntlet:

You’re probably wondering why you couldn’t spawn a new thread, why this error happened, why you shouldn’t just up the max worker threads, and probably also thinking this must be some kind of “bug” in SQL Server. So here’s where our awkward conversation starts… It’s you. Contrary to every relationship breakup you’ve ever had, it’s definitely you. I’m not saying this to be mean but to really drive the point home here. The major reasons for this occurring are large blocking chains, too much workload for the server size (databases, users, etc.), and/or your virtual infrastructure sucks. There aren’t too many reasons for getting yourself into this situation, and while what I’ll be putting forth here isn’t exhaustive of all edge cases and scenarios, these are by far the majority of all the items in the wild that I’ve either worked on or have been involved in at some level. Side Note: If you’ve read this far, are shaking your head, calling me names that an irate sailor might utter, and telling yourself that upping the max worker threads as the product error suggests and Microsoft should fix their bugs then you can stop reading here as you’re probably not open to learning why you have issues in your environments.

One more scenario I’ve seen is mirroring thousands of databases on a single instance. That scenario fit none of Sean’s criteria—there was very little blocking, most of the databases were small and infrequently-used, and the infrastructure was the right size. It was just a huge number of databases and each database requiring a minimum of X worker threads. Mind you, it was still a bad idea…

Comments closed

Publishing a dacpac with .NET Core

Erik Ejlskov Jensen shows how to deploy a Visual Studio database project from .NET Core:

In this post, I will describe how you can build a SQL Server Database project in order to create a .dacpac file, using .NET Core only – dotnet build.

For a while now, it has been possible to publish a .dacpac (meaning apply it to an new or existing database) using the cross-platform version of sqlpackage.

But building a database project (.sqlproj) was only possible on Windows, as the .sqlproj project type is based on the classic .NET Framework .csproj project type.

However, thanks to a smart open source effort, you can now also build a .dacpac file, even on a Mac or Linux build agent.

Read on to learn more.

Comments closed

Tips for a Smooth Power BI Report Server Experience

Josh Crittenden has a few tips for people installing Power BI Report Server:

As is the case with any enterprise BI product, careful planning and consideration of how the technology will be leveraged should be made up-front. This will help avoid speed bumps during implementation and ensure a smooth ride as you roll out your BI and analytics solutions. Whether you’ve been working with PBIRS from day one, or are starting to use the platform for the first time, the goal of this post is to showcase tips and techniques that you can turn around and apply in your own environment.

Additionally, for those of you considering a hybrid Power BI implementation (Power BI service and Power BI Report Server), we will compare specific features of each platform below.  

Read on for the tips.

Comments closed

Capturing CHECKDB Output

Erin Stellato shows how you can track the results of those automated CHECKDB runs you’re doing:

First, you need to be running CHECKDB on a regular basis.  If you’re going to ask what you mean by regular basis, I’ll direct you to a post of Paul’s that is still relevant: Importance of running regular consistency checks.  There are a few factors to consider, but in general I recommend at least once a week.  If you’re a 9 to 5 shop and have the maintenance window to run CHECKDB daily, feel free to do that.

Erin walks us through it and also recommends checking out Ola’s scripts for integrity checks. I’d add to that Minion CheckDB.

Comments closed