Press "Enter" to skip to content

Curated SQL Posts

Aggregate Window Functions

I have a series on window functions:

Here, we get the sum of LineProfit by CustomerID. Because SUM() is an aggregate function, we need a GROUP BY clause for all non-aggregated columns. This is an aggregate function. The full set of them in T-SQL is available here, but you’ll probably be most familiar with MIN()MAX()SUM()AVG(), and COUNT().

To turn this into a window function, we slap on an OVER() and boom! Note: “boom!” only works on SQL Server 2012 and later, so if you’re still on 2008 R2, it’s more of a fizzle than a boom.

Read on for several examples of this nature.

Comments closed

API Servers and the Importance of Learning

Steve Jones tells a story:

While talking with a client recently about their performance challenges, I was relieved to find that the database wasn’t the problem. Instead, their API server was overloaded by the number of calls taking place in their application. While the database did provide the backing for the API calls, there was a fair amount of caching. However, as they’d moved to microservices, more and more of the interaction between modules was taking place as a network call to a single server, which became overloaded.

Steve goes on to the broader point of people freely donating their time and expertise to explain how to solve problems. And the above is a major problem of moving to microservices: everything gets several times chattier. The biggest tricks I have there are to embrace asynchronous processing via queues and ensure that messages passed back and forth are as small as possible, which means getting rid of the idea of passing big lists of fully-hyrdated objects around.

Comments closed

Get Power BI Data from Google Sheets

Reza Rad shows off a new connector:

Power BI can get data from Google Sheet now. This functionality is released just yesterday and announced in both Power BI and Power Query blogs. The feature is still preview (Beta) but it is worthwhile looking at how it works in a quick article and video.

There are several steps involved but it’s still a lot simpler than the old method of parsing a website, especially if you had any sort of security on the spreadsheets.

Comments closed

LOBs and OPTION(RECOMPILE)

Paul White has a warning for us:

All that is fairly well-known. The point of this short post is to draw your attention to another side-effect of adding OPTION (RECOMPILE) — the parameter embedding optimization (PEO).

When PEO is used, SQL Server takes the value of any variables and parameters and embeds the runtime values in the query text, pretty much as if you had entered them by hand before compiling. This is often very useful for plan quality, but there is a potential drawback when large object types (LOBs) are in play.

Click through for the explanation and a simple demo.

Comments closed

Signs It’s Time to Move to Enterprise Edition

Everywhere are signs, says Erik Darling:

SQL Server Standard Edition hobbles batch mode pretty badly. DOP is limited to two, and there’s no SIMD support. It’s totally possible to have batch mode queries running slower than row mode queries, because the row mode queries can use much higher DOPs and spread the row workload out.

I’d almost rather use indexed views in Standard Edition for large aggregations, because there are no Edition-locked enhancements. You’ll probably wanna use the NOEXPAND hint either way.

Click through for several factors which may cause you to want Enterprise Edition over Standard Edition. Similarly, if none of those apply to you, Standard Edition could work well for you.

Comments closed

Serverless SQL Pool CI/CD via GitHub Actions

Kevin Chant reminds me I need to spend more time with GitHub Actions:

I want to cover one way you can do CI/CD for Azure Synapse Analytics serverless SQL pools using GitHub Actions in this post. For various reasons.

For a start, in a previous post I wrote about how you can CI/CD for serverless SQL pools using Azure DevOps. So, I thought I would balance things out and show how you can do the same thing within GitHub.

In addition to this, there have been a few discussions about using GitHub Actions instead of Azure Pipelines within the Microsoft Data Platform community recently. For example, the topic came up during the DataWeekender conference.

With this in mind, I want to show how easy it can be to migrate an Azure DevOps pipeline to GitHub Actions.

Click through for the example.

Comments closed

Stats Q&A

Erin Stellato has a two-parter on statistics in SQL Server. Part 1 deals with questions on stats creation:

Last week I presented a session, Demystifying Statistics in SQL Server, at the PASS Community Summit, and I had a lot of great questions; so many that I’m creating multiple posts to answer them. This first post is dedicated to questions specific to creating statistics in SQL Server.

Part 2 deals with stats updates:

Last week I presented a session, Demystifying Statistics in SQL Server, at the PASS Community Summit, and I had a lot of great questions; so many that I’m creating multiple posts to answer them. This second post is dedicated to questions specific to updating statistics in SQL Server. Of note…I have a couple previous posts which also include helpful information:

Click through for lots of questions and lots of good answers.

Comments closed

Index Usage across Replicas

Jess Pomfret does the math:

Last week, I was working on a project to analyse indexes on a database that was part of an availability group. The main goal was to find unused indexes that could be removed, but I was also interested in gaining an overall understanding of how the system was indexed.

Unused indexes not only take up disk space, but they also add overhead to write operations and require maintenance which can add additional load on your system.  We can also use this analysis to look for a high number of lookups which could indicate we need to adjust indexes slightly.

Click through to see how you can connect together index usage stats from the primary and secondary replicas of an availability group.

Comments closed

Auditing Data and Data Access Quiz

Kenneth Fisher has a pop quiz for us:

I was honored to speak at Pass Summit last week (Thanks again Redgate), and if you’ve ever been to one of my sessions you’ll know there is always a “quiz” at the end. I.e. a crossword puzzle. Well .. here is the puzzle itself, and attached (at the bottom) is the answer key.

I answered “c” for all of the questions and it worked out really well. When in doubt, Charlie out!

Comments closed

Fill Factor and When It Matters

Raul Gonzalez has a confession to make:

I love SQL Server internals, I do and I just said it.

Why? because thanks to all the tools, documentation and community members that share their knowledge, folks like me can understand how a super complex piece of software like a relational database engine works (or at least a small part of it).

Click through for a discussion of fill factor and one area where Raul thinks it falls short. I’m not sure that I agree but would need to think about it to give a clear explanation as to why.

Comments closed