Category: Versions

Ordered Columnstore Indexes

Published 2022-07-14 by Kevin Feasel

Joe Obbish and Erik Darling tag team on this one. First, Joe looks at the details of what the CCI ordering process does:

The sort for inserting into an ordered columnstore is a DML request sort. It appears to use the same internal mechanism as the sort that’s added for inserting into partitioned columnstore tables. The difference is that the data is sorted by the specified columns instead of a calculated partition id. In my testing, the sort appears to be a best effort sort that does not spill to tempdb. This means that if SQL Server thinks there won’t be enough memory then the data will not be fully sorted. Parallel inserts have an additional complication.

And Erik has a messy work-around:

Anyway, I decided to dig in and see what was going on behind the scenes. Which of course, means query plans, and bothering people who are really good at debuggers.
Most of the problems that you’ll run into in SQL Server will come from sorting data.
Whenever I have to think about Sorts, I head to this post about all the different Sorts you might see in a query plan.

Definitely read both posts.

Comments closed

Summarizing Data & AI Summit Announcements

Published 2022-07-05 by Kevin Feasel

Zach Stagers hits the high notes:

One of the biggest cheers of the keynote was that Delta is being fully open sourced! Databricks continue to share their incredible work to help drive our industry forward. Delta already has wide adoption, but with the open sourced version now being levelled up to the same standard as the ‘proprietary’ one, this should help cement it as the default choice for lake-based storage.
There were some announcements of things to come with Delta too, such as a optimised deletes and updates by removing single rows instead of having to completely rewrite the file. It’ll be really interesting to see how this works, and just how much it boosts performance.

Read on for more notes on several big announcements.

Comments closed

Azure Synapse Analytics June 2022 Updates

Published 2022-06-30 by Kevin Feasel

Ryan Majidimehr has some updates for us:

Fuzzy matching with a sliding similarity score option has been added to the Join transformation in Mapping Data Flows. You can create inner and outer joins on data values that are similar rather than exact matches! Previously, you would have had to use an exact match. The sliding scale value goes from 60% to 100%, making it easy to adjust the similarity threshold of the match.

Read on for the full list of updates.

Comments closed

Query Store in SQL Server 2022

Published 2022-06-29 by Kevin Feasel

Melody Zacharias gives us a heads up on what’s new with Query Store:

The SQL Server team has improved on Query Store for 2022 again and made some great improvements for SQL 2022. Query Performance was originally introduced as a flight recorder for your queries. It uses a system that gathers query performance data and gives you insights into your work loads over time. In 2022 it is being used to build and expand new capabilities in intelligent query processing. To allow this to work well and be accurate, Query Store is now enabled by default for new databases. In addition to providing hinting support, it will facilitate the ability to build new intelligent query processing scenarios and improve performance.

Read on for a list of improvements you’ll see in the product.

Comments closed

Restoring SQL Managed Instance Backups to SQL Server 2022

Published 2022-06-28 by Kevin Feasel

Mladen Andzic has a preview around how we can take a Managed Instance backup and go on-premises:

Restoring a backup file is the easiest way to copy a SQL Server database to another instance. It allows you to create a copy of your production database for easier troubleshooting or debugging of an issue, to provide a copy of a database to your end users or eligible third parties, or as a light-weight business continuity/disaster recovery solution to restore functionality on another instance of SQL Server. These are just a few use cases, and the list is much longer and there are some very inventive ways of using backup-restore in the wild.
This article explains the challenges of cross-release restore to an older version of SQL engine and announces the private preview of a capability to restore a backup of a database taken from Azure SQL Managed Instance to instance of SQL Server 2022.

So much SQL Server functionality has been built with the idea of getting you from on-premises into the cloud (specifically Azure) but it’s good to see them spend some development effort on the entirely reasonable and realistic scenario that Azure is not the best choice for a company and there are many such companies still willing to throw money at Microsoft for a good product.

Comments closed

Cumulative Updates and GDRs

Published 2022-06-27 by Kevin Feasel

Aaron Bertrand clarifies two concepts:

The underlying problem is that servicing complex software is, well, complex. Microsoft simplified this for our little corner of the world when they announced that SQL Server 2016 would be the last release to get service packs. We still have Cumulative Updates (CUs) and General Distribution Releases (GDRs) to deal with, but they tend to only cause confusion around Patch Tuesday (or the – cough – odd time a CU breaks things). Before I explain, let’s define these:

Read on for the definitions and why the GDR path exists.

Wait, I thought the German Democratic Republic (GDR / DDR) re-unified with the Federal Republic of Germany (FRG / BRD) in 1990… Ah, the lengths I go to for an awful joke.

Comments closed

Distributed Replay Deprecated in SQL Server 2022

Published 2022-06-22 by Kevin Feasel

Brent Ozar starts the wake:

For SQL Server 2022, Microsoft deprecated Distributed Replay.
The idea behind the feature was that you’d capture a trace against your production environment, set up another environment for load testing or QA testing, and then replay that exact same workload against it. You’d be able to measure which queries got better or worse, and how.
The reality was a complete mess. It was a giant pain in the rear to set up and use, to the point where I got frustrated with it within a few hours and asked my peers about their experiences with it. I got back a string of four-letter words – everybody really struggled to get it across the finish line. Over subsequent versions, Microsoft made token efforts to improve it, but never really gave it the love it required.

Yep, I can concur. What we wanted was a simple button-click (or easy-to-navigate UI) that let you capture “What does a real production workload look like?” and then the ability to re-run it elsewhere, like on new hardware. What we got was indeed a mess.

I don’t fully agree with Brent’s argument that the right answer is to build app-level testing. If everything was architected and developed for this, then yeah, that might be a better answer. But unless you’ve built all relevant applications around APIs (so they can be programmatically invoked rather than trying to do everything via Selenium) and have put in the legwork necessary to track and re-run calls, I think you end up with an even bigger mess—especially if there are multiple applications working with the same database. I do agree that this is a hard problem regardless of the path you choose.

Comments closed

Query Store Hints in SQL Server 2022

Published 2022-06-17 by Kevin Feasel

Erik Darling has thoughts:

When you’re dealing with untouchable vendor code full of mistakes, ORM queries that God has turned away from, and other queries that for some reason can’t be tinkered with, we used to not have a lot of options.
In SQL Server 2022, Query Store gains a new super power: you can add hints to queries without intercepting the code in some other manner.

There are a couple of useful hints which won’t be available but Erik seems mostly upbeat about what is there.

Comments closed

Parameter Sensitivity Plan Optimization and Monitoring Scripts

Published 2022-06-16 by Kevin Feasel

Erik Darling gives us a warning:

You can read the full documentation here. But you don’t read the documentation, and the docs are missing some details at the moment anyway.
– It only works on equality predicates right now
– It only works on one predicate per query
– It only gives you three query plan choices, based on stats buckets
There’s also some additional notes in the docs that I’m going to reproduce here, because this is where you’re gonna get tripped up, if your scripts associate statements in the case with calling stored procedures, or using object identifiers from Query Store.

It’s not a deal-breaker but it does make things a lot harder for tool writers, as Erik points out. Hopefully there’s some way to tie this all together before GA.

Comments closed

Creating a SQL Server 2022 Learning Environment

Published 2022-06-09 by Kevin Feasel

Marlon Ribunal gets us started with a Docker container:

Maybe you want to get your hands dirty with the bells and whistles of the latest iteration of SQL Server, but you don’t have an extra bare metal or Azure or GCP based VM. Well, you’re in luck because Microsoft just released container images for SQL Server 2022.
Here are few steps to get you started with SQL Server 2022:

At this point, it’s quite easy to give new versions of SQL Server a try, even when they’re in preview. That said, some of the features make it to containers later so you might want to spin up a virtual machine and install it if there’s something you can’t get right now in the container.

Comments closed