Press "Enter" to skip to content

Curated SQL Posts

Always Encrypted with Secure Enclaves

Ned Otter has started a new series on Always Encrypted with Secure Enclaves in SQL Server 2019:

SQL 2019 supports an enhanced version of Always Encrypted, known as “Secure Enclaves”. What is an enclave? It’s like a consulate: “….a state that is enclosed within the territory of another state”.

It takes the form of a protected region of memory within the SQL Server environment, requiring special credentials for access. Data in the secure enclave lives in an unencrypted state.

However, as I’ll discuss later in this series, depending on how your organization implements Always Encrypted with Secure Enclaves, it might not be as secure as you had hoped.

That’s pretty ominous. The first part is a fairly high-level overview which gets you familiar with enclaves.

Comments closed

Why So Few Columnstore Indexes Around?

Grant Fritchey has a bit of a rant about people not using Columnstore indexes as much as they should:

It was already common knowledge that columnstore indexes didn’t work for most of us.

Fact is, that’s not true. Now that we have clustered columnstore and non-clustered columnstore, you can go nuts. Most of your data access is through analytical channels? Awesome, use a clustered columnstore. Sometimes though, you need point lookups. Not a problem, add a nonclustered b-tree index to the clustered columnstore. Go here to learn more about Columnstore Indexes.

In short, today, we can completely orient our data storage with our principal data access. Yet, most people are not using these things at all.

One of my interview questions is about columnstore indexes. I’ve learned that I needed to preface it with “What’s the latest version of SQL Server you’ve worked with?” A lot of people answer 2012. Even among the people who use 2016, the normal answer is that they haven’t learned about columnstore yet. And that goes back to Grant’s learning gap: it’s not that hard to grab a book on SQL Server 2019, spin up a Docker container, and dive in. Or watch a course, spin up a Docker container, and follow along. Or read a blog post, spin up a Docker container, and…well, you get the idea.

Comments closed

When Missing Index Requests Go Missing

Erik Darling doesn’t want you to put missing index requests functionality on the back of a milk carton:

Rebuilding indexes will clear missing index requests. So before you go Hi-Ho-Silver-Away rebuilding every index the second an iota of fragmentation sneaks in, think about the information you’re clearing out every time you do that.

You may also want to think about Why Defragmenting Your Indexes Isn’t Helping, anyway. Unless you’re using Columnstore.

There are several reasons why you might not get a missing index request and Erik enumerates them for us.

Comments closed

Data Copy & Package Execution in ADF

Cathrine Wilhelmsen continues a series on Azure Data Factory. First, we get to see how to copy data from on-prem SQL Servers:

In the previous post, we looked at the three different types of integration runtimes. In this post, we will first create a self-hosted integration runtime. Then, we will create a new linked service and dataset using the self-hosted integration runtime. Finally, we will look at some common techniques and design patterns for copying data from and into an on-premises SQL Server.

And when I say “on-premises”, I really mean “in a private network”. It can either be a SQL Server on-premises on a physical server, or “on-premises” in a virtual machine.

Then, we learn how to run SSIS packages in Azure Data Factory:

Two posts ago, we looked at the three types of integration runtimes and created an Azure integration runtime. In the previous post, we created a self-hosted integration runtime for copying SQL Server data. In this post, we will complete the integration runtime part of the series. We will look at what SSIS Lift and Shift is, how to create an Azure-SSIS integration runtime, and how you can start executing SSIS packages in Azure Data Factory.

I’m going to guess that the next post will be all about the third integration runtime.

Comments closed

Problems with sp_estimate_data_compression_savings

Andy Mallon knows it’s getting close to Festivus and he has some grievances to air:

If you’re working with compressed indexes, SQL Server provides a system stored procedure to help test the space savings of implementing data compression: sp_estimate_data_compression_savings. Starting in SQL Server 2019, it can even be used to estimate savings with columnstore.

I really don’t like sp_estimate_data_compression_savings. In fact, I kind of hate it. It’s not always very accurate–and even when it is accurate, the results can be misleading. Before I get ranty about why I don’t like it, let’s look at it in action.

Andy makes good points in this, so check it out.

Comments closed

Schiphol Takeoff: Low-Code Automated Deployment

Tim van Cann and Daniel van der Ende have an open source project for automatic deployment on Azure:

To give a bit more insight into why we built Schiphol Takeoff, it’s good to take a look at an example use case. This use case ties a number of components together:

– Data arrives in a (near) real-time stream on an Azure Eventhub.
– A Spark job running on Databricks consumes this data from Eventhub, processes the data, and outputs predictions.
– A REST API is running on Azure Kubernetes Service, which exposes the predictions made by the Spark job.

Conceptually, this is not a very complex setup. However, there are quite a few components involved:

– Azure Eventhub
– Azure Databricks
– Azure Kubernetes Service

Each of these individually has some form of automation, but there is no unified way of coordinating and orchestrating deployment of the code to all at the same time. If, for example, you were to change the name of the consumer group for Azure Eventhub, you could script that. However, you’d also need to manually update your Spark job running on Databricks to ensure it could still consume the data.

This looks pretty nice. I’ll need to dive into it some more.

Comments closed

New Features in Kafka 2.4

Manikumar Reddy announces new features in Apache Kafka 2.4:

KIP-392: Allow consumers to fetch from closest replica

Historically, consumers were only allowed to fetch from leaders. In multi-datacenter deployments, this often means that consumers are forced to incur expensive cross-datacenter network costs in order to fetch from the leader. With KIP-392, Kafka now supports reading from follower replicas. This gives the broker the ability to redirect consumers to nearby replicas in order to save costs.

It’s not the biggest release of Kafka ever, but there are some really nice updates here.

Comments closed

Testing Power BI Report Performance in the Browser

Chris Webb continues a series on testing Power BI report performance in a browser. Part 2 walks us through some of the mechanics of the process:

Before you publish your report, in Power BI Desktop add a blank page with no visuals on to it. It doesn’t need to be the page that is opened when the report opens and you will be able to delete it later. Why do this? When you’re testing how long it takes for your report page to render, you’re probably doing so because you want to improve performance. Some things in the report page that influence performance you have the power to change, such as the design of the dataset, the DAX in the measures, the number and type of visuals on a page; some things will always happen when a report runs and you have to accept that overhead. Testing how long a blank page takes to render will give you an idea of how long this latter category of “things that always happen” takes, and you can subtract this time from the time your chosen report page takes to run.

Part 3 is a demonstration of the process:

…so you go ahead and publish. You view the report after publishing and it still seems fast. Then the complaints start coming in: the report is slow!?! It seems to be users who are viewing the report on their phone who are having the most problems. So, following the instructions in my last post, you open up Chrome DevTools and run an audit using a simulated slow 4G connection:

That’s an important part of testing. We normally develop inside a fast network, but our users may be on rather slow networks.

Comments closed

Copy Reports with Shared Data Sets Between Workspaces

Gilbert Quevauvilliers ran into a cross-environment issue:

I was working on some documentation for a customer and I came across a very quick and easy way to create a copy of a report which also creates a connection to the shared dataset that I could then copy to another “New Workspace”

Before I found out this gem, I had to manually do this via PowerShell which worked really well, but I had to do a whole lot of extra work to find the GUID’s then test it and make sure it works. With this new method it makes it simple and quick. It is a WIN-WIN

You can follow along as I show you how to do it below.

Click through for the demonstration.

Comments closed