Press "Enter" to skip to content

Author: Kevin Feasel

Comparing Column-Level Encryption to Always Encrypted

Tom Collins compares and contrasts:

A common question from developers & data owners  is what benefits does Always Encrypted offer over column level encryption  aka cell level encryption?    First thing to understand is what are the basic differences between the two methods – Column-level encryption vs Always encrypted

For as much as I appreciate Always Encrypted, it seems I use column-level encryption about an order of magnitude more often.

Comments closed

Expanding Column Width in Powershell Results

Kenneth Fisher supersizes the screen:

Notice the ellipsis (the three dots). That’s showing us that the name was too long and ended up being truncated. Given that I’ve been doing this for a little while now I’m almost completely certain that if I send this as it is the users are going to want to know full names. And with my luck I’ll end up having to give them each truncated string individually. On the theory that if I have time to do it twice I probably have time to do it right the first time, let’s figure out how to expand the columns. Fortunately, as with most things Powershell, there’s a cmdlet for that.

Read on to see what the process looks like.

Comments closed

Buffer Pool Parallel Scans in SQL Server 2022

David Pless talks about an internal optimization in SQL Server 2022:

Operations such as database startup/shutdown, creating a new database, file drop operations, backup/restore operations, Always On failover events, DBCC CHECKDB and DBCC Check Table, log restore operations, and other internal operations (e.g., checkpoint) will all benefit from Buffer Pool Parallel Scan.

In SQL Server 2019 and previous releases, operations that require scanning the buffer pool can be slow, especially on large memory machines such as the M-series Azure SQL virtual machine and large on-premises SQL Server environments. Even log restore operations and availability group failover operations can be impacted. Currently, there’s no way to eliminate this issue prior to SQL Server 2022, and dropping buffers using DBCC DROPCLEANBUFFERS would likely result in some degree of performance degradation as any subsequent query executions will have to reread the data from the database files increasing I/O.

Read on to understand why these operations can be slow on high-memory boxes and how much of a benefit you might get on certain administrative activities.

Comments closed

Visualizing Kafka Stream Lineage

David Araujo and Julia Peng show off stream lineage in Confluent Cloud:

Stream Lineage is a tool Confluent built to address the lack of data visibility in Kafka and event-driven architectures. Confluent’s Stream Lineage provides an interactive map of all your data flows that enable users to:

1. Understand what data flows are running both now or at any point in the past

2. Trace where each data flow originated from

3. Track how data is transformed along its journey

4. Observe where each data flow ends up

Read on to see how it works.

Comments closed

The Power of Hash Match Joins

Jared Poche looks at hash matching:

When I began working at Microsoft, I was very much a novice at performance troubleshooting. There was a lot to learn, and hash match joins were pointed out to me multiple times as the potential cause for a given issue. So, for a while I had it in my head, “hash match == bad”. But this really isn’t the case.

Hash matches aren’t inefficient; they are the best way to join large result sets together. The caveat is that you have a large result set, and that itself may not be optimal. Should it be returning this many rows? Have you included all the filters you can? Are you returning columns you don’t need?

I might throw in one caveat about hash match joins and being the best performers for two really large datasets joining together: merge join can be more efficient so long as both sets are guaranteed to be ordered in the same way without an explicit sort operator. That last clause is usually the kicker.

Comments closed

Ordered Columnstore Indexes in SQL Server 2022

Brent Ozar appreciates order:

So essentially, every column has a whole bunch of indexes on it.

But there’s no order whatsoever as to which rows end up in which index.

This isn’t a problem for relatively small tables, but as you get to billion-row data warehouse fact tables where columnstore should really shine, performance gradually degrades. In data warehouses, fact tables often have a commonly filtered column, like SaleDate. However, until SQL Server 2022, even if you wanted a small SaleDate range, your query would likely check hundreds or thousands of row groups, each of which had a huge range of data.

But do read the whole thing, as it seems it’s not working correctly in CTP 2.0 of SQL Server 2022. It is quite useful in Azure Synapse Analytics dedicated SQL pools, at least—that I can confirm.

Comments closed

Removing a Data Disk from a Running Azure VM

Joey D’Antoni tightrope walks without a net for fun:

I was working with a client recently, were we had to reconfigure storage within a VM (which is always a messy proposition). In doing so, we were adding and removing disks from the VM. this all happened mostly during a downtime window, so it wasn’t a big deal to down a VM, which is how you can remove a disk from a VM via the portal. However, upon further research, I learned that through the portal you can remove a disk from a running VM.

Read on to see how. Though I’d generally still recommend shutting the VM off first just to be sure.

Comments closed

Refreshing Power BI Usage Metrics Report Credentials

Stephanie Bruno fixes a credentials issue:

Have you ever had a Usage Metrics Report just stop refreshing? At the bottom you’ll see a little message that tells you to check the credentials. But you don’t have access to the Usage Metrics Report dataset in the list of datasets in the workspace because it’s hidden! So even if you wanted to delete it and nicely ask it to start again, you can’t do it from the workspace. What can you do? Well, you can delete the Usage Metrics Report dataset with the Power BI Rest API.

Read on to see how.

Comments closed