Press "Enter" to skip to content

Author: Kevin Feasel

Inserting into Azure Blob Storage from SQL Server 2022

I continue a series on data virtualization in SQL Server 2022:

Several years ago, I wrote a blog post on how to insert data into Azure Blob Storage from SQL Server using PolyBase. That technique used PolyBase V1: the Java connector for Hadoop. With SQL Server 2022 eliminating that connector, we’re going to learn the new method.

This is one of the larger practical differences in data virtualization with SQL Server 2022.

Comments closed

SQL Server 2022 Query Store Hints

David Pless takes us through some new query hints:

Query Store hints provide a direct method for developers and DBAs to shape query plans without changing application code.  

Query Store hints are a new feature that extends the power of Query Store—but this means that Query Store hints does require the Query Store feature to be enabled and that your query and query plan are captured in the Query Store.

Just like plan guides, Query Store hints are persisted and will survive restarts, but Query Store hints are much easier to use than plan guides.

Read on to see which options are available and how they work.

Comments closed

Backup Options for Cosmos DB

Manvendra Singh takes a backup:

This article will explore backup options available in the Azure Cosmos DB service. Backups are very important to safeguard our data in case of data corruption, data deletion, system failure, or any unforeseen circumstances like DR. We have planned, configured, and managed it for our on-prem databases whether it is SQL Server, Oracle, DB2, or system files on various machines. DBAs and Infrastructure admins have ensured to keep a backup of all these systems to safeguard their data. Similarly, we must also secure our data hosted in a cloud environment for any services whether it is Azure VMs, Azure SQL, Cosmos Db accounts, or any other services. Today we will talk about backup options available to secure cosmos DB databases and their contents.

Click through for those two options.

Comments closed

Oracle on Azure Frequently Asked Questions

Kellyn Pot’vin-Gorman spreads information:

A lot of DBAs aren’t as familiar with Oracle DataGuard as many would think.  Even though it’s a phenomenal product, they may have never used it, so knowing the ins and outs of the best Oracle product to use with Oracle on Azure is important.

I highly recommend the following documentation and guidelines from Oracle.  The Product team in charge of DataGuard is fantastic at Oracle, so why go anywhere else to learn about this?

Oracle Data Guard Concepts and Administration, 19c

If you are in the situation where you’re thinking about moving your Oracle servers to Azure, this is a good starting point.

Comments closed

Inverted Indexes for Full-Text Search

Maria Zakourdaev twists some text inside-out:

Sometimes there are properties in the document with unstructured text, like newspaper articles, blog posts, or book abstracts. The inverted index is easy to build and is similar to data structures search engines use.

Such document structures can help in various complex search patterns, like common word detection, full-text searches, or document similarity searches, using humming distance or l2distance algorithms. Inverted indexes are useful when the number of keywords is not too large and when the existing data is either totally immutable or rarely changed, but frequently searched.

This post and Maria’s MSSQLTips post both cover the high-level concept, focusing on tradeoffs between different data models. I like this sort of idea a lot and like telling people that sometimes, the right answer in a relational database involves thinking backwards.

Comments closed

The Joy of Treemaps

Simon Rowe answers describes one of my favorite often-inappropriate visuals:

Dr Shneiderman developed the “treemap” in order to visualise this large amount of data—with multiple levels of folders and subfolders—in an efficient way, without taking up too much screen real estate. The treemap uses a series of nested rectangles, sized proportionally to the corresponding data value, to deliver an organised and multi-level view into any hierarchical data set.

Treemaps get misused a lot but are really valuable in specific scenarios. Click through to learn when (and when not) to use a treemap.

Comments closed

Visualizing Delay Times on Subway Stations

Benjamin Smith looks for delays:

Any Torontonian who has commuted regularly on the TTC has probably experienced their fair share of delays on the subway. Having experienced a few recently I was inspired to visualize the average delay times across all stops on the subway. What are the stations with the longest delays on average this past year? Could we make a nice visual with it?

Click through for the end result as well as the process to get there.

Comments closed

PolyBase and Named Instances

I show how to connect to a named instance using PolyBase in SQL Server 2019 or 2022:

We have two SQL Server instances running on the same machine. Before we get started, I do want to point out one thing: PolyBase can only work on one instance for a given server (physical machine or virtual machine) because the PolyBase engine and data movement services are system-level services. This means you cannot have PolyBase installed on your main instance as well as your named instance.

Click through for two methods.

Comments closed

Building UNPIVOT Syntax

Michael J. Swart has a function:

Just like PIVOT syntax, UNPIVOT syntax is hard to remember.
When I can, I prefer to pivot and unpivot in the application, but here’s a function I use sometimes when I want don’t want to scroll horizontally in SSMS.

Click through for the function. This is an area where I wish there was built-in * logic for PIVOT and UNPIVOT. Or at least a “Select all columns but the following” as that would make things easier.

Comments closed

Lock Escalation Thresholds

Paul White gets into the weeds:

This article isn’t about the act of lock escalation itself, which is already well documented and generally well understood. Some myths (like row locks escalating to page locks) persist, but challenging those yet again probably wouldn’t change much.

Instead, the question I’ll address here is exactly how and when lock escalation is triggered. Much of the documentation is incorrect or at least imprecise about this and I’ve been unable to find a correct description in other writings.

There are good reasons you haven’t seen a simple demo of lock escalation taking place at 5000 locks. I’ve seen suggestions such as lock escalation isn’t deterministic, or some types of locks don’t count toward the 5000 lock threshold. Neither of those assertions is true, but the details are interesting, as I’ll explain.

As always, Paul brings clarity to a difficult topic.

Comments closed