Press "Enter" to skip to content

Curated SQL Posts

Version Store and ONLINE Operations

Josh Darnell takes us through how SQL Server manages ONLINE = ON operations (such as index building and rebuilding) using the version store:

The votes table has about 10 million rows in it, so this takes a bit of time (10-15 seconds if nothing else is happening). If I check sys.dm_tran_version_store_space_usage and sp_WhoIsActive, I can see that:

– the version store is not growing, and
– the ALTER statement is chugging along making progress

There are costs to setting ONLINE = ON. I think they’re almost always worth it, but it’s important to remember that they are there.

Comments closed

The Costs of Bad Statistics

Monica Rathbun explains what happens when statistics go wrong:

Over Estimations of Rows (Actual > Estimated) leads to:

– Selection of parallel plan when a serial plan might be more optimal
– Inappropriate join strategy selections
– Inefficient Index Navigation (scan verses seek)
– Inflated Memory Grants

Read the whole thing. The optimizer doesn’t get to look at actual data when determining plans (save for something like adaptive query join processing, but that’s pretty rare), so statistics are its link to reality.

Comments closed

Strong and Weak Power BI Relationships

Alberto Ferrari takes us through the two different kinds of relationships in Power BI:

A relationship in a Tabular model can be strong or weak. In a strong relationship the engine knows that the one-side of the relationship contains unique values. If the engine cannot ensure that the one-side of the relationship contains unique values for the key, then the relationship is weak. A relationship can be weak because either the engine cannot ensure the uniqueness of the constraint – due to technical reasons we outline later – or the developer defined it as such. A weak relationship is not used as part of table expansion. Let us elaborate on this

Something I’d like to see improved in Power BI is to differentiate strong versus weak relationships in the UI. Having no way to differentiate is okay if you only have a few tables or if you designed everything, but coming in late and reviewing a big model, it’s annoying to double-click each link to see if it’s strong or weak.

Comments closed

Starting SQL Server in Single-User Mode

Ranga Babu has a few methods for starting SQL Server in single-user mode:

It is advisable to use SQLCMD when you want to query SQL Server that is started in single user mode as connecting directly and query using SQL Server Management Studio that uses more than one connection. To query SQL Server single user mode using SQL Server Management Studio, open SQL Server Management Studio, and do not connect to SQL Server directly. Close the connection window and click on New Query as shown in the below image which opens a query editor in SQL Server Management Studio:

I recommend practicing this a few times, as the only time you’d actually start SQL Server in single-user mode is during an emergency and that means people breathing down your neck (figuratively if not literally).

Comments closed

Getting to Basics with Excel Charts

Alex Velez removes junk from Excel charts:

Custom chart templates aren’t a new feature, but I’m not sure how widely known they are. In a guest post, Bill Dean briefly recommended using these to create a non-standard Excel chart, The Bullet Graph. Another use-case is to create what I call a “clean-slate-template.” This is a chart template that incorporates many best practices and allows you—the creator—to focus on the strategic use of color and words while saving time on formatting.

This is nice because it eliminates the need to click-click-click on every chart, removing the same things over and over.

Comments closed

Spark Streaming DStreams

Manish Mishra explains the fundamental abstraction of Spark Streaming:

Before going into details of the operations available on the DStream API, let us look at the input sources from which we can start a Stream. There are multiple ways in which we can get the inputs from e.g. Kafka, Flume, etc. Or simple Idle files. To get the details on the available input sources supported by Spark, you can refer to this section. As part of this blog, we will take the example of Kafka.

Read on to see an example of pulling data from Kafka and converting inputs into microbatches.

Comments closed

Why Root Containers are Troublesome

Andrew Pruski explains to us why it can be bad to have a container user running as root:

Recently I noticed that Microsoft uploaded a new dockerfile to the mssql-docker repository on Github. This dockerfile was under the mssql-server-linux-non-root directory and (you guessed it) allows SQL Server containers to run as non-root.

But why is running a container as root bad? Let’s run through an example.

Just as with physical devices and VMs before them, Docker containers can do a lot of damage if you’re logged in as root.

Comments closed

VM Storage Performance in the Cloud

Joey D’Antoni explains how storage architecture has changed from on-prem to the cloud:

This architecture design dates back to when a storage LUN was literally a built of a few disks, and we wanted to ensure that there were enough I/O operations per second to service the needs of the SQL Server, because we only had the available IO of a few disks.

As virtualization became popular storage architectures changes and the a SAN lun was carved out into many small extents (typically 512k-1MB depending on vendor) across the entire array. What this meant was that with modern storage there was no need to separate logs and data files, however some DBAs did, however in an on-premises world there was no penalty for this.

It’s important to keep up on these changes.

Comments closed

Converting JSON to Result Sets

Jack Vamvas shows how you can import data in JSON format and get tabular data in SQL Server:

It is possible to read a json file using T-SQL.There are a number of different methods.  By using the OPENROWSET functionality , ISJSON and OPENJSON function you can quickly read the file , check if the JSON is valid and then unpack the JSON into a SQL table. 

Read on for an example. This also performs reasonably well in practice, at least in my experience.

Comments closed

Non-Root SQL Server 2019 Containers

Vin Yu announces a change to Microsoft’s container configuration for SQL Server 2019:

The application process within most Docker containers is running as a root user meaning the process has root privileges within the container user space. The root user within the container is also the same root (uid 0) on the host machine, and if the user can break out of the container, they would have root permissions on the host. Running as root is convenient for development, testing and CI/CD use cases but for production use cases, it is safest to run SQL Server as a non-root process within the container. In this blog, we’re going to share with you how you can preview this upcoming improvement by creating your own non-root SQL Server container.

Vin has a quick demonstration of how it works.

Comments closed