Press "Enter" to skip to content

Author: Kevin Feasel

Replication Publisher To Azure SQL DB

Jes Borland continues her series on transactional replication from on-prem SQL Server + Availability Groups into Azure SQL Database:

After initializing, check the Snapshot Agent and Log Reader Agent for success. (To do so, go to Replication, right-click the publication name, and select Snapshot Agent Status and Log Reader Agent Status.) I ran into problems with the Snapshot account not having high enough permissions in the databases (it needs db_owner), and then not having enough permissions on the snapshot folder (it needs Full). (This forum post, answered by Hilary Cotter, helped: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/899857db-e38e-4026-a34c-2a8c2628c6fc/access-denied-to-sql-replication-snapshot-folder?forum=sqlreplication.)

Except for the final section, it’s pretty much the same as dealing with on-prem SQL Server sans Availability Groups.

Comments closed

STRING_AGG() Performance

Aaron Bertrand wants to know how the STRING_AGG() function performs:

We can see that our FORCESCAN hint really did make things worse – while we shifted the cost away from the clustered index seek, the sort was actually much worse, even though the estimated costs deemed them relatively equivalent. More importantly, we can see that STRING_AGG() does offer a performance benefit, whether or not the concatenated strings need to be ordered in a specific way. As with STRING_SPLIT(), which I looked at back in March, I am quite impressed that this function scales well prior to “v1.”

Given that the early releases tend to be “get the thing working” and later CTPs are around “make the thing faster,” it’s nice to see that STRING_AGG() is already ready for prime-time, and makes me wonder if they’ll make it even faster by RTM.

Comments closed

Unencrypted Backups With TDE

Steve Jones shows what you need to do to take an unencrypted backup on a database with TDE configured:

When SQL Server goes to restore the file, it reads part of the header. In here, the process must detect the DEK and try to decrypt that key. However, since this new instance does not have the certificate, this doesn’t work and an error is thrown, despite not needing the key since the data isn’t encrypted.

The issue here is the DEK still exists in the source database.

Read the whole thing for the solution.

Comments closed

Windows Virtual Accounts

Wayne Sheffield describes virtual accounts and how SQL Server can make use of them:

SQL Server will use these groups in many places so that permissions are granted to the group, instead of the actual service account. This simplifies things greatly if you change the service account – SQL Server Configuration Manager will just change the member of this group instead of having to hunt down and change everywhere that it knows that permissions are needed for the service account. Using these groups instead of the service account will simplify your life also if you ever change the service account – all those specific permissions that you granted on local resources (paths, registry, etc.) would have to be changed. Using the group, it will still have the same permissions.

I consider virtual accounts—particularly when you stick to using the virtual account itself rather than a domain account—to be a really good security feature, as it prevents system administrators from getting lazy and using the same service account everywhere.  This in turn blocks an attacker from using a pass-the-hash strategy to pivot from one SQL Server instance to another.

Comments closed

Mastering Tools

The folks at Sharp Sight Labs explain that future obsolescence of a tool does not mean you should not master it:

The heart of his critique is this: data science is changing very fast, and any tool that you learn will eventually become obsolete.

This is absolutely true.

Every tool has a shelf life.

Every. single. one.

Moreover, it’s possible that tools are going to become obsolete more rapidly than in the past, because the world has just entered a period of rapid technological change. We can’t be certain, but if we’re in a period of rapid technological change, it seems plausible that toolset-changes will become more frequent.

The thing I would tie it to is George Stigler’s paper on information theory.  There’s a cost of knowing—which the commenter notes—but there’s also a cost to search, given the assumption that you know where to look.  Being effective in any role, be it data scientist or anything else, involves understanding the marginal benefit of pieces of information.  This blog post gives you a concrete example of that in the realm of data science.

Comments closed

Collecting Deadlock Information

Kendra Little has scripts to collect deadlock graph information using extended events or a server-side trace:

Choose the script that works for you. You can:

  1. Use a simple Extended Events trace to get deadlock graphs via the sqlserver.xml_deadlock_report event

  2. Use a Server Side SQL Trace to get deadlock graphs (for older versions of SQL Server, or people who like SQL Trace)

  3. Use a (much more verbose) Extended Events trace to get errors, completed statements, and deadlock graphs. You only need something like this if the input buffer showing in the deadlock graph isn’t enough, and you need to collect the other statements involved in the transactions. You do this by matching the transaction id for statements to the xactid for each item in the Blocked Process Report. Warning, this can generate a lot of events and slow performance.

I’d default to script #1 and look at #3 in extreme scenarios.

Comments closed

Polybase With Azure Blob Storage

I look at using Polybase to read data from Azure Blob Storage:

To this point, I have focused my Polybase series on interactions with on-premises Hadoop, as it’s the use case most apropos to me.  I want to start expanding that out to include other interaction mechanisms, and I’m going to start with one of the easiest:  Azure Blob Storage.

Ayman El-Ghazali has a great blog post the topic, which he turned into a full-length talk.  As such, this post will fill in the gaps rather than start from scratch.  In today’s post, my intention is to retrieve data from Azure Blob Storage and get an idea of what’s happening.  From there, we’ll spend a couple more posts on Azure Blob Storage, looking a bit deeper into the process.  That said, my expectation going into this series is that much of what we do with Azure Blob Storage will mimic what we did with Hadoop, as there are no Polybase core concepts unique to Azure Blob Storage, at least any of which I am aware.

Spoilers:  I’m still not aware of any core concepts unique to Azure Blob Storage.

Comments closed

In-Memory OLTP For Reporting

Daniel Janek shows that memory-optimized tables aren’t just for OLTP scenarios:

I wouldn’t have thought that Hekaton could take my report query down from 30+ min to 3 seconds but in the end it did. *Note that the source data is static and repopulated just twice a week. With that said I didn’t bother looking into any limitations that “report style” queries may cause OLTP operations. I’ll leave that to you.

With SQL Server 2016 (an important caveat), memory-optimized tables can work great for reporting scenarios.  The important factor is having enough RAM to store the data.

Comments closed

Transactional Replication To Azure SQL Database

Jes Borland has a five-part series on replicating a series of databases in an Availability Group to Azure SQL Database.  Part 1 involves planning:

There are tasks you’ll need to take care of in SQL Server, the AG, and the SQL DB before you can begin.

This blog series assumes you already have an AG set up – it won’t go through the setup of that. It also assumes you have an Azure SQL server and a SQL Database created – it won’t go through that setup either.

Ideally, the publishers, distributor, and subscribers will all be the same version and edition of SQL Server. If not, you have to configure from the highest-version server, or you will get errors.

Part 2 prepares the replication distributor:

The first step in this process is to set up the remote distributor. As I mentioned in the first blog, you do not want your distribution database on one of the AG replicas. You need to set this up on a server that is not part of the AG.

Start by logging on to the distributor server – in this demo, SQL2014demo.

Stay tuned for the remainder of the series.

Comments closed