Press "Enter" to skip to content

Author: Kevin Feasel

Agent Schedule Unique IDs

Chris Sommer has a gripe about the schedule_uid column in SQL Agent jobs:

When you script out a SQL Agent Job you’ll notice that the job schedule will have a schedule_uid parameter (providing your job has a schedule). The gottcha lies in that schedule_uid. If you create another job schedule with the same schedule_uid, it will overwrite the schedule for any jobs that are using it. i.e. Any other jobs that are using that schedule_uid will start using the new schedule. Normally I consider UID’s as very unique and chances of a collision are low, but if you do a fair amount of copying jobs between SQL Servers there’s a good chance this will bite you eventually. That’s what happened to us (more than once).

Lets see if I can explain it better in an example.

Something to think about when scripting SQL Agent jobs.

Comments closed

Testing Nested Sets

Nate Johnson continues his nested set series:

Now the complex.  To validate that our tree is properly structured, the following statements need to be true:

  1. Each node’s Right value is greater than its Left.

  2. More to the point, each node’s Right value is greater than all of its ancestors’ Left values.

  3. Similarly, each node’s Left value is less than all of its descendants’ Left values (and Right values, obviously!)

  4. Leaf nodes have no gaps between Left & Right: Right = Left + 1

  5. Depth is easy to verify because we already wrote the rCTE to calculate it!

  6. And of course, no orphans – all ParentIDs lead to an actual parent node, except of course if they’re NULL (root nodes).

Read on for further explanation of these points.

Comments closed

Azure VM Encryption

Melissa Coates looks at different encryption methods available for Azure Virtual Machines:

Initially I opted for Storage Service Encryption due to its sheer simplicity. This is done by enabling encryption when you initially provision the storage account. After having set it up, I had proceeded onto other configuration items, one of which is setting up backups via the Azure Recovery Vault. Turns out that encrypted backups in the Recovery Vault are not (yet?) supported for VMs encrypted with Storage Service Encryption (as of Feb 2017).

Next I decided to investigate Disk Encryption because it supports encrypted backups in the Recovery Vault. It’s more complex to set up because you need a Service Principal in AAD, as well as Azure Key Vault integration. (More details on that in my next post.)

Click through for a point-by-point comparison between the two methods.

Comments closed

Replication Error When Listing Directory Contents

Andrew Peterson troubleshoots a replication issue:

You’re trying to setup SQL Server Replication on a server, and it fails. Looking thru the error message you find this:

        An exception occurred while executing a Transact-SQL statement or batch.
        (Microsoft.SQLServer.ConnectionInfo)

        Destination path ………….is not valid. Unable to list directory contents. Specify a valid
            destination path.
            Changed database context to ‘master’. (Microsoft SQL Server, Error: 14430)

Read on for the solution.

Comments closed

Thoughts On Deprecated Technologies

Dan Guzman discusses a couple deprecated components which are still hanging around:

The message is loud and clear that ODBC is the supported and preferred path for native applications going forward. The Data Access Technologies Road Map provides an overview and history of Microsoft data access technologies, which I recommend you peruse to ensure you are not inadvertently using deprecated or unsupported technologies for new development and, for existing applications, consider moving from legacy data access technologies to current ones when practical.

The current Microsoft ODBC Driver for SQL Server as of this writing is ODBC Driver 13 for SQL Server. Note that that both the 13.0 and 13.1 versions of this driver have the same “ODBC Driver 13 for SQL Server” display name listed under installed programs and ODBC Data Source Administrator. If installed, the driver will be listed under installed programs along with the corresponding driver version (when viewed detail mode). The 13.1 version adds support for the Always Encrypted feature. These ODBC Drivers are available from the link I mentioned earlier.

They will pry OLE DB from my cold, dead hands.  For my money, it’s still the best SQL Server data access technology for Integration Services, beating out ODBC both in terms of ease of use and performance.

Comments closed

Encryption In ElasticMapReduce

Sai Sriparasa shows how to enable encryption in an ElasticMapReduce cluster:

In this post, I go through the process of setting up the encryption of data at multiple levels using security configurations with EMR. Before I dive deep into encryption, here are the different phases where data needs to be encrypted.

Data at rest

  • Data residing on Amazon S3—S3 client-side encryption with EMR
  • Data residing on disk—the Amazon EC2 instance store volumes (except boot volumes) and the attached Amazon EBS volumes of cluster instances are encrypted using Linux Unified Key System (LUKS)

Data in transit

  • Data in transit from EMR to S3, or vice versa—S3 client side encryption with EMR

  • Data in transit between nodes in a cluster—in-transit encryption via Secure Sockets Layer (SSL) for MapReduce and Simple Authentication and Security Layer (SASL) for Spark shuffle encryption

  • Data being spilled to disk or cached during a shuffle phase—Spark shuffle encryption or LUKS encryption

Turns out this is rather straightforward.

Comments closed

Data Frame Serialization In R

David Smith shows a new contender for serializing data frames in R, fst:

And now there’s a new package to add to the list: the fst package. Like the data.table package (the fast data.frame replacement for R), the primary focus of the fst package is speed. The chart below compares the speed of reading and writing data to/from CSV files (with fwrite/fread), feather, fts, and the native R RDS format. The vertical axis is throughput in megabytes per second — more is better. As you can see, fst outperforms the other options for both reading (orange) and writing (green).

These early numbers look great, so this is a project worth keeping an eye on.

Comments closed

JSON In Clustered Columnstore Indexes

Jovan Popovic gives a use case for JSON data being part of a clustered columnstore index:

This is equivalent to collections that you might find in classic NoSQL database because they store each JSON document as a single entity and optionally create indexes on these documents. The only difference is CLUSTERED COLUMNSTORE index on this table that provides the following benefits:

  1. Data compression – CCI uses various techniques to analyze your data and choose optimal compression algorithms to compress data.

  2. Batch mode analytic – queries executed on CCI process rows in the batches from 100 to 900 rows, which might be much faster than row-mode execution.

I think it’s worth reading this in conjunction with Niko Neugebauer’s comments regarding strings in columnstore.

Comments closed

Compatibility Level And Forced Plans

Erin Stellato has an experiment with Query Store plan forcing:

A question came up recently about plan guides and compatibility mode, and it got me thinking about forced plans in Query Store and compatibility mode.  Imagine you upgraded to SQL Server 2016 and kept the compatibility mode for your database at 110 to use the legacy Cardinality Estimator.  At some point, you have a plan that you force for a specific query, and that works great.  As time goes on, you do testing with the new CE and eventually are ready to make the switch to compatibility mode 130.  When you do that, does the forced plan continue to use compatibility mode 110?  I had a guess at the answer but thought it was worth testing.

There are some interesting results here.

Comments closed

Thinking About The Gitlab Outage

Brent Ozar shares his thoughts on the recent Gitlab outage:

You can read more about the details in GitLab’s outage timeline doc, which they heroically shared while they worked on the outage. Oh, and they streamed the whole thing live on YouTube with over 5,000 viewers.

There are so many amazing lessons to learn from this outage: transparency, accountability, processes, checklists, you name it. I’m not sure that you, dear reader, can actually put a lot of those lessons to use, though. After all, your company probably isn’t going to let you live stream your outages. (I do pledge to you that I’m gonna do my damnedest to do that ourselves with our own services, though.)

There are some good pointers in here.

Comments closed