Curated SQL – Page 899 – A Fine Slice Of SQL Server

I have wrote a number of blogs on the topic of business continuity in SQL Database before (HA/DR for Azure SQL Database, Azure SQL Database high availability, Azure SQL Database disaster recovery) but with a number of new features I felt it was time for a new blog on the subject, focusing on disaster recovery and not high availability.
Business continuity in Azure SQL Database and SQL Managed Instance refers to the mechanisms, policies, and procedures that enable your business to continue operating in the face of disruption, particularly to its computing infrastructure. In the most of the cases, SQL Database and SQL Managed Instance will handle the disruptive events that might happen in the cloud environment and keep your applications and business processes running.

James takes us through options available for Azure SQL Database as well as managed instances.

Comments closed

The Tuple Mover in SQL Server 2019

Published 2020-07-21 by Kevin Feasel

Taryn Pratt gives us closure on an issue from a few months back:

I suggest reading my other post first, it’ll only take a few minutes. I’ll wait…
However, if you really don’t want to read it, here’s a quick recap on the initial issue.
In early February 2020, a lot of data was deleted from some clustered columnstore indexes in our PRIZM database. Some of the tables were rebuilt, but 11 tables weren’t since we don’t have maintenance windows, and that would involve downtime. The rebuilds would happen once we upgraded to SQL Server 2019, to take advantage of the ability to rebuild those columnstore indexes online.

Taryn now has the full story and I recommend giving it a read.

Comments closed

The Basics of Gremlin

Published 2020-07-21 by Kevin Feasel

Raul Gonzalez introduces us to Gremlin:

Graph databases in Cosmos DB benefit from the same features, like the SQL API, it is globally distributed, scales independently throughput and storage, provides guaranteed latency, automatic indexing and more. So when relational databases choke with certain queries, No-SQL databases come to play.
Gremlin is the query language used by Apache Tinkerpop and it is implemented in Azure Cosmos DB. This language enables us to transverse graphs and answer complex queries that would be otherwise very expensive to run in traditional relational database engines.

Read on for a detailed example.

Comments closed

Securing S3 Buckets

Published 2020-07-20 by Kevin Feasel

Adam Youngberg relates an experience with securing public S3 buckets:

As a response to our initial alert, we took action to identify all of our S3 buckets and the public / non-public status. Since Databricks is a cloud-native company, we had already deployed JupiterOne, a commercial cloud asset management solution that allowed us to quickly query and determine which assets were public. Open-source tools are available, such as Cartography from Lyft, which allow for similar query capabilities. With the outputs of our queries, we were able to quickly identify and triage our public buckets and determine whether they should remain public.

Read on for the process, as well as some issues they experienced in rollout.

Comments closed

All About The Compute Scalar Operator

Published 2020-07-20 by Kevin Feasel

Hugo Kornelis continues a quest to document query plan operators:

The Compute Scalar operator is used to compute new values based on other columns in the same row. These new values are then added as extra column(s) in the output rows.
The expressions used to compute the new values can only refer to constant values and to columns in the input rows of the Compute Scalar operator. Other than that, there are, to my knowledge, no restrictions. The expressions can vary from very simple to extremely complex. The expressions can even include references to scalar user-defined functions, to CLR user-defined functions, and to built-in CLR functions.

Read on for a good deal of information about the operator.

Comments closed

Monitoring Identity Columns

Published 2020-07-20 by Kevin Feasel

Michael J. Swart has a script to monitor identity columns:

My friend pointed out an interesting RCA by Github where a
database table’s auto-incrementing ID column exceeded [maxint]. When we attempted to insert larger integers into the column, the database rejected the value
This led to a discussion about setting up monitoring for this kind of problem in our software. We have a place for monitoring and health-checks for all our databases. We just need to know how to define them.
So how do I create a script that reports any tables whose current identity values are above a certain threshold? This is what I came up with. Maybe you’ll find it useful too.

Click through for the script, but also read the comments for a simplification. The next step in the process is to figure out which tables can handle it and auto-reseed when you get to a certain threshold. Most commonly, this would be queue tables, where the active set of IDs in use is a relatively small number and there are no ID stragglers hanging around, so resetting is safe.

Comments closed

Getting the Last Query Plan Stats in SQL Server 2019

Published 2020-07-20 by Kevin Feasel

John Morehouse walks us through retrieving the actual query plan stats of the last run of an execution plan:

Currently, if you are not on SQL Server 2019 and wanted to see an execution plan, you would attempt to dive into the execution plan cache to retrieve an estimated plan. Keep in mind that this just an estimated plan and the actual plan, while the shape should be the same, will have runtime metrics. These Actual runtime metrics could be, different than what is shown in the estimated plan, so it is important to get the actual whenever possible.
With the introduction of lightweight statistics, SQL Server can retain the metrics of the actual execution plan if enabled. Note that this could introduce a slight increase in overhead, however, I haven’t yet seen it be determinantal. These metrics are vastly important when performing query tuning.

Read on to see the specific set of metrics you can pull and how to do it. This does require SQL Server 2019.

Comments closed

Azure Data Factory and Key Vault References

Published 2020-07-20 by Kevin Feasel

Gerhard Brueckl shows how we can get around a limitation in the Azure Data Factory user interface:

As You can see, the setting “AccessToken” can use a Key Vault reference whereas settings like “Databricks Workspace URL” and “Cluster” do not support them. This is usually fine because the guys at Microsoft also thought about this and support Key Vault references for the settings that are actually security relevant or sensitive. Also, providing the option to use Key Vault references everywhere would flood the GUI. So this is just fine.
But there can be good reasons where you want to get values from the Key Vault also for non-sensitive settings, especially when it comes to CI/CD and multiple environments. From my experience, when you implement a bigger ADF project, you will probably have a Key Vault for your sensitive settings and all other values are provided during the deployment via ARM parameters.
So you will end up with a mix of Key Vault references and ARM template parameters which very likely will be derived from the Key Vault at some point anyway. To solve this, you can modify the JSON of an ADF linked service directly and inject KeyVault references into almost every property!

Click through to see how that works, as well as the ramifications.

Comments closed

Adding Tooltips to Column Headers in Power Query

Published 2020-07-20 by Kevin Feasel

Chris Webb has some fun with M:

How does this work? Here’s what each of the steps do:
– The source step creates a simple table with two columns called firstname and lastname using #table (see here for more details on that).
– The tabletype step declares a new table type with two columns (the same two columns in the table from the previous step) and then adds a metadata record to this type. In that record the Documentation.FieldDescription field contains the text values that will appear as tooltips when you hover over each column.
– The replacetype step replaces the type of the table returned by source with the type declared in tabletype.

Click through for the code.

Comments closed

Docker Compose and SQL Server

Published 2020-07-20 by Kevin Feasel

Andrew Pruski makes it easy to launch a fully-featured Docker container running SQL Server:

The solution here is to create a custom image with the volume created and permissions set.
But wouldn’t it be easier to just have to run one command to spin up a custom 2019 image, with volumes created and permissions set?
Enter Docker Compose.

Andrew has a GitHub repo with everything set up and includes plenty of screenshots to demonstrate.

Comments closed

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Curated SQL Posts