Curated SQL – Page 649 – A Fine Slice Of SQL Server

Ways to avoid the MERGE Operator

Published 2021-08-05 by Kevin Feasel

Michael J. Swart has important bullet points:

Aaron Bertrand has a post called Use Caution with SQL Server’s MERGE Statement. It’s a pretty thorough compilation of all the problems and defects associated with the MERGE statement that folks have reported in the past. But it’s been a few years since that post and in the spirit of giving Microsoft the benefit of the doubt, I revisited each of the issues Aaron brought up.
Some of the items can be dismissed based on circumstances. I noticed that:
– Some of the issues are fixed in recent versions (2016+).
– Some of the issues that have been marked as won’t fix have been fixed anyway (the repro script associated with the issue no longer fails).
– Some of the items are complaints about confusing documentation.
– Some are complaints are about issues that are not limited to the MERGE statement (e.g. concurrency and constraint checks).

Spoilers: some + some + some + some is still a lot less than all. Read the whole thing.

Comments closed

An Overview of Amazon Athena

Published 2021-08-04 by Kevin Feasel

Aveek Das takes us through the basics of Amazon Athena:

Serverless. Since Amazon Athena is offered as a fully managed cloud service, customers do not need to take the pain of installing and maintaining separate infrastructures for this. You can start by logging into the AWS Web console and proceeded to Amazon Athena.
Pay Per Query. You only pay for queries you execute. This is very cost-effective, as you can easily figure out your monthly expenses based on your usage pattern. On average, users pay 5 USD for each terabyte of data scanned. This can be further optimized by creating partitions or compressing your dataset.
Interactive Performance. We do not need to worry about the resources that work behind the scenes. When a query is executed, Athena automatically runs the query in parallel across multiple resources, bringing the results faster.

Read on to see an example of Athena in action.

Comments closed

Azure Functions and (Lack of) F# Support

Published 2021-08-04 by Kevin Feasel

Jamie Dixon has a shaggy dog tale:

When Azure Functions first came out, F# had pretty good support – templates, the ability to run a .fsx file, cool examples written by Don… Fast forward to 2021 and we have bupkis. I recently wrote a dictionary of amino acid weights that would be perfect as a service: pass in a kmer length and weight, get all hits from amino acids.
I first attempted to create a function app, select a .fsx template, and write the code in my browser. Alas, only .csx scripting is available.

Not to be too cutesy about it, but it would be nice if the product which allows for the execution of functions in a cloud service would support the .NET language which most explicitly embraces the notion of functions. If you feel similarly, there is an open feedback ticket.

Comments closed

Fun with Hash Tables in Powershell

Published 2021-08-04 by Kevin Feasel

Robert Cain continues a series on Powershell data structures:

Hash tables are powerful, flexible items. In some languages, hash tables are also called dictionaries. In this post we’ll start with a few basics, then get into some fun things you can do with hash tables.
A quick note, some documentation calls them a hash table, others read hashtable, one word. For this post I’ll use the two word hash table, but it’s the same thing no matter what documentation or blog you read.

Click through for plenty of examples of what you can do with them.

Comments closed

Finding Bad (Worse?) NOLOCK Statements across Instances

Published 2021-08-04 by Kevin Feasel

Aaron Bertrand powers up for about six episodes straight, but the results are amazing:

In Part 1 of this series, I showed how to use a Visitor pattern to walk through one or more T-SQL statements to identify a problematic pattern where NOLOCK hints are applied to the target of an update or delete. The method in my initial examples was very manual, though, and won’t scale if this problem might be widespread. We need to be able to automate collecting a potentially large number of statements across an entire environment, and then try to eliminate false positives without manual intervention.

Read on to see how you can take what Aaron wrote last time and make it scalable.

Comments closed

Executing T-SQL with a Proxy Account

Published 2021-08-04 by Kevin Feasel

Tom Collins answers a question:

I have some t-sql code added to a job step on a SQL Server Agent job. The problem is I need to run the code as RUNAS . I though of executing the job with a proxy account – so progressed with the Credential & proxy set up. But I still can’t view the Proxy\Credential in the RunAs list . Is there a way around this problem?

Read on to learn why and for the answer.

Comments closed

Installing Kubernetes on Docker Desktop

Published 2021-08-04 by Kevin Feasel

Joy George Kunjikkur takes another stab at Kubernetes:

This is 3rd or 4th time, I am learning Kubernetes (hereafter mostly refer as K8s short form) hands-on sessions. Every time I learn the kubectl command and its options, I forget as there were no chances to apply in the day job. Another mistake I did all those times was missing to post learning to this blog.
Hope this time I will get a chance to use it in the day job and not miss posting the Kubernetes learning on to this blog.
This post is very basic. The aim is to get started with the Kubernetes development environment using Docker Desktop. Below are the steps at a high level to get started. Detailed steps with videos are available on the internet.

Read on for some installation tips, including how to install the Kubernetes dashboard for a UI experience.

Comments closed

An Overview of Azure Purview

Published 2021-08-04 by Kevin Feasel

Angela Henry gives us an overview of Azure Purview:

Organizations are amassing more data than ever, yet it is getting more difficult for their employees to find that data and use it with confidence. What if there was a solution out there that not only told us what data sources we have, but could tell us how those data sources should be used, and who the stewards/producers of that data are? What if it could allow us to classify our data, and provided us insights into what our entire data estate looked like? It might sound like data nirvana, but it just might be possible with the newest Platform as a Service (PaaS) offering from Microsoft, Azure Purview.

In the first part of the series, Angela covers the basics and pricing, so check it out.

Comments closed

Working with Trees of Data in R

Published 2021-08-03 by Kevin Feasel

Martin Stingl shows off the data.tree package:

Lately I tried to visualize an hierarchy with Tableau Desktop. The problem was that the hierarchy had a variable depth because it was tree-based. Each row had an id and a parent_id. Normally hierarchies in Tableau are defined by pulling some fields together, such as product category, product group and product id.
Handling tree-based hierarchies seems to be a lot more complex. I found a plugin at https://github.com/tableau/extension-hierarchy-navigator-sandboxed but this only works online.
So I asked myself how I can handle this using R. I found the R-package data.tree at https://github.com/gluc/data.tree. I want to describe how I use this package to preprocess my data.

Read on to see how this works and how you can turn a classical data representation of a tree (ID and parent ID) into a flattened structure with a fixed number of levels. H/T R-Bloggers.

Comments closed

Caching Function Results in an R Package

Published 2021-08-03 by Kevin Feasel

Maelle Salmon and Cristophe Dervieux show us ways to cache results of function calls using R:

Caching means that if you call a function several times with the exact same input, the function is only actually run the first time. The result is stored in a cache of some sort (more practical details later!). Every other time the function is called with the same input, the result is retrieved from the cache unless invalidated. You will often think of caching as something valid in only one R session, but we’ll see it can be persistent across sessions via storage on disk.

As a quick note, this makes sense when writing functions, which are expressions without side effects. If you have side effects, caching might not give you what you expect.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Curated SQL Posts