Kevin Feasel – Page 1075

The optimizer’s output may contain both apply and nested loops join physical operations. Both are shown in execution plans as a Nested Loops Join operator, but they have different properties:
Apply
The Nested Loops Join operator has Outer References. These describe parameter values passed from the outer (upper) side of the join to operators on the inner (lower) side of the join. The value of the each parameter may change on each iteration of the loop. The join predicate is evaluated (given the current parameter values) by one or more operators on the inner side of the join. The join predicate is not evaluated at the join itself.
Join
The Nested Loops Join operator has a Predicate (unless it is a cross join). It does not have any Outer References. The join predicate is always evaluated at the join operator.

And to make things tricky, APPLY can generate either of these. Read the whole thing.

Comments closed

Finding Recently Created Objects

Published 2019-06-10 by Kevin Feasel

Max Vernon has a script to help us find what new objects now exist on your database:

The code below provides a list of all SQL Server objects created in the past “x” number of days. Dynamic T-SQL is used to construct a query for each database, including system databases. Each query provides the schema, name, and date created for each object listed, along with the object type description.

This looks quite useful for auditing. You might want to filter out tempdb on a real system, though.

Comments closed

Dealing with HADR_SYNC_COMMIT Waits

Published 2019-06-10 by Kevin Feasel

Dmitri Korokevitch walks us through the HADR_SYNC_COMMIT wait type:

The secondary nodes may be configured using asynchronous or synchronous commit. With asynchronous commit, transaction considered to be committed and all locks were released when COMMIT log record is hardened on the primary node. SQL Server sends COMMIT record to secondary node; however, it does not wait for the confirmation that the record had been hardened in the log there.
This behavior changes when you use synchronous commit as shown in Figure 1. In this mode, SQL Server does not consider transaction to be committed until it receives the confirmation that COMMIT log record is hardened in the log on the secondary node. The transaction on primary will remain active with all locks held in place until this confirmation is received. The session on primary is suspended with HADR_SYNC_COMMIT wait type.

Click through for the full story.

Comments closed

Using Power Query to Pivot Text

Published 2019-06-10 by Kevin Feasel

Matt Allington shows how you can pivot text data from an Excel spreadsheet using Power Query:

It is very common to need to transform data from one “shape” to another “shape” before it can be used inside Power BI for analysis (although many beginners don’t realise this). One such example is shown below, where the data in the table on the left hand side needs to be transformed into the table on the right hand side. As you can see on the left, column A contains the attribute and column B contains the value of the attribute. Every 4 lines of data is 1 record. This specific problem is very common problem when your only source of data is from an extract (eg csv) from some other system, particularly older systems where you can’t change the format of the data extract.

This is a clever solution.

Comments closed

Arrays in Azure Data Factory

Published 2019-06-10 by Kevin Feasel

Rayis Imayev takes us through arrays in Azure Data Factory:

Currently, there are 3 data types supported in ADF variables: String, Boolean, and Array. The first two are pretty easy to use: Boolean for logical binary results and String for everything else, including the numbers (no wonder there are so many conversion functions in Azure Data Factory that we can use).
I’ve also blogged about using Variables in Azure Data Factory:
– Setting Variables in Azure Data Factory Pipelines
– Append Variable activity in Azure Data Factory: Story of combining things together
– System Variables in Azure Data Factory: Your Everyday Toolbox
– Azure Data Factory: Extracting array first element

Click through for arrays and follow up with those other posts from there.

Comments closed

Azure Data Studio June Release

Published 2019-06-10 by Kevin Feasel

Alan Yu announces the June release of Azure Data Studio:

As our team presented in SQL Server sessions across the country, users in person and on GitHub told us that they couldn’t start using Azure Data Studio in their daily work streams until X feature was implemented. One of the most requested of these features is Central Management Servers support, and we are excited to announce the preview release of the CMS extension.

CMS is quite useful. There are also a couple dozen bugfixes and improvements to SQL notebooks.

Comments closed

Choosing Colors for Visuals

Published 2019-06-07 by Kevin Feasel

Lewis Chou has some advice for choosing color schemes for data visualization:

When making a chart, we should use the same color scheme for the same metrics. And we need to avoid the excessive color interference to the user.
For example, when we do sales analysis, we usually analyze the indicators of sales and payment collection. Then, when we do data visualization analysis of different dimensions for the same indicator, we recommend using the same color system for sales and payment collection. It means that the sales amount can be indicated by the yellow-green color, and the return amount can be indicated by the blue color accordingly. After following the principle of consistency of indicator color, the user can quickly understand the meaning of the indicator expressed by the current data visualization chart according to the color distinction.

Color is a pre-attentive attribute: we sub-consciously pay attention to it before we consciously observe it. That has advantages but it also comes with responsibilities.

Comments closed

Tracking Database Changes with DDL Triggers

Published 2019-06-07 by Kevin Feasel

Lori Brown shows how you can use DDL triggers to track database or instance-level changes:

I have been working on some improvements to some of the regular ways we monitor for important changes. We always have to be on the lookout for unexpected changes being made in the SQL instances that we monitor since often times we are not the only team who has sysadmin access to the instance. We are always the best trained to take care of and configure things but we sometimes find that someone makes a change either to the SQL or database configuration without telling us. We want to know when things like this happen!

I’m a big fan of these. Of course you need to get the code right, as a bad trigger can be devastating but you can get a lot of useful information out of it and figure out who’s hand was in the cookie jar.

Comments closed

Flink’s Network Stack

Published 2019-06-07 by Kevin Feasel

Nico Kruber dives into the internals of Apache Flink’s network stack:

Flink’s network stack is one of the core components that make up the flink-runtime module and sit at the heart of every Flink job. It connects individual work units (subtasks) from all TaskManagers. This is where your streamed-in data flows through and it is therefore crucial to the performance of your Flink job for both the throughput as well as latency you observe. In contrast to the coordination channels between TaskManagers and JobManagers which are using RPCs via Akka, the network stack between TaskManagers relies on a much lower-level API using Netty.
This blog post is the first in a series of posts about the network stack. In the sections below, we will first have a high-level look at what abstractions are exposed to the stream operators and then go into detail on the physical implementation and various optimisations Flink did. We will briefly present the result of these optimisations and Flink’s trade-off between throughput and latency. Future blog posts in this series will elaborate more on monitoring and metrics, tuning parameters, and common anti-patterns.

There’s a lot in here and it’s worth reading.

Comments closed

Persistent Memory and SQL Server

Published 2019-06-07 by Kevin Feasel

Ned Otter gives us the rundown on Persistent Memory and how it can make life smoother:

SQL 2017 on Windows Server 2016 behaves the same as SQL 2016 on Windows Server 2016 – “tail of the log” is supported. However, there is no support for PMEM with SQL 2017 on supported Linux distributions (except as a traditional block store). Using PMEM with SQL 2019 on Linux supports what’s known as “enlightenment”, which allows us to place data and log files on DAX formatted volumes, thereby reducing latency considerably. SQL 2019 on Linux also support “tail of the log”.

This is one of those areas where understanding Linux versus Windows administration really pays off, at least until Windows Server supports something like enlightenment.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Author: Kevin Feasel

Physical Operators: Apply and Nested Loops

Finding Recently Created Objects

Dealing with HADR_SYNC_COMMIT Waits

Using Power Query to Pivot Text

Arrays in Azure Data Factory

Azure Data Studio June Release

Choosing Colors for Visuals

Tracking Database Changes with DDL Triggers

Flink’s Network Stack

Persistent Memory and SQL Server