Kevin Feasel – Page 863

Blob Storage Enhancements

Published 2020-05-28 by Kevin Feasel

James Serra is keeping on top of things around Azure Blob Storage:

Account failover: Customer-initiated storage account failover is now generally available, allowing you to determine when to initiate a failover instead of waiting for Microsoft to do so. When you perform a failover, the secondary replica of the storage account becomes the new primary. The DNS records for all storage service endpoints—blob, file, queue, and table—are updated to point to this new primary. Once the failover is complete, clients will automatically begin reading from and writing to data to the storage account in the new primary region, with no code changes. Customer initiated failover is available for GRS, RA-GRS, GZRS and RA-GZRS accounts. To learn more, see Disaster recovery and account failover

Read on for several more improvements.

Comments closed

Setting Default Parameters with Powershell

Published 2020-05-28 by Kevin Feasel

Jess Pomfret takes us through the main idea behind PSDefaultParameterValues:

I first heard about PSDefaultParameterValues from a PSPowerHour session by Chrissy LeMaire in 2018. After rewatching this recently, I realised she even mentioned this exact scenario. However, it took until I recently rediscovered this handy preference variable that it all clicked together.
PSDefaultParameterValues does exactly what the name suggests- it lets you specify default values for parameters. PSDefaultParameterValues can be set as a hash table of parameter names and values that will be used in your session for any function that can use it. A simple example is the verbose parameter. If you wanted to turn on the -Verbose switch for every function you run you could add -Verbose to each function call, or you could set PSDefaultParameterValues.

Click through for a few examples of how you can use this.

Comments closed

Comparing Recurrent and Convolutional Neural Networks

Published 2020-05-27 by Kevin Feasel

Sameer Nigam explains the differences between convolutional and recurrent neural networks:

CNN and RNN are amongst most important algorithm of Neural Network family, also they differ in their network process and solving problems.
So talking about their differences:
CNN are used to solve classification and regression problems and RNN are used to solve sequence information.

Read on to see what Sameer means by this.

Comments closed

Exchange Demand Partitioning and Parallel Queries

Published 2020-05-27 by Kevin Feasel

Joe Obbish takes us through a fun concurrency problem:

Very little has been written about exchange operators with a partitioning type of demand, so I forgive you for not hearing of it before today. There is a brief explanation available here, an example of using demand partitioning to improve some query plans involving partitioned tables, and a Stack Exchange answer for someone comparing round robin and demand partitioning. You have the honor of reading perhaps the fourth blog post about the subject.

Read on for an in-depth look at the problem.

Comments closed

Personalizing Power BI Page Navigation

Published 2020-05-27 by Kevin Feasel

Marc Lelijveld shows how you can create custom tooltips per user in Power BI:

From an end user perspective, it is very user friendly to see that the report is adjusted to your personal preferences. Similar is the case for page navigation. If I am the country manager for Germany, I would prefer the report to be adjusted to my area.
With row level security you can filter the report down to only the specified area you want. But in this case, we want clarity over all countries. So, we do not want to filter the dataset itself, but only customize the end user interaction. For this setup, we will still use row level security, but in a different setup.

Click through to see how to do this.

Comments closed

The Architecture of Columnstore Indexes

Published 2020-05-27 by Kevin Feasel

Ed Pollack has started a series on columnstore indexing:

By storing data grouped by columns, like values can be grouped together and therefore compress very effectively. This compression will often reduce the size of a table by 10x and offers significant improvements over standard SQL Server compression.
For example, if a table with a billion rows has an ID lookup column that has 100 distinct values, then on average each value will be repeated 10 million times. Compressing sequences of the same value is easy and results in a tiny storage footprint.
Just like standard compression, when columnstore data is read into memory, it remains compressed. It is not decompressed until runtime when needed. As a result, less memory is used when processing analytic queries. This allows more data to fit in memory at one time, and the more operations that can be performed in memory, the faster queries can execute.

In scenarios where it makes sense, I absolutely love clustered columnstore indexes.

Comments closed

Misleading Query Plan Estimates from Missing Nodes

Published 2020-05-27 by Kevin Feasel

Hugo Kornelis points out another issue when checking arrow width when reading execution plans:

The issue I will talk about today is caused when a node is missing, due to a post-optimization rewrite. I myself have so far only seen this in execution plans where a bitmap filter (created by a Bitmap operator, by a Batch Hash Table Build operator, or by a Hash Match operator with a BitmapCreator property) was pushed into an Index Scan, Clustered Index Scan, or Columnstore Index Scan operator. So that’s what I’ll focus on here. But do keep in mind that there may be other situations where a post-optimization rewrite removes a node.
To understand the root cause of this (and, credit where credit is due, I only understood this myself after getting it explained by Paul White, whose explanation I am now paraphrasing here), we need to dig deeper into the internals of the optimizer.

This is where you put on the internals mining cap and watch out for falling rocks.

Comments closed

Network Monitoring in Powershell

Published 2020-05-27 by Kevin Feasel

Jeffrey Hicks builds out a quick Powershell script to test network performance:

I hope you’ve been trying your hand at the scripting challenges being posted on the Iron Scripter web site. The challenges are designed for individuals to do on their own to build up their PowerShell scripting skills. A few weeks ago, a challenge was posted to create a network monitoring tool using PowerShell and the Write-Progress cmdlet. I thought I’d share my notes on the challenge and some of the code I came up with.

It’s an interesting challenge and solution.

Comments closed

Preventing Unexpected Failover with AG Patching

Published 2020-05-27 by Kevin Feasel

Josh Darnell dealt with an issue with Availability Groups:

I had a 2-node availability group (AG) + fileshare witness system experience an unexpected failover recently.
The synchronous secondary was being patched, and when it came back up from a reboot, the current primary unexpectedly failed over. We weren’t done with all the patching on the secondary, so this caused a short outage, and we had to fail back to the original primary to finish the patching (which is of course another short interruption in availability).
The root cause was interesting enough that I decided to share the story here, and provide some general advice and debugging tips along the way.

Click through to understand why this happened and how you might be able to avoid the pain Josh experienced.

Comments closed

Two Ways to Create a DataFrame in Spark

Published 2020-05-26 by Kevin Feasel

Kundan Kumarr explains two techniques for creating DataFrames in Apache Spark:

There are two different ways to create a Dataframe in Spark. First, using toDF() and second is using createDataFrame(). In this blog we will see how we can create Dataframe using these two methods and what’s the exact difference between them.

Read on for the explanation and examples.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Author: Kevin Feasel