Curated SQL – Page 912 – A Fine Slice Of SQL Server

Visualizing Decision Trees in R

Published 2020-10-21 by Kevin Feasel

Sebastian Sauer shows us a couple techniques for plotting decision trees:

The recursive partitioning plot was designed by Grant McDermott. The tree plot is from the rpart.plot package written by Stephen Milborrow.

Click through for code and illustrations.

Comments closed

Self-Service with Azure Synapse Analytics

Published 2020-10-21 by Kevin Feasel

Paul Andrew lays out an interesting idea:

I’ve been playing around with Azure Synapse Analytics for a while now exploring the preview features and trying to find a meaningful use case for the ‘single pane of glass’ capabilities. In this post I’m exploring one possible option/idea for creating a very simple self service approach to dataset ingestion and consumption. Full disclosure, the below is far from technical perfection for lots of reasons, I mainly wanted to put something out there as an idea and use it to maybe start a conversation.

Click through to see Paul’s take on the matter.

Comments closed

Choosing the Right Index and Partition in Dedicated SQL Pools

Published 2020-10-21 by Kevin Feasel

Tsuyoshi Matsuzaki gives us some advice on indexing and partitioning data in Azure Synapse Analytics dedicated SQL pools:

Designing index for a table is so primitive and important for better performance.
There’s no “one answer for any case”. You should choose right index for a table depending on the size, usage, query patterns, and cardinality.
In order to help you understand pros/cons in each indexes, I’ll show you each pictures illustrating intuitive structures of indexes available in Synapse Analytics.

Because dedicated SQL pools aren’t the same as the SQL Server box product, it’s important to go in with the understanding that indexing won’t be exactly the same as on-premises or in Azure SQL Database.

Comments closed

The Raw Facts on Azure SQL DB Serverless

Published 2020-10-21 by Kevin Feasel

Taiob Ali gives us a briefing summary on Azure SQL Database Serverless:

Occasionally, load balancing automatically occurs if the machine cannot satisfy resource demand within a few minutes. For example, if the resource demand is 4 vCores, but only 2 vCores are available, it may take up to a few minutes to load balance before 4 vCores are provided. The database remains online during load balancing except for a brief period at the end of the operation when connections are dropped.

Click through for more points along these lines.

Comments closed

Dynamic Format Strings when using Calculation Groups

Published 2020-10-21 by Kevin Feasel

Alberto Ferrari shows off how you can dynamically generate format strings when using calculation groups in Power BI:

Each product in Contoso weighs a certain weight. The weight is stored in two columns: the unit of measure and the actual weight, expressed in that unit of measure. Specifically, Contoso uses three units of measure: ounces, pounds, and grams.
Because the units of measure are different, you cannot aggregate the weight over different products. If you author a simple measure that computes the ordered weight of products by using a simple SUMX, the result is wrong:

Click through to see how you can work through this problem.

Comments closed

SQL Server Management Studio 18.7

Published 2020-10-21 by Kevin Feasel

Drew Skwiers-Koballa announces SQL Server Management Studio version 18.7 is now generally available:

Policy-based management is accessed in SQL Server Management Studio under “Management” in the object explorer as “Policy Management”. Getting started with policy-based management can be accelerated by importing the sample policies available for SQL Server. In September, these policies were added to the open source collection of SQL Server samples to facilitate their use and improvement. You can access these sample policies on the GitHub repository and your contributions to these best practices are welcome. For more information on Policy-Based Management, please check out the documentation.

I think Policy-Based Management is one of the biggest missed opportunities in SQL Server. They came out with a good start in 2008 but the product stagnated after that and it remains under-utilized as a result. Perhaps open-sourcing the policies will help, as the key problem with PBM was how limited it was.

Comments closed

There Is No Instance Named X Error in PSSDiag

Published 2020-10-21 by Kevin Feasel

Eric Cobb troubleshoots an error:

I’m writing this as a reminder to myself for a problem I just had to solve. When running the PSSDIAG tool, if you are running it against SQL 2016 SP2 or higher, you may get the following error:
There is no instance named “YourServer”
Message: The parameter is incorrect

Click through for the solution.

Comments closed

Alternative Ways of Displaying Heatmap Data

Published 2020-10-20 by Kevin Feasel

Cole Nussbaumer Knaflic gives us a couple alternatives to displaying data in a heatmap:

I often describe heatmaps as a good means for getting an initial view of your data. They can help you start to explore and understand where there might be something interesting to highlight or dig into. But once you’ve identified the noteworthy aspects of your data, should you use heatmaps to communicate them?
As often is the case, it depends.
If you are communicating to an audience who likes to see data in tables—applying heatmap formatting can provide a visual sense of the numbers without fully changing the approach (or having it feel like you’ve taken detail away). If you know your stakeholders will want to look up specific numbers (particularly in the case where different stakeholders will care about different numbers) and then understand them in the context of the broader landscape, a heatmap may also work in this scenario.

Click through for some ideas.

Comments closed

The Spark Starter Guide

Published 2020-10-20 by Kevin Feasel

Landon Robinson has some good news for us:

If you visit hadoopsters.com/spark or thesparkguide.com, you’ll see something new and exciting from us. It’s official: we’ve written and are publishing a comprehensive guide to Apache Spark.
This guide will be completely online and completely free. A book’s worth of content, containing exercises in Python and Scala to teach you Spark, at your fingertips. Again, free.

Landon has posted chapter 1, section 1 already:

This section introduces the concept of data pipelines – how data is processed from one form into another. It’s also the generic term used to describe how data moves from one location or form, and is consumed, altered, transformed, and delivered to another location or form.
You’ll be introduced to Spark functions like join, filter, and aggregate to process data in a variety of forms. You’ll learn it all through interactive Spark exercises in Scala and Python.

This is very early in the process but I’m excited.

Comments closed

Azure Site-to-Site VPN Blocking Certain Traffic

Published 2020-10-20 by Kevin Feasel

Denny Cherry diagnoses a network configuration issue:

I ran across an interesting a couple of weeks ago when working with a client. The client has several subsidiaries each with their own vNet. The client had a site to site VPN been the Azure vNets. All traffic was successfully crossing the Azure Site to Site VPN as expected. The sticking point was that a software licensing server running in one of the subsidiaries Azure infrastructure configurations. The software licensing software simply wasn’t working.

Click through to learn why.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Curated SQL Posts