Press "Enter" to skip to content

Author: Kevin Feasel

Convert SSAS Tabular Processing Scripts Into Tables

Chris Koester shows how to take an Analysis Services Tabular processing script in TMSL format and turn it into a table using OPENJSON:

The previous post looked at how to process SSAS Tabular models with TMSL. Since SQL Server adds new JSON capabilities in 2016, let’s look at how to convert TMSL JSON to a Table with OPENJSON. OPENJSON is a new function in SQL Server 2016 that, per Microsoft:

OPENJSON is a table-valued function that parses JSON text and returns objects and properties from the JSON input as rows and columns.

In short, OPENJSON converts JSON text to a table. Since TMSL is JSON, this function can convert a SSAS Tabular processing script into a table. This could be useful if you wanted to document a SSAS processing schedule.

That’s an interesting use of OPENJSON.

Comments closed

Drawing Cubes With SQL Server Spatial

Slava Murygin has entered his cubism phase:

Hey, there is a time to go level up and instead of drawing Spirals, Fractals and other cool stuff I decided to go 3D!

So, the first my try will be drawing 3D cubes.
As you know, SQL is not an Object Orienting Programming language, and I can’t just simply create an Object “Cube” with certain properties. To create a Cube I need a Stored Procedure:

Click through for a touch of Picasso in your database.

Comments closed

Filtered Statistics

William Wolf shows us the value of filtered statistics:

Wolf only had 700 complaints, but 166,900 records were estimated for return. He is looking much worse than reality shows.

So, what is happening is that there are 3 possible employee results for complaints. It is rather simple. CE is taking the total amount of records(500,701) and dividing by 3 assuming that all 3 will have roughly the same amount of records. We see that along with the estimated number of records being the same, the execution plan operators are the same. For such a variation in amount of records, there must be a better way.

I rarely create filtered statistics, in part because I don’t have a good idea of exactly which values people will use when searching.  But one slight change to Wolf’s scenario might help:  having a filter where name = Sunshine and a filter where name <> Sunshine (or name is null).  That might help a case where there’s extreme skew with one value and the rest are much closer to uniformly distributed.

Comments closed

Temporal Memory-Optimized Tables

Ned Otter describes how hybrid disk + memory-optimized temporal tables differ from on-disk temporal tables:

As changes are made to rows in the temporal memory-optimized table, before being transferred to the history table on disk, they are first migrated to an internal memory-optimized staging table. That means when you query the “history table”, you could be retrieving rows from both the on-disk history table, and internal staging table. Because no custom indexing was possible on the internal staging table, there could be performance implications when executing queries against historical data. Microsoft addressed these potential performance issues in SQL 2016 SP1 (detailed in this CAT blog post).

The internal staging table only gets flushed to the on-disk history table when it reaches 8% of the size of the temporal table. Given the current capacities of Windows Server 2016 (24TB memory), it’s now possible to have very large memory-optimized tables. 8% of one of those large memory-optimized tables could be quite large, which will affect query performance, if predicates don’t match available indexes.

Read on for some sobering thoughts on the topic.

Comments closed

Automating Installation Of SQL Server

Nate Johnson has a script he uses to automate installation of SQL Server on a new server:

We can then use this file in the ConfigurationFile argument of setup.exe from the SQL Server install media.  To put a little more color on that: the .ini file is really just a collection of command-line arguments to setup.exe​; you could also list them all out in-line, but that would be tedious and silly.  Here’s a couple major selling points of creating your own config file:

  1. Slipstream updates (SP’s, CU’s), instead of having it go out to MSFT update servers (or *aghast* sticking with the original RTM bits, you heathen you!)

  2. Specify drive letters / default file locations: sure, this may be considered old-hat if you’re running super slick storage, but I still find it makes management a bit easier if I know where my MDFs, LDFs, TempDB, & backups will always be.

  3. Take advantage of 2016’s better TempDB setup options (# files, size & growth)

Read the whole thing.

Comments closed

Unit Testing Kafka Streams

Anuj Saxena shows us how to build mocks for streams in Kafka Streams:

Here, we are using Kafka streams in our applications. We are done with the implementation but again, the most important thing left is testing. This blog is about how to test the application we have created. For this, I’ll be taking the sample app I created in my previous blog for both high-level DSL and low-level processor API.

Traditionally, we test our Kafka application with an integration test for which we need to create a ZooKeeper and a real Kafka broker. After that, we need a mock producer and mock consumer for our app to produce the inputs and receive the outputs. That makes it such a big hassle just to test our app. Testing it for real scenarios and for the actual integration test, this is needed without a doubt.

Click through for an example.

Comments closed

Sympathy For The Part-Timer

John Mount wants us to think about part-time users:

The second point I think is particularly interesting. It means:

An R user who does not consider themselves an expert programmer could be maintaining code that they understand, but could not be expected to create from scratch.

Or:

Let’s have some sympathy for the part-time R user.

This is the point we will emphasize in our new example.

Read on for a particular example.  I think this is good advice to generalize:  write your code to make it as easy as possible for “part-time” users.  This applies to custom code you write as well, as unless you are constantly in a particular part of the code base, you’ll forget the details later and have the same problems that a part-timer would have working with a different language.

Comments closed

Dynamic Date Dimensions In Power BI

Ginger Grant shows how to build a dynamic date dimension in Power BI using either DAX or M:

Recently I needed to create a date dimension for a Power BI model as there was not one in the source database. There are two different ways that I could do this, using DAX from the Modeling Tab within the Data View or using M via the Query Editor window.  As a general rule, when it is possible data manipulation should be done in M as it offers a greater level of compression.  In this case though I am using a function in DAX, which is not the same as creating a calculated column.

Read on to see code examples for each method, as well as Ginger’s analysis.

Comments closed

More Annoying MSDTC Problems

Jeff Mlakar walks through some advanced MSDTC troubleshooting:

Sometimes when a VM is cloned from a template (VMware or Hyper-V) the CID can be duplicated between the two machines. Evidence of this can be seen in the Event Viewer (Event ID 4101). This means the two MSDTC services will not be able to communicate with each other.

A possible error message is:

The Microsoft Distributed Transaction Coordinator (MS DTC) has cancelled the distributed transaction

The MSDTC feature of the Windows operating system requires unique CID values to ensure that MSDTC functionality between computers works correctly. Disk duplicate images of Windows installations must have unique CID values or MSDTC functionality may be impaired. This can occur when using virtual hard disks to deploy an operating system to a virtual machine.

This burned me in the past.  Jeff has several scenarios that he walks us through, so if you’re using the Distributed Transaction Coordinator, definitely check this out.

Comments closed

Limiting Color Usage On Dashboard Charts

Jesse Gorter explains why you shouldn’t overwhelm your dashboard chart users with colors:

In this example we use a signal color for the past too. Do you notice how the usage of green distracts from the current week which is a red? This suggest we are doing great overall even though at this time, we are doing not so great. It is up to you to decide what you want to communicate. If you are a sports team showing the rank during the season, only the current position would be important. In sales, having 30 weeks of outstanding sales above the target and the current week selling slightly under, it would make sense to show the signal color for the past.

Not to mention making it easier for people with CVD to read your report, something with which the red-green scheme does not do great.

Comments closed