Press "Enter" to skip to content

Author: Kevin Feasel

Creating Minesweeper In Power BI

Philip Seamark has fun with a classic Windows game reimplemented in Power BI:

The latest addition to my recent series of DAX based games is the classic Minesweeper game.  This is the game where you are presented with a 9 x 9 matrix of squares.  There are 10 hidden mines and you can either step on a square or place a flag where you think there might be a mine.  If you are lucky enough not to step on a square that contains a mine, you will get clues that help you identify where the mines are.

Click here if you would like to see the final publish to web version.

Click here if you would like to download the PBIX version to go through the code.

Still better than the Windows 10 version.

Comments closed

Taking Action With Wait Stats

Aaron Bertrand lays out a course of action (or inaction) when dealing with the most common wait types in SQL Server:

I started going a little further than this, mapping out some of the more common wait types, and noting some of the properties they shared. Translated into questions a tuner might have about a wait type they are experiencing:

  • Can the wait type be solved at the query level?
  • Is the core symptom of the wait likely to be affecting other queries?
  • Is it likely you will need more information outside the context of a single query and the wait types it experienced in order to “solve” the problem?

When I set out to write this post, my goal was just to group the most common wait types together, and then start jotting notes about them relating to the above questions. Jason pulled the most common ones from the library, and then I drew some chicken scratch on a whiteboard, which I later tidied up a bit. This initial research led to a talk that Jason gave on the most recent TechOutbound SQL Cruise in Alaska. I’m kind of embarrassed that he put a talk together months before I could finish this post, so let’s just get on with it. Here are the top waits we see (which largely match Paul’s survey from 2014), my answers to the above questions, and some commentary on each:

Read on for the top 10 list.

Comments closed

Table Variable Deferred Compilation: When It Works

Milos Radivojevic gives us a good example of when table variable deferred compilation is a good thing:

As mentioned in the previous article, SQL Server 2019 cardinality estimations for a table variable are based on actual table variable row counts. Therefore, in SQL Server 2019, we should expect better estimations and better plans for queries that use table variables.
Which queries will benefit from this improvement? Generally, queries that use table variables with a lot of rows in them, which are not tuned yet. For table variables with a few rows, there will not be significant changes and you should expect the same execution plan and almost same execution parameters.

Queries whose execution was slow due to underestimation in table variables usually implement logical joins by using Nested Loop Join physical operator where a Hash or Merge Join operators would be more appropriate. In addition to this, underestimation of table variables participating in multiple joins could lead to issues with insufficient memory grants, and thus data spilling to tempdb .

Click through for the example.  The next post in the series will be a case where it doesn’t work very well.

Comments closed

Troubleshooting KSQL Executions

Robin Moffatt shows us some of the tools available for researching problems with KSQL queries executed against a server:

What does any self-respecting application need? Metrics! We need to know how many messages have been processed, when the last message was processed and so on.

The simplest option for gathering these metrics comes from within KSQL itself, using the same DESCRIBE EXTENDED command that we saw before:

ksql> DESCRIBE EXTENDED GOOD_RATINGS;
[...]
Local runtime statistics
------------------------
messages-per-sec:      1.10 total-messages:     2898 last-message: 9/17/18 1:48:47 PM UTC
 failed-messages:         0 failed-messages-per-sec:         0 last-failed: n/a
(Statistics of the local KSQL server interaction with the Kafka topic GOOD_RATINGS)
ksql>

You can get more details, including explain plans, from this.  There are external tools which Robin demonstrates as well, which let you track the streams over time.

Comments closed

Enhancements To Actual Query Plans In SSMS 18

Brent Ozar points out a big enhancement to the way SQL Server Management Studio views actual query plans:

You can see the estimated and actual number of rows right there on the query plan just like live query plans! You no longer have to waste hours of your life hovering over different parts of the query plan in order to see where the estimated row counts veer off from the actual row counts.

This doesn’t require SQL Server 2019, either.

Read on for Brent’s thoughts on the matter.

Comments closed

SQL Server Management Studio 18.0 Released

Dinakar Nethi announces the release of a public preview of SQL Server Management Studio 18.0:

Shell improvements

  • SSMS is based on the new VS 2017 Isolated Shell. This means a modern shell that unlocks all the accessibility features from both SSMS and VS 2017.

  • Smaller download size (~400 MB). This is less than half of what SSMS 17.x is.

  • SSMS can be installed in a custom folder. Currently, this is only available on the command line setup. Pass the extra argument to SSMS-Setup-ENU.exe, SSMSInstallRoot = C:\MyFolder

  • High DPI enabled by default.

  • Better support for multiple monitors to ensure dialogs and windows pop up on the expected monitor.

  • Isolation from SQL Engine. SSMS does not share components with SQL engine anymore. More isolation from SQL engine allows for more frequent updates.

  • Package Ids no longer needed to develop SSMS Extensions.

18.0 will install alongside 17, so you can have both at the same time.

Comments closed

Batch Mode Processing On Rowstore Tables

Dmitry Pilugin shares some thoughts on the expansion of batch mode processing to rowstore tables:

The main advantages of Batch Mode are:

  • Algorithms optimized for the multi-core modern CPUs;
  • Better CPU cache utilization and increased memory throughput;
  • Reduced number of CPU instructions per processed row.

All these features make Batch Mode much faster than Row Mode (typically an order of magnitude, 10x-100x times faster) for analytical queries with CS indexes. One major condition for Batch Mode is a presence of a CS index. If you don’t have a CS index on a table involved in a query, you won’t get Batch Mode.

However, some analytical queries may benefit from Batch Mode without a CS index, or CS cannot be created due to some limitations.

There are a few tricks that allow you to enable Batch Mode on a Rowstore table for example with a dummy filtered CS index (see this post from Itzik Ben-Gan), but SQL Server 2019 may use Batch Mode on Rowstore without any extra efforts from your side.

Dmitry dives into the debugger and teases out the specific circumstances which can help get a query considered for rowstore.  If you want a deep dive into what’s currently available, this is your post.

Comments closed

Deploying An Azure Container Within A Virtual Network

Andrew Pruski shows us that you can now deploy an Azure container running SQL Server within an Azure virtual network:

Up until now Azure Container Instances only had one option to allow us to connect. That was assigning a public IP address that was directly exposed to the internet.

Not really great as exposing SQL Server on port 1433 to the internet is generally a bad idea: –

Now I know there’s a lot of debated about whether or not you should change the port that SQL is listening on to prevent this from happening. My personal opinion is, that if someone wants to get into your SQL instance, changing the port isn’t going to slow them down much. However, a port change will stop opportunistic hacks (such as the above).

But now we have another option. The ability to deploy a ACI within a virtual network in Azure! So let’s run through how to deploy.

Click through for those instructions.

Comments closed

Stateful Services With Kubernetes

Kevin Sookocheff explains some scenarios in which stateful Kubernetes services can work well:

With leader election, you begin with a set of candidates that wish to become the leader and each of these candidates race to see who will be the first to be declared the leader. Once a candidate has been elected the leader, it continually sends a heart beat signal to keep renewing their position as the leader. If that heart beat fails, the other candidates again race to become the new leader. Implementing a leader election algorithm usually requires either deploying software such as ZooKeeper, or etcd and using it to determine consensus, or alternately, implementing a consensus algorithm on your own. Neither of these are ideal: ZooKeeper and etcd are complicated pieces of software that can be difficult to operate, and implementing a consensus algorithm on your own is a road fraught with peril. Thankfully, Kubernetes already runs an etcd cluster that consistently stores Kubernetes cluster state, and we can leverage that cluster to perform leader election simply by leveraging the Kubernetes API server.

Kubernetes already uses the Endpoints resource to represent a replicated set of pods that comprise a service and we can re-use that same object to retrieve all the pods that make up your distributed system. Given this list of pods, we leverage two other properties of the Kubernetes API: ResourceVersions and Annotations. Annotations are arbitrary key/value pairs that can be used by Kubernetes clients, and ResourceVersions mark the unique version of every Kubernetes resource in the cluster. Given these two primitives, we can perform leader election in a fairly straightforward manner: query the Endpoints resource to get the list of all pods running your service, and set Annotations on those resources. Each change to an Annotation also updates the ResourceVersion metadata. Because the Kubernetes API server is backed by etcd, a strongly consistent datastore, you can use Annotations and the ResourceVersion metadata to implement a simple compare-and-swap algorithm.

Google has used this approach to implement leader election as a Kubernetes Service, and you can run that service as a sidecar to your application to perform leader election backed by etc. For more on running a leader election algorithm in Kubernetes, refer to this blog post.

This is one of the parts that container services like Docker are striving to answer, but I don’t think they have it quite nailed down yet.

Comments closed

Multi-Class Classification With vtreat

John Mount has an example of using the vtreat package for multi-class classification in R:

vtreat is a powerful R package for preparing messy real-world data for machine learning. We have further extended the package with a number of features including rquery/rqdatatable integration (allowing vtreat application at scale on Apache Spark or data.table!).

In addition vtreat and can now effectively prepare data for multi-class classification or multinomial modeling.

The two functions needed (mkCrossFrameMExperiment() and the S3 method prepare.multinomial_plan()) are now part of vtreat.

Click through for an example of this in action.

Comments closed