Automatic Tuning In SQL Server

Grant Fritchey is a fan of SQL Server 2017’s automatic tuning feature:

The core of automatic tuning at this point in time (because I’m sure it’s going to evolve) is the ability of the query engine to spot when a query has generated a new plan and that new plan is causing performance to degrade. This is known as a regression in the plan. It comes from bad parameter sniffing, changes in statistics, cumulative updates, or the big notorious one, the cardinality estimator introduced in SQL Server 2014 (it’s been almost four years, I’m not calling it new any more). Spotting a plan regression prior to SQL Server 2016 and the introduction of the Query Store was a major pain. However, now it’s easy. You can spot them by reading the data collected. Further, Microsoft can spot them by reading the data collected, and it does.

If the engine sees that a plan is causing a regression (you have to have Query Store enabled for this), then it will write out a suggestion for fixing it to the new DMV, sys.dm_db_tuning_recommendations. If you further enable automatic tuning on your database, then SQL Server can automatically substitute the last good plan for you, fixing the problem. Further, SQL Server monitors this fix and if, over time, it’s clear that the forced plan is causing problems, it will unforce the plan, again automagically.

Click through for more information, including a query to read from the tuning recommendations DMV.

Unit Testing Spark Streaming DStreams

Anuj Saxena gives an example of using StreamingSuiteBase to build unit tests for DStreams in Spark Streaming:

So what’s the problem? How to execute streaming logic in a test environment.

We can write Integration test cases and provide the actual environment in the integration test. But for unit testing, we need a testing environment which should not depend on any external application.

Click through for the example.

The Need For Multiple Warehouse Architectures

James Serra argues in favor of a data lake approach and a traditional data warehouse:

I think the ultimate question is: Can all the benefits of a traditional relational data warehouse be implemented inside of a Hadoop data lake with interactive querying via Hive LLAP or Spark SQL, or should I use both a data lake and a relational data warehouse in my big data solution?  The short answer is you should use both.  The rest of this post will dig into the reasons why.

I touched on this ultimate question in a blog that is now over a few years old at Hadoop and Data Warehouses so this is a good time to provide an update.  I also touched on this topic in my blogs Use cases of various products for a big data cloud solutionData lake detailsWhy use a data lake?and What is a data lake? and my presentation Big data architectures and the data lake.  

Read on for James’s argument, which is good.  My argument is summed up as follows:  the purpose of a data warehouse is to solve known business problems—that is, to help build reports that people on the business side need based on established requirements.  The purpose of a data lake is to hold all kinds of data and curate it for when people come looking for something they didn’t know they needed.

Failover Groups In Azure SQL Database

Jim Donahoe shows off Failover Groups in Azure SQL Database.  Part 1 involves setting up a Failover Group:

In my former company, we had 22 web applications that all had connections to various databases.  We had all of our databases configured for Geo-Replication already, but still if we had to failover, we had to update each connection string for the web apps along with others which became a tedious process. In came Failover Groups to the rescue!  With a Failover Group, I was able to create two endpoints that stayed the same no matter which server was primary/secondary.  I liked to think of these as my Availability Group Listeners as they kinda serve the same functionality: Route traffic to a node depending on if its read-only or not.  Best part?  It’s configured through the Azure Portal SO EASILY!  You can use PowerShell as well, but for this blog post, I will walk through the creation via the Portal.  I will make a separate post or attach a script at some point for the PowerShell deployment.

Before we start the configuration portion of this though, let’s take a look at how Microsoft defines what a Failover Group is.  I found this definition here:  “Azure SQL Database auto-failover groups (in-preview) is a SQL Database feature designed to automatically manage geo-replication relationship, connectivity, and failover at scale.”  Sounds pretty interesting, right? Let’s make one!

In Part 2, Jim shows how to connect to SQL Server using the Failover Group listener:

Well, now that the easy stuff is out of the way, let’s talk about how you connect to these groups via SSMS.  This is where some of the confusion happens.  When I first configured a Failover Group, the first thing I tried to do was connect to the Primary server via SSMS thinking it will work just like an Always On Listener in traditional SQL Server…NEWP!

If you’re running a production database on Azure SQL Database, you might want to look at Failover Groups.

The Argument For Single-Socket Servers

Joe Chang wants us to think about socket counts:

It might seem that the 2-socket system continues to be a good choice, as two processors with an intermediate number of cores is less expensive than one processor with twice as many cores. An example is the Xeon Gold 6132 14-core versus the Xeon Platinum 8180 28-core processors. In addition, the two-socket system has twice the memory capacity and nominally twice as much memory bandwidth.

So, end of argument, right? Well, no.

Click through for his argument in favor of single-socket machines for OLTP systems.

Power BI Usage Metrics

Gogula Aryalingam shows how to access Power BI usage metrics for a report or dashboard:

Each app workspace gets its own report usage metrics data set, it’s just that you don’t see it when you are in the portal. In order to access it (at least for now) you need to use Power BI Desktop. When you open Power BI Desktop, you need to sign-in with the appropriate login, and then choose Power BI service from Get Data menu item. You then get listed with a set of app workspaces; within each you would find a list of all the datasets that were every published to each of the workspaces. Additionally, Power BI will also give you two more datasets: Report Usage Metrics Model and Dashboard Usage Metrics Model. However, these data models will only show up if you had attempted to view usage metrics at least once on one of the reports of the app workspace.

Read the whole thing.

When The Power BI Work Is Done

Melissa Coates has a great checklist to help you figure out if your Power BI dashboard is done:

Auto time intelligence is enabled by default, and it applies to each individual PBIX file (there’s not a global option). For most datetime columns that exist in the dataset, a hidden date table is created in the model to support time-oriented DAX calculations. This is great functionality for newer users, or if you have a very simple data model. However, if you typically utilize a standard Date table, then you will want to disable the hidden date tables to reduce the file size. (Tip: You can view the hidden date tables if you connect to the PBIX via DAX Studio.)

There are a lot of good things to think about here.

Using The Squint Test

Meagan Longoria gives us the squint test:

While you can definitely perform the Squint Test on your report within Power BI Desktop, I recommend also testing in a browser once the report is deployed to PowerBI.com or to the Power BI Report Server portal since colors and objects may be slightly different there.

The Squint Test is also used in web page design, so web developers have made tools to aid them in this check. While just squinting at the page is perfectly sufficient, using a browser extension or another tool allows you to easily share your findings with others. In the Chrome Browser, there is a free extension called The Squint Test. This extension places an eye icon near the top right of the browser window. Clicking the icon provides a slider that allows you to increase or decrease the amount of blur applied to the page.

Meagan also has an example of applying this test and picks a dashboard where she can make some improvements, so check it out.

Categories

December 2017
MTWTFSS
« Nov Jan »
 123
45678910
11121314151617
18192021222324
25262728293031