Curated SQL – Page 1017 – A Fine Slice Of SQL Server

Waits and Queues has been used as a SQL Server performance tuning methodology since Tom Davidson published the above article as well as the well-known SQL Server 2005 Waits and Queues whitepaper in 2006. When used in combination with resource metrics, waits can be valuable for assessing certain performance characteristics of the workload and aid in steering tuning efforts. Waits data is surfaced by many SQL Server performance monitoring solutions, and I’ve been an advocate of tuning using this methodology since the beginning. The approach was influential in the design of the SQL Sentry performance dashboard, which presents waits flanked by queues (key resource metrics) to deliver a comprehensive view of server performance.
However, some seem to have missed Davidson’s point regarding the importance of resources and rely almost entirely on waits to present a picture of query performance and system health. Waitstats come directly from the SQL Server engine and are easy to consume and categorize. Waiting queries mean waiting applications and users, and no one likes to wait! From a marketing standpoint this is pure gold for a SQL Server monitoring tools vendor – it is easier to evangelize waits analysis as a singular solution for making queries and applications faster than the full story, which is more involved.
Unfortunately, a waits-focused approach to the exclusion of resource analysis can mislead users, and worst-case leave them flying blind. SentryOne team members Kevin Kline and Steve Wright have previously touched on this here and here. In this post I’m going to take a deeper dive into some recent research made possible by Query Store that has shed new light on how deficient waits-focused tuning can truly be.

Interesting research and Greg does a great job of explaining it.

Comments closed

Building Custom Sort Orders in Power BI

Published 2020-01-24 by Kevin Feasel

Reza Rad shows us how to perform custom sorting in Power BI:

I have previously written about how to sort a column by another column, and I used Month Names as an example. However, still, many are unaware that the same technique with slight modifications can be applied to any other columns. You can have a text column in your slicer (product category for example), and sort it based on a different order than the normal alphabetical order. In this post, I am going to show you how to do a custom sort order for a column in Power BI.

There are a few important considerations, so check out Reza’s post.

Comments closed

Creating a UI in Powershell

Published 2020-01-24 by Kevin Feasel

Michael Berthold walks us through a useful example of using POSHGUI’s UI editor:

Some time back, a customer and I were working with the SentryOne PowerShell Module. Our PowerShell Module lets you manage the targets you are monitoring with SentryOne using a script or command line rather than the UI. This is a great time saver when you’re administering performance monitoring for hundreds or thousands of database servers.
The customer and I worked together to type up the commands they wanted for their script. They mentioned how it would be great if there were a GUI for this. This seemed odd initially, because the reason we were doing this in the first place was to automate these actions outside of a GUI. We spoke on it for a bit, and their meaning become clear. They envisioned a simple GUI used to guide in defining the commands for the PowerShell Module. I agreed that this would be helpful in getting a head start on scripting automation. I decided to find a way to fill this need.
This post explores one way to create a GUI using PowerShell. I’m using the SentryOne PowerShell Module for this example, but this method can be used for any PowerShell script.

Click through to see the example.

Comments closed

Concepts in Support Vector Machines

Published 2020-01-23 by Kevin Feasel

Abhijit Telang takes us through the calculations involved in Support Vector Machines and then gives us an example in R:

So, let’s take that out and we are back to old, classical vector algebra. It’s like a person with a bunch of sticks to figure out which one to lay where in a 2-D plane to separate one class of objects from another, provided class definitions are already known.
The problem is which particular shape and length must be chosen to show maximum contrast between classes.
We need to arrive at a function definition, in such a way that the value a given function takes changes drastically (e.g. from a large positive value to a large negative value).

SVM is often great for two-class classification problems, and different variants also work well for multi-class problems.

Comments closed

Log Aggregation with Apache Flink

Published 2020-01-23 by Kevin Feasel

Gyula Fora and Matyas Orhidi have started a series on log aggregation with Apache Flink:

There are several off-the-shelf solutions available on the market for log aggregation, which come with their own stack of components and operational difficulties. For example, notable logging frameworks that are widely used in the industry are ELK stack and Graylog.
Unfortunately, there is no clear cut solution that works for every application, and different logging solutions might be more suitable for certain use cases. The log processing of real-time applications should for instance also happen in real-time, otherwise, we lose timely information that may be required to successfully operate the system.
In this blog post, we dive deep into logging for real-time applications.

This post is mostly understanding and setup, but it leads into processing and visualization.

Comments closed

Migrating Oracle Exadata Workloads to Azure

Published 2020-01-23 by Kevin Feasel

Kellyn Pot’vin-Gorman shows the process of moving from an Exadata system to Oracle on Azure:

An Exadata is an engineered system- database nodes, secondary cell nodes, (also referred to as storage nodes/cell disks), InfiniBand for fast network connectivity between the nodes, specialized cache, along with software features such as Real Application Clusters, (RAC), hybrid columnar compression, (HCC), storage indexes, (indexes in memory) offloading technology that has logic built into it to move object scans and other intensive workloads to cell nodes from the primary database nodes. There are considerable other features, but understanding that Exadata is an ENGINEERED system, not a hardware solution is important and its both a blessing and a curse for those databases supported by one. The database engineer must understand both Exadata architecture and software along with database administration. There is an added tier of performance knowledge, monitoring and patching that is involved, including knowledge of the Cell CLI, the command line interface for the cell nodes. I could go on for hours on more details, but let’s get down to what is required when I am working on a project to migrate an Exadata to Azure.

Click through for the process.

Comments closed

ORDER BY Can Change the Query Plan

Published 2020-01-23 by Kevin Feasel

Erik Darling notes that adding an ORDER BY clause to a query can change the underlying query plan:

Sometimes I think it’s interesting how adding a seemingly useless or harmless thing to a query can change the query plan.
Here’s a quick example using an Order By on an indexed column.

Your mileage may vary on whether that’s a good thing.

Comments closed

Power BI: Visual has Exceeded the Available Resources

Published 2020-01-23 by Kevin Feasel

Chris Webb explains why you might see an error in Power BI:

This visual has exceeded the available resources. Try filtering to decrease the amount of data displayed.Please try again later or contact support. If you contact support, please provide these details.More details Resource Governing: The query exceeded the maximum memory allowed for queries executed in the current workload group (Requested 1048580KB, Limit 1048576KB).
The official Power BI documentation has similar advice to what’s shown in this dialog about what to do here, but what’s really going on?
The information in the “More details” section of the section dialog gives you a clue: in this case it’s resource governance. When you run a DAX query in Power BI it will always use a certain amount of memory; inefficient DAX calculations can cause a query to try to grab a lot of memory. In Power BI Desktop these queries may run successfully but be slow, but the Power BI Service can’t just let a query use as many resources as it wants (if it did, it may affect the performance of other queries being run by other users) so there is a resource governor that will kill queries that are too resource hungry. In the case of the visual above the query behind it tried to use more than 1GB of memory and was killed by the resource governor.

Read on to understand where these limits are and how you can modify them.

Comments closed

Indexes for Memory-Optimized Tables

Published 2020-01-23 by Kevin Feasel

Monica Rathbun takes us through the options available when creating indexes on memory-optimized tables:

Before we dive into this subject it is VERY important to note the biggest differences.
First, ALL memory optimized indexes MUST be created when the table is created or migrated. You cannot add indexes in an existing table without dropping and recreating the table.
Secondly, currently you can only have 8 indexes per table including your primary key. Remember that every table must have a primary key to enforce a secondary copy for a minimum of schema durability This means you can only really add 7 additional indexes so be sure to understand your workloads and plan indexing accordingly.

There are a few other differences as well, which Monica covers before detailing the specific index options.

Comments closed

Solving the Gaps and Islands Set of Problems

Published 2020-01-23 by Kevin Feasel

Ed Pollack continues a series on gap and island analysis:

Gaps and islands analysis supplies a mechanism to group data organically in ways that a standard GROUP BY cannot provide. Once we know how to perform an analysis and group data into islands, we can extend this into the realm of real data.
For all code examples in this article, we will use a set of baseball data that I’ve created and maintained over the years. This data is ideal for analytics as it is large and contains data quality that varies between very accurate and very sloppy. As a result, we are forced to consider data quality in our work, as well as scrutinize boundary conditions for correctness. This data will be used without much introduction as we will only reference two tables, and each is relatively straightforward.

The code in this article gets a bit complex, but Ed shows off some powerful techniques.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

Curated SQL Posts