Press "Enter" to skip to content

Author: Kevin Feasel

When Date Tables Go Bad

Brent Ozar walks through a scenario in which a calendar table (AKA, date dimension) makes a query perform quite a bit worse:

So why did the date table not perform as well as the old-school way?

SQL Server doesn’t understand the relationship between these two tables. It simply doesn’t know that all of the rows in the Users table will match up with rows in the calendar table. It assumes that the calendar table is doing some kind of filtering, especially given our CAST on the date. It doesn’t expect all of the rows to match.

My reaction was pretty much the same as Koen Verbeeck’s in the comments. Put in clearer terms, calendar tables work best when you’re joining a DATE type to a DATE type. Once you introduce times into the mix, the optimizer has to behave differently, not least because you have to do things like CAST() to coerce data types.

Comments closed

Benefits from Nonclustered Columnstore Indexes

Dave Mason shows off some places where non-clustered columnstore indexes can benefit you:

I tend to work mostly with OLTP environments. Many of them have questionable designs or serve reporting workloads. Not surprisingly, there are a lot of performance-sapping table scans and index scans. I’ve compensated for this somewhat by using row and page compression, which became available on all editions of SQL Server starting with SQL Server 2016 SP1. Could I get even better results with columnstore indexes? Lets look at one example.

Here are four individual query statements from a stored procedure used to get data for a dashboard. If you add up percentages for Estimated Cost (CPU + IO), Estimated CPU Cost, or Estimated IO Cost, you get a total of about 90% (give or take a few percent).

Read on for the queries and to see how adding a non-clustered columnstore index helped in Dave’s case. I haven’t had a great deal of success with non-clustered columnstore indexes, but have greatly enjoyed the use of clustered columnstore indexes for fact tables.

Comments closed

Reviewing the Windows Event Log with Powershell

Jess Pomfret takes us through an improvement to the old Get-EventLog cmdlet:

Recently I was tasked with troubleshooting an incident on a SQL Server at a certain point in the past, the issue being a high CPU alert.  It’s hard (without monitoring solutions set up) to go back in time and determine what the issue is.  However, one thing we can check is the windows event log to see if there was anything happening on the server at that time.

Now, you probably know that my favourite tool of choice is PowerShell, so let’s take a look at how we can use Get-WinEvent to see what was happening in the past.

Get-WinEvent is the newer revamped version of Get-EventLog, and there are two improvements I believe are worth mentioning. Firstly, with the introduction of filter parameters we can now find certain events much easier, which we’ll talk about a little later. Secondly, the performance of Get-WinEvent is much faster than using the legacy command.  I believe this is due to the filtering happening at the event engine instead of within PowerShell.

Those are some rather welcome improvements.

Comments closed

Finding Running SQL Agent Jobs

Garry Bargsley has a quick dbatools script to find running SQL Agent jobs:

Do you run SQL Agent Jobs in your environment?  Do you know what is running at any given time?  Most people do not know what is currently running if you ask them.

There are several times I need to know what job(s) is running on which server. For instance, let’s say on the morning of monthly maintenance you want to check to make sure all your important ETL jobs are complete or that there are currently no database backup jobs running.

Read on to see the cmdlet and how you can use it to query across a broad set of servers.

Comments closed

Calling Stored Procedures with Output Parameters from Entity Framework Core

Erik Ejlskov Jensen shares how to call a stored procedure which includes an output parameter from Entity Framework Core:

In this post I will show how you can call stored procedures with OUTPUT parameters from EF Core. I am using the Northwind database for the sample code.

Erik’s code doesn’t have this problem, but using FromSqlRaw can lead to SQL injection problems if you aren’t careful with sanitizing inputs.

Comments closed

Managing Lakehouse Data

Harsha Gummadavelli gives us an introduction to the Data Lakehouse concept:

“Data Lakehouse” is a new architecture paradigm in the data management space that combines the best characteristics of Data Warehouse and Data Lakes. Once you load the data into a data lake, there is no need to load the data into a warehouse for additional analysis or business intelligence. You can directly query the data residing in cheaper but highly reliable storage, often termed as “Object Stores”, thus reducing the operational overhead on data pipelines.

I will say that I’m not particularly sold on the data lakehouse concept at this point. It’s interesting, in that it reduces the number of systems to maintain by one, but I do wonder about performance issues when trying to replace an existing warehouse. The post turns into a marketing pitch for Informatica, but the first half does give a fair introduction to the concept.

Comments closed

Spark 3.0’s Structured Streaming UI

Genamo Yu, et al, show off the Structured Streaming user interface built into Apache Spark 3.0:

When a developer submits a streaming SQL query, it will be listed in the Structured Streaming tab, which includes both active streaming queries and completed streaming queries. Some basic information for streaming queries will be listed in the result table, including query name, status, ID, run ID, submitted time, query duration, last batch ID as well as the aggregate information, like average input rate and average process rate. There are three types of streaming query status, i.e., RUNNINGFINISHED and FAILED. All FINISHED and FAILED queries are listed in the completed streaming query table. The Error column shows the exception details of a failed query.

Read on to learn more.

Comments closed

Creating a Database Project with Azure Data Studio

Wolfgang Strasser takes the database project extension for a spin:

There is currently one requirement to start your database project development in ADS, it is that you need the Insider build of ADS (that you can download here). After the installation, you’ll need to install the extension. Please search for it in the list of extensions and install it in your ADS instance.

Tom Norman and I talked about it in detail on last night’s episode of Shop Talk (to be posted later today). It’s a good start, but there are still some rough edges and missing functionality. I’d expect that to improve over time, though.

Comments closed

Filtering out Blanks in MEDIANX with DAX Studio

Matt Allington continues a series on blanking out:

This article is a follow on from last week. I recommend you go back and read the article first if you missed it, but in summary, I want to write a measure (not a calculated column) that will return the median sales of products while excluding the products with blanks (no sales). As I showed last week, this is relatively easy with a calculated column. Here it is again.  Remember writing calculated columns first is a great way to visualise the problem you want to solve.  It is not a great way to solve most problems (some yes, most, no).

Read on to see how you can solve the problem using DAX Studio.

Comments closed

Custom Formatting of Visuals using Calculation Groups

Gilbert Quevauvilliers shares some exciting news:

The Power BI team has been doing a lot of incredible work. The most recent update which I got wind of is Custom Formatting of measures is now supported for Visuals.

This has already been deployed to the Power BI Service and if you download the
latest version of Power BI Desktop (Version 2.83.5894.961 as at 03 Aug 2020) it has the new features. This means you can use this TODAY!

Previously this was only supported for tables and matrixes.

Click through to see how it looks in Power BI. It’s easy, and that’s a good thing.

Comments closed