Press "Enter" to skip to content

Curated SQL Posts

To and From Date Filtering in one Slicer

Reza Rad uses a date slicer:

Power BI from and to date slicer

It happens that you might have two fields as From and To date (or Start and End date) in your dataset, and you want a date slicer in the report. The date slicer has to filter records in a way that the FROM and the TO dates are in the range of dates selected in the slicer. There are multiple ways of doing this. In this article and video, I’ll explain a simple but effective method for that. I have explained in another article, how this can be done using two date slicers, you can read that from here.

Click through to see what you need and how you can put one of these in place.

Comments closed

Diagnosing Stalled Dispatchers

Sean Gallardy has another memory dump to investigate:

If you’ve been following with the series of demystifying dumps then you’re probably thinking or have thought something along the lines of, “All of these issues seem to be around stalled or non-yielding things…” and you’d be correct. If you revisit the non-yielding scheduler dump post, I explain a little about cooperative scheduling and how things need to place nice, if one piece isn’t playing nice by doing its part in a timely manner or not giving up time to others, that’s going to be a very large problem for the system in general. The stalled dispatcher falls into the same set of issues.

There are various dispatcher pools in SQL Server, though the most widely known one is for extended events. Dispatcher pools are items that hold pieces of work to be completed, like a dumping ground for “someone pick this up and do this” type of tasks. If it helps, you can think of them in a similar way as connection pooling, where there are background threads which will do something (like making or updating connections) when needed. It can respond to pressure by making new threads if needed or trim threads when the work slows down. In SQL Server dispatchers set their own attributes including when an item is considered stuck, the number of threads to use, etc., which means some may never be considered stalled as they are able to disable their own timeouts.

Read on to see what might cause a stalled dispatcher and what a memory dump for one can tell us.

Comments closed

Powershell Equality Operations

Dave Mason is not amused:

When comparing two values in PowerShell, you’ll have to march to the beat of a different drum. The syntax is drastically different:

The short reason why Powershell uses equality operators like -eq is that Bash uses them. Though the funny thing is that Bash actually uses == for string equality comparison and only uses -eq for numeric equality comparisons. The norm for POSIX is =, adding yet another level of fun.

Comments closed

The Data Professional Salary Survey

Brent Ozar has re-opened the data professional salary survey:

We’re data people, you and I. We make better decisions when we work off data instead of feelings.

It’s time for our annual salary survey to find out what data professionals make. You fill out the data, we open source the whole thing, and you can analyze the data to spot trends and do a better job of negotiating your own salary:

Click through for the link to the survey. It looks like most of the questions have stayed the same this year, which is good for longer-term analysis.

Comments closed

Updates in Azure Synapse Analytics

Saveen Reddy shows how the Synapse product team has been busy this year:

Previously, Synapse workspaces had a kind of database called a Spark Database. Spark databases had two key characteristics:

– Tables in Spark databases kept their underlying data in Azure Storage accounts (i.e. data lakes)

– Tables in Spark databases could be queried by both Spark pools and by serverless SQL pools.

To help make it clear that these databases are supported by both Spark and SQL and to clarify their relationship to data lakes, we have renamed Spark databases to Lake databases. Lake databases work just like Spark databases did before. They just have a new name.

Okay, this is the kind of change I can do without. That’s a really dumb name. Spark databases tell you what a thing is. It’s a database which lives in Apache Spark. Lake databases run what? Apache Spark. But if anything really should be called a Lake database, it’d be a serverless SQL pool’s database because everything in there is built on top of the data lake—it’s all external tables pointing to a lake. So calling a Spark database a Lake database brings more confusion than elucidation.

Most of the other changes on that list? Really cool. This one? Not at all.

Comments closed

Variables and Scope in Powershell

Dave Mason continues a quest into the bowels of Powershell:

Let’s talk a little bit about PowerShell variables and how long they exist within the scopes they’re defined. I’ve encountered some behavior that for me, was unexpected. It’s made my development efforts unproductive–especially when it comes to debugging.

Just like with notebooks, it’s important to remember that the Powershell prompt has a session, and that you aren’t running fresh every time. You can also use Dave’s solution to the problem, which makes sense as well.

Comments closed

Using the Fail Activity in Azure Data Factory

Rayis Imayev thinks about failure:

Recently, Microsoft introduced a new Fail activity (https://docs.microsoft.com/en-us/azure/data-factory/control-flow-fail-activity) in the Azure Data Factory (ADF) and I wondered about a reason to fail a pipeline in ADF when my internal being tries very hard to make the pipelines successful once and for all. Yes, I understand a documented explanation that this activity can help to “customize both its error message and error code”, but why?

Click through for Rayis’s take. I’ll just be here cracking jokes about how Fail activities are banned in my code because I expect it to have a positive outlook on life.

Comments closed

A Heap of Pain

Chad Callihan explains the dislike for heaps in SQL Server:

A table is considered a heap when it is created without a clustered index. Data isn’t in any type of ordered state. Some data is over here, some data is over there.

When you are inserting data into a heap, that data is tossed in wherever. Think of it like your junk drawer. It’s not organized into its own little sections. What do you do when you have something to add such as a pair of scissors or an old pen? You open the drawer, toss it in, and close it up without giving it a second thought.

Like Chad mentions, there are uses for heaps. And when you move to Azure Synapse Analytics, there are more uses for heaps. But with on-premises SQL Server, a heap is usually a mistake.

Comments closed