Press "Enter" to skip to content

Author: Kevin Feasel

Working with CROSS APPLY

Paul Randal takes us through one of my favorite operators:

Introduced by Microsoft in SQL Server 2005, SQL CROSS APPLY allows values to be passed from a table or view into a user-defined function or subquery. This tutorial will cover the incredibly useful and flexible APPLY operator, such as how the CROSS APPLY and OUTER APPLY operators work, how they’re like the INNER and LEFT OUTER JOIN, and give you some examples of both. All the examples use the AdventureWorks example database.

Later in the article, I’ll also discuss a highly pervasive SQL Server performance problem—one I still encounter with customers on a weekly basis. This problem is related to using the APPLY operator against a specific type of user-defined function. It’s so problematic and can overwhelm tempdb, bringing your SQL Server instance to a crawl!

If you’re not too familiar with APPLY in its two forms, read the whole thing.

Comments closed

Powershell Colors and Icons

Jeff Hill has fun with terminals:

If you have followed along with me before, chances are you know I like color. Color helps me zoom in on the information I want faster. Color can help me know if I have the right number of curly braces. Modern browsers and word processors show a red squiggly line when you misspell a word. Your eye is drawn to that bit of color that is different than the rest of the page. Keep reading and I’ll show you how to add some spice to your PowerShell experience.

I heartily endorse the mention of Windows Terminal later on. If you haven’t tried it before, it’s a worthy successor to CMD and the basic Powershell terminal.

Comments closed

query_antipattern_type Extended Event

Aaron Bertrand is intrigued:

But one thing that caught my eye in all the ruckus is a new Extended Event called query_antipattern, which is a lot more up my personal alley. You may have come across my Bad Habits series, or might have seen me rant about some of these in person. I think this is a promising compromise between manually parsing every single query and just not realizing how many little things are collectively bringing SQL Server to its knees.

This event is not yet documented, but it seems like it could be some combination of runtime code analysis and execution plan inspection. Right now, all we have to go on is a set of cryptic values in sys.dm_xe_map_values:

Read on to see those values and Aaron’s speculation.

Comments closed

SQL Server 2022 and Azure Data Lake Storage

I continue a series on data virtualization in SQL Server 2022:

Today’s post is a fairly short one. Let’s say you want to connect to Azure Data Lake Storage Gen2 from SQL Server 2022. We saw in a recent post how to connect to Azure Blob Storage. When it comes to Azure Data Lake Storage, the story is almost the same, though there are a couple pitfalls you will want to avoid.

Read on for that error, which stymied me for a good 10 minutes.

Comments closed

Building a Lakehouse with Azure Synapse Analytics

Arshad Ali does a bit of construction:

Data Lakehouse architecture has become the de facto standard for designing and building data platforms for analytics as it bridges the gap and breaks the silos created by the traditional/modern data warehouse and the data lake. This blog post introduces you to the world of data lakehouse and it goes into details of how to implement it successfully in Azure with Azure Synapse Analytics.

Read the whole thing.

Comments closed

Comparing Performance of Azure SQL Databases

Reitse Eskens begins an interesting series:

The trigger for this feast were a number of questions from my employer. What database tier and SKU (Stock Keeping Unit) should we choose for customer C. Well, the answer depends on their budget and the performance they’re expecting. And we didn’t have a conclusive answer. Microsoft will provide you with their statistics on database performance, but what does that mean in real life. Because no matter what the good people in Redmond can come up with as databases, we as data engineers can do worse. Much worse.

So, I created some horrible query’s to insert, select and delete data and fired them off to a database. My own laptop with nvme disks provided the baseline, after that I gradually worked my way up from the Basic DTU database to the Hyperscale database.

This first post is an introduction to the method and laying out expectations. I’m looking forward to the remaining posts in the series.

Comments closed

Deploying an Arc-Enabled SQL Managed Instance

Warwick Rudd continues a series on Azure Arc-enabled data services:

Now that we have our Azure Arc-enabled Data Controller configured and available, we can now deploy our first Arc-enabled SQL Managed Instance into our environment. As previously mentioned depending on the type of configuration required for your environment with your Arc-enabled Data Controller (Directly connected or Indirectly connected modes) this will dictate the approach available for you to setup / configure your Arc-enabled SQL Managed Instance.

Click through for a step-by-step guide.

Comments closed

Cancelling a Power BI Dataset Refresh

Chris Webb says, never mind:

The thing that got me excited about the Power BI Enhanced Refresh API, and which inspired me to start this series of posts, was the fact that for the first time it gives you a way of cancelling Power BI dataset refreshes. In this last post in the series I’ll show you how you can add an Action to your Power Automate custom connector to cancel a refresh and how you can use it in a Power Automate Flow.

Click through to see how.

Comments closed

Creating a Date Column in Power Query from Year Number and Month Name

Erik Svensen needs to build a date:

Sometimes it’s the little things that can help you minimize the number of steps in your Power Queries.

Here is another example that might help you.

Let’s imagine your data contains two columns with year and the month name and you want to create a date column.

Read on for a clever function modification to generate a date from these two fields.

Comments closed

Using the ShortCircuitOperator in Airflow

Lior Gavish shows off a useful operator in Apache Airflow:

But what happens when Airflow testing doesn’t catch all of your bad data? What if “unknown unknown” data quality issues fall through the cracks and affect your Airflow jobs? 

One helpful but underutilized solution is to leverage the Airflow ShortCircuitOperator to create data circuit breakers to prevent bad data from flowing across your data pipelines.

Data circuit breakers are powerful, but as with most data quality tactics, the nuances of how they are implemented are critical. Otherwise, you can make a bad problem worse.

Read on to learn more about the operator and how you can use it. The code block images are a bit fuzzy but still readable enough. It might be a little clearer on the original post.

Comments closed