Press "Enter" to skip to content

Curated SQL Posts

String Parsing with SQLCLR

Josh Darnell would like you to give CLR a try:

The basic problem is this: given a string of arbitrary characters, return a new string containing only numbers. If there are no numbers, return NULL.

One use case for a function like this would be removing special characters from a phone number, although that’s not the specific goal of this function.

Doing string manipulation directly in SQL Server using T-SQL is somewhat infamously slow. This is stated very nicely in Aaron Bertrand’s (b | t) blog post Performance Surprises and Assumptions : STRING_SPLIT() (note he’s talking about splitting strings, the emphasis is mine to illustrate the broader point):

Throughout, my conclusion has been: STOP DOING THIS IN T-SQL. Use CLR or, better yet, pass structured parameters like DataTables from your application to table-valued parameters (TVPs) in your procedures, avoiding all the string construction and deconstruction altogether – which is really the part of the solution that causes performance problems.

For the life of me, I really don’t get why CLR is supposed to be so scary for DBAs. The best answer I have is that they matched the .NET term “unsafe” (which means unmanaged code) and DBAs, without a .NET background, interpreted that the wrong way.

Leave a Comment

Generating Scripts as a Notebook in SSMS 18.5

Emanuele Meazzo is excited about some new functionality in SQL Server Management Studio 18.5:

Microsoft just dropped SSMS 18.5 after almost 5 long months without any updates; this new release fixes a lot of bugs and introduces a few new features, above them all I’m now showing you the following.

I’m sure that you used the the “Generate Scripts” feature in SSMS quite a few times, you could generate the code for schema and/or data for any (or all) the objects in your DB, especially if you haven’t embraced that sweet, elusive, devops workflow.

Well, good news! Other than file, clipboard and the good ‘o new query window, you can now export directly to a new Notebook.

Read on to see what you have to do and what the output looks like.

Leave a Comment

All About Table Expressions

Itzik Ben-Gan has started a series on table expressions:

Perhaps this will come as a surprise to some, but I actually do find the use of the term table in common table expression as very appropriate. In fact, I find the use of the term table expression as appropriate. To me, the best way to describe what a CTE is in T-SQL, it’s a named table expression. The same applies to what T-SQL calls derived tables (the specific language construct as opposed to the general idea), views and inline TVFs. They are all named table expressions.

If you can bear with me a bit, I’ll provide the reasoning for my view of things in this article. It occurred to me that both the naming confusion, and the confusion around whether there’s a persistency aspect to table expressions, can be cleared with a better understanding of the fundamentals of our field of relational database management systems. Those fundamentals being, relational theory, how SQL (the standard language) relates to it, and how the T-SQL dialect used in the SQL Server and Azure SQL Database implementations relates to both.

There’s a lot of depth in this post, so I recommend a careful reading.

Leave a Comment

Dataflows vs Datasets in Power BI

Reza Rad disambiguates two Power BI concepts:

I have presented about Power BI dataflow and datasets a lot, and always one of the questions I get is: What is the difference between dataflow and dataset. So I thought better to explain it in a post and help everyone in that understanding. In this post, you will learn what the differences between these two components are, when and where you use each of them, and how they work together besides other components of Power BI.

Read on to learn where each is useful.

Leave a Comment

Decade Two of Hadoop

Arun Murthy takes us through decade two of Hadoop:

By the end of the first decade, we needed a fundamental rethink — not just for the public cloud, but also for on-premises. It’s also helpful to cast an eye on the various technological forces driving Hadoop’s evolution over the next decade:

– Cloud experiences fundamentally changed expectations for easy to use, self-service, on-demand, elastic consumption of software and apps as services.
– Separation of compute and storage is now practical in both public and private clouds, significantly increasing workload performance.
– Containers and kubernetes are ubiquitous as a standard operating environment that is more flexible and agile.
– The integration of streaming, analytics and machine learning — the data lifecycle — is recognized as a prerequisite for nearly every data-driven business use case.

“Core” Hadoop (not including products in the broader Hadoop ecosystem like Spark, Kafka, etc.) hit a major stress point with migration out of data centers running direct attached storage. This is how Cloudera is working to pick up some of that lost momentum.

Leave a Comment

SQL Server Management Studio 18.5 GA

Dinakar Nethi announces that SSMS 18.5 is now generally available:

Today, we’re sharing the release of SQL Server Management Studio (SSMS) 18.5. We have some feature updates as well as important behind the scenes updates.

You can download SQL Server Management Studio 18.5  today and review SSMS Release Notes for full details.

Hugo Kornelis recommends that you update as soon as possible:

And I need all of you to update your version. Now. Yes, right now. Here’s a link to download it. I’ll wait.

Why the rush, you ask? Because hidden in between all the little (and some big) improvements and fixes, there is one true gem. One I wish Microsoft had done … oh, let’s say two decades ago?

Click through to see what has Hugo so excited.

Leave a Comment

Sharing a Dataset in Power BI

Marc Lelijveld shows how you can share a dataset in Power BI:

There are many different use cases to consider where shared datasets can be an advantage. Below I have quickly listed a few advantages, but probably you can think of many more.

– Centrally managed definitions and calculations to avoid different calculations for the same metrics and different versions of the truth.
– One central load from source to Power BI dataset, which lowers the performance impact on the source system.
– Easier to kickstart the data driven analytics experience for the business users and any other self-service analytics purposes.

Sharing here doesn’t mean giving to the broader world; it’s sharing within an organization.

Leave a Comment

Powershell Interactive Debugging in Visual Studio Code

Jess Pomfret shows how to use the interactive debugger in Visual Studio Code to troubleshoot an issue in Powershell code:

So I figured I’d take a look and see what was happening and how we could fix it. Now I’m going to be honest with you, my usual method of debugging involves adding Write-Host 'Hi‘, or piping objects to Out-GridView. I did start down this route, but the Get-DbaRegServer function calls an internal function, and things quickly got complicated.

Luckily, the PowerShell extension for VSCode includes a debugger so we can level up our game and use that to track down our issues.

Click through to see how it works.

Leave a Comment