Press "Enter" to skip to content

Author: Kevin Feasel

Parallel SQL Agent Jobs: The Next Step

Aaron Bertrand wraps up a series on parallel SQL Agent jobs:

Imagine you’ve split up this chunk of work so that, collectively, it finishes much faster. When all of the threads are done, you want to kick off the next round of work. For example, if the threads are writing data to flat files, when they’re all done, you might want to then kick off a process that zips them, or moves them, or bulk loads them into a different system, or all of the above.

Read on for some of the challenges around this, as well as what you can do about them. Also, Aaron mentions SQL Agent tokens en passant, and that’s something that will probably be new to most people.

Comments closed

Concatenating Strings and (N)VARCHAR Truncation

Vlad Drumea troubleshoots a common problem:

The code in this case is the GetStatsInfoForWholeDB.sql script that’s part of PSBlitz’s resources.
This script is used for, you wouldn’t believe by the name alone, getting statistics information for a specific database.

Due to the fact that it might be ran on Azure or on older versions of SQL Server, as well as on databases with incremental statistics, the best option for it was to use dynamic SQL.

In this case it uses a variable @SQL defined as NVARCHAR(MAX) to store the query that’s built at runtime and execute it via EXEC.

Read on for one of the most common issues you may run into around generating dynamic SQL.

Comments closed

Module.Versions in Power Query

Chris Webb gives Internet sleuths something to chew on:

The ever-vigilant folks on the internet have spotted that there’s a new M function in the latest versions of Power BI and Excel: Module.Versions. This function, at the time of writing, returns a record with a single field in that contains the version number of the Power Query engine currently in use.

Click through for an example of calling the function and what it returns as of right now.

Comments closed

Building a Power Query Template

Nikola Ilic looks at Power Query templates:

In this “ocean” of innovations, there are certain features that don’t get the deserved limelight – as they somehow go under the radar. Some of them, I really consider “hidden gems” – you might not use them in each and every solution, but in some scenarios, they can be of immense help.

A hidden gem that I’m introducing today is called Power Query Template. As of today, this feature is still in preview (the same as many others in Microsoft Fabric), but this doesn’t minimize its potential.

Read on to see why Nikola likes capability this so much.

Comments closed

Splitting a Number into Component Digits in R

Steven Sanderson does a bit of splitting:

Splitting numbers into individual digits can be a handy trick in data analysis and manipulation. Today, we’ll explore how to achieve this using base R functions, specifically gsub() and strsplit(). Let’s walk through the process step by step, explain the syntax of each function, and provide some examples for clarity.

Click through for a pair of examples.

Comments closed

An Overview of Classification Algorithms

Matthew Mayo explains several algorithms:

Classification algorithms are at the heart of data science, helping us categorize and organize data into pre-defined classes. These algorithms are used in a wide array of applications, from spam detection and medical diagnosis to image recognition and customer profiling. It is for this reason that those new to data science must know about and understand these algorithms: they lay foundations for more advanced techniques and provide insight into how those data-driven decisions are made.

Let’s take a look at 5 essential classification algorithms, explained intuitively. We will include resources for each to learn more if interested.

Click through for five algorithms and a couple of paragraphs describing how the algorithm works. For a little bit of self-promotion on my end, I have a series on YouTube running right now on the topic of classification where I cover a variety of algorithms. As a spoiler, 4 of the 5 on Matthew’s list will have their own videos, and there are several other algorithms to boot.

Comments closed

Time Travel in the Microsoft Fabric Warehouse

Reza Rad hops in the Delorean:

Data changes throughout time, especially in the world of BI and data warehousing systems; the data gets updated through ETL processes frequently. This means that the data you see in the warehouse today might differ from yesterday and the day before, and so on. Some parts of this data can be retrieved on a timely basis. You can, for example, query the sales amount from the sales table where the date has been the 2nd of April. That would give you the sales amount for the 2nd of April, even if you are querying it on the 23rd of May.

However, what if some of the sales transactions on the 2nd of April got updated? The sales amount you see would likely be the updated amount, but not the original amount. It is sometimes useful to be able to see what was that original amount, or in other words, travel in time and see what that value was.

Click through for a combination video and article. The syntax isn’t quite the same as with temporal tables in SQL Server, though it’s close enough to follow along if that’s your relevant experience.

Comments closed

Contained Availability Groups in SQL Server 2022

Rich Benner talks about a new feature:

SQL Server 2022 introduced of a number of new features, and one of the more interesting of these is Contained Availability Groups. These are very close to normal Availability Groups, but they have some very important differences. In this post, I’ll compare the new features included and some of the pitfalls that we’ve come across with their implementation.

Read on to learn more about the functionality and several harsh limitations.

Comments closed

Building an ERD for Existing Databases

Josephine Bush creates a diagram:

There are several tools out there to make your life easier by creating an ERD for your existing db. Everything works pretty well when you have a small number of tables with FKs mapped, but when the number gets bigger, the diagram naturally gets a lot messier. Here are some of the ones I tried.

Click through to see the full list. I haven’t found any that are particularly good at the job, especially not in the free or relatively inexpensive tiers. My problem is that the tools tend to get goofy when you update an existing model based on database changes: all that time you spent reorganizing entity locations so you don’t have a spaghetti mess of lines criss-crossing all of your entities gets wasted the next time you perform an update, because the tools tend to shuffle things around once again.

Comments closed

Finding Free Space in SQL Server Database Files

Greg Robidoux checks for free space:

One of the functions of a DBA is to keep an eye on free space within the database and database files. The auto grow feature is great as a last resort, but proactively managing the database files is a better approach. Also, this information can be used to shrink data files as needed if there is a lot of free space in the files.

In this tip we take a look at a few ways to find the current free space within a database, so you can better manage your database files.

Click through for several techniques.

Comments closed