Press "Enter" to skip to content

Curated SQL Posts

Conditional Merges in Power Query

Ed Hansberry shows how to perform a conditional join in Power Query:

I am going to use two tables for this example, Fact.Sales and Dim.Customer. I only want my Fact Sales table to have customer info for the Buying Group Tailspin Toys. That info is in the Dim Customer table. I can do this several ways.

1. Do the merge, expand the Buying Group column from the Customer table, then filter to only show those rows.
2. Pre-filter the Customer table for Tailspin Toys in the Buying Group column, then do the merge. It would need to be an Inner Join, otherwise you will get nulls in the Sales table when you expand, and then you have to filter those out. Which you can do, but it is more steps.
3. You can bypass all of that and do it in one step.

I’m not going to bore you with how to do methods 1 and 2. Let’s do method 3, a conditional join!

Read on for the approach, which also can take advantage of query folding when possible.

Comments closed

Automating a Permissions Audit with dbatools

Garry Bargsley walks us through the process of a permissions audit made easier:

My post for T-SQL Tuesday #130 is going to be how I automated my companies SQL Server Permissions Audit process. Even though this only occurs every quarter it is something that I as the DBA never look forward to and the business partners never really enjoy. In an effort to streamline the process from start to finish I built automation using PowerShell, dbatools and ImportExcel. The process now builds the permissions audit file, extracts individual users from Active Directory and e-mails the information to the appropriate team(s) for review.

Read on for details about the process.

Comments closed

General Principles for Automation

Eitan Blumin thinks through a series of general principles around when to automate things:

What I’m gonna do is try to look at the topic of automation and productivity in more general terms, in the hopes of conveying to you my overall methodology while working on a given task.

I will try to formulate a list of “Laws” or “Maxims“, that together will try to represent my personal approach to handling tasks, with the main goal in mind to maximize productivity.

Click through for the list.

Comments closed

Automating Azure Resource Deletion

Drew Skwiers-Koballa has put together a runbook to remove Azure resources tagged in a certain way:

Microsoft puts a lot of effort into making it easy for you to log on to portal.azure.com and spin up a VM, SQL Database, Function, or other instance. The Visual Studio Enterprise subscription with $150 credit per month can go quickly if you’re not careful to stop or remove big ticket items as soon as you’re done with them. Even if you have a virtually unlimited Azure account connected to a credit card, expense account, or a trust fund – you probably don’t want to accidentally leave something running longer than it is needed.

With an Azure Automation account and a Powershell workflow runbook I’m able to use resource tags to set resources for autodeletion by date or immediately that evening.

Click through for that runbook.

Comments closed

R: Avoid apply() with Large Datasets

George Pipis walks us through the performance cost of using the apply() function against a large dataset in R:

When we are dealing with large datasets and there is a need to calculate some values like the row/column min/max/rank/mean etc we should avoid the apply function because it takes a lot of time. Instead, we can use the matrixStats package and its corresponding functions. Let’s provide some comparisons.

Click through for a demonstration of how much slower it can be in certain cases. H/T R-Bloggers.

Comments closed

Ensuring Trace Flag Consistency Across Instances

Taiob Ali creates a process to track ensure specific trace flags are in use:

Recently during a SQL Server upgrade, I noticed some inconsistency with trace flags between production and non-production instances. Investigating further also found a similar inconsistency between Always On Availability Group replicas.

My team members fixed this issue by leveraging startup stored procedures.

Click through for the explanation.

Comments closed

Restoring a Database in an Availability Group

Rajendra Gupta walks us through the process of restoring a database which is currently in an Availability Group:

You might think a question here– We can take production database backup and restore it on the development database. What difference does it make in a standalone database restore or availability group database restore?

Database restore works with the standalone database, but if the database is configured in the availability group, we cannot directly restore the database. It requires additional steps because of the AG configurations. Our database should be in the same state (AG synchronized) after the database restores as well.

In this article, let’s cover the steps to restore an existing availability group database in the SQL Server Always On Availability Group.

Read on for the answer.

Comments closed

Generating Scripts in SSMS and ADS

Rob Farley walks us through automation options in SQL Server Management Studio and Azure Data Studio:

User interfaces are great, but I simply don’t want to have to remember to do everything the same way each time.

To that end, I want to wax lyrical for a moment about the Script button on most dialog boxes in SQL Server Management Studio (SSMS), and make a quick mention of what things (currently) look like in Azure Data Studio (ADS). (I say “currently” because ADS is still under very active development and could have changed even by the time I publish this.)

The Script button is one of the best features of Management Studio. The UI lets you know what can be done, and the Script button lets you know the process. It’s also a nice reminder that this functionality is not unique to the UI. Management Studio’s implementation of the Script button isn’t perfect, but it is good.

Comments closed

Filtering with dplyr Using Strings as Expressions

Kenneth Tay shows how to build arbitrary expressions to execute in dplyr’s filter function:

This took me a while to figure out and so I thought I would post this as future reference. Let’s say I have the mtcars data and I want to filter for just the rows with cyl == 6. I would do something like this:

library(tidyverse)
data(mtcars)
mtcars %>% filter(cyl == 6)

What if I had the filter condition as a string instead?

Read on to see how you can do this. Given the answer, I wonder if there’s any chance that could be turned into an injection vulnerability. H/T R-Bloggers

Comments closed

The State of R Packages Locally

Maelle Salmon and Gabor Csardi walk us through some details about installed packages on a system:

Now how do you know where any of your installed packages was installed? You can use find.package() and path.package()!

To check whether a package is installed, it is better to use find.package() than installed.packages() because the latter, as its docs state, can be slow on some systems. In both cases, it does not mean the package is usable, for that you’d need to use library() or require().

Read on for several tips around where packages are located, what their contents look like, and learning a bit more about the actual code in packages.

Comments closed