Press "Enter" to skip to content

Curated SQL Posts

Using Runbooks in Azure Automation

Rod Edwards has a process for that:

Nobody likes to do the same monontonous task over and over again. Well, saying that, maybe some out there do in order to look and feel busy…but I don’t, as I nearly always have something else more pressing or fun or interesting to do. By automating those repeatable tasks, it reduces boredom, chance of errors, and stress if you’re already a busy bunny.

This is where Automation comes into play, and in Azure we have a few options. This post focuses in Azure automation.

Read on to see how Azure Automation works and how to build a Powershell runbook in it.

Leave a Comment

Calculating Percentages in T-SQL

Edwin Sanchez shows a variety of methods to calculate percentages of the whole in T-SQL:

Calculating percentages in SQL Server is like slicing a pie. You need to know the total size (the denominator) and the size of the slice you want (the numerator). To get a percentage, you divide the slice size by the total size and multiply by 100. 

Read on for a variety of methods to calculate this. I wouldn’t use all of the methods myself, as I have certain predilections against subqueries in the SELECT clause, but they do get the job done.

Leave a Comment

Database Watcher for Azure SQL

Dimitry Furman has a new announcement:

Reliable, in-depth, and at-scale monitoring of database performance has been a long-standing top priority for SQL customers. Today, we are pleased to announce the public preview of database watcher for Azure SQL, a managed database monitoring solution to help our customers use Azure SQL reliably and efficiently.

Click through to see what it offers and what’s on the roadmap for this product.

Leave a Comment

Using Key Vault in SQL Server on Linux

Aravind Mahadevan shares information on a new bit of functionality:

We’re excited to announce that Extensible Key Management (EKM) using Azure Key Vault in SQL Server on Linux is now generally available from SQL Server 2022 CU12 onwards, which allows you to manage encryption keys outside of SQL Server using Azure Key Vaults.

In this blog post, we’ll explore how to leverage Azure Key Vault as an EKM provider for SQL Server on Linux.

Read on to see how to set this up.

Leave a Comment

Duplicating Rows in R

Steven Sanderson repeats the punch line a few times:

Are you working with a dataset where you need to duplicate certain rows multiple times? Perhaps you want to create synthetic data by replicating existing observations, or you need to handle imbalanced data by oversampling minority classes. Whatever the reason, replicating rows in a data frame is a handy skill to have in your R programming toolkit.

In this post, we’ll explore how to replicate rows in a data frame using base R functions. We’ll cover replicating each row the same number of times, as well as replicating rows a different number of times based on a specified pattern.

Click through to replicate data without copy-paste.

Leave a Comment

Learning about GitHub Actions

I have a new video:

In this video, we dig into GitHub’s process for executing code: GitHub Actions workflows. We’ll learn what Actions and workflows are, how we can create them from scratch, and how to incorporate Actions from the GitHub Marketplace into our own workflows.

Along the way, I describe what GitHub Actions workflows are and we build a simple one. I’ll have more videos coming up that expand on GitHub Actions and show you more of what you can do with them.

Leave a Comment

Exposing Kafka Data in Iceberg using Tableflow

Marc Selwan announces a new product:

We’re excited to talk about our vision for Tableflow, which makes it push-button simple to take Apache Kafka® data and feed it directly into your data lake, warehouse, or analytics engine as Apache Iceberg® tables. Making operational data accessible to the analytical world is traditionally a complex, expensive, and brittle process and we believe we can do better to unify the operational and analytical estates.

Tableflow removes all this erroneous, duplicative work and helps convert Kafka topics and associated schemas to Iceberg tables in one click. This is central to our Confluent’s vision to build the world’s leading data streaming platform that fuels any operational and analytical workload with real-time data products. 

It looks like this is currently in early access, but you can see where Confluent intends to take the product.

Leave a Comment

Changing the Timeout of a Spark Session in Microsoft Fabric

Koen Verbeeck doesn’t have time to wait:

You might know the feeling: you’re writing code in a Notebook in Microsoft Fabric and suddenly you have to leave your workstation for a while. Someone ran the doorbell (you’re working from home and you get some parcels delivered), or you took a coffee break with some colleagues. When you return to your notebook, the Spark session has timed out and when you run a cell, you have to wait for the damn thing to restart again. The agony, waiting for 2-3 minutes for the session to start, and only after that the actual code can start running.

Read on to see how you can set the timeout to a custom value, assuming you’re okay with paying for the Spark cluster to sit around until it times out.

Leave a Comment

Dropping Objects in SQL Server and Snowflake

Kevin Wilkie gets the drop on us:

When you’re working between SQL Server and Snowflake, there can be a lot of crossover that may make you forget what system you’re working in. Sometimes it’s close, but not close enough.

Today, let’s go over something that should be rather simple – removing old objects that we shouldn’t need any longer.

Read on to see how the two data platform technologies differ in this regard.

Leave a Comment