Press "Enter" to skip to content

Curated SQL Posts

Azure DevOps and Data Factory

Helge Rege Gardsvoll has a three-part series for us on using Azure DevOps to deploy Data Factories. Part 1 is all about environment setup:

Shared Data Factory
The shared Data Factory is there for one use; self-hosted integration runtimes. This is the component you will use to connect to on-premise or other sources that have restrictions on access such as IP restriction or other firewall rules. Migrating a Self Hosted Integration Runtime is not supported, but you can share the same Integration Runtime across different Data Factories. You can find a description for how to do this in this article.

Part 2 covers Git branching, linked services, and development:

Create datasets and pipeline
For this demo I create two datasets; one for source and one for target, and a simple pipeline that copies the data. Datasets have name that point to the data lake, like ADLS_datahelgeadls2_Brreg_MainUnits, but does not include environment information.

Part 3 covers the release process:

The release process will have these steps;
1. Stop any active triggers. We do not want any pipelines to start as we are changing things (and you should wait until running pipelines finish before publishing)
2. Release from development to target environment
3. Clean up target environment by removing objects that are not present in dev. Also start triggers

This is a great series of posts and also includes a bonus tidbit if you’re using Databricks.

Comments closed

Finding Columns and Measures Used in SSAS Tabular

Kasper de Jonge uses Power BI to figure out which columns are being used in an SSAS Tabular model:

I got an interesting question about being able to figure out which measures and columns are being used for all tabular models in a server, either by known reports or by self service users using Power BI. To solve this I decided to use Power BI :).

To get the required information I decided to capture and parse the queries being that are being send to AAS and parse the results Power BI desktop over a period of time. In this post I describe how to do it.

Check it out, and also read the comments for an additional tip from Bill Anton.

Comments closed

Recovering Deleted Indexes

Eric Cobb has an enhancement to the SQL Server Metrics Pack:

I recently had a case at work where a database was restored, and several important indexes were accidentally removed. But because we are using SQL Server Metrics Pack to track the indexes on that server, we were easily able to recover all of the deleted indexes.

In order to spotlight the feature a little, I wanted to answer a few questions and provide some queries to help explain how to use this new feature.

Read on to learn more, and check out the GitHub repo as well.

Comments closed

Using Schemas with DBI and SQL Server

Thomas Roh takes us through an oddity in R’s DBI library:

I ran into an issue the other day where I was tring to write a new table to a SQL Server Database with a non-default schema. I did end up spending a bit of time debugging and researching so I wanted to share for anyone else that runs into the issue. Using the DBI::Id function, allows you to specify the schema when you are trying to write a table to a SQL Server database.

Click through for the end result. I will say that the more I work with DBI, the more I’m tempted to keep using rodbc, at least when working with SQL Server. H/T R-Bloggers.

Comments closed

Spark Transformations and Actions

Divyansh Jain differentiates the key sets of functions in Spark:

Now there is a point to be noted here and that is when you apply the transformation on any RDD it will not perform the operation immediately. It will create a DAG(Directed Acyclic Graph) using the applied operation, source RDD and function used for transformation. And it will keep on building this graph using the references till you apply any action operation on the last lined up RDD. That is why the transformation in Spark are lazy.

Read on for more details.

Comments closed

Making SQL Agent Jobs AG-Aware

Stuart Moore shows how you can use dbatools to make SQL Agent jobs Availability Group-aware:

What do I mean by Availability Group aware? When running on an Availability Group, one SQL Server instance ‘owns’ the database at any point in time, but the SQL Agent jobs have to be replicated across all of the instances in the cluster. So you want to make sure that your SQL Server Agent jobs only do work on the instance that currently owns the Availability Group.

Doing this is pretty simple. Below is a piece of T-SQL that checks if the current SQL Server Instance is the primary instance in the AG. If it isn’t then we exit with an error.

Read on to see how, and how you can use dbatools to automate this work.

Comments closed

When Extended Event Loss Occurs

Jonathan Kehayias explains when Extended Events will discard an event:

There are three specific session options that determine how large of an event an event session can actually collect, and one that controls how events are dropped when the buffer memory for the event session is full or under pressure. All four of these matter when we are talking about collecting events that could generate a large event payload and we want to minimize the chance that we could potentially drop an event.

Read on to see these settings in action.

Comments closed

PowerApps Security

Jason Bonello gives us some tips on PowerApps security:

Depending on how the backend is set up, the tables having these sensitive data might be in the same database. For example, ERP solutions can have Company Accounts data, Customer related data and Inventory related data all in the same database, maybe under different schemas – but still part of the same database.

Now let’s say we are about to create a PowerApps solution to maintain Customer information. However, as part of the organization policy, this information should not be shared across other departments apart from the intended users.

Read on for some ideas of how to limit the risk of data exposure.

Comments closed

Azure Data Factory Switch Activity

Rayis Imayev explains what the Switch activity does in Azure Data Factory:

Developing conditional logic of your Azure Data Factory control flow has been simplified with introducing of the Switch activity – https://docs.microsoft.com/en-us/azure/data-factory/control-flow-switch-activity. Official documentation resource states, this new data factory activity “provides the same functionality that a switch statement provides in programming languages“. I would also add a more simplified definition of the Switch activity in Azure Data Factory: it is a container (or wrapper) for multiple IF conditions.

Click through for an example.

Comments closed

Accelerated Database Recovery

Andy Mallon explains the concept of Accelerated Database Recovery:

Accelerated Database Recovery(ADR) is a new feature intended to speed up the recovery process, which could be very slow, particularly when there are long-running, large transactions. ADR is not just for recovery after a crash, but also helps in other scenarios where the transaction log needs to be recovered–including Availability Group secondary redo and Failover Cluster Instance failovers.

This is one of the most interesting new features in SQL Server 2019.

Comments closed