Press "Enter" to skip to content

Category: Cloud

Options for Running Jobs against Azure SQL DB

Anthony Norwood replaces on-prem SQL Agent jobs:

Both SQL Server on Azure VM and Azure SQL Managed Instance provide you with SQL Server Agent and therefore the capability to run scheduled tasks against your databases, so when we’re talking about being able to run jobs we’re only considering Azure SQL Database as needing guidance – some of the suggestions  in the following paragraphs can also apply to all these options of SQL Server, but perhaps not as necessary.

We’re going to provide you with four options for how you might be able to still run your favourite SQL Agent Jobs against an Azure SQL Database, each of which come with their own advantages and disadvantages – one not mentioned is Data Factory, sometimes referred to as SSIS in the cloud, and this is because we’re trying to focus on some options that may be more comfortable to people who have never built an SSIS package before.

Read on for the four options Anthony has for us.

Comments closed

Metadata-Driven Pipelines for Azure Data Factory Loads

Marc Bushong doesn’t want to copy and paste:

Developing ETLs/ELTs can be a complex process when you add in business logic, large amounts of data, and the high volume of table data that needs to be moved from source to target. This is especially true in analytical workloads involving Azure SQL when there is a need to either fully reload a table or incrementally update a table. In order to handle the logic to incrementally update a table or fully reload a table in Azure SQL (or Azure Synapse), we will need to create the following assets:

  • Metadata table in Azure SQL
    • This will contain the configurations needed to load each table end to end
  • Metadata driven pipelines
    • Parent and child pipeline templates that will orchestrate and execute the ETL/ELT end to end
  • Custom SQL logic for incremental processing
    • Dynamic SQL to perform the delete and insert based on criteria the user provides in the metadata table

Read on for the demonstration, which reads from one Azure SQL DB into another.

Comments closed

Using DVC to Store Data Science Artifacts in Azure

I have a new video up:

In this video, we introduce DVC, a tool for version control management of data science and machine learning artifacts. We learn why Git isn’t the best place to store those large data files, how DVC integrates with Git, and how you can save your files in Azure Blob Storage.

Click through for the video, as well as a variety of links which helped me put it together.

Comments closed

Upgrading SQL Server Cloud VMs

Brent Ozar recommends you check your cloud provider’s VM listings:

If you’ve been in Azure or Amazon for a few years, you’re probably on old, slow hardware.

In the last 3 weeks, I’ve had two clients who’d both been early cloud adopters. When they’d migrated to the cloud, they both used Azure Ev3 VMs – at the time, a good choice for SQL Server due to its relatively high amount of memory. When the Ev3 VM types were announced in 2017, they were hosted on Intel Broadwell and Haswell processors with 2.3-2.4GHz processing speed.

Also, even if you’re locked into a 1-year or 3-year deal, I know that at least Azure is usually willing to switch your VM class registration if you contact your support person. I’m not positive if AWS does the same but it wouldn’t shock me.

Comments closed

Azure AD (or Entra ID) Authentication for SQL Server VMs

Deepthi Goguri enables Azure Entra ID security on a SQL Server VM in Azure:

To enable the SQL Server 2022 on a virtual machine to use Azure AD, we need below things:

Deepthi then includes the list of requirements, starting with a managed identity and moving on to permissions and other configuration. It looks like a fair number of steps, but it’s not that onerous a task once you get to it.

And this also gives me a chance to rant about Microsoft marketing a bit, as they are in the process of switching the name Azure Active Directory to Azure Entra ID. Granted, Azure Active Directory isn’t Active Directory (although it does perform a very similar task in a fairly similar way). So to remove confusion that I don’t think really existed, they changed the name to Entra ID. Except that most of the Microsoft documentation still says Azure Active Directory, and we have about a decade’s worth of blog content talking about Azure Active Directory, so when you go searching for the resolution to a problem, you’ll have to search for Azure Entra ID as well as its former name, which means people will still link the product to Azure Active Directory—at least, until the point when there’s enough blog content and documentation in place to replace the large majority of those existing blog posts—and so you renamed a product for no reason. Plus, they picked an ambiguous name that people will pronounce multiple ways: is the “ent” in Entra like “enter the dungeon” or Entra like “a delicious entrée”?

But then again, considering how many pronunciations of Azure there are, maybe this is the plan…

Comments closed

Azure Blob Storage Operating System Error 86

Jose Manuel Jurado Diaz 86’d that option:

Today, I worked on a service request that our customer got the following error message: Cannot open backup device ‘https://XXX.blob.core.windows.net/NNN/YYY.bak‘. Operating system error 86(The specified network password is not correct.). RESTORE HEADERONLY is terminating abnormally. (Microsoft SQL Server, Error: 3201). Following I would like to share with you some details why this issue and the activities done to resolve it. 

Read on to get a better understanding of what this error actually means and how you can fix it.

Comments closed

Power BI Shareable Cloud Connections and Multiple Connections

Chris Webb shows off an interesting aspect of a new feature:

A few weeks ago an important new feature for managing connections to data sources in the Power BI Service was released: Shareable Cloud Connections. You can read the blog post announcing them here. I won’t describe their functionality because the post already does that perfectly well; I want to focus on one thing in particular that is important for anyone using Power BI with Snowflake (and, I believe BigQuery and probably several other non-Microsoft sources): Shareable Cloud Connections allow you to have multiple connections to the same data source in the Power BI Service, each using different credentials.

Read on to see what Chris means and how you can take advantage of it.

Comments closed

CI/CD for Synapse Serverless SQL Pool with SqlPackage and Azure DevOps

Rui Cunha has a tutorial for us:

Azure Synapse Analytics Serverless SQL is a query service mostly used over the data in your data lake, for data discovery, transformation, and exploration purposes. It is, therefore, normal to find in a Synapse Serverless SQL pool many objects referencing external locations,  using disparate external data sources, authentication mechanisms, file formats, etc. In the context of CICD,  where automated processes are responsible for propagating the database code across environments, one can take advantage of database oriented tools like SSDT and SqlPackage CLI , ensuring that this code is conformed with the targeted resources.

In this article I will demonstrate how you can take advantage of thee tools when implementing the CICD for the Azure Synapse Serverless SQL engine. We will leverage SQL projects in SSDT to define our objects and implement deploy-time variables (SQLCMD variables).  Through CICD pipelines, we will build the SQL project to a dacpac artifact, which enables us to deploy the database objects one or many times with automation.

Click through for the demonstration.

Comments closed

Trying out Azure Geo-Replication

Etienne Lopes continues a series on Azure SQL DB HA/DR:

So, first of all, what is Active Geo-Replication?

Active geo-replication is a feature that lets you create a continuously synchronized readable secondary database for a primary database. The readable secondary database may be in the same Azure region as the primary, or, more commonly, in a different region. This kind of readable secondary database is also known as a geo-secondary or geo-replica.“

Read on to learn more about the topic, including how to set it up and ways to try it out.

Comments closed