Cloud – Page 26 – Curated SQL

Managing Azure Data Factory IP Ranges for Azure Firewalls

Published 2024-01-03 by Kevin Feasel

While a private endpoint and vNets are preferred, sometimes we need to configure Azure SQL Database or Azure Storage to allow use of public endpoints. In that case, an IP-based firewall is used to prevent traffic from unauthorized locations. But Azure Data Factory’s Azure Integration Runtimes do not have a single static IP. So how do we keep the firewall updated so that ADF can access these resources?

One option is to run everything through a self-hosted integration runtime, which can have a static public IP. But that also means you have to install (and keep updated) a Java SDK if you are converting data to Parquet or ORC files.

Another option is to obtain the IP range list published by Microsoft and update the firewall. That is what I did for a recent project.

Read on to learn more about the latter option, including a Powershell script to do the work.

Comments closed

Fabric Data Pipeline for Blob Storage CSV into Azure SQL DB

Published 2024-01-02 by Kevin Feasel

Andy Leonard loads some data:

In November 2023, I shared how to start learning Microsoft Fabric in a post titled Start a Fabric Free Trial. In December 2023, I shared how to Create a Workspace in Fabric. In this post, I document one way to create a pipeline to load data from a CSV file stored in Azure Blob Storage to Azure SQL Database in your new Fabric workspace.

Click through for some key assumptions, as well as the process.

Comments closed

Using KQL in Azure SQL DB Audits

Published 2023-12-27 by Kevin Feasel

Josephine Bush tracks what’s happening on that Azure SQL Database:

According to Microsoft, “Kusto Query Language (KQL) is a powerful tool to explore your data and discover patterns, identify anomalies and outliers, create statistical modeling, and more. The query uses schema entities that are organized in a hierarchy similar to SQLs: databases, tables, and columns.”

Note: KQL is case-sensitive for everything. Also, remember to refrain from querying everything just like you wouldn’t with SQL — don’t do the equivalent of SELECT * from gianttable.

Microsoft also has a lot of documentation with best practices and a quick reference guide to the Kusto commands. This blog post covers the ones I use the most.

Read on for a primer on the language, specifically some of the things you can do when reading Azure SQL Database audit information.

Comments closed

Batch Scoring in the Azure ML Designer

Published 2023-12-20 by Kevin Feasel

I have a new video:

In this video, I answer a viewer question about how to perform batch processing from the Azure Machine Learning Designer.

This video wraps up my work on Azure ML for now. I’m going to kick off a brand new series starting next week.

Comments closed

Concatenating Many Files in Azure Blob Storage

Published 2023-12-20 by Kevin Feasel

Drew Furgiuele concatenates a lot of files:

Lately, I’ve found myself with a few requests from friends and users that have a particular problem: they’ve got themselves a data lake in Azure, and they can read and write files just fine to it. The problem, though, is that sometimes they need to take a series of files and mash them all together, or as the cool kids call it: concatenate them. And when it comes to third party tools and methods that can do the trick, you’re spoiled for choice: Azure Data Factory, Spark via Databricks, or even PowerShell.

Case in point: I was working with someone who had tens of thousands of CSV files that they needed to merge together into one big file, but they were already out in their Azure storage account. That doesn’t sound so bad, does it?

Drew explains why it is, but also why it isn’t. So click through and check that out.

Comments closed

Troubleshooting the Automation of a Process

Published 2023-12-19 by Kevin Feasel

Reitse Eskens tells a tale of woe:

When I got the job to restore a bacpac file, I didn’t think much of it. Because it’s quite easy. Get the bacpac and restore it with Azure Data Studio or Sql Server Management Studio. Get coffee, done. Then I got these requirements:

The bacpac needs to be downloaded daily from an sFTP server, without certificate

The bacpac file is in a zip file

The zip file is password protected

Everything must be done serverless (Azure Automation, Logic App, Function App, Data Factory and/or Synapse Analytics)

It’s a daily process that needs to be done without human supervision

Welcome to cloud development: Part A is easy, Part B is easy, Part C is mildly challenging, and combining A with B and C is a total nightmare because it turns out that A and B aren’t compatible, so by the end, you’re dealing with A” and b and D (because C, C’, C”, etc. wouldn’t work and c and c’ would work but had severe limitations preventing you from using it in this scenario).

Comments closed

Resource Governor and Azure SQL Managed Instance

Published 2023-12-18 by Kevin Feasel

Kendra Little has a note for us:

In Azure SQL Managed Instance, you get to use Resource Governor, even in the General Purpose tier. This is awesome.

Just make sure you execute commands in the context of the master database, or you’ll get error 40510: Statement 'ALTER RESOURCE GOVERNOR' is not supported in this version of SQL Server.

Read on for more information about this error and how to circumvent it.

Comments closed

Batching Text Analytics with Azure AI Services

Published 2023-12-15 by Kevin Feasel

Matt Eland tries out the TextAnalytics client:

We’ll talk about each one of these capabilities briefly as we cover the results, but at a high level what we want to do is:

Perform sentiment analysis to determine if the text is positive, negative, neutral, or mixed.

Summarize the text using abstractive summarization which summarizes the text with new text generated by a large language model (LLM).

Summarize the text using extractive summarization which summarizes the text by extracting key sentences or parts of sentences to convey the overall meaning.

Extract key phrases of interest from the text document.

Perform entity recognition and linked entity recognition to determine the major objects, places, people, and concepts the document discusses.

Recognize any personally identifiable information (PII) present in the document for potential redaction.

Analyze the text for healthcare specific topics such as treatment plans or medications.

Read on to see how a certain passage of text fares.

Comments closed

Saving Money on Azure SQL Database

Published 2023-12-15 by Kevin Feasel

Andy Leonard’s speaking my language:

Regular readers of this blog know I am all about saving money. I’ve blogged about ways to set up inexpensive instances of Azure SQL DB. I recently learned a couple new ways to save even more money working with Azure SQL. Both methods begin the same:

Read on for a pair of tips. Serverless mode does work really well for dev environments and cases where you don’t need it up that much (and can wait the 3-5 minutes for the service to spin up the first time you use it that day). But if you’re using it 12+ hours a day, it’s actually more expensive than using the standard model.

Comments closed

What’s Missing in Azure SQL Managed Instance?

Published 2023-12-15 by Kevin Feasel

Kendra Little gives us the lowdown:

Spoiler: a large amount of features from SQL Server 2022 are missing from Azure SQL Managed Instance. Some major features are missing that were introduced in SQL Server 2019– and here we are just a few weeks away from 2024.

But Microsoft’s top-line marketing claims about Azure SQL Managed Instance remain that “it’s always up to date with the latest SQL features and functionality.”

Let’s dig into some of the documented highlights on missing features, so you can decide for yourself what to think of that statement.

Read on for Kendra’s take. I agree with the overall argument: Managed Instance is an expensive service intended to serve as a SQL Server replacement. But there’s still a lot missing from the product and it still feels like it’s smarter simply to build a VM running SQL Server rather than play the Managed Instance game. Azure SQL DB I extend a bit more grace toward, as its intent is a bit different from the SQL Server box product or Managed Instance.

Comments closed

Category: Cloud