Cloud – Page 74 – Curated SQL

Combining Azure DevOps and Databricks

Published 2022-01-07 by Kevin Feasel

Anna Wykes continues a series on DevOps for Databricks:

An Environment Variable is a variable stored outside of the Python script; in our instance it will be stored on the DevOps Agent running the DevOps Pipelines. Consequently, it is accessible to other scripts/programs running on the DevOps Agent. We will not cover DevOps Agents in this blog specifically, the simplest description is that they are the compute that runs your pipeline, normally a VM (Virtual Machine) or Docker Container

Read the whole thing.

Comments closed

Lessons Learned Troubleshooting High CPU in Azure SQL DB

Published 2022-01-07 by Kevin Feasel

Kendra Little has an after-action report:

I’ve just had the pleasure of publishing my first new article in the Microsoft Docs, Diagnose and troubleshoot high CPU on Azure SQL Database.
This article isn’t really “mine” – anyone in the community can create a Pull Request to suggest changes, or others at Microsoft may take it in a different direction. But I got to handle the outlining, drafting, and incorporation of suggested changes for the initial publication.
It was a ton of fun, and I learned a lot about Azure SQL Database in the process.

Click through for what Kendra learned specific to Azure SQL Database, and also read the article itself.

Comments closed

Flexible File Components with SSIS

Published 2022-01-07 by Kevin Feasel

Bill Fellows hides SSIS DNA in a can of Barbasol shave cream:

The Azure Feature Pack for SSIS is something I had not worked with before today. I have a client that wants to use the Flexible File Task/Flexible File Source/Flexible File Destination but they were having issues. The Flexible File tools allow you to work with Azure Blob storage. We were dealing with ADLS Gen2 but the feature pack can work with classic blob storage as well. In my hubris, I said no problem, I know SSIS. Dear reader, I did not know as much as I thought I did…

Click through for a whopper of a story. But be sure to read to the very end, as you don’t want to stop at using TLS 1.0.

Comments closed

Low-Code Machine Learning: Create an Azure ML Workspace

Published 2022-01-06 by Kevin Feasel

I have started a new series on low-code machine learning with Azure ML:

The first thing that we need to do is create a machine learning workspace. In the Azure portal, search for “machine learning” and choose the Machine learning option.

This is mostly image-driven step-by-step guidance, though I do go a bit deeper than “here’s what to click” on several topics in the series

Comments closed

Azure Data Factory Activity Queue Times

Published 2022-01-06 by Kevin Feasel

Meagan Longoria waits in line:

I’ve been working on a project to populate an Operational Data Store using Azure Data Factory (ADF). We have been seeking to tune our pipelines so we can import data every 15 minutes. After tuning the queries and adding useful indexes to target databases, we turned our attention to the ADF activity durations and queue times.
Data Factory places the pipeline activities into a queue, where they wait until they can be executed. If your queue time is long, it can mean that the Integration Runtime on which the activity is executing is waiting on resources (CPU, memory, networking, or otherwise), or that you need to increase the concurrent job limit.

Click through to see how you can calculate queue times across activities, pipelines, and data factories.

Comments closed

Log Analytics and Power BI

Published 2022-01-06 by Kevin Feasel

Chris Webb has started a new series:

As a Power BI administrator you want to see what’s happening in your tenant right now: who’s running queries, which datasets are refreshing and so on. That way if a user calls you to complain that their report is slow or their dataset hasn’t refreshed yet you can start troubleshooting immediately. Power BI’s integration with Log Analytics (currently in preview with some limitations) is a great source of information for this kind of troubleshooting: it gives you the ability to send various useful Analysis Services engine events, events that give you detailed information about queries and refreshes among other things, to Log Analytics with a latency of only a few minutes. Once you’ve done that you can write KQL queries to understand what’s going on, but writing queries is time consuming – what you want, of course, is a Power BI report.

Click through to see how to use Power BI to access KQL data in Log Analytics, which you’re using to monitor Power BI behavior.

Comments closed

Addressable Disk Space and File Counts in SQL MI General Purpose

Published 2022-01-06 by Kevin Feasel

Niko Neugebauer has been busy:

In the previous blog posts in the SQL MI How-Tos we have already touched on the aspect of SQL MI reserved and available Disk Space, but as in everything – there is so many things to add and expand. In this post we shall focus on the General Purpose service tier and the remote disk storage that is used in this service tier. Besides the explicit limits of the addressable space that is connected to the number of CPU vCores, there are important aspects of the remote storage that will limit the number of database files that can be located there.
If you are interested in other posts on how-to discover different aspects of SQL MI – please visit the http://aka.ms/sqlmi-howto, which serves as a placeholder for the series.

Click through to see how it all fits together with Managed Instances.

Comments closed

Using a Kafka Client with Azure Event Hubs

Published 2022-01-04 by Kevin Feasel

Niels Berglund takes us through one way to work with Azure Event Hubs:

This blog post came by, by accident, lol. A couple of weeks ago, I started to prepare for a webinar: Analyze Billions of Rows of Data in Real-Time Using Azure Data Explorer. One of the demos in that webinar is about ingesting data from Apache Kafka into Azure Data Explorer. When prepping, I noticed that my Confluent Cloud Kafka cluster didn’t exist anymore, so I had to come up with a workaround. That workaround was to use Azure Event Hubs instead of Kafka.
Since I already had the code to publish data to Kafka and knew that you could use the Kafka client to publish to Event Hubs, I thought I’d test it out. I did run into some minor snags along the way, so I thought I’d write a blog post about it. Then, at least, I have something to go back to. This post also looks at how to set up an Event Hubs cluster.

Click through to see it in action.

Comments closed

Getting Started with KQL

Published 2022-01-04 by Kevin Feasel

Steve Jones starts learning about the Kusto Query Language:

I saw an episode of Data Exposed with my good friend, Hamish Watson. He talked about KQL (Kusto Query Language) being the next query language you need to learn. I was skeptical of the title, but I decided to give this a try.
In the episode, Hamish points out a cheat sheet from Microsoft, which I thought was a good resource. However, while watching the video, I browsed over to the demo site Microsoft has at https://aka.ms/lademo. You need an Azure account to log in, but this is a demo site where you can query some Log Analytics data. The new query window below is what appears when you go here:

If you’re already familiar with the way Splunk’s filtering language works, KQL follows from it. It’s a worthwhile language for Azure-based administrators to know, as it’s the most powerful way to get data out of Log Analytics.

Comments closed

Hierarchical Partition Keys in Cosmos DB

Published 2021-12-30 by Kevin Feasel

Hasan Savran looks at partition keys:

Selecting a partition key for your Cosmos DB is one of the most important choices you need to make for your Cosmos DB project. You really need to take your time and have a plan for your project. Where is this application will be in 1 year? 5 years? How much data are you planning to store? If your application will become popular and you start to have users all over the county or world, do you think your partition key can oversee a growth like this? These are the some of the questions you need to ask yourself. Selecting a partition key is like selecting a life partner for your project. You need a good one that will grow with your project together.
Sometimes, it does not matter how much time you spend to find a good partition key. Your document simply does not have good one. In those cases, usually the best thing you can do is combining multiple properties together and generate a unique custom property called synthetic key.

Read on for a better solution to the problem than a synthetic key.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: Cloud