Press "Enter" to skip to content

Category: Cloud

Kusto Queries in Azure Data Studio Notebooks

Julie Koesmarno shows off the Kusto Query Language magic in Azure Data Studio notebooks:

To do this, you’ll need to ensure that you have Kqlmagic installed. See Install and set up Kqlmagic in a notebook. Then in a notebook, you can load Kqlmagic with %reload_ext Kqlmagic in a code cell.

The next step is then in a new code cell, you can start connecting to a Log Analytics workspace. There are three ways to do so (roughly – as I’m also learning in this space too):

1. Using Azure Active Directory Device Login authentication.
2. Using Az CLI login
3. Using Client Secret

Read on for one example using Azure AD authentication.

Comments closed

Azure Data Factory Integration Runtimes

Tino Zishiri takes us through the concept of the Integration Runtime:

An Integration Runtime (IR) is the compute infrastructure used by Azure Data Factory to provide data integration capabilities such as Data Flows and Data Movement. It has access to resources in either public networks, or hybrid scenarios (public and private networks).

Read on to learn more about what they do and the variety of Integration Runtimes available to you.

Comments closed

AzureTableStor: Table Storage in R

Hong Ooi announces a new package on CRAN:

I’m pleased to announce that the AzureTableStor package, providing a simple yet powerful interface to the Azure table storage service, is now on CRAN. This is something that many people have requested since the initial release of the AzureR packages nearly two years ago.

Azure table storage is a service that stores structured NoSQL data in the cloud, providing a key/attribute store with a schemaless design. Because table storage is schemaless, it’s easy to adapt your data as the needs of your application evolve. Access to table storage data is fast and cost-effective for many types of applications, and is typically lower in cost than traditional SQL for similar volumes of data.

If that sounds like a fit for you, check out the package.

Comments closed

Tracking Cosmos DB Re-Indexing Progress

Hasan Savran wants information:

Indexes let your queries run faster. When you need to adjust your indexing policies, database engines re-indexes your data respecting to your changes. In Cosmos DB, when you change your indexing policies, database engine truncates all your indexes and starts to reindex all your indexes from scratch. You do not want to change your indexing policies when your application is busy. Because your queries can not use the dropped indexes, queries will take longer, and they will cost more Request Units. Also, your queries might not return all the data they supposed to. You can read me my older post about indexes in Cosmos DB.

     You may want to monitor re-indexing progress; you may want to disable your application until indexing is completed or warn your team about the re-indexing progress. You can check the re-indexing progress only from SDK, that means you need to write your own code to accomplish this. I have the following code which checks the progress every second. If progress is at %100 then it quits, otherwise it continues to check progress every second until it receives 100 as result.

Hasan has provided us with a script, so check that out.

Comments closed

Target Groups in Elastic Jobs

Reitse Eskens shares some more information about elastic jobs in Azure:

In one of my previous blogs, I wrote about how to create an elastic job agent when you need the SQL Agent functionality on Azure. You can read that one here.

This morning, I needed a job to update the stats on a database, but on just one database within the “instance” on Azure. But my first group contained all the databases, and the Ola Hallengren script isn’t available on all databases and the credential I’m using to execute the jobs doesn’t have access to all the databases.

Read on to learn how Reitse solved the problem.

Comments closed

Check if an ADF Pipeline is Already Running

Paul Andrew has a scenario for us:

Scenario: I want to trigger a Data Factory pipeline, but when I do I want the pipeline to know if it’s already running. If it is already running, stop the new run.

Sounds simple enough right?

Wrong!

But, now simple for you, because I’ve done it for you, yay! 🙂

I thought it was simple, but it wasn’t simple, but now it’s simple, but is it really simple? Click through to find out.

Comments closed

Infrastructure Notes for RMDBS on Azure VMs

Kellyn Pot’vin-Gorman takes a look at some of the hardware choices you have in Azure, focusing on what works for relational database management systems:

The truth is, its often a combination of database and infrastructure issues that are the cause.  Although many of you may want me to dig into database performance data, I’m actually going to first focus on infrastructure, as it’s the area that most aren’t privy to for Oracle, or for that matter, any database on Azure IaaS.

The topic of infrastructure is an essential one for any database running in IaaS and even more so VMs on Linux, which can be a bit foreign for the Microsoft data specialist.  Yes, this may be intimidating when doing the shift to Linux and understanding some of the nuances to running a database on Linux, but understanding the infrastructure is a key to removing it from the scenario.  Hopefully these tips will assist you, no matter if you’re running Oracle, (MySQL, PostgreSQL or SQL Server) on Linux VMs on Azure IaaS.

Click through for some guidance on the topic.

Comments closed

Alternatives to an Agentless Azure SQL DB

Reitse Eskens gives us a few alternatives to use when we need something like SQL Agent but are running in Azure SQL Database:

What i got into was the following. For a project we’re loading an Azure sql database (serverless) with a lot of data (think billions of rows) that has to come from an on-premises Oracle server. We’re using a vpn connection with network peering to connect to the on-premises server and using a VM with a third-party tool to load the data.

Normally we’re delta-loading the database but because it’s a new project we need to perform an initial load. Nothing really weird, just a huge number of records that needs to pass through. And every now and then the application freezes and refuses to thaw. Because it’s hard to find out when the freezing will start, we want to monitor some processes on the database.

Now on a normal SQL Server i’d create a job in the Agent and be done with that part. But not on Azure. Because the agent doesn’t exist there. In SSMS you’ll see a huge empty space where the agent ought to be.

Reitse lists five separate options. A sixth would be to spin up SQL Server in a VM and use its agent for scheduling. And there are a few more alternatives as well in the ‘outside scheduler’ realm.

Comments closed

Auto-Checking Azure Data Factory Setup

Paul Andrew is at it again:

Building on the work done and detailed in my previous blog post (Best Practices for Implementing Azure Data Factory) I was tasked by my delightful boss to turn this content into a simple check list of what/why that others could use…. I slightly reluctantly did so. However, I wanted to do something better than simply transcribe the previous blog post into a check list. I therefore decided to breakout the Shell of Power and attempt to automate said check list.

Sure, a check list could be picked up and used by anyone – with answers manually provided by the person doing the inspection of a given ADF resource. But what if there was a way to have the results given to you a plate and inferring things that aren’t always easy to spot via the Data Factory UI.

Paul uses an ARM template rather than hitting your Data Factory directly, so there’s a little bit more work for you the user, but Paul explains why it’s both necessary and proper.

Comments closed

Querying Data Lake Files in Power BI through Synapse Analytics

Wolfgang Strasser shows us how to integrate Azure Synapse Analytics and Power BI:

Sometimes however, would not it be nice to access the data lake in Direct Query mode – to get the most up to date information for every report view? I would say: yes … but how can you achieve this? The options natively provided by ADLS Gen2 and Power BI are not sufficient to solve this requirement. But: there are options to achieve this and, in this post, I would like to show you the possibilities using Azure Synapse Analytics to build a query layer on top of a ADLS Gen2 storage account.

Click through for a step-by-step walkthrough.

Comments closed