Press "Enter" to skip to content

Category: Cloud

So You’ve Hit the Limits of ADF Concurrency

Paul Andrew shows what happens you you break the ADF concurrency barrier:

Firstly, understanding how these limits apply to your Data Factory pipelines takes a little bit of thinking about considering you need to understand the difference between an internal and external activity. Then you need to think about this with the caveats of being per subscription and importantly per Azure Integration Runtime region.

Assuming you know that, and you’ve hit these limits!

Click through to see what happens. It’s not pretty.

Comments closed

Working with Self-Hosted Integration Runtimes

Craig Porteous walks us through some of the planning necessary for self-hosted integration runtimes:

If your Data Factory contains a self-hosted Integration runtime, you will need to do some planning work before everything will work nicely with CI/CD pipelines. Unlike all other resources in your Data Factory, runtimes won’t deploy cleanly between environments, primarily as you connect the installed runtime directly to a single Data Factory. (We can add more runtime nodes to a single Data Factory but we cannot share a single node between many data factories*). An excerpt from Microsoft’s docs on Continuous integration and delivery in Azure Data Factory mentions this caveat.

Read on for the consequences and two options available to you.

Comments closed

dbachecks Against Azure SQL Databases

Jess Pomfret takes us through running dbachecks on an Azure SQL Database:

Last week I gave a presentation at Data South West on dbachecks and dbatools. One of the questions I got was whether you could run dbachecks against Azure SQL Databases, to which I had no idea. I always try to be prepared for potential questions that might come up, but I had only been thinking about on-premises environments and hadn’t even considered the cloud.  The benefit is this gives me a great topic for a blog post.

Click through for the answer.

Comments closed

Kusto Queries in Azure Data Studio Notebooks

Julie Koesmarno shows off the Kusto Query Language magic in Azure Data Studio notebooks:

To do this, you’ll need to ensure that you have Kqlmagic installed. See Install and set up Kqlmagic in a notebook. Then in a notebook, you can load Kqlmagic with %reload_ext Kqlmagic in a code cell.

The next step is then in a new code cell, you can start connecting to a Log Analytics workspace. There are three ways to do so (roughly – as I’m also learning in this space too):

1. Using Azure Active Directory Device Login authentication.
2. Using Az CLI login
3. Using Client Secret

Read on for one example using Azure AD authentication.

Comments closed

Azure Data Factory Integration Runtimes

Tino Zishiri takes us through the concept of the Integration Runtime:

An Integration Runtime (IR) is the compute infrastructure used by Azure Data Factory to provide data integration capabilities such as Data Flows and Data Movement. It has access to resources in either public networks, or hybrid scenarios (public and private networks).

Read on to learn more about what they do and the variety of Integration Runtimes available to you.

Comments closed

AzureTableStor: Table Storage in R

Hong Ooi announces a new package on CRAN:

I’m pleased to announce that the AzureTableStor package, providing a simple yet powerful interface to the Azure table storage service, is now on CRAN. This is something that many people have requested since the initial release of the AzureR packages nearly two years ago.

Azure table storage is a service that stores structured NoSQL data in the cloud, providing a key/attribute store with a schemaless design. Because table storage is schemaless, it’s easy to adapt your data as the needs of your application evolve. Access to table storage data is fast and cost-effective for many types of applications, and is typically lower in cost than traditional SQL for similar volumes of data.

If that sounds like a fit for you, check out the package.

Comments closed

Tracking Cosmos DB Re-Indexing Progress

Hasan Savran wants information:

Indexes let your queries run faster. When you need to adjust your indexing policies, database engines re-indexes your data respecting to your changes. In Cosmos DB, when you change your indexing policies, database engine truncates all your indexes and starts to reindex all your indexes from scratch. You do not want to change your indexing policies when your application is busy. Because your queries can not use the dropped indexes, queries will take longer, and they will cost more Request Units. Also, your queries might not return all the data they supposed to. You can read me my older post about indexes in Cosmos DB.

     You may want to monitor re-indexing progress; you may want to disable your application until indexing is completed or warn your team about the re-indexing progress. You can check the re-indexing progress only from SDK, that means you need to write your own code to accomplish this. I have the following code which checks the progress every second. If progress is at %100 then it quits, otherwise it continues to check progress every second until it receives 100 as result.

Hasan has provided us with a script, so check that out.

Comments closed

Target Groups in Elastic Jobs

Reitse Eskens shares some more information about elastic jobs in Azure:

In one of my previous blogs, I wrote about how to create an elastic job agent when you need the SQL Agent functionality on Azure. You can read that one here.

This morning, I needed a job to update the stats on a database, but on just one database within the “instance” on Azure. But my first group contained all the databases, and the Ola Hallengren script isn’t available on all databases and the credential I’m using to execute the jobs doesn’t have access to all the databases.

Read on to learn how Reitse solved the problem.

Comments closed

Check if an ADF Pipeline is Already Running

Paul Andrew has a scenario for us:

Scenario: I want to trigger a Data Factory pipeline, but when I do I want the pipeline to know if it’s already running. If it is already running, stop the new run.

Sounds simple enough right?

Wrong!

But, now simple for you, because I’ve done it for you, yay! 🙂

I thought it was simple, but it wasn’t simple, but now it’s simple, but is it really simple? Click through to find out.

Comments closed

Infrastructure Notes for RMDBS on Azure VMs

Kellyn Pot’vin-Gorman takes a look at some of the hardware choices you have in Azure, focusing on what works for relational database management systems:

The truth is, its often a combination of database and infrastructure issues that are the cause.  Although many of you may want me to dig into database performance data, I’m actually going to first focus on infrastructure, as it’s the area that most aren’t privy to for Oracle, or for that matter, any database on Azure IaaS.

The topic of infrastructure is an essential one for any database running in IaaS and even more so VMs on Linux, which can be a bit foreign for the Microsoft data specialist.  Yes, this may be intimidating when doing the shift to Linux and understanding some of the nuances to running a database on Linux, but understanding the infrastructure is a key to removing it from the scenario.  Hopefully these tips will assist you, no matter if you’re running Oracle, (MySQL, PostgreSQL or SQL Server) on Linux VMs on Azure IaaS.

Click through for some guidance on the topic.

Comments closed