Press "Enter" to skip to content

Category: Cloud

Internal and External Azure Data Factory Pipeline Activities

Paul Andrew differentiates two form of pipeline activity:

Firstly, you might be thinking, why do I need to know this? Well, in my opinion, there are three main reasons for having an understanding of internal vs external activities:

1. Microsoft cryptically charges you a different rate of execution hours depending on the activity category when the pipeline is triggered. See the Azure Price Calculator.

2. Different resource limitations are enforced per subscription and region (not per Data Factory instance) depending on the activity category. See Azure Data Factory Resource Limitations.

3. I would suggest that understanding what compute is used for a given pipeline is good practice when building out complex control flows. For example, this relates to things like Hosted IR job concurrency, what resources can connect to what infrastructure and when activities may might become queued.

Paul warns that this is a dry topic, but these are important reasons to know the difference.

Comments closed

Building an Azure Function in R

David Smith has a demo for us:

It’s important to note that the model prediction is not being generated by the Shiny app: rather, it’s being generated by an Azure Function running R in the cloud. That means you could integrate the model estimate into any application written in any language: a mobile app, or an IoT service, or anything that can call an HTTP endpoint. Furthermore, you don’t need to worry how many apps are running or how often estimates will be requested by the app: Azure Functions will automatically scale to meet the demand as needed.

Read the whole thing. Given that R isn’t naturally supported by Azure Functions, I think this is quite interesting.

Comments closed

ADF Switch Activities

Nick Edwards shows off the Switch activity in Azure Data Factory:

Prior to the switch statement I could achieve this using 4 ‘IF’ activities connected to a lookup activity as shown in the snip below using my ‘Wait’ example pipeline.

However a neater solution is to use the ‘Switch’ activity to do this work instead. I’ll now jump straight into a worked example to show you how I achieved this.

Click through for the demo.

Comments closed

Cosmos DB Custom App Logic in Functions

Hasan Savran shows off how we can use user-defined functions to add custom application logic to operations in Cosmos DB:

There are couple of things you need to know about them. First, User-defined functions are only for reading data. User-defined functions will always require more Request Units than regular SQL queries. You should always try to solve your application logic problem with regular queries first. Get familiar with system functions, be sure that you are not trying to write a user-defined function when there is already a system function solving the same problem. System functions will always use less Request Units than your custom user-defines function. Just like SQL Server, User-defined functions will cause a table-scan in Cosmos DB. That is why they cost more than regular queries. If you want to use the User-defined function in a where clause. Try to filter by other properties too. Other properties might hit to indexes and that will help you with request units.

Click through to see an example of them in action.

Comments closed

Cross-Cluster and Cross-Service Kusto Queries in ADS

Julie Koesmarno shows off some new functionality in Azure Data Studio:

This blog post covers examples of cross-cluster and cross-service querying, including handy syntax, code snippets and notebooks that you can use in Azure Data Studio.

As some of you may already know, Kusto (KQL) extension is available in Azure Data Studio, which allows you to explore Azure Data Explorer (ADX) more natively. ADX also supports cross-cluster and cross-service queries between ADX, Azure AppInsights and Azure Log Analytics. This cross- service query preview feature is documented in Query data in Azure Monitor using Azure Data Explorer.

Click through for the demos.

Comments closed

Using the Develop Hub in Azure Synapse Analytics

Charles Feddersen shows off one of the Azure Synapse Analytics hubs:

The Develop Hub in Azure Synapse Analytics enables you to write code and define business logic using a combination of notebooks, SQL scripts, and data flows. This gives us a development experience which provides the capability to query, analyze, and model data in multiple languages, along with giving us Intellisense support for these languages. This provides a rich interface for authoring code and in this post, we will see how we can use the Knowledge Center to jump-start our development experience.

Click through to see two demos, one of notebooks and one of SQL scripts.

Comments closed

New Azure Announcements

Eitan Blumin has a roundup of Azure-related announcements:

On the week of December 7th (especially on December 9th), Microsoft has sent us a whole bag of goodies, announcing the general availability of new features that were only in preview until now, and even newer features that have just entered public preview.

There’s quite a lot to cover here, so let’s try to break it down by categories and provide links for more details. 

Click through for the list.

Comments closed

On-Premises SQL Server is Still Relevant

John Morehouse does not abide by Betteridge’s Law of Headlines:

While I’m a firm believer that the cloud is not a fad and is not going away, it’s just an extension of a tool that we are already familiar with.  The Microsoft marketing slogan is “It’s just SQL” and for the most part that is indeed true.  However, that does not mean that every workload will benefit from being in the cloud.  There are scenarios where it does not make sense to move things to the cloud so let’s take a look at a few of them.

Read on for several reasons why the cloud might not be right for you.

Comments closed

Setting up Azure Purview for Power BI

Soheil Bakhshi has a great step-by-step walkthrough for setting up Azure Purview:

Microsoft newly announced a piece of very exciting news that Azure Purview now supports Power BI. This is massive news from a data governance point of view. Azure Purview is the next generation of Azure Data Catalog with more metadata discovery power and the ability to use sensitivity labels. After reading the news, I immediately decided to set up my test environment and give it a go. I followed the steps mentioned in this article on the Microsoft documentation website but I faced some difficulties to get it to work. And here we are, another blog post to help you to set up the Azure Purview for Power BI.

Click through for a detailed walkthrough.

Comments closed

Clouds as Single Points of Failure

Denny Cherry argues that you should not consider a cloud provider as a single point of failure:

Having a two cloud providers isn’t going to save you from an outage. The public cloud providers (Microsoft Azure, Amazon AWS, Google GCP, etc.) have specifically designed their networks so that an outage at one region doesn’t impact other regions.

The day before US Thanksgiving (November 25, 2020), AWS had a major outage where the east-us facility suffered an outage for several hours. But you’ll notice something very interesting about this outage. No other AWS region was impacted by this outage. This is a very important distinction, as it shows that having multiple regions within AWS would give a solid Disaster Recovery strategy a great fail-over experience.

I’m mostly in agreement with Denny on this, but then I’d also have to point out the Azure AD issue which crippled Azure work across the globe, or the Azure DevOps service going down for a period of time (because everything was hosted in one data center and there was an issue). Depending on just how important uptime is, it can still make sense to be multi-cloud, especially if we use a broad enough definition which includes on-premises as a “local cloud.” In extreme cases—say, you lose millions of dollars per hour of downtime—the cost of a belt + suspenders approach is well below the expected loss from an outage.

Comments closed