Press "Enter" to skip to content

Category: Cloud

Retrieving Azure Log Analytics Data using Azure Data Factory

Meagan Longoria needs to move some log data around:

For this project, we have several Azure SQL Databases configured to send logs and metrics to a Log Analytics workspace. You can execute KQL queries against the workspace in the Log Analytics user interface in the Azure Portal, a notebook in Azure Data Studio, or directly through the API. The resulting format of the data downloaded from the API leaves something to be desired (it’s like someone shoved a CSV inside a JSON document), but it’s usable after a bit of parsing based upon column position. Just be sure your KQL query actually states the columns and their order (this can be done using the Project operator).

Click through for an example of moving this resultant data into Azure Storage.

Comments closed

Using the Cosmos DB Analytics Storage Engine

Hasan Savran explains the purpose of the Cosmos DB Analytics Storage Engine:

Analytics storage uses Column Store format to save your data. This means data is written to disk column by column rather than row by row. This makes all aggregation function run fast because disk does not need to work hard to find data row by row anymore. Cosmos DB takes responsibility to move data from Transaction Store to Analytical Store too. You do not need to write any ETL packages to accomplish this. That means you do not need to figure out which data needs to update, which data should be deleted. Azure Cosmos DB figures all data for you, syncs the data between these two storage engines. This gives us the isolation we have been looking for between transactional and analytical environments. Data written to transactional storage will be available in Analytical Storage less than 5 minutes. In my experience, it really depends on the size of the database, if you have a smaller database usually data becomes available in Analytical Storage in less than a minute.

This makes the data easy to ingest into Azure Synapse Analytics, for example.

Comments closed

Internal and External Azure Data Factory Pipeline Activities

Paul Andrew differentiates two form of pipeline activity:

Firstly, you might be thinking, why do I need to know this? Well, in my opinion, there are three main reasons for having an understanding of internal vs external activities:

1. Microsoft cryptically charges you a different rate of execution hours depending on the activity category when the pipeline is triggered. See the Azure Price Calculator.

2. Different resource limitations are enforced per subscription and region (not per Data Factory instance) depending on the activity category. See Azure Data Factory Resource Limitations.

3. I would suggest that understanding what compute is used for a given pipeline is good practice when building out complex control flows. For example, this relates to things like Hosted IR job concurrency, what resources can connect to what infrastructure and when activities may might become queued.

Paul warns that this is a dry topic, but these are important reasons to know the difference.

Comments closed

Building an Azure Function in R

David Smith has a demo for us:

It’s important to note that the model prediction is not being generated by the Shiny app: rather, it’s being generated by an Azure Function running R in the cloud. That means you could integrate the model estimate into any application written in any language: a mobile app, or an IoT service, or anything that can call an HTTP endpoint. Furthermore, you don’t need to worry how many apps are running or how often estimates will be requested by the app: Azure Functions will automatically scale to meet the demand as needed.

Read the whole thing. Given that R isn’t naturally supported by Azure Functions, I think this is quite interesting.

Comments closed

ADF Switch Activities

Nick Edwards shows off the Switch activity in Azure Data Factory:

Prior to the switch statement I could achieve this using 4 ‘IF’ activities connected to a lookup activity as shown in the snip below using my ‘Wait’ example pipeline.

However a neater solution is to use the ‘Switch’ activity to do this work instead. I’ll now jump straight into a worked example to show you how I achieved this.

Click through for the demo.

Comments closed

Cosmos DB Custom App Logic in Functions

Hasan Savran shows off how we can use user-defined functions to add custom application logic to operations in Cosmos DB:

There are couple of things you need to know about them. First, User-defined functions are only for reading data. User-defined functions will always require more Request Units than regular SQL queries. You should always try to solve your application logic problem with regular queries first. Get familiar with system functions, be sure that you are not trying to write a user-defined function when there is already a system function solving the same problem. System functions will always use less Request Units than your custom user-defines function. Just like SQL Server, User-defined functions will cause a table-scan in Cosmos DB. That is why they cost more than regular queries. If you want to use the User-defined function in a where clause. Try to filter by other properties too. Other properties might hit to indexes and that will help you with request units.

Click through to see an example of them in action.

Comments closed

Cross-Cluster and Cross-Service Kusto Queries in ADS

Julie Koesmarno shows off some new functionality in Azure Data Studio:

This blog post covers examples of cross-cluster and cross-service querying, including handy syntax, code snippets and notebooks that you can use in Azure Data Studio.

As some of you may already know, Kusto (KQL) extension is available in Azure Data Studio, which allows you to explore Azure Data Explorer (ADX) more natively. ADX also supports cross-cluster and cross-service queries between ADX, Azure AppInsights and Azure Log Analytics. This cross- service query preview feature is documented in Query data in Azure Monitor using Azure Data Explorer.

Click through for the demos.

Comments closed

Using the Develop Hub in Azure Synapse Analytics

Charles Feddersen shows off one of the Azure Synapse Analytics hubs:

The Develop Hub in Azure Synapse Analytics enables you to write code and define business logic using a combination of notebooks, SQL scripts, and data flows. This gives us a development experience which provides the capability to query, analyze, and model data in multiple languages, along with giving us Intellisense support for these languages. This provides a rich interface for authoring code and in this post, we will see how we can use the Knowledge Center to jump-start our development experience.

Click through to see two demos, one of notebooks and one of SQL scripts.

Comments closed

New Azure Announcements

Eitan Blumin has a roundup of Azure-related announcements:

On the week of December 7th (especially on December 9th), Microsoft has sent us a whole bag of goodies, announcing the general availability of new features that were only in preview until now, and even newer features that have just entered public preview.

There’s quite a lot to cover here, so let’s try to break it down by categories and provide links for more details. 

Click through for the list.

Comments closed

On-Premises SQL Server is Still Relevant

John Morehouse does not abide by Betteridge’s Law of Headlines:

While I’m a firm believer that the cloud is not a fad and is not going away, it’s just an extension of a tool that we are already familiar with.  The Microsoft marketing slogan is “It’s just SQL” and for the most part that is indeed true.  However, that does not mean that every workload will benefit from being in the cloud.  There are scenarios where it does not make sense to move things to the cloud so let’s take a look at a few of them.

Read on for several reasons why the cloud might not be right for you.

Comments closed