Press "Enter" to skip to content

Category: Cloud

Immutable Blobs in Azure Storage

Khushbu Gandhi won’t change:

A lot of my customers have a business requirement that –

  • once a document is written in storage account, nobody shall be able to modify or delete it, including the administrators
    AND / OR
  • audit data can be written only once, and nobody shall be able to alter them.

These kind of business requirements are called WORM (Write Once, Read Many) and are common for various industries such as ISVs, financial or healthcare. Have you run into these mission-critical requirements? If yes, then immutable storage would be the solution.

Read on to see how this works, using retention policies. But ash Khushbu notes, if you set an absurd retention policy (up to 400 years), you can’t change the data while that policy is active.

Comments closed

A Primer on Databricks Unity Catalog

Beginner’s Hadoop gives us an overview:

The Databricks Unity Catalog is a feature provided by Databricks Unified Data Analytics Platform that allows you to organize and manage metadata about your data assets, such as tables, databases, and views. It provides a centralized metadata repository that enables users to discover, understand, and collaborate on data assets within a Databricks environment. The Unity Catalog integrates with various data sources and supports different metadata management capabilities.

Read on for an overview of what it does.

Comments closed

Listing Available Properties in Azure Data Factory

Andy Leonard builds a list:

Did you know Azure Data Factory (ADF) will actually list available properties? It will. One of the things I cover in my ADF training titled Master the Fundamentals of Azure Data Factory is this handy troubleshooting tip.

Read on to see how, though I’d personally like something which is a bit faster than waiting for the thing to execute and getting back what my choices are.

Comments closed

Bring Fabric to the Data Lakehouse

Ust Oldfield ties together Databricks and Microsoft Fabric:

We’ve built countless Lakehouses for our customers and influenced the design of many more. With the advent of Fabric, many organisations with existing lakehouse implementations in Azure are wondering what changes Fabric will herald for them. Do they continue with their existing lakehouse implementation and design, or do they migrate entirely to Fabric?

For many, the answer will be to continue as-is. They’ve invested a lot of time and money in establishing a Lakehouse – to migrate now to a slightly different technology stack would be a very costly exercise! There also isn’t a need to migrate from a lakehouse implementation in Databricks to one in Fabric as there aren’t concrete benefits to be realised.

For those using Power BI as their semantic and reporting layers, as well as using Databricks SQL or Synapse Serverless as the serving layer, Fabric provides a perfect opportunity to rationalise the architecture and to bring about substantial performance gains through the Direct Lake connectivity and V-Order compression in Fabric.

Read on to see what Ust means, using a couple of architecture diagrams along the way.

Comments closed

Preventing Accidental Azure Changes with Resource Locks

Khushbu Ghandi puts a padlock on it:

Resource locks are just locks that we can associate to different scopes in Azure allowing us to override permissions at that resource scope and down. When we talk about the scope of the resource lock, we can lock subscriptions, we can lock resource groups and individual resources, and the lock restrictions that we have based off the type of lock we select will apply to all users and roles that have access to that resource. Also, it’s worth noting that locks are inherited by child resources. So, if we apply a lock on a subscription, it is inherited by all the resource groups that have been created under that subscription along with the resources that will be created under the resource groups.

Resource locks come with their own considerations, and Khushbu dives into those. This is a concept I like more in theory than in practice, save for pretty stable systems where you keep things running 24/7.

Comments closed

Cache Recommendations for Azure Data Explorer

Guy Reginiano notes an update:

A new generation of cache recommendations for Azure Data Explorer is now available in the Azure portal! 
This update introduces significant improvements, including enhanced logic, additional statistics for end users, an improved user interface, and a streamlined process for reviewing and applying recommendations. In this blog post, we will explore the new features and benefits offered by this latest update. 

Read on to see where you can find these cache recommendations, as well as the types of recommendations you’re liable to receive.

Comments closed

A Complex Example of ADF Pipeline Return Value

Andy Leonard goes beyond the simple example:

In this post, I demonstrate one way to create a child pipeline that returns the SubscriptionId for a data factory. I then call the child pipeline from a parent package.

To build this demonstration, please follow the instructions that follow.

This is definitely more complicated than Andy’s simple example, but there are plenty of screenshots to take you through the process.

Comments closed

Trying the Azure OpenAI Playground

Obaro Alordiah gives us a primer:

The Azure OpenAI Service has been a trending topic in the tech world this year as it combines the power of OpenAI’s advanced generative AI models with the comprehensive suite of services available on the Azure cloud. It has given developers the opportunity to create and embed high performing AI models into the Azure environment to deliver more efficient, insightful & innovative solutions. In this blog, we will take a high level look at some of the key features within the Azure OpenAI playground and how we can get the best out of it.

Generative AI via OpenAI is an area in which Microsoft is putting an inordinate amount of focus.

Comments closed

A Simple Example of ADF Pipeline Return Value

Andy Leonard starts easy:

I want to develop an Azure Data Factory (ADF) design pattern for calling focused, unit-of-work, function-y ADF pipelines that perform focused tasks. Some of these “worker pipelines” will need to return values to the calling pipeline.

In this example, I started by reading Mark Kromer‘s (excellent) article titled You can now customize the return value from your pipeline! I then crafted the simple example shown in this post to make sure I understood the principles involved before using pipeline return value (preview) functionality in more robust ADF patterns.

Follow the steps I outline below to build a simple example for an ADF pipeline that returns a value!

Click through to follow those steps.

Comments closed