Press "Enter" to skip to content

Category: Cloud

Azure AD Credential Passthrough and Databricks

Anna Shrestinian, et al, explain how Azure Databricks enables Azure Active Directory credential passthrough when working with Azure Data Lake Storage Gen2:

Azure Data Lake Storage (ADLS) Gen2, which became generally available earlier this year, is quickly becoming the standard for data storage in Azure for analytics consumption. ADLS Gen2 enables a hierarchical file system that extends Azure Blob Storage capabilities and provides enhanced manageability, security and performance.

The hierarchical file system provides granular access control to ADLS Gen2. Role-based access control (RBAC) could be used to grant role assignments to top-level resources and POSIX compliant access control lists  (ACLs) allow for finer grain permissions at the folder and file level. These features allow users to securely access their data within Azure Databricks using the Azure Blob File System (ABFS) driver, which is built into the Databricks Runtime.

There are some tradeoffs involved, particularly around using High Concurrency clusters (or limiting yourself to one user account), but it’s a nice bit of added value when you’re a heavy Azure user.

Comments closed

Ordering in Cosmos DB Queries

Hasan Savran shows how you can order data in Cosmos DB queries:

If you need to use multiple properties in your ORDER BY then you need to define COMPOSITE INDEXES.For example when I try to run the following query and try to order the objects by CreatedOn and Score, I end up with an error because I do not have a COMPOSITE INDEX to use with this ORDER BY.

Many parts of Cosmos DB’s SQL syntax are similar to T-SQL, but some of the underlying assumptions—such as, what you need to order data—are quite different.

Comments closed

Deploying a Container Instance in Azure

Anibal Kolker takes us through container deployment in Azure:

As derived from the title, the objective of this post is to help you deploy a container instance inside Azure.

However, we’ll extend the typical scenario and make a slightly more extensive use of networking capabilities, by placing the container group inside a private subnet.

Note: For this example, and for simplicity only, we’ll use NGINX as our container of choice. Of course, you’re welcome to try with any other image.

There are a few pieces in play, but Anibal does a good job putting it all together.

Comments closed

Azure DevOps and Data Factory

Helge Rege Gardsvoll has a three-part series for us on using Azure DevOps to deploy Data Factories. Part 1 is all about environment setup:

Shared Data Factory
The shared Data Factory is there for one use; self-hosted integration runtimes. This is the component you will use to connect to on-premise or other sources that have restrictions on access such as IP restriction or other firewall rules. Migrating a Self Hosted Integration Runtime is not supported, but you can share the same Integration Runtime across different Data Factories. You can find a description for how to do this in this article.

Part 2 covers Git branching, linked services, and development:

Create datasets and pipeline
For this demo I create two datasets; one for source and one for target, and a simple pipeline that copies the data. Datasets have name that point to the data lake, like ADLS_datahelgeadls2_Brreg_MainUnits, but does not include environment information.

Part 3 covers the release process:

The release process will have these steps;
1. Stop any active triggers. We do not want any pipelines to start as we are changing things (and you should wait until running pipelines finish before publishing)
2. Release from development to target environment
3. Clean up target environment by removing objects that are not present in dev. Also start triggers

This is a great series of posts and also includes a bonus tidbit if you’re using Databricks.

Comments closed

PowerApps Security

Jason Bonello gives us some tips on PowerApps security:

Depending on how the backend is set up, the tables having these sensitive data might be in the same database. For example, ERP solutions can have Company Accounts data, Customer related data and Inventory related data all in the same database, maybe under different schemas – but still part of the same database.

Now let’s say we are about to create a PowerApps solution to maintain Customer information. However, as part of the organization policy, this information should not be shared across other departments apart from the intended users.

Read on for some ideas of how to limit the risk of data exposure.

Comments closed

Azure Data Factory Switch Activity

Rayis Imayev explains what the Switch activity does in Azure Data Factory:

Developing conditional logic of your Azure Data Factory control flow has been simplified with introducing of the Switch activity – https://docs.microsoft.com/en-us/azure/data-factory/control-flow-switch-activity. Official documentation resource states, this new data factory activity “provides the same functionality that a switch statement provides in programming languages“. I would also add a more simplified definition of the Switch activity in Azure Data Factory: it is a container (or wrapper) for multiple IF conditions.

Click through for an example.

Comments closed

Running Big Data Clusters on VS Subscriptions

Kevin Chant has a few tips for people wanting to try out Big Data Clusters with their Visual Studio subscriptions to Azure:

In order to present the right results for various outcomes I attempted to deploy Big Data Clusters multiple times.

When I say multiple times, I mean the number of deployments easily went into double figures. Because I was testing deploying various virtual machine sizes in multiple regions.

Hence, I spent many hours testing and verifying the results in order to present them properly.

Read on to see Kevin’s notes and recommendations.

Comments closed

Backing Up SQL Server on Azure VMs

Arun Sirpal looks at three techniques for backing up SQL Server running on Azure virtual machines:

In the previous blog post I did a quick overview building a SQL VM (imaged) in Azure. It is now time to clarify some backup techniques because it can get confusing.

At a high level there are 3 techniques.
– Automated backup.
– Azure backup for SQL VM (that’s what MS call it).
– Manual backup, for example backup to URL.

Read on to learn more about each.

Comments closed

Running Oracle on Azure

Kellyn Pot’vin-Gorman takes us through various options on running Oracle in Azure:

Running Oracle on Azure VM environments aren’t that different from running Oracle on VMs in your on-premises for a DBA.  The DBAs and developers that I work with still have their jobs, still work with their favorite tools and also get the chance to learn new skills such as cloud administration.

Click through for more, including a setup script.

Comments closed