Press "Enter" to skip to content

Category: Cloud

Summarizing Data Mesh in Azure

Paul Andrew wraps up a series:

When we consider this in the context of what I’ve already established in part 1 of the series, I focused on our data products and ownership. Now I want to re-introduce our data domains as a level above our data products. We can even consider this a hierarchy.

– Data Domains

– Data Products

Why?

Read on for that answer.

Comments closed

Data Virtualization with Azure SQL Managed Instance

Mladen Andzic announces data virtualization in Azure SQL Managed Instance:

Data virtualization capabilities, now in preview in Azure SQL Managed Instance, enable you to execute Transact-SQL (T-SQL) queries against data from files stored in Azure Data Lake Storage Gen2 or Azure Blob Storage and combine it with relational data stored locally in the managed instance using logical joins. This way you can transparently access external data while keeping it in its original format and location. There is no data duplication or need to run and maintain ETL processes, which means that you can extract and deliver insights faster. Currently supported file formats are Parquet, CSV, and JSON.

I’m going to start calling it PolyBase Duck Typing: it’s not actually PolyBase but the syntax is the same and the outcome is the same and the method to enable it is the same and “PolyBase” is a lot easier to say than “data virtualization.” So even though it’s not PolyBase, I’m going to call it PolyBase until there’s a meaningful split.

Comments closed

Stringing Azure Data Factory between VNets

Ahmed Mahmoud performs networking wizardry:

Customer wants to connect Azure Data Factory on one subscription to an Azure SQL Server on Virtual Machine (SQL VM) on another subscription. check out the architecture diagram below for more clarification.

Click through for that diagram as well as the process. And between VNet peering and Private Link, I believe (but could be wrong in saying) the traffic would never leave Azure-hosted machines even when it transits between subscriptions.

Comments closed

Executing SQL Statements in Azure Data Factory

Abhishek Narain announces a pretty nice improvement to Azure Data Factory and Synapse Pipelines:

We are introducing a Script activity in pipelines that provide the ability to execute single or multiple SQL statements.  

Using the script activity, you can execute common operations with Data Manipulation Language (DML), and Data Definition Language (DDL). DML statements like SELECT, UPDATE, and INSERT let users retrieve, store, modify, delete, insert and update data in the database. DDL statements like CREATE, ALTER, and DROP allow a database manager to create, modify, and remove database objects such as tables, indexes, and users.

Be sure to read the limitations at the bottom, however.

Comments closed

Deleting an RDS Instance

Chad Callihan takes out the trash:

We’ve created an AWS RDS instance and logged into it successfully. One thing to remember when creating test instances is when to them when you’re finished. While a lot of test instances I’ve created have been free tier, it’s still good to clean up rather than leave instances lingering. Today, let’s clean up a test instance.

Click through for the step-by-step on how to do this.

Comments closed

Form Recognizer Updates

Vinod Kurpad shares some news:

Form Recognizer continues to improve product capabilities with improved models, support for additional document types and containerized solutions that run in the cloud or on premises either connected or fully disconnected for scenarios where containers need to run in an isolated environment. Recent updates to pricing include commitment tiers for customers who have a predictable volume of documents. Starting February 15th, the pricing for Invoices and General Document API will drop to $10 per 1000 pages, an 80% reduction, making it possible for customers to use invoices and the general document APIs for high volume scenarios to significantly lower cost while providing additional value.

That’s a pretty big improvement.

Comments closed

Authenticating with s5cmd

Anthony Nocentino has the need for speed. And authentication:

At work, I get to work with some fantastic tech that pushes the boundaries of performance. I needed to do some performance testing from a Windows server into a FlashBlade using s3. I reached out to a colleague of mine, Joshua Robinson, who told me about s5cmds5cmd is a very fast, parallel s3 compatible command-line client.

Check out Joshua’s post for some performance numbers. Here’s a direct quote from his post.

But it doesn’t matter how fast it is if you can’t connect, so Anthony shows us how to do just that.

Comments closed

Deploying SQL Scripts via Azure Release Pipelines

Meagan Longoria solves a problem:

We chose release pipelines over the YAML pipelines because it was easier to manage for the client and pretty quick to set up. While I had done this before, I had a couple of new challenges:

– I was deploying to an Azure SQL managed instance that had no public endpoint.

– There were multiple databases for which there may or may not be a change script to execute in each release.

This took a bit longer than I expected, and I enlisted my coworker Bill to help me work through the final details.

Read on to see how Meagan and Bill solved the problem.

Comments closed

Mission-Critical Azure Architectures

Ben Brauer has some reference guidelines:

The AlwaysOn project strives to address the challenges of building mission-critical applications by providing organizations with a prescriptive architectural approach for the Microsoft Cloud.

It leverages lessons from numerous customer applications and first-party solutions, and applies Well-Architected best practices to provide actionable and authoritative guidance for building and operating a highly reliable solutions on Azure at-scale.

I guess marketing is calling it “AlwaysOn” again. Until they call it Always On. Or AlWaYsOn.

Comments closed