Press "Enter" to skip to content

Category: Architecture

Using Proprietary Database Features

Lee Markum does some thinking:

I was recently thinking of SQL Server temporal tables and how there is a perspective that you shouldn’t use proprietary features of a product because it locks you into that product. I want to be your guide on this matter.

I agree completely with Lee’s take on this. I can count on one hand the number of platform migrations I’ve been a part of through the years. These are painful enough experiences even if you use 100% “portable” code because optimization patterns change. And then you get to the utterly absurd: indexes in Oracle and SQL Server behave differently (e.g., clustered indexes are a practical must in SQL Server and the equivalent to clustered indexes in Oracle is so rarely used that it’s not even worth talking about to most Oracle people), so are you going to avoid indexes so you can have “truly” portable code? If the answer to that question is “yes,” you are wrong.

I do understand that there are companies which create code bases which need to be installable on multiple platforms. In that case, I do understand trying to keep things as common as possible, especially for fairly simple apps. But this is also part of why vendor databases are, as a general rule, awful.

Comments closed

The Evolution of Cloud Architecture

Ben Brauer has a two-parter looking at how architecture is changing. Part 1 looks at containers and machine learning:

Let’s start describe containers at a high level.  A container is a packaging and distribution mechanism that abstracts and resolves many of the installer issues that result from ‘unique’ environments.  We’ve all heard developers exclaim “well, it works on my machine,” after pushing an application to a new environment only to realize its broken.  Containers strive to address this problem by creating a hard boundary between the infrastructure and the software stack used by an application. External dependencies are not necessarily added to the container, but all your internal dependencies (frameworks, runtimes, etc.) are there.  This makes the deployment of the application to a new environment significantly more predictable as the compute environment is consistent as its part of the container.

Part two looks at serverless compute and low-code/no-code development:

Low-code (or no-code) development for applications is not a new concept. It strives to democratize development in a similar way as decades ago Visual Basic expanded the number of developers from thousands of C++ developers to hundreds of thousands of developers creating Windows-based solutions. Low-code takes this concept to non-technical professionals. Although this notion is great for productivity and usability, the maintenance and performance of these apps can be daunting to say the least. Now non-technical application authors need to learn about application management, documentation and, application deployment.  Without a clear understanding of these considerations, the environment can quickly become chaotic.  The good news is that platforms and tools have come a long way since Visual Basic. For example, Microsoft’s Power Apps platform provides many of the platform services needed to maintain a healthy application lifecycle and governance paradigm.

These are good concepts to know about, regardless of your particular cloud platform.

Comments closed

Secondary and Tertiary Data Mesh Interfaces in Azure

Paul Andrew continues a series on implementing data mesh with Azure:

When thinking about our node edges in part 2 I also made the statement about a primary set of node interfaces. In my initial drawings I alluded to this then capturing what I’ve called the PaaS Plane, suggesting the Azure Resource type used.

Building on this understanding I want to cover off the remaining edge use cases by exploring the other interface types we will typically need for the nodes of our data mesh architecture.

This has been a rather informative series on a topic I knew very little about coming in.

Comments closed

Building Data Mesh Edges in Azure

Paul Andrew continues a series on data mesh in Azure:

Anyway, moving on. In part 2 of this blog series, keeping the same focus from part 1, with the first data mesh principal. Let’s take our nodes and start thinking about the edges. The data mesh – data product interfaces… Enter my Azure Resource Group with arms/antenna type things, seen on the right 

Caveat: as you may have already gathered, I’m going to use the terms edge and interface a lot in this post. The meaning in the context of the data mesh is the same. Nodes with edges, nodes with interfaces.

Click through for more detail.

Comments closed

Building a Data Mesh in Azure

Paul Andrew starts a new series:

The concepts and principals of a data mesh architecture have been around for a while now and I’ve yet to see anyone else apply/deliver such a solution in Azure. I’m wondering if the concepts are so abstract that it’s hard to translate the principals into real world requirements, and maybe even harder to think about what technology you might actually need to deploy in your Azure environment.

Given this context (and certainly no fear of going first with an idea and being wrong ) here’s what I think we could do to build a data mesh architecture in the Microsoft cloud platform – Azure.

Click through for Paul’s take on the first data mesh principle.

Comments closed

Organizing Synapse Workspaces and Lakehouses

Jovan Popovic confirms that Microsoft is using the term “Lakehouse” like Databricks does:

The lakehouse pattern enables you to keep a large amount of your data in Data Lake and to get the analytic capabilities without a need to move your data to some data warehouse to start an analysis. A lakehouse represents a good trade-off between query performance and the ability to access the latest version of data without the need to wait for data to be reloaded.

Azure Synapse Analytics workspace enables you to implement the Lakehouse pattern on top of Azure Data Lake storage.

When you think about your lakehouse solution, be aware that there are two options for creating databases over the lake:

Lake databases that are created using Spark or database template

SQL databases that are created using serverless SQL pools on top of data lake.

Although you might use different tools and languages to create these types of databases, the principles described in this article apply to both types. I will use the term “lakehouse” whenever i reference Spak Lake database or SQL database created using the serverless SQL pools.

Click through for Jovan’s guidance.

Comments closed

A Case for EAV?

Erik Darling makes the case:

EAV styled tables can be excellent for certain data design patterns, particularly ones with a variable number of entries.

Some examples of when I recommend it are when users are allowed to specify multiple things, like:

I’m not sure I agree on the examples. When there are specific known things with expected shapes, I’d rather have a separate entity to model each. Even if each table is a single string, I’d still like the separation for logical modeling purposes.

That said, there are cases when EAV ends up being the best approach (unfortunately), particularly when you don’t even know the types of things a customer would wish to include. Just try to fight back hard when the inevitable request comes in to pivot all of that data.

Comments closed

Azure Synapse Database Templates

Aaron Merrill announces database templates for Azure Synapse Analytics:

The Synapse database template for Agriculture is a comprehensive data model that addresses the typical data requirements of organizations engaged in growing crops, raising livestock, and producing dairy products, including field and pasture management and satellite and drone data.

The Synapse database template for Energy & Commodity Trading is a comprehensive data model that addresses the typical data requirements of organizations engaged in trading energy, commodities, and/or carbon credits, whether as a primary trading business or in support of their supply chains, operating businesses, and hedging activities.

You may remember Microsoft buying ADRM Software a while back. This is why.

Comments closed

Azure Synapse Analytics Database Templates

Santosh Balasubramanian shows off database templates in Azure Synapse Analytics:

Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing, and big data analytics. It gives you the freedom to query data on your terms, using either serverless or dedicated resources—at scale. Azure Synapse brings these worlds together with a unified experience to ingest, explore, prepare, manage, and serve data for immediate BI and machine learning needs.

One of the challenges that users in key industry areas face is how to describe and shape the mass of data that they are gathering. Most of this data is currently stored in data lakes or in application-specific data silos. The challenge is to bring all this data together in a standardized format enabling it to be more easily analyzed and understood and for ML and AI to be applied to it.

Azure Synapse solves this problem by introducing industry-specific templates for your data, providing a standardized way to store and shape data. These templates provide schemas for predefined business areas, enabling data to be loaded into a database in a structured way.

Read on to see what they can do, and try them out in a Synapse workspace.

Comments closed

Event-Driven Programming

Gigi Sayfan explains the idea of event-driven programming:

Event-driven programming is a great approach for building complex systems. It embodies the divide-and-conquer principle while allowing you to continue using other approaches like synchronous calls.

When discussing event-based systems, several different terms often refer to the same concept. For simplicity, we’ll primarily use the terms listed below in bold:

Event, message, and notification

Producer, publisher, sender, and event source

Consumer, receiver, subscriber, handler, and event sink

Message queue and event queue

I really like the ideas behind event-driven programming. It’s most commonly useful when working with cloud services, but also in solving some difficult T-SQL problems.

Comments closed