Press "Enter" to skip to content

Category: Architecture

ORMs and Mapping Requirements

Mark Seemann is not a big fan of Entity Framework:

When I evaluate whether or not to use an ORM in situations like these, the core application logic is my main design driver. As I describe in Code That Fits in Your Head, I usually develop (vertical) feature slices one at a time, utilising an outside-in TDD process, during which I also figure out how to save or retrieve data from persistent storage.

Thus, in systems like these, storage implementation is an artefact of the software architecture. If a relational database is involved, the schema must adhere to the needs of the code; not the other way around.

To be clear, then, this article doesn’t discuss typical CRUD-heavy applications that are mostly forms over relational data, with little or no application logic. If you’re working with such a code base, an ORM might be useful. I can’t really tell, since I last worked with such systems at a time when ORMs didn’t exist.

Read on for a thoughtful argument. The only critique I have is I’d prefer stored procedures over saving SQL queries in the code.

1 Comment

The Medallion Architecture in Data Modeling

Nikola Ilic gets the gold:

The most common pattern for modeling the data in the lakehouse is called a medallion. I love this name – it’s really easy to remember. But, why medallion? Tag along and you’ll soon find out why.

The same as for the lakehouse concept, credits for being pioneers in the medallion approach goes to Databricks.

What I’ve found interesting is the number of people who have taken to disliking the medallion architecture terms because Databricks pushed it so hard that their clients automatically assumed “medallion = using Databricks.”

Comments closed

The Basics of Fact-Dimensional Modeling

Nikola Ilic gives us a primer on Kimball-style fact and dimensional modeling:

Before we come up to explain why dimensional modelling is named like that – dimensional, let’s first take a brief tour through some history lessons. In 1996, a man called Ralph Kimball published a book “The Data Warehouse Toolkit”, which is still considered a dimensional modelling “Bible”. In his book, Kimball introduced a completely new approach to modelling data for analytical workloads, the so-called “bottom-up” approach. The focus is on identifying key business processes within the organization and modelling these first, before introducing additional business processes.

This is a really good overview of the topic, though I’m saddened that “dimensional bus matrix” didn’t make the cut of things to discuss. Mostly because I like the name “dimensional bus matrix.”

Comments closed

Microsoft Fabric Architectural Icons

Marc Lelijveld imports some icons:

In the past, I’ve made a draw.io file for Power BI to help you using the right icons to design your solutions and make architectural diagrams. With Fabric, a bunch of new services and icons have been introduced. This asks for a new draw.io file.

With this blog, I will provide the draw.io file for all new icons and elements of Fabric.

Click through for that link. Also note that you might be more familiar with the new name of draw.io, diagrams.net.

Comments closed

Contrasting Kafka and Pulsar

Tessa Burk perform a comparson:

Apache Kafka® and Apache Pulsar™ are 2 popular message broker software options. Although they share certain similarities, there are big differences between them that impact their suitability for various projects.  

In this comparison guide, we will explore the functionality of Kafka and Pulsar, explain the differences between the software, who would use them, and why.  

Click through for that comparison. I haven’t used Pulsar before, so it’s interesting to get this sort of a functionality and community comparison.

Comments closed

Elastic Pools for Azure SQL DB Hyperscale

Arvind Shyamsundar announces a new preview:

We are very excited to announce the preview of elastic pools for Hyperscale service tier for Azure SQL Database!

For many years now, developers have selected the Hyperscale service tier in a “single database” resource model to power a wide variety of traditional and modern applications. Azure SQL Hyperscale is based on a cloud native architecture providing independently scalable compute and storage, and with limits which substantially exceed the resources available in the General Purpose and Business Critical tiers.

Click through to learn more about what’s on offer.

Comments closed

The Current Status of the Lakehouse Architecture

Paul Turley is happy:

When I first started attending conference and user group sessions about Lakehouse architecture, I didn’t get it at first, but I do now; and it checks all the boxes. As a Consulting Services Director in a practice with over 200 BI developers and data warehouse engineers, I see first-hand how our customers – large and small – are adopting the Lakehouse for BI, Data science and operational reporting.

Read on for Paul’s thoughts. My main concern with the strategy has always been performance, with the expectation that it’d take a few years for lakehouse systems to be ready for prime time. We’re getting close to that few years (back in 2020, I believe I estimated 2024-2025).

Comments closed

Data Mesh Q&A

Jean-Georges Perrin hosts another Q&A:

As part of the Data Mesh Learning Community, Eric Broda invited Laveena KewlaniKruthika Potlapally, and me to discuss the implementation of Data Mesh at PayPal. As expected, the session went longer than scheduled, and some questions remained open. As with the previous Q&A sessions ([#1] and [#2]), here is an attempt to answer them.

Click through for the questions, as well as the answers.

Comments closed

PayPal’s Data Contract Template Open Sourced

Jean-Georges Perrin makes an announcement:

A data contract is a binding agreement between the consumers and producers of data. You can see it as a data schema on steroids or data schema++. The goal of the contract is to set expectations between the parties. It can be built as fit-for-purpose where the consumers and producer agree on what it should contain or can serve as a brochure for any consumer willing to access the data offered by this (data) product.

Click through to learn more about data contracts and then check out the contract template itself on PayPal’s GitHub repo.

Comments closed

Portfolio Management for Creating a Technology Strategy

Kevin Sookocheff busts out the 2×2 matrix:

Application Portfolio Management (APM) draws inspiration from financial portfolio management, which has been around since at least the 1970s. By looking at all applications and services in the organization and analyzing their costs and benefits, you can determine the most effective way to manage them as part of a larger overall strategy. This allows the architect or engineering leader to take a more strategic approach to managing their application portfolio backed by data. Portfolio management is crucial for creating a holistic view of your team’s technology landscape and making sure that it aligns with business goals.

This is for C-levels and VPs rather than individual contributors, but acts as a good way of thinking about a portfolio of applications and what to do with each.

Comments closed