Category: Architecture

Updates to Azure Well-Architected Review Assessments

Published 2023-11-15 by Kevin Feasel

Microsoft is excited to announce a significant update to the Azure Well-Architected Review assessment helps you build and optimize workloads. It walks you through a series of questions about your workload. Based on your responses, it generates tailored and prioritized recommendations to improve your workload design. The guidance is actionable and applicable to nearly every workload. It aligns with the latest best practices across the five key pillars of reliability, security, cost optimization, operational excellence, and performance efficiency (see figure 1).

I’m a big fan of the Well-Architected Framework and the assessments Microsoft has put together. An assessment can take teams within a company days to complete because the questions are so thorough, but once you do get through the list, you’ll get some great practical insights on your setup and what you can do to improve performance and save money.

Comments closed

Comparing Service Endpoints and Private Endpoints in Azure

Published 2023-10-25 by Kevin Feasel

Khushbu Gandhi clarifies a choice:

For a long time, if you were using the multi-tenant, PaaS version on many Azure services, then you had to access them over the internet with no way to restrict access just to your resources. This restriction was primarily down to the complexity of doing this sort of restrictions with a multi-tenant service. At that time, the only way to get this sort of restriction was to look at using single-tenant solutions like App Service Environment or running service yourself in a VM instead of using PaaS.

This public access was a concern for many, and so Microsoft implemented new services that allow you to limit access to these multi-tenant services. Today, we have two solutions that on the face of it look quite similar, Service Endpoints and Private link/Endpoints. These two services are both designed to allow you to restrict who connects to your service, and how they do it. Because of this, it can be confusing to know which service to use and what the benefits are. In this article, we will look at these services and try to make your decision clearer.

Read on to see what the differences are between the two, as well as a comparison table and recommendations on which to choose in what circumstances.

Comments closed

Building a Multi-Tenant Database

Published 2023-10-25 by Kevin Feasel

Adron Hall looks at multi-tenancy within Postgres:

Music has always been a significant part of my life. From the melodies that accompany my daily routines to the anthems of my most memorable moments, it’s been a constant. As my collection grew, I realized I needed a better way to organize it. That’s when I stumbled upon the concept of multi-tenancy databases and decided to give it a shot with PostgreSQL. Here’s my experience.

Multi-tenancy is one case in which I’m much more relaxed about including the tenant ID on tables where it is not absolutely necessary in order to prevent a series of joins to get the appropriate tenant ID. We can quibble about whether that’s reasonable denormalization or appropriate use of a superkey—especially because, in SQL Server, tenant ID ends up being part of the clustered index and likely part of the primary key anyhow—but it’s extremely useful nonetheless.

Comments closed

Building Diagrams in Mermaid

Published 2023-10-23 by Kevin Feasel

Michael Bourgon tries out Mermaid:

Just found out about this the past month.

I like diagrams for my documentation, and I detest making it. I also would like to build it via script, since that’s more useful.

I used Mermaid to create a series of architectural diagrams a couple years back. It was a reasonably good experience, although you have to keep in mind that you don’t get pixel-perfect designs and certain concepts can be difficult to represent. Even so, it’s quite alright for straightforward diagrams and includes support for icon sets for a variety of cloud and on-premises environments.

Comments closed

Using Data Contracts in Confluent Schema Registry

Published 2023-10-19 by Kevin Feasel

Robert Yokota shows us how to generate data contracts for streaming solutions:

A data contract is a formal agreement between an upstream component and a downstream component on the structure and semantics of data that is in motion. The upstream component enforces the data contract, while the downstream component can assume that the data it receives conforms to the data contract. Data contracts are important because they provide transparency over dependencies and data usage in a streaming architecture. They help to ensure the consistency, reliability, and quality of the data in event streams, and they provide a single source of truth for understanding the data in motion.

Click through for a sample application that uses data contracts.

Comments closed

A Primer on Functional Programming

Published 2023-10-17 by Kevin Feasel

Anirban Shaw gives us the skinny:

In the ever-evolving landscape of software development, there exists a paradigm that has been gaining momentum and reshaping the way we approach coding challenges: functional programming.

In this article, we delve deep into the world of functional programming, exploring its advantages, core principles, origin, and reasons behind its growing traction.

I like this as an introduction to the topic, helping explain what functional programming languages are and why they’ve become much more interesting over the past 15-20 years. Anirban hits the topic of concurrency well, showing how a functional approach with immutable data makes it easy for multiple machines to work on separate parts of the problem independently and concurrently without error. I’d also add one more bit: functional programming languages tend to be more CPU-intensive than imperative languages, so in an era of strict computational scarcity, imperative languages dominate. With strides in computer processing, we tend to be CPU-bound less often, so the trade-off of some CPU for the benefits of FP makes a lot more sense. H/T R-Bloggers.

Comments closed

The Rise of Single-Purpose ML Frameworks

Published 2023-10-16 by Kevin Feasel

Pete Warden describes a phenomenon:

The GGML framework is just over a year old, but it has already changed the whole landscape of machine learning. Before GGML, an engineer wanting to run an existing ML model would start with a general purpose framework like PyTorch, find a data file containing the model architecture and weights, and then figure out the right sequence of calls to load and execute it. Today it’s much more likely that they will pick a model-specific code library like whisper.cpp or llama.cpp, based on GGML.

This isn’t the whole story though, because there are also popular model-specific libraries like llama2.cpp or llama.c that don’t use GGML, so this movement clearly isn’t based on the qualities of just one framework. The best term I’ve been able to come up with to describe these libraries is “disposable”. I know that might sound derogatory, but I don’t mean it like that, I actually think it’s the key to all their virtues! They’ve limited their scope to just a few models, focus on inference or fine-tuning rather than training from scratch, and overall try to do a few things very well. They’re not designed to last forever, as models change they’re likely to be replaced by newer versions, but they’re very good at what they do.

Pete calls them disposable ML frameworks, though I’d call them single-purpose frameworks to contrast with general-purpose ML frameworks like PyTorch and TensorFlow.

Comments closed

An Overview of Event-Driven Architecture

Published 2023-10-16 by Kevin Feasel

Yaniv Ben Hemo explains what event-driven architecture is:

First things first, Event-driven architecture. EDA and serverless functions are two powerful software patterns and concepts that have become popular in recent years with the rise of cloud-native computing. While one is more of an architecture pattern and the other a deployment or implementation detail, when combined, they provide a scalable and efficient solution for modern applications.

Click through for a primer on event-driven architecture. This is a pattern that I find quite useful for optimizing cloud pricing, assuming your normal business processes can run asynchronously—that is, people are not expecting near-real-time performance and you can start and stop processes periodically in order to “re-use” the same compute for multiple services. The alternative use of EDA is that your services need to be running all the time, but you also have multiple teams working together on the solution and you want to decouple team efforts. In that case, you define queues or Kafka-style topics and let those act as the mechanism for service integration.

This is definitely an architecture that works better for cloud-based systems than on-premises systems.

Comments closed

ORMs and Mapping Requirements

Published 2023-09-21 by Kevin Feasel

Mark Seemann is not a big fan of Entity Framework:

When I evaluate whether or not to use an ORM in situations like these, the core application logic is my main design driver. As I describe in Code That Fits in Your Head, I usually develop (vertical) feature slices one at a time, utilising an outside-in TDD process, during which I also figure out how to save or retrieve data from persistent storage.

Thus, in systems like these, storage implementation is an artefact of the software architecture. If a relational database is involved, the schema must adhere to the needs of the code; not the other way around.

To be clear, then, this article doesn’t discuss typical CRUD-heavy applications that are mostly forms over relational data, with little or no application logic. If you’re working with such a code base, an ORM might be useful. I can’t really tell, since I last worked with such systems at a time when ORMs didn’t exist.

Read on for a thoughtful argument. The only critique I have is I’d prefer stored procedures over saving SQL queries in the code.

1 Comment

The Medallion Architecture in Data Modeling

Published 2023-08-29 by Kevin Feasel

Nikola Ilic gets the gold:

The most common pattern for modeling the data in the lakehouse is called a medallion. I love this name – it’s really easy to remember. But, why medallion? Tag along and you’ll soon find out why.

The same as for the lakehouse concept, credits for being pioneers in the medallion approach goes to Databricks.

What I’ve found interesting is the number of people who have taken to disliking the medallion architecture terms because Databricks pushed it so hard that their clients automatically assumed “medallion = using Databricks.”

Comments closed