Press "Enter" to skip to content

Category: Architecture

An Overview of Function-as-a-Service

Grace Ol’Halloran lays out the basics of serverless computing in cloud platforms:

The term serverless computing can be misleading; how can you compute things without a server? Well, the answer is that you don’t. The term “serverless” comes from the idea that the server is abstracted from the developer, and is totally maintained by the cloud provider. In other words, the developer doesn’t really care what environment their code is run in; they just need it hosted somewhere where it can be executed. This removes the responsibility of infrastructure configuration and maintenance from the developer, but naturally gives them less flexibility and control over the environment.

It took me watching several presentations before I really understood the value behind serverless compute.

Comments closed

Where Databases Fit in the Always-Valid Domain Model

Vladimir Khorikov asks an important question:

Today, we’ll talk about an important question: how does the application database fit into the concept of Always-Valid Domain Model?

In other words, is the database part of the always-valid boundary or should you consider it an external system and validate all data coming from it?

Pre-read, my answer was no, databases are part of the external world and your domain model needs to validate every time because who knows what weirdo did something to your data while it slept.

Post-read, well, you’ll have to read to find out.

Comments closed

A Primer on Apache Cassandra Reads and Writes

Utkarsh Upadhyay explains some of the internals of reading and writing with Apache Cassandra:

Apache Cassandra is a type of No-SQL database. It handles large amounts of data across many commodity servers. Being a highly scalable and high-performance distributed database, it provides high availability with no single point of failure. Here in this blog, mainly I focused on Reads and writes in Cassandra. And For Cassandra architecture, you can refer to this blog Apache Casandra: Back to Basics. So let’s get started with this blog on Apache Cassandra: Reads and Writes.

Click through for a comparison between Cassandra and MySQL, followed by a high-level architectural explanation of read and write operations in Cassandra. Though one thing which raises my eyebrow is the statement that reads in Cassandra are O(1). I don’t know that not to be the case, but I’m inclined to say it doesn’t sound right.

1 Comment

Multi-Cloud Pros and Cons

James Serra lays out some of the benefits and drawbacks of using multiple cloud providers:

A discussion I have seen many companies have is if they should be single-cloud (using only one cloud company) or multi-cloud (using more than one cloud company). The three major Cloud Service Providers (CSPs) that companies use for nearly all use cases are Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform (GCP).

Without spoiling it too much, James is not really sold on the idea.

Comments closed

The Basics of Event-Driven Architecture

The Aiven team has a nice primer on event-driven architecture:

What happens when one link in the chain goes down? Requests that are waiting for a response don’t receive one at all. They continue to wait, or they time out. The entire application is blocked. What’s more, as the number of services increases, the number of synchronous interactions between them increases as well. In such a situation, a single system’s downtime affects the availability of other systems as well.

An alternative approach is building a microservices application on an event-driven architecture (EDA). Event-driven architecture is made up of decoupled components — producers and consumers — which process events asynchronously, often working through an intermediary, called a broker. That might feel like a mouthful. Don’t worry — we’re going to walk through these concepts one step at a time. In this article, we’re going to look at the components that make up event-driven architecture, why you would use this paradigm, and how to implement it.

Read on to see what makes it so interesting.

Comments closed

Data Mesh and Ownership Strategies

James Serra aims to clear up some confusion:

I have done a ton of research lately on Data Mesh (see the excellent Building a successful Data Mesh – More than just a technology initiative for more details), and have some concerns about the paradigm shift it requires. My last blog tackled the one about Centralized vs decentralized data architecture. In this one I want to talk about centralized ownership vs decentralized ownership, along with another paradigm shift (or core principle) closely related to it, siloed data engineering teams vs cross-functional data domain teams.

First I wanted to mention there is a Data Mesh Learning slack channel that I have spent a lot of time reading and what is apparent is there is a lot of confusion on exactly what a data mesh is and how to build it. I see this as a major problem as the more difficult it is to explain a concept the more difficult it will be for companies to successfully build that concept, so the promise of a data mesh improving the failure rates for big data projects will be difficult to achieve if we can’t all agree exactly what a data mesh is. What’s more is the core principles of the data mesh sound great in theory but will have challenges in implementing them, hence my thoughts in this blog on centralized ownership vs decentralized ownership.

Read on for James’s take on the matter.

Comments closed

Centralized and Decentralized Data Architectures

James Serra looks at a pattern:

A centralized data architecture means the data from each domain/subject (i.e. payroll, operations, finance) is copied to one location (i.e. a data lake under one storage account), and that the data from the multiple domains/subjects are combined to create centralized data models and unified views. It also means centralized ownership of the data (usually IT). This is the approach used by a Data Fabric.

A decentralized distributed data architecture means the data from each domain is not copied but rather kept within the domain (each domain/subject has its own data lake under one storage account) and each domain has its own data models. It also means distributed ownership of the data, with each domain having its own owner.

So is decentralized better than centralized?

Read on for James’s answer, and allow me to include a Dilbert cartoon so old, the boss didn’t even have pointy hair yet.

How Decentralized Organizations Can be Effective | The Fourth Revolution Blog

Comments closed

Centralized Data Modeling via Power BI Templates

Haroon Ashraf aims to square the circle:

Data modeling is the way you can arrange and link your organizational data (typically in the form of tables) for reporting and analysis.

In other words, it is the strategy of lining tables with each other to get useful information by following the standard practices and domain knowledge of the organization.

Traditionally, it stands for implementing the star or snowflake schema from the perspective of the data warehouse BI solution.

What is Centralized Data Modeling?

Centralized data modeling means a generic data model consisting of some commonly used tables, relationships, and hierarchies that are shared across the organization. These elements the starting point for Power BI report development to anyone eligible, interested, and capable to do so.

With that in mind, read on to learn how you can use Power BI templates to bring this about. I joke about squaring the circle here because if you treat Power BI as a self-service business intelligence tool, the users may not be totally familiar with what you’re doing and could end up accidentally undermining your plans. That said, it’s a good approach to solving this common problem.

Comments closed

Superkeys

Kevin Wilkie knows that not all keys wear capes:

Well, that’s because we sometimes need different ways to describe what we’ve got going on. Of the four different types of keys we’ve discussed so far, they are all different enough that we need to differentiate them and be able to explain what the differences are. For the rest of the keys that we’ll go through today, the same idea exists. They are close to the others but different enough that there is a need for another name for that type of key.

There is a super key.

Read on to learn what a superkey is. That will put you one quarter of the way to understanding Boyce-Codd Normal Form: a relational variable is in Boyce-Codd Normal Form if and only if all functional dependencies have superkey determinants.

Comments closed