Architecture – Page 9

Before we come up to explain why dimensional modelling is named like that – dimensional, let’s first take a brief tour through some history lessons. In 1996, a man called Ralph Kimball published a book “The Data Warehouse Toolkit”, which is still considered a dimensional modelling “Bible”. In his book, Kimball introduced a completely new approach to modelling data for analytical workloads, the so-called “bottom-up” approach. The focus is on identifying key business processes within the organization and modelling these first, before introducing additional business processes.

This is a really good overview of the topic, though I’m saddened that “dimensional bus matrix” didn’t make the cut of things to discuss. Mostly because I like the name “dimensional bus matrix.”

Comments closed

Microsoft Fabric Architectural Icons

Published 2023-07-11 by Kevin Feasel

Marc Lelijveld imports some icons:

In the past, I’ve made a draw.io file for Power BI to help you using the right icons to design your solutions and make architectural diagrams. With Fabric, a bunch of new services and icons have been introduced. This asks for a new draw.io file.

With this blog, I will provide the draw.io file for all new icons and elements of Fabric.

Click through for that link. Also note that you might be more familiar with the new name of draw.io, diagrams.net.

Comments closed

Contrasting Kafka and Pulsar

Published 2023-05-25 by Kevin Feasel

Tessa Burk perform a comparson:

Apache Kafka® and Apache Pulsar™ are 2 popular message broker software options. Although they share certain similarities, there are big differences between them that impact their suitability for various projects.

In this comparison guide, we will explore the functionality of Kafka and Pulsar, explain the differences between the software, who would use them, and why.

Click through for that comparison. I haven’t used Pulsar before, so it’s interesting to get this sort of a functionality and community comparison.

Comments closed

Elastic Pools for Azure SQL DB Hyperscale

Published 2023-05-24 by Kevin Feasel

Arvind Shyamsundar announces a new preview:

We are very excited to announce the preview of elastic pools for Hyperscale service tier for Azure SQL Database!

For many years now, developers have selected the Hyperscale service tier in a “single database” resource model to power a wide variety of traditional and modern applications. Azure SQL Hyperscale is based on a cloud native architecture providing independently scalable compute and storage, and with limits which substantially exceed the resources available in the General Purpose and Business Critical tiers.

Click through to learn more about what’s on offer.

Comments closed

The Current Status of the Lakehouse Architecture

Published 2023-05-23 by Kevin Feasel

Paul Turley is happy:

When I first started attending conference and user group sessions about Lakehouse architecture, I didn’t get it at first, but I do now; and it checks all the boxes. As a Consulting Services Director in a practice with over 200 BI developers and data warehouse engineers, I see first-hand how our customers – large and small – are adopting the Lakehouse for BI, Data science and operational reporting.

Read on for Paul’s thoughts. My main concern with the strategy has always been performance, with the expectation that it’d take a few years for lakehouse systems to be ready for prime time. We’re getting close to that few years (back in 2020, I believe I estimated 2024-2025).

Comments closed

Data Mesh Q&A

Published 2023-05-05 by Kevin Feasel

Jean-Georges Perrin hosts another Q&A:

As part of the Data Mesh Learning Community, Eric Broda invited Laveena Kewlani, Kruthika Potlapally, and me to discuss the implementation of Data Mesh at PayPal. As expected, the session went longer than scheduled, and some questions remained open. As with the previous Q&A sessions ([#1] and [#2]), here is an attempt to answer them.

Click through for the questions, as well as the answers.

Comments closed

PayPal’s Data Contract Template Open Sourced

Published 2023-05-03 by Kevin Feasel

Jean-Georges Perrin makes an announcement:

A data contract is a binding agreement between the consumers and producers of data. You can see it as a data schema on steroids or data schema++. The goal of the contract is to set expectations between the parties. It can be built as fit-for-purpose where the consumers and producer agree on what it should contain or can serve as a brochure for any consumer willing to access the data offered by this (data) product.

Click through to learn more about data contracts and then check out the contract template itself on PayPal’s GitHub repo.

Comments closed

Portfolio Management for Creating a Technology Strategy

Published 2023-04-28 by Kevin Feasel

Kevin Sookocheff busts out the 2×2 matrix:

Application Portfolio Management (APM) draws inspiration from financial portfolio management, which has been around since at least the 1970s. By looking at all applications and services in the organization and analyzing their costs and benefits, you can determine the most effective way to manage them as part of a larger overall strategy. This allows the architect or engineering leader to take a more strategic approach to managing their application portfolio backed by data. Portfolio management is crucial for creating a holistic view of your team’s technology landscape and making sure that it aligns with business goals.

This is for C-levels and VPs rather than individual contributors, but acts as a good way of thinking about a portfolio of applications and what to do with each.

Comments closed

Checklist for a Snowflake Migration

Published 2023-04-12 by Kevin Feasel

Sandeep Arora has a checklist for us:

We have broken our Snowflake Migration Checklist into nine phases to help plan and execute an end-to-end migration of the existing traditional data platform to Snowflake. These phases will help align migration resources and efforts; however, this doesn’t necessarily mean that all steps should be executed sequentially. Some phases, like “Train Users,” can be executed parallel to other phases.

At a high level, the process isn’t Snowflake-specific—really, 6 of the 9 steps are generic supporting steps which would apply to any major project. This makes the checklist not only a good starting point for a Snowflake migration, but also any major migration project.

Comments closed

An Overview of the Kappa Architecture

Published 2023-04-10 by Kevin Feasel

Amian Patnaik provides an overview:

The Kappa Architecture, introduced by Jay Kreps, co-founder of Confluent, is designed to handle real-time data processing in a scalable and efficient manner. Unlike the traditional Lambda Architecture, which separates data processing into batch and stream processing, the Kappa Architecture promotes a single pipeline for both batch and stream processing, eliminating the need for maintaining separate processing pipelines.

What’s interesting to me is that Lambda, an architecture which was an explicit product of its time (in the sense that it was a compromise architecture trying to do two things, the combination of which limited hardware and tooling didn’t allow), is still thriving today. Kappa, meanwhile, isn’t an architectural style that people throw around a lot anymore, at least in the circles I run around in.

Comments closed

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Category: Architecture

The Basics of Fact-Dimensional Modeling

Microsoft Fabric Architectural Icons

Contrasting Kafka and Pulsar

Elastic Pools for Azure SQL DB Hyperscale

The Current Status of the Lakehouse Architecture

Data Mesh Q&A

PayPal’s Data Contract Template Open Sourced

Portfolio Management for Creating a Technology Strategy

Checklist for a Snowflake Migration

An Overview of the Kappa Architecture