Press "Enter" to skip to content

Day: March 8, 2023

Data Mesh Q&A Round 2

Jean-Georges Perrin didn’t hear no bell:

How does the Data Mesh concept differ from similar efforts in the past, like EDM (Enterprise Data Model) or MDM (Master Data Model)?
Data Mesh will help us achieve those goals more quickly as those EDM and MDM projects are usually slow, and the ROI starts showing only after deployment. The product approach of Data Mesh for its data products enables a product lifecycle mentality that will help get from a current state to an (end?) state like EDM through versioning. It also allows EDM to be versioned more efficiently and reduces time to market.

Read on for a series of questions and answers around the topic of data mesh architecture.

Comments closed

Deploying a Database via Azure DevOps Pipeline

Olivier Van Steenlandt deploys a database:

After we successfully introduced a database development strategy in my previous blog post series, Getting Started With Database Projects & Azure DevOps, we can look at how to introduce a database deployment automation strategy using Database Projects and Azure DevOps Pipelines.

As a starter, we will first be implementing a build automation process and in future blog posts, we will go through the different ways of deployment to different environments. On top of that, we will also discuss the differences between SQL Server and AzureSQL database deployments.

Read on for the full story.

Comments closed

February 2023 Updates for Azure Synapse Analytics

Ryan Majidimehr has a new round-up for us:

Azure Synapse Runtime for Apache Spark 3.3 has been in Public Preview since November 2022. We are excited to announce that after notable improvements in performance and stability, Azure Synapse Runtime for Apache Spark 3.3 now becomes Generally Available and ready for production workloads.   

The essential changes include features that come from upgrading Apache Spark to version 3.3.1, Delta Lake to version 2.2.0, and Python to 3.10. 

This month’s set of changes isn’t quite as big as some prior months, though there are a couple items of great importance to make up for it.

Comments closed

The Importance of Monitoring Tools

Louis Davidson talks turkey about tooling:

When I was a DBA involved with the management of a large number of database servers, I didn’t have many third-party tools to help me do my job. For the most part, I relied on scripts that I found or wrote. I enjoyed writing scripts to manage the servers, as it taught me a lot about the internals of SQL Server. Many of these scripts were eventually automated using SQL Server’s agent to run and save data on the different servers so we could review the results, looking for issues.

Some of these tools written over 20 years ago still run to this day. We captured tons of data about everything we wanted to know about the server in case there were issues. Loads and loads of data. We had some processes that would scan that data and send emails when obvious errors occurred, but it was hard to keep synchronized over many different servers.

Click through for Louis’s thoughts. I believe good tools can make a DBA’s life a lot easier, though mediocre tools might make it worse: you become the proverbial drunk looking for his keys under a streetlamp because that’s where the light is.

Comments closed

Combining CSV Files via Powershell

Chad Callihan smooshes files together:

I recently had a handful of CVS files that needed reviewed. Each CSV file was the same format, and while I could have opened them each individually to sort and review, I thought it would be much easier to combine them into one file. It was time to turn to PowerShell. Let’s look at a few examples of how PowerShell can be used to combine multiple CSV files into a single file.

A core assumption here is that the structure of each file—particularly the number of columns but also the semantic meaning of each column—is the same.

Comments closed