Press "Enter" to skip to content

Day: June 7, 2022

Request-Response and CQRS in Kafka

Kai Waehner compares two message exchange patterns:

How can I do request-response communication with Apache Kafka? That’s one of the most common questions I get regularly. This blog post explores when (not) to use this message exchange pattern, the differences between synchronous and asynchronous communication, the pros and cons compared to CQRS and event sourcing, and how to implement request-response within the data streaming infrastructure.

Read on to learn more.

Comments closed

The Value of MLOps

Tori Tompkins explains what MLOps is and why it’s valuable:

A ML project will typically begin in an ‘Explore Phase’ where a data scientist or team of data scientists will explore the data they currently have and experiment with models, algorithms, parameters and features. MLOps at this stage is responsible for supplying Data Scientists with environment they need to achieve this. One way this can be done is by leveraging Feature Store.

A feature store is a tool for storing commonly used features. As data scientists create new features then can log these into feature stores such as Feast and Databricks Feature Store, they can reuse these features across teams and projects. This will benefit teams in multiple ways by reducing compute times for both training and inference, provide consistency in common features and reducing effort for create complex logic.

Read on for information about all six phases.

Comments closed


Marco Russo and Alberto Ferrari explain how the KEEPFILTERS function works:

KEEPFILTERS is a CALCULATE modifier used to change the way CALCULATE merges new filters with the outer filter context. Indeed, the default behavior of CALCULATE is to override existing filters. By using KEEPFILTERS you ask CALCULATE to add the new filter to the outer filter context, instead of overriding the outer filter.

Read on for the explanation and a demo.

Comments closed

Reviewing Power BI Datamarts

Teo Lachev looks at Power BI Datamarts:

As Microsoft announced here, Power BI datamarts are upon us. I can almost see an important enterprise client demanding “self-service datamarts me now or else… “, thus inspiring an opportunity for another premium feature, spearheaded with great vision and effort, but questionable practical value. In a nutshell, a Power BI datamart is a combo of Power BI Premium and a Microsoft-hosted Azure SQL Database aiming to simplify the implementation of a departmental datamart.

This take is a bit more negative than most of the others I’ve seen, so it’s worth a read in comparison to what others have written.

Comments closed

Reconciling Tag Names across Azure

Anthony Watherston has an interesting script:

During a recent cost optimization workshop with a customer, they mentioned that although they had some tagging policies in place there was no consistency of tag names across the Azure environment. This post introduces a script to remediate this and remove some confusion from your tagging strategy.

The customer was trying to ensure that all resources were being tagged with a cost centre tag. Some of this was automatic and some was done manually by people. While there was a policy in place to control this in the future, they needed a way to remediate the existing resources.

This is really useful if you have enough information to create a to-and-from mapping. It won’t automatically understand anything, so you’ll need to do the digging but it will do the renaming.

Comments closed

Testing Azure Synapse Link for SQL Server 2022

Kevin Chant gives Synapse Link for SQL Server a try:

Azure Synapse Link for SQL Server 2022 allows you to replicate your data from a SQL Server 2022 database to an Azure Synapse Analytics dedicated SQL Pool.

It is one of the options for the new Azure Synapse Link for SQL feature that was announced during Microsoft Build. You can read more about this in the Microsoft post which also announced the Public Preview of Azure Synapse Link for SQL.

Click through to see what Kevin has found so far. I think by the time this rolls out GA, it should be pretty good.

Comments closed

Downplaying Logical Reads

Erik Darling lays out an argument:

I decided to expand on some scripts to look at how queries use CPU and perform reads, and found some really interesting stuff. I’ll talk through some results and how I’d approach tuning them afterwards.

Interestingly, I just dealt with a mini-consulting engagement in which I saw the opposite: CPU sitting there twiddling its thumbs because of I/O insanity—and not even slow disks. In that case, the advice generally was “add this obviously missing index from this rather large table and stop scanning when you get 1 row on these really busy queries.” There was a little more nuance than that—and in fairness, physical reads were bad as well—but that’s why we investigate systematically.

Also, I generally accede to Erik’s point: for most busy environments, logical reads are unlikely to be the constraining factor and there are plenty of times where I choose the query form with more logical reads because it reduces CPU and memory requirements.

Comments closed

Comparing Properties between SQL Server Instances

Eitan Blumin plays duck-duck-goose:

A few years ago, I created a couple of T-SQL scripts that can be used for comparing instance-level and database-level properties between two HA/DR replicas. Originally, this supported comparing only two servers. But recently, following a fan request, I upgraded the script to support an unlimited number of servers that you can compare to each other.

So, I figured, if one person found this useful, there must be more out there that would need this, right?

Read on for the script itself, how to use it, and some limitations you’ll need to know about.

Comments closed