Press "Enter" to skip to content

Month: September 2023

Attacks on Row-Level Security

Ben Johnston continues a series on row-level security in SQL Server:

As mentioned in previous sections, RLS is an addition to security and should not be used as the primary method to limit access to data. It is a supplementary layer, useful in specific scenarios. There are also instances where RLS can be defeated by an unauthorized user. The attacks listed below are broken down into direct attacks, indirect attacks, and side-channel attacks. The categorizations could be changed, but the important part of each is the vulnerability discussed.

The one scenario I’m a bit surprised about is the divide by zero attack, as I had figured the filter predicate would apply before the computation leading to a divide by zero scenario.

Comments closed

Using DAX to Find Products Missing Sales

Marco Russo and Alberto Ferrari observe the dog that didn’t bark:

What products did not sell in a specific area, store, or time period? This may be an important analysis for several businesses. There are multiple ways to obtain the desired result. Some specific implementations might be needed because of the user or model requirements, whereas developers can choose any formula in several cases. Or you might just find a solution on the web and blindly implement it without questioning whether there is a better way to achieve what you want.

It turns out that different formulas perform very differently. Choosing the right one in your scenario can make a slow report fast. This article analyzes the performance of different formulations of one same algorithm. 

It’s interesting to see the performance profile here: most are reasonably close together, although you can still get a 2x gain from using the fastest approach versus the second-slowest. And then there’s the slowpoke.

Comments closed

Kafka Message Compression

Lucia Cerchie and Dave Troiano give us the rundown on compression of individual messages in Apache Kafka:

Topic partitions are the main “unit of parallelism” in Kafka. What’s a unit of parallelism? It’s like having multiple cashiers in the same store instead of one. Multiple purchases can be made at once, which increases the overall amount of purchases made in the same amount of time (this is like throughput). In this case, the cashier is the unit of parallelism. 

In Kafka, each partition leader can live on a different broker in a cluster, and a producer can send multiple messages, each with a different destination topic partition; that is, a producer can send them in parallel. While this is the main reason Kafka enables high throughput, compression can also be a tool to help improve throughput and efficiency by reducing network traffic due to smaller messages. A well-executed compression strategy also means better disk utilization in Kafka, since stored messages on disk are smaller. 

Click through for the various options and some guidance on using each.

Comments closed

Oracle: RMAN and Non-Synchronizing Standby Database

David Fitzjarrell proffers advice on recovering from a non-synchronizing standby database:

Occasionally the unthinkable can occur and the DBA can be left with a standby database that is no longer synchronizing with the primary. A plethora of “advice”will soon follow that discovery, most of it much like this:

“Well, ya gotta rebuild it.”

Of course the question to ask is “how far out of synch is the standby>” That question is key in determining how to attack this situation. Let’s go through the two most common occurrences of this and see how to address them.

Read on to see David’s advice.

Comments closed

Deployment Pipelines for Microsoft Fabric

Reitse Eskens crosses a line:

It’s a bit of a challenge to keep up with all the changes, updates and all the new stuff coming out for Fabric. As I’m not really invested in the PowerBI part of the data platform (yay pie charts ;)), some things that are very common for the PowerBI community are very new to me. I have it on good authority that this blog covers a feature that is well know within PowerBI but quite new in the data engineering part. When I say that, I need to add that at the time of writing, only the PowerBI side of things are fully supported but I have very good hopes that pipelines and notebooks will be supported as well.

Supporting pie charts are fightin’ words here. Nonetheless, read on to see how deployment pipelines work in Microsoft Fabric.

2 Comments

Finding SSAS Tabular Dimensions in Excel

Olivier Van Steenlandt has lost a few dimensions in the couch cushions:

A colleague reached out last week while connecting to one of our SQL Server Analysis Services models in Excel. He couldn’t find the expected Attribute folders in the model. He was looking for the following dimensions:

Of particular interest was that this colleague could not see them but Olivier could. The answer ends up being a bit surprising.

Comments closed

Query Execution Concepts and SQL Server

Erik Darling answers the question, why is it so hard to figure out why my query sometimes sucks:

Sometimes people will ask me penetrating questions like “why does SQL Server choose a bad execution plan?” or “why is this query sometimes slow?”

Like many things in databases, it’s an endless spiral of multiverses (and turtles) in which many choose your own adventure games are played and, well, sometimes you get eaten by a Grue.

In this post, I’m going to talk at a high level about potential reasons for both.

Read on for a smorgasbord of factors to consider based on the steps SQL Server takes.

Comments closed

Grouped Scatter Plots in R

Steven Sanderson builds a scatter plot:

Data visualization is a powerful tool for gaining insights from your data. Scatter plots, in particular, are excellent for visualizing relationships between two continuous variables. But what if you want to compare multiple groups within your data? In this blog post, we’ll explore how to create engaging scatter plots by group in R. We’ll walk through the process step by step, providing several examples and explaining the code blocks in simple terms. So, whether you’re a data scientist, analyst, or just curious about R, let’s dive in and discover how to make your data come to life!

Click through for several examples of plot generation.

Comments closed