Press "Enter" to skip to content

Curated SQL Posts

Thoughts on Fabric Data Wrangler

Gilbert Quevauvilliers tries out a tool:

I was going through my twitter feed and I came across this tweet where they spoke about the Data Wrangler Calling all #Python users! Have you tried Data Wrangler in #MicrosoftFabric?

I thought I would give this a try and that was the idea for my blog post. I honestly had no idea that firstly was this possible, but second that it is so easy for data wrangler to do all the hard work for me

I am going to demonstrate 2 transformations in this blog post, the first will be changing the d_date from date to datetime and then using the columns from examples I am going to create a new column where it concatenates 2 columns delimited with a double pipe command.

Read on for Gilbert’s thoughts.

Comments closed

Monitoring Query Store State Changes

Jose Manuel Jurado Diaz wants to know when the Query Store state changes:

This morning, I have been working on a support case where our client was not able to see certain queries when querying the Query Data Store. We have observed that the cause is due to the Query Data Store changing to a read-only mode due to the volume of data and the limitation our client had on the QDS database space. Therefore, I would like to share the following PowerShell script that can be executed at regular intervals to check and retrieve when the state of QDS has changed. Unfortunately, in Azure SQL Database, we cannot use the Extended Event ‘qds.query_store_db_diagnostics’.

Click through to see Jose’s alternative solution.

Comments closed

Protecting Kubernetes Services

Boemo Mmopelwa gives us an idea of Kubernetes service types and how to secure them:

A Kubernetes service is a logical abstraction that enables communication between different components in Kubernetes. Services provide a consistent way to access and communicate with the application’s underlying components, regardless of where those components are located.

In Kubernetes the default type is ClusterIP. Services abstract a group of pods with the same functions. Services expose pods and clusters. Services are crucial for connecting the backend and front-end of your applications.

This is different from your containerized applications that you can deploy on Kubernetes

Comments closed

Apache Doris and Data Colocation

Frank Z takes us through a use case for Apache Doris:

In data analytics, fast query performance is more of a result than a guarantee. What’s more important than the result itself is the architectural design and mechanism that enables quick performance. This is exactly what this post is about. I will put you into context with a typical use case of Apache Doris, an open-source MPP-based analytic database.

The user, in this case, is an all-category Q&A website. As a billion-dollar listed company, they have their own data management platform. What Doris does is support the data filtering, packaging, analyzing, and monitoring workloads of that platform. Based on their huge data size, the user demands quick data loading and quick response to queries. 

This sounds a lot like sharding of the data, where you segregate data for a particular customer/entity into its own database (and possibly instance), with the exception that queries are expected to go over a number of shards rather than focus on a single one.

Comments closed

Bring Fabric to the Data Lakehouse

Ust Oldfield ties together Databricks and Microsoft Fabric:

We’ve built countless Lakehouses for our customers and influenced the design of many more. With the advent of Fabric, many organisations with existing lakehouse implementations in Azure are wondering what changes Fabric will herald for them. Do they continue with their existing lakehouse implementation and design, or do they migrate entirely to Fabric?

For many, the answer will be to continue as-is. They’ve invested a lot of time and money in establishing a Lakehouse – to migrate now to a slightly different technology stack would be a very costly exercise! There also isn’t a need to migrate from a lakehouse implementation in Databricks to one in Fabric as there aren’t concrete benefits to be realised.

For those using Power BI as their semantic and reporting layers, as well as using Databricks SQL or Synapse Serverless as the serving layer, Fabric provides a perfect opportunity to rationalise the architecture and to bring about substantial performance gains through the Direct Lake connectivity and V-Order compression in Fabric.

Read on to see what Ust means, using a couple of architecture diagrams along the way.

Comments closed

Balancing Governance and Collaboration with Fabric

Marc Lelijveld makes it sound like I can’t just say “No!” to everything as a Microsfot Fabric administrator:

Frequently, I am approached by curious individuals who inquire about my job and how I contribute to the success of our customers, especially since I am not directly involved in building solutions for each and every one of them. These questions have made me realize that it might be interesting to share insights into my role as a Fabric Administrator, or as some may refer to it, a Power BI Administrator.

In this blog post, I aim to shed light on the essence of daily activities of a Fabric Administrator, the meaningful conversations people in this role engage in, and the additional value they bring to the table.

Read on to see what people like Marc do all day.

Comments closed

Running a Background Job in Powershell

Patrick Gruenauer does two things at once:

In this blog post, I’d like to give you a few examples related to PowerShell Background Jobs to build upon. Let’s jump in.

Let’s say I want to ping a few computers. This consumes time. So I want that this task runs in the background as a PowerShell background job.

Outside of practicing for a certification, I don’t remember the last time I willingly chose to run something as a background job, either in Powershell or bash. The concept is still useful (especially if I’m on an SSH connection or have direct terminal access), though in a UI-driven world, I’d just open a new terminal tab.

Comments closed

Query Hints: Ad Hoc vs Query Store

Grant Fritchey sets up a showdown:

I recently presented a session on the Query Store at Data Saturday Rhineland and the question came up: If there’s already a query hint on a query, what happens when you try to force a similar query hint?

Yeah, OK, that is a weird one. I don’t know the answer, but I’m about to find out.

Click through for a very interesting demo. To be honest, I expected the opposite result, so this was surprising.

Comments closed

Dynamic Data Masking and Formatted Text

Ben Johnston attacks masked strings in different formats:

Clearly this is NOT a suggestion for how you might break the text but is far more of an exercise to show you how a bad actor may attempt to look at your data in ways that would generally not cause red flags.

It is especially important to reinforce the sentiment that Dynamic Data Masking less of a security tool to prevent attacks, but more to hide data from general viewing, and as a tool for building applications where the data still is accessible in some scenarios and not others.

Click through for several examples. As I like to say (over and over), dynamic data masking only works until users get access to write arbitrary queries against a system. If they’re accessing data through an app or via stored procedure calls only, then it be a reasonable part of a broader security posture.

Comments closed

Decluttering a Dual-Axis Chart

Amy Esselman only needs one Y axis:

You may be confused and overwhelmed at first. Dual-axis graphs like this are inherently challenging. Whether you call them dual-axis graphs, combo charts, or secondary y-axis graphs, they always demand extra effort from a reader to figure out which data series to read against which vertical axis. 

Click through for a variety of ways to improve a busy dual-axis chart.

Comments closed