Press "Enter" to skip to content

Author: Kevin Feasel

Item History in the Microsoft Fabric Capacity Metrics App

Ope Aladekomo announces a new feature:

We’re thrilled to announce the Preview of the Item History page in the latest version of the Microsoft Fabric Capacity Metrics App. The Item History page provides a 30-day compute usage analysis through dynamic visuals and slicers, enabling users to explore both high-level consumption trends and granular item-level metrics. This page helps you understand how individual items and operations contribute to overall capacity usage.

Click through to see a picture of the page, as well as some of the information you can glean from it.

Comments closed

Diagnosing Classification Model Failures

Ivan Palomares Carrascosa looks into a poorly-fitting model:

In classification models, failure occurs when the model assigns the wrong class to a new data observation; that is, when its classification accuracy is not high enough over a certain number of predictions. It also manifests when a trained classifier fails to generalize well to new data that differs from the examples it was trained on. While model failure typically presents itself in several forms, including the aforementioned ones, the root causes can sometimes be more diverse and subtle.

This article explores some common reasons why classification models may underperform and outlines how to detect, diagnose, and mitigate these issues.

The explanations are fairly high-level and focus mostly on two-class rather than multi-class classification, but there is some good guidance in here.

Comments closed

Multi- and Single-Line Regular Expression Processing in SQL Server

Louis Davidson continues a series on regular expressions in SQL Server:

There are currently only 4 flags that SQL Server supports and they are used to change some of the fundamental ways that the expressions are applied. These flags are:

i – insensitive
c – case sensitive
m – ^ and $ match end of line, not entire string
s – single line, dot matches newline

In Part 6, I covered i and c, no[w] let’s do m and s. These flags are not ones I expect to use all that often, but they are definitely useful to know.

Read on to see how they work, as well as some of the issues Louis ran into along the way.

Comments closed

Auditing Specific Data Access in SQL Server

Andreas Wolter wants to focus in on specific database objects:

In this article I want to share a targeted approach to audit access to specific objects within a database in Microsoft SQL Server.

  • In my last article, Evading Data Access Auditing in Microsoft SQL Server – and how to close the gaps, I showed multiple approaches to gain access to a chunk of sensitive data using the statistics object in SQL Server. The hardest one to capture is access to data that is exposed via the dynamic management function (DMF) dm_db_stats_histogram. This requires an additional Audit Specification in the master database for this system object. In the end we required 3 different Audit Action Groups to cover all the methods used to read data from our example table.

Read on to see what you can do as of SQL Server 2022.

Comments closed

Scheduling Copy Jobs in Microsoft Fabric

Ye Xu can run more than once:

Copy Job is the go-to solution in Microsoft Fabric Data Factory for simplified data movement. With native support for multiple delivery styles, including bulk copy, incremental copy, and change data capture (CDC) replication, Copy job offers the flexibility to handle a wide range of scenarios—all through an intuitive, easy-to-use experience.

In this update, we’re excited to announce a powerful new enhancement: multiple scheduler support. This gives you even greater control over when your data moves.

Click through for a screenshot showing how you can set up multiple schedules for a specific copy job. Based on the screenshot, it seems that there is a limit to the number of schedules you can create, though that number (20) is large enough that I couldn’t imagine it being a major impediment for most people.

Comments closed

Power BI Dataflow Gen1 and Connecting to SQL DB

Koen Verbeeck lays out a warning:

I’m in the progress of migrating some legacy stuff at a client, and in their Power BI environment there are still quite some Power BI dataflows Gen1. I had migrated an Azure Synapse Dedicated SQL Pool to an Azure SQL DB (much cheaper for their volume of data), and in the dev/test environment all dataflows were switched correctly to the new database.

However, in production, the dataflows only wanted to connect to the Azure SQL DB production database through a gateway. Weird, right? 

Click through for a rundown of the issue, as well as another one Koen ran into regarding Azure Data Lake Storage.

Comments closed

Logging in PostgreSQL

Elizabeth Christensen saves some information:

A modern-day Postgres instance creates robust and comprehensive logs for nearly every facet of database and query behavior. While Postgres logs are the go-to place for finding and debugging critical errors, they are also a key tool in application performance monitoring.

Today let’s get set up with logging for Postgres – starting with the basics of what to log, how to log what you want, and as reward for your hard work – how to use these to monitor and improve performance. The Postgres docs on logs are excellent, so please consult those for the most up to date and comprehensive configurations. This blog reads between the lines a bit beyond the docs to offer some practical advice and settings. As always, your mileage may vary.

Click through for several tips and a lot of information on the topic of logging.

Comments closed

Using the DAX FILTER Function

Ben Richardson digs into a function:

If you’ve ever tried to build a measure that needed more filtering power than a basic slicer, you’ve probably hit a wall.

That’s where DAX’s FILTER function comes in.

While visual filters and slicers work great for basic scenarios:

FILTER gives you row-level control to create sophisticated calculations that respond dynamically to your business logic.

Click through for an explanation of the function, as well as several examples of how it works.

Comments closed

A Primer on Bayesian Modeling

Hristo Hristov is speaking my language:

Multivariate analysis in data science is a type of analysis that tackles multiple input/predictor and output/predicted variables. This tip explores the problem of predicting air pollution measured in particulate matter (PM) concentration based on ambient temperature, humidity, and pressure using a Bayesian Model.

Click through for a detailed code sample and explanation.

Comments closed

A Primer on Join Operators

Andy Brownsword takes a peek at the three most common types of join operators, plus a bonus:

When reviewing our execution plans we’ll see joins executed using different operators. The type of operator is chosen based on the data that’s available to join and how the optimiser wants to execute it.

In this post we’ll take a look at what the operators are, when they are used, and how they work. These are the operators we’ll cover:

  • Nested Loop Joins
  • Merge Joins
  • Hash Match Joins
  • (Bonus) Adaptive Joins

Read on for a quick overview of which works best when.

Comments closed