Press "Enter" to skip to content

Month: June 2025

Fronting Fabric APIs with Azure API Management

Ed Lima combines expensive with expensive:

Integrating Azure API Management (APIM) with Microsoft Fabric’s API for GraphQL can significantly enhance your API’s capabilities by providing robust scalability and security features such as identity management, rate limiting, and caching. This post will guide you through the process of setting up and configuring these features.

API Management is a really neat service, though it’s rather costly. That’s my biggest complaint about it, though it is a doozy.

Leave a Comment

Custom Libraries in Microsoft Fabric Data Engineering

Gerhard Brueckl isn’t content with the defaults:

When working with Spark or data engineering in general in Microsoft Fabric, you will sooner or later come to the point where you need to reuse some of the code that you have already written in another notebook. Best practice is to put these code pieces into a central place from where it can be referenced and reused. This way you can make sure all notebooks always use the very same code and it is also easy to develop, update and test the common functions.

As Gerhard mentions, having common notebooks with utilities is fine for when you’re getting started with development, but being able to centralize functions in proper libraries can make that code a lot more useful, not just in the context of the single notebook.

I believe that this does allow for arbitrary code execution, so someone with sufficient permissions to create a notebook and import code from arbitrary locations would be able to execute that code. I think there are ways of limiting this risk (such as not allowing your Fabric hosts to connect to any remote servers other than ones you explicitly allow), but it’s something I’d have to puzzle through.

Leave a Comment

Optimizing a Snowflake Data Warehouse

Harshavardhan Yedla gives us some guidance:

Optimizing a Snowflake data warehouse (DWH) is crucial for ensuring high performance, cost-efficiency, and long-term effectiveness in data processing and analytics. The following outlines the key reasons optimization is essential:

Read on for some tips around optimizing Snowflake warehouses. A lot of this stays at a pretty high level and doesn’t provide detailed guidance, but it’s a good checklist for thinking about your own situation.

Leave a Comment

Power BI Model Analysis via INFO Functions in DAX

Reza Rad is leading this interrogation:

There are many DAX functions for covering day-to-day business-related calculations using measures and calculated columns. However, there is also a set of functions that can be helpful to the BI team and developers in gaining insights from the Power BI model itself. The insights can include things such as the number of both-directional relationships, the dependency of the calculations, the list of columns in tables, etc. These functions are in the category of INFO functions in DAX. Let’s see what they are and how they work.

Click through for a list, as well as how you can make use of them.

Leave a Comment

Refreshing SQL Analytics Endpoint Metadata in Fabric

Ancy Philip makes an announcement:

We’re excited to announce that the long-awaited refresh SQL analytics endpoint metadata REST API is now available in preview. You can now programmatically trigger a refresh of your SQL analytics endpoint to keep tables in sync with any changes made in the parent artifact, ensuring that you can keep your data up to date as needed.

Click through to see how it works.

Leave a Comment

Verifying SQL Server Backups via SMO

Stephen Planck does some testing:

Regularly restoring test copies of your databases is the gold-standard proof that your backups work. Between those tests, however, RESTORE VERIFYONLY offers a fast way to confirm that a backup file is readable, that its page checksums are valid, and that the media set is complete. In this post you will see how to run that command from PowerShell by invoking SQL Server Management Objects (SMO), turning a one-off verification into a repeatable step you can schedule across all your servers.

Click through for the script and explanation. I also like dbatools’ Test-DbaLastBackup command, as that can also run RESTORE VERIFYONLY but goes further and allows you to restore the backup and then run DBCC CHECKDB against its contents.

Leave a Comment

Track those SQL Agent Jobs

Kevin Hill satisfies Betteridge’s Law of Headlines:

Too many IT teams treat SQL Server Agent jobs like a coffee timer “Set it and forget it!”

Unfortunately, that mindset only works if everything else is perfect forever. Whether its backup jobs failing silently, index maintenance running on the wrong replica, or nobody getting alerts when things break, unattended SQL Agent jobs are one of the sneakiest ways to rack up technical debt. Let’s dig into what DBAs and non-DBAs alike need to keep an eye on to avoid job-related headaches.

Kevin includes some good tips on monitoring SQL Agent jobs. If you’re feeling paranoid or have a particularly important job to watch, it may also make sense to set up some monitoring alerts around the end results, tracking things like the latest load date (for an ETL job) or some other indicator of doneness, and have a monitoring solution independently verify this.

Leave a Comment

The Role of Padding in Power BI Reports

Elena Drakulevska explains why padding is so important between visuals in Power BI reports:

Now that we’ve all learned to love rounded corners, let’s talk about another quiet champion of good design: padding.

You know, that tiny bit of space inside your visuals that keeps content from being awkwardly pressed right up against the border, with no room to breathe. Yeah. That.

The ideal here is to have densely informative visuals that have sufficient padding to make it easy for a viewer to move between them.

Leave a Comment

Spark Streaming plus Drools

Ram Ghadiyaram builds a tool:

Near real-time decision-making systems are critical for modern business applications. Integrating Apache Spark (Streaming) and Drools provides scalability and flexibility, enabling efficient handling of rule-based decision-making at scale. This article showcases their integration through a loan approval system, demonstrating its architecture, implementation, and advantages.  

Click through for a bit of sample code.

Leave a Comment

Vector Search from Scratch

Kanwai Mehreen does a bit of searching:

In this article, I’ll walk you through every step from generating vector representations to searching using cosine similarity, and we’ll even visualize what’s happening behind the scenes. By the end, you’ll not only understand how vector search works but also have a working implementation you can build on. So, let’s get started.

It’s kind of funny how simple this is, but it is. A lot of the complexity is around data quality operations, as well as optimizing the search process.

Leave a Comment