Press "Enter" to skip to content

Author: Kevin Feasel

Read Local Files in SQL Server with PolyBase and MinIO

I have a new video:

In this video, I demonstrate how we can use PolyBase and MinIO to read files on a local machine in SQL Server.

This is one of the reasons I’m really happy that SQL Server introduced access to AWS S3 and S3-compatible storage with PolyBase in SQL Server 2022. The results are definitely slower than if you had direct file access, but it is possible.

Comments closed

Data Retention for Data in the Microsoft Fabric Lakehouse

Kenneth Omorodion clears out some data:

More than before, organizations now aim for a well-defined approach to manage their data storage effectively. Some reasons for this include operational efficiency, cost management, regulatory compliance, and strategic decision-making. In this article, I will describe an approach on data retention management​ for Lakehouse files to manage data storage when the data exists as files in the Fabric Lakehouse.

There’s nothing built in but Kenneth makes it easy.

Comments closed

Microsoft Fabric Item Ownership Takeover

Sakshi Jain has an announcement:

Today, when an item owner leaves the company, their credentials expire, or they lose access, many Fabric items cease to function. For example, Lakehouses and their SQL Endpoints become inoperative, Pipelines fail to execute due to user access errors. In these situations, enabling another user to assume ownership would ensure business continuity.

We are pleased to announce that Fabric users with the right permissions can now take ownership of Fabric items.

This is a big deal. for the same reason that we don’t want individual users to own databases in SQL Server, having individual users own objects in Fabric lakehouses and endpoints was always a risky play. At least now, there’s a way to handle when that person leaves the company.

Comments closed

Natively Compiled Stored Procedures in SQL Server

Yvonne Vanslageren covers a point of frustration for me:

Modern applications often demand lightning-fast performance from their databases, whether they’re handling large transactional workloads or complex analytical queries. SQL Server’s in-memory OLTP feature addresses these needs by using memory-optimized tables and natively compiled stored procedures to boost throughput and reduce latency. This post provides an overview of natively compiled stored procedures, how to create them, and best practices for performance monitoring and maintenance.

My point of frustration is pretty simple: these things work really, really well. But they’re also so limited that I have never been able to use one in production. Memory-optimized tables are already so limited in good use cases, and natively compiled stored procedures have even more limitations, like using an awful collation (from the standpoint of humans working with the data) for string data.

Comments closed

Using the Azure SQL DB Query Editor

Josephine Bush writes a query:

I keep losing track of this wondering where it went. You have to access it at the database level. Adding this post to remind me for later. This came in very handy when my home internet went down and I couldn’t auth on my phone hotspot without timeouts in Azure Data Studio.

You can login in with SQL Server auth or Entra.

Read on for some notes about limitations. It is definitely a helpful tool for occasional queries or having a simpler way to access data without having to set up a VPN and a whole bunch of tools.

Comments closed

ALL vs ALLCROSSFILTERED in DAX

Marco Russo and Alberto Ferrari disambiguate a pair of operators:

Have you ever wondered what the subtle difference between ALL and ALLCROSSFILTERED might be? The family of ALL functions and modifiers includes some common functions, like ALL and ALLSELECTED, and some fancier and less frequently-used functions, like ALLNOBLANKROW and ALLCROSSFILTERED. This article discusses what ALLCROSSFILTERED is, why it is there in DAX, and when and how developers should use it.

Read on for that answer, along with several helpful demos.

Comments closed

Security Baselines for Azure SQL Workloads

Mika Sutinen builds a baseline:

I’ve recently had to work a bit more with the Microsoft Defender and the vulnerability assessment in Azure. Following those efforts, it dawned to me that the topic of security baselines is sometimes slightly misunderstood. So, in this post, we’ll look into what a security baseline should cover (and what they probably shouldn’t).

But first things first. Security baselines are provided by the Microsoft Defender for Cloud service, which I always recommend enabling for Azure workloads (unless there’s a 3rd party solution for it already). If you don’t have anything of the sorts enabled for your databases and servers, I highly recommend you go and turn Defender on. Seriously. Do it now.

Read on to learn more about why having a security baseline is so important and where to draw the cut-off between security and functionality.

Comments closed

Emitting Data to a Single CSV in Spark

Chen Hirsh wants to consolidate:

To write and read data faster, Spark splits the work between nodes in a cluster, each reading\writing part of the data. That’s why, in the screenshot above, there are 3 CSV files (That’s the files starting with “Part”, with a CSV extension), instead of 1. Note that this can also occur when working with a single node cluster since Spark splits the work into tasks.

This behavior is great if you intend to keep working with the CSV files in Databricks since reading will be faster. But if you want to share this file with someone outside of Databricks, this may be inconvenient.

Read on for two ways of doing this, as well as the price you pay to get it done.

Comments closed

Dropping a Role in PostgreSQL

Josephine Bush drops a role:

You can’t just exec DROP ROLE your_role_name; if it’s granted perms or other roles are granted to it. I had to go fishing to find all the grants to revoke them. Note: if you are worried about re-granting later, you can always fiddle with this to output the grants for these perms as a rollback.

Read on for a few scripts to help out with finding what that role owns, revoking rights, and reassigning ownership.

Comments closed

Truncating All Tables while Preserving Foreign Keys in T-SQL

Ronald Kraijesteijn builds a script:

When testing a data warehouse, a common challenge is managing large datasets effectively. Often, you need to reset tables to a clean state, ensuring consistent testing environments. The most efficient way to clear a table is using the SQL command TRUNCATE TABLE. However, this command is not straightforward when foreign key constraints are present. In this article, we’ll explore a solution that temporarily disables constraints, allows truncation, and then restores the constraints—keeping your data model intact.

Click through for the script, which saves a record of all of the foreign key constraints, truncates each table, and then re-creates the foreign keys.

Comments closed