Press "Enter" to skip to content

Category: Storage

Dealing with Long-Running I/O Requests in SQL Server

Rebecca Lewis has a two-parter. First up is finding instances of long-running I/O Requests:

When diagnosing storage or latency issues, one SQL Server message factors in more than many:

“SQL Server has encountered X occurrence(s) of I/O requests taking longer than 15 seconds to complete on file…”

Where X might be 1, 5 or 50, and it could list a file from any one of your databases. When you see this, the next good question is when did it happen and where.

And then the question is, what do you do about it? Rebecca provides some guidance:

In a previous post, I shared a script to detect the I/O requests taking longer than 15 seconds warning across your SQL Server inventory.  Now let’s talk about what to do when you find it.

Here are five of the most common causes with some tips to investigate each:

The neat part is, it’s not always due to slow storage or bad hardware.

Leave a Comment

Thoughts on On-Disk Rowstore in SQL Server

Hugo Kornelis starts a series on storage structures:

When a query is slow, it is often caused by inefficient access to the data. So our tuning work very frequently comes down to figuring out how data was read, and then massaging our queries or database structures to get SQL Server to access the data in a more efficient way.

So we look at scans, seeks, and lookups. We know that scans are good when we access most of the data. Or, in the case of an ordered scan, to prevent having to sort the data. We know that seeks are preferred when there is a filter in the query. And we know that lookups represent a good tradeoff between better performance and too many indexes, but only if the filter is highly selective.

All of the above is true. And all of it is highly generalized. And hence, often, not true enough to be actually useful.

Read on for an overview of the most common option.

Leave a Comment

Cloud Storage Archival via Parquet Files

Joey D’Antoni builds a tool:

What I’m writing about today has nothing to do with analytics, per se. It has everything to do with cloud storage, and the way operations there are priced. Specifically, metadata operations–in the demo code I’ve shared we’re going from five files to one, but you can imagine going from a much larger number of files to much smaller number of files. You may ask–“Joey that sounds dumb, why are you reinventing zip and iso files”. Well, the main reason is that many cloud operations are priced on the number of objects–for example if you had to calculate a checksum across a number of files on S3. (For files/objects that were created before S3 automatically did checksums).

Click through for more information on how it works, as well as a link to the GitHub repo.

Leave a Comment

Access S3 Buckets in VPCs in Fabric via Entra Integration

Premal Shah announces new functionality in preview:

When we first introduced Amazon S3 shortcut integration with Microsoft Entra ID, customers gained a powerful new way to connect S3 data to Microsoft Fabric — without storing or rotating AWS access keys. Using OpenID Connect (OIDC), Fabric authenticates directly with AWS Identity and Access Management (IAM), enabling secure, identity-based access to cloud storage.

However, many enterprises keep their S3 buckets locked down inside Virtual Private Clouds (VPCs) or behind corporate firewalls. In these environments, Entra OIDC can authenticate identities, but it cannot provide network access — so Fabric still cannot reach the S3 endpoint. That changes today.

Read on to see what has changed, how you can enable this functionality, and current limitations.

Comments closed

What’s New in OneLake

Kim Manis shares an update:

In this blog post, I’ll highlight the new zero-ETL, zero-copy sources in OneLake, deeper interoperability between OneLake and Microsoft Foundry, and new tools to help admins manage capacity, security, and governance at scale. Together, these updates further cement Fabric as the ideal data platform for your mission-critical workloads—open, integrated, secure, and built to connect every part of your data estate to the intelligence your business needs. 

Read on to see some of the latest from Ignite.

Comments closed

OneLake Diagnostics now GA

Tom Peplow makes an announcement:

Alongside Workspace monitoring and user activity tracking accessible through Microsoft Purview, these capabilities make federated data governance a reality at enterprise scale.

Enable diagnostics at the workspace level, and OneLake streams diagnostic events as JSON into a Lakehouse you choose—within the same capacity. You can use these events to unlock usage insights, provide operational visibility, and support compliance reporting.

It does seem a bit odd that this data goes into a Lakehouse rather than into an Eventhouse. But click through to see how things work, what sorts of events this captures, and what you can do with it.

Comments closed

API Interaction with OneLake Tables

Matthew Hicks makes an announcement:

Microsoft OneLake is the unified data lake for your entire organization, built into Microsoft Fabric. It provides a single, open, and secure foundation for all your analytics workloads – eliminating data silos and simplifying data management across domains.

The preview of Microsoft OneLake Table APIs, a new way to programmatically manage and interact with your data tables in OneLake! These APIs open the door for developers and data engineers to integrate OneLake seamlessly into their workflows, enabling powerful automation and interoperability with open table formats.

Read on to see what’s available in the initial preview. It’s interesting that they started with Iceberg rather than Delta Lake.

Comments closed

S3-Compatible Object Storage in SQL Server 2025

Anthony Nocentino updates a guide for SQL Server 2025:

In this blog post, I’ve implemented two example environments for using SQL Server’s S3 object integration. One for backup and restore to S3-compatible object storage and the other for data virtualization using PolyBase connectivity to S3-compatible object storage. This work aims to get you up and running as quickly as possible to work with these new features. I implemented this in Docker Compose since that handles all the implementation and configuration steps for you. The complete code for this is available on my GitHub repo. I’m walking you through the implementation here in this post.

Click through to see the updates Anthony has to his scripts.

Comments closed

Building Storage Tiers with Pure Storage in Powershell

Anthony Nocentino creates a medallion storage layout:

In modern IT environments, not all workloads require the same level of storage performance, protection, or cost. Some applications need high performance with aggressive data protection, while others are perfectly fine with lower performance in exchange for cost savings. This tiered approach to storage service delivery is fundamental to efficient infrastructure management.

In my previous post on Fusion, I took an application-centric approach, showing how to deploy SQL Servers using Fusion. Let’s switch gears now and learn how to define a storage service catalog. In this post, I’ll demonstrate how to build a complete storage service catalog using Pure Storage Fusion Presets, offering Bronze, Silver, and Gold tiers with optional replication. We’ll see how to leverage different array types (FlashArray //X and FlashArray //C) to optimize both performance and cost across your fleet.

Read on for a link to the code, as well as more information on how it works.

Comments closed

Loading Data from Network-Protected Storage Accounts into OneLake

Matt Basile grabs some data:

AzCopy is a powerful and performant tool for copying data between Azure Storage and Microsoft OneLake, and is the preferred tool for large-scale data movement due to its ease of use and built-in performance optimizations. AzCopy now supports copying data from firewall-enabled Azure Storage accounts into OneLake using trusted workspace access. Now you can use AzCopy to load data from even network-protected storage accounts, letting you effortlessly load data into OneLake without compromising on security or performance.

Click through for an explanation of trusted workspace access, followed by the steps to try it out for yourself.

Comments closed