Press "Enter" to skip to content

Category: Storage

Viewing Storage Consumption in Microsoft Fabric

Gilbert Quevauvilliers wants to know about storage utilization in Microsoft Fabric:

This blog post will show you how to understand what is consuming your Fabric Storage.

If you want to know how I got this data, please read my previous blog post View all your Storage consumed in Microsoft Fabric – Lakehouse Files, Tables and Warehouses – FourMoo

With this Semantic model below, I could also create alerts to notify based on certain thresholds. For example, if total storage in a single App workspace is more than 100GB send me an alert (This could be done using Power Automate). Or it could be on too many files being stored, or even looking at the Parquet file sizes and if they are too small they would then need to be optimised (for better performance).

Click through for the report.

Comments closed

Viewing Total Storage Consumption in Microsoft Fabric

Gilbert Quevauvilliers builds a report:

One of the things I have found when working with my customers in Microsoft Fabric is that there is currently no way to easily view the total storage for the entire tenant.

Not only that, but it would also be time consuming and quite a challenge to then find out what is consuming the storage. Could it be large files or tables or warehouse tables?

In this blog post I will show you how using a Notebook you can get details of the storage across your Microsoft Fabric Tenant.

Click through for an image of the Power BI report and how you can get there.

Comments closed

Ingesting Blob Storage Data into SQL Server

Andy Brownsword brings in some data:

We may associate consuming data from Azure Storage with tools like Data Factory or even SSIS as we saw recently. We don’t always need the middle man though.

Here we’ll demonstrate how to use an External Data Source to perform the ingestion directly into SQL Server.

Click through for the solution. As a quick note, the TYPE attribute that Andy uses in CREATE EXTERNAL DATA SOURCE was necessary from SQL Server 2016 through SQL Server 2019, but no longer exists for SQL Server 2022. Instead, for SQL Server 2022, you’d switch the LOCATION to start with abs:// for Azure Blob Storage and PolyBase would infer the type from the protocol.

Comments closed

Reading Parquet Files in R with nanoparquet

Stephen Turner reads some data:

In these slides I also learned about the nanoparquet package — a zero dependency package for reading and writing parquet files in R. Besides all the benefits noted above, parquet is much faster to read and write. And, as opposed to saving as .rds, parquet can easily be passed back and forth between R, Python, and other frameworks.

Let’s take a look at how reading and writing parquet files compares with CSV, either with base R or readr.

Stephen shows one of the best-case scenarios for Parquet: lots of data (100 million rows), relatively few columns, no long strings, etc. That leads to a massive improvement over using CSVs, even if you ignore the metadata and formatting benefits. I wouldn’t expect the benefits to be nearly as significant with wide text columns and very little value overlap, but that’s also pretty uncommon for the type of dataset we’re analyzing in R.

Comments closed

Reading Data from Azure Blob Storage in Snowflake

Arun Sirpal explains a common architectural pattern:

Let’s go back to data platforms today and I want to talk about a very common integration I see nowadays, Azure Blob Storage linked to Snowflake via a storage integration which then we can access semi structured files via external tables, it is a good combination of technology I have to say.

Click through for an architecture diagram and example of the code you’d need.

Comments closed

Connecting to Azure Storage from SSIS

Andy Brownsword makes a connection:

Migrating to the cloud can be disruptive to existing processes. Moving storage to Azure isn’t a simple configuration change for SSIS packages.

SSIS doesn’t have native connections for Azure. That doesn’t mean we need to completely re-engineer the process or change technology though.

How can we take the simple package below and move to using Azure storage?

Read on for the answer. Also, I am 100% on Team SAS Token. They are easy to create and give you a lot of control over who gets access to what.

Comments closed

Exploring Azure Data Storage Options

Anurag K talks about data storage:

Choosing the right data storage option in Azure is critical for ensuring your applications run efficiently, securely, and cost-effectively. In this blog post, we’ll dive deeper into three key Azure storage services: Blob Storage, File Storage, and Disk Storage. We’ll explore their features, use cases, and provide specific examples to help you understand how to best leverage these services in your cloud environment.

I would note here that within the Blob storage section, you could carve out an explanation of Data Lake Storage Gen2, which starts from Blob storage and then adds hierarchical namespaces (i.e., folders).

Comments closed

Migrating SQL Server Database Files between Storage Subsystems

Andy Yun does a bit of shuffling:

In my role at Pure Storage, I often engage with customers who wish to migrate their SQL Server databases off of their prior storage onto our hardware. And after some digging around for prior-published material, I was surprised to find that there really wasn’t much that was comprehensive. After all, one doesn’t change SANs too often. But when it does happen, it is nice to have some reference material from others who have. So I decided to try and give a good overview of how I’d approach the challenge.

This is meant to be a “food for thought” kind of post. I’m going to keep things somewhat high level, but will provide links to other blogs and material that can help you continue down whatever path you choose. And for simplicity, I’m going to limit this scope to a single SQL Server.

Read on for a few questions you should answer, followed by some notes and preferences. Andy’s filegroups tip is also a really good one.

Comments closed

Reclaiming Space after a DELETE Operation

Andy Yun checks disk usage:

In my current role at Pure Storage, I have the privilege of working with two amazingly smart, awesome SQL Server nerds; Andrew Pruski (b) and Anthony Nocentino (b). We often find ourselves facing interesting questions about SQL Server and storage, and today was no exception.

Andrew had a customer who wanted to know what happens on our FlashArray, from a space usage perspective, when they first delete a large volume of data in a database’s data file, then subsequently shrink the database’s data file.

Read on for that answer. This answer also applies to other storage solutions as well.

Comments closed

Storing Images in Kusto and Visualizing in Power BI or Data Explorer

Hauke Mallow shares what is probably a bad idea:

Kusto is a fast and scalable database designed to ingest, store, and analyze large volumes of structured and semi-structured data. For non-structured data like images, Azure Storage is typically the best choice. Databases can reference image data on storage via a URL, meaning images are not directly stored in Kusto. However, there are scenarios where storing image data in Kusto is beneficial. In this blog post, we will explore when it makes sense to store images in Kusto, how to store them, and how to visualize this data using Azure Data Explorer dashboards or Power BI.

I suppose the main benefit would be displaying images in Azure Data Explorer, as that tool might not support loading in external images from a storage account or other sane location. But this feels more like a neat parlor trick than something I’d actively recommend.

Comments closed