Press "Enter" to skip to content

Category: Storage

Migrating SQL Server Database Files between Storage Subsystems

Andy Yun does a bit of shuffling:

In my role at Pure Storage, I often engage with customers who wish to migrate their SQL Server databases off of their prior storage onto our hardware. And after some digging around for prior-published material, I was surprised to find that there really wasn’t much that was comprehensive. After all, one doesn’t change SANs too often. But when it does happen, it is nice to have some reference material from others who have. So I decided to try and give a good overview of how I’d approach the challenge.

This is meant to be a “food for thought” kind of post. I’m going to keep things somewhat high level, but will provide links to other blogs and material that can help you continue down whatever path you choose. And for simplicity, I’m going to limit this scope to a single SQL Server.

Read on for a few questions you should answer, followed by some notes and preferences. Andy’s filegroups tip is also a really good one.

Comments closed

Reclaiming Space after a DELETE Operation

Andy Yun checks disk usage:

In my current role at Pure Storage, I have the privilege of working with two amazingly smart, awesome SQL Server nerds; Andrew Pruski (b) and Anthony Nocentino (b). We often find ourselves facing interesting questions about SQL Server and storage, and today was no exception.

Andrew had a customer who wanted to know what happens on our FlashArray, from a space usage perspective, when they first delete a large volume of data in a database’s data file, then subsequently shrink the database’s data file.

Read on for that answer. This answer also applies to other storage solutions as well.

Comments closed

Storing Images in Kusto and Visualizing in Power BI or Data Explorer

Hauke Mallow shares what is probably a bad idea:

Kusto is a fast and scalable database designed to ingest, store, and analyze large volumes of structured and semi-structured data. For non-structured data like images, Azure Storage is typically the best choice. Databases can reference image data on storage via a URL, meaning images are not directly stored in Kusto. However, there are scenarios where storing image data in Kusto is beneficial. In this blog post, we will explore when it makes sense to store images in Kusto, how to store them, and how to visualize this data using Azure Data Explorer dashboards or Power BI.

I suppose the main benefit would be displaying images in Azure Data Explorer, as that tool might not support loading in external images from a storage account or other sane location. But this feels more like a neat parlor trick than something I’d actively recommend.

Comments closed

Blob Storage Account Lifecycle Maintenance

Andy Brownsword deletes some files but wants to keep other files:

A hierarchy of directories which contain files. That’s how we typically think about file storage. That’s not quite the same everywhere. In Blob Storage a file can appear to be in a directory, but when it’s removed so is the directory.

This can occur when using Lifecycle Management to help purge legacy blobs, which can be unexpected. Let’s look at a way we can help remediate this.

One important thing to remember about Azure blob storage accounts and S3 buckets is that there’s really no concept of a directory structure. It’s all keys, where your key might be dir1/dir2/dir3/file.txt. This is a bit different for Azure Data Lake Storage Gen2 and its notion of hierarchical namespaces (i.e., folders). But Andy does walk through some of the consequences of this and how to work with lifecycle maintenance policies to delete only certain sets of files.

Comments closed

Filesystem Access for Database Restoration via dbatools

Andy Levy shares a lesson learned:

While performing an instance migration this spring, I happened upon something I didn’t expect in [dbatools](https://dbatools.io/). It should have been a simple backup/restore copy of the databases, with the backup files residing on a fileshare on the destination server after being copied there. I kept getting a warning that the backup files I was attempting to restore couldn’t be read, and the restores (via Restore-DbaDatabase) wouldn’t execute.

I checked permissions on the server over and over again. Both on the filesystem and for the share that I was attempting to read from. Even more curious, if I executed the restore database statements directly from within Management Studio, the databases restored without issue.

After doing quite a bit of digging, I managed to find the reason.

Read on to learn more about necessary permissions, as well as the issue Andy hit, as well as the solution.

Comments closed

Parquet Files in Pandas

Chris LaGreca works with Parquet files:

Apache Parquet has become one of the defacto standards in modern data architecture. This open source, columnar data format serves as the backbone of many high-powered analytics and machine learning pipelines, supported by many of the worlds most sophisticated platforms and services. AWS, Azure, and Google Cloud all offer built-in support for Parquet while big data tools like Hadoop, Spark, Hive, and Databricks natively support Parquet, allowing seamless data processing and analytics. Parquet is also foundational in data lakehouse formats like Delta Lake, Iceberg, and Hudi, where its features are further enhanced.

Parquet is efficient and has broad industry support. In this post, I will showcase a few simple techniques to demonstrate working with Parquet and leveraging its special features using Pandas.

Pandas does make this rather easy, as Chris shows.

Comments closed

RIP Stretch DB

Debbi Lyons calls it:

Ever since Microsoft introduced SQL Server Stretch Database in 2016, our guiding principles for such hybrid data storage solutions have always been affordability, security, and native Azure integration. Customers have indicated that they want to reduce maintenance and storage costs for on-premises data, with options to scale up or down as needed, greater peace of mind from advanced security features such as Always Encrypted and row-level security, and they seek to unlock value from warm and cold data stretched to the cloud using Microsoft Azure analytics services.     

During recent years, Azure has undergone significant evolution, marked by groundbreaking innovations like Microsoft Fabric and Azure Data Lake Storage. As we continue this journey, it remains imperative to keep evolving our approach on hybrid data storage, ensuring optimal empowerment for our SQL Server customers in leveraging the best from Azure.

This is not surprising at all, considering that the premise of Stretch DB was that you could off-load old and less-important data from your local SQL Server instances and expensive local disk into Azure, querying it when you need that data. The problem was, you couldn’t use cheap storage and pay a few cents per gigabyte of data per month. Instead, you were effectively spinning up Azure Synapse Analytics and paying a marked premium for your least important data. The price alone made this an untenable idea, but there were other holes in the plan as well that doomed it as a product.

Comments closed

Parallel Download in Oracle Object Storage

Brendan Tierney continues a series on Oracle Object Storage:

In previous posts, I’ve given example Python code (and functions) for processing files into and out of OCI Object and Bucket Storage. One of these previous posts includes code and a demonstration of uploading files to an OCI Bucket using the multiprocessing package in Python.

Building upon these previous examples, the code below will download a Bucket using parallel processing. Like my last example, this code is based on the example code I gave in an earlier post on functions within a Jupyter Notebook.

Click through for the code.

Comments closed

Export Azure SQL DB to Blob Storage

Josephine Bush runs an import-export business and wants a database to “fall off a truck”:

After a data migration, we needed to decommission the old Azure SQL DBs, but we wanted to keep a copy in case we needed anything later. Enter exporting an Azure SQL DB to storage!

Click through for an example of how it works. Given that we’re getting bacpac files out, I wonder what it would look like with a really large database.

Comments closed