Storage – Page 8 – Curated SQL

Parallel Scans and Blob Storage Slowness

Published 2021-12-23 by Kevin Feasel

Joe Obbish goes beyond the obvious reason:

Upon reading the title, you may be thinking that of course parallel scans will be slow in the cloud. Cloud storage storage simply isn’t very fast. I would argue that there’s a bit more to it.

Click through for a deep dive with some advice on what you might do to fix a specific (but common) scenario.

Comments closed

Understanding SQL Server I/O Size

Published 2021-12-16 by Kevin Feasel

Anthony Nocentino dives into a topic:

A Master File Table (MFT) is the data structure that describes files and directories on NTFS. In Figure 1, you can see an MTF record has several sections describing the metadata about the file and pointers to blocks that make up the file. A block, also referred to as a cluster in Windows, is an abstraction over one or more physical structures (sectors or pages depending on the media) presented by the underlying disk. A block/cluster is also the atomic allocation unit from a file system and has a configurable size. On NTFS, this is referred to as the NTFS Allocation Unit Size and is a configurable attribute of the file system. By default, it is 4KB and can be as large as 2MB. Since a block is a unit of allocation, if a file is between 1 byte and the file system’s allocation unit size, it will take up exactly one block/cluster on the file system. As the file grows, more blocks/clusters are allocated to represent the file. The MFT data structure tracks which blocks make up a file. The block allocator of the file system will try to ensure blocks are physically adjacent on the disk and groups them together in runs.
A 4KB NTFS Allocation Units size is considered best practice on general-purpose file systems. And 64KB is considered best practice for SQL Server…but why? Let’s keep digging…

I was sitting at a lunch table with Anthony when it all clicked and that was fun to see.

Comments closed

Building a Pipeline for External Data Sharing

Published 2021-12-09 by Kevin Feasel

Hope Foley has data to share:

I worked with a customer recently who had a need to share CSVs for an auditing situation. They had a lot of external customers that they needed to collect CSVs from for the audit process. There were a lot of discussions happening on how to best do it, whether we’d pull data from their environment or have them push them into theirs. Folks weren’t sure on that so I tried to come up with something that would work for both.

Read on for Hope’s solution to the problem.

Comments closed

Managing File Retention in Blob Storage

Published 2021-11-23 by Kevin Feasel

Jeet Kainth shows how to configure a retention strategy in Azure Blob Storage:

This blog is a follow up to a previous blog I wrote about backing up Azure Analysis Services cubes in Azure, that blog can be found here. This blog shows how to implement a retention policy using PowerShell in Azure Runbooks to remove the backups after a set number of days. To create a new Runbook in the Azure portal, go to the relevant Automation account in the relevant resource group and then select Runbooks from the left hand pane. Note you will need to add the Az.Storage module to the automation account to be able to use some of the commands listed in this blog.

Click through for the process, including Powershell code to perform the task.

Comments closed

Storage Testing for Azure SQL Managed Instances

Published 2021-10-22 by Kevin Feasel

Joe Obbish busts out the slide rule:

Lately I’ve been doing some exploratory performance testing on Azure SQL Managed Instances in preparation for a migration to that platform. This blog post documents some storage testing results and may even have practical advice near the end. All testing was done on a gen5 general purpose instance with 8 vCores.

Read on for Joe’s findings. Spoiler alert: there is practical advice at the end.

Comments closed

Long-Term Storage in Gold

Published 2021-10-15 by Kevin Feasel

Alex Woodie discusses a new storage solution:

In the hunt for the optimal long-term storage, media comes and media goes. From cave paintings and daguerreotypes to tape and optical disk, humans continue to search for new ways to preserve information for use by subsequent generations. Now an Upstate New York company is touting a new type of archive that leverages an element with the ultimate in longevity: gold.
Totenpass today announced the launch of its novel storage solution that’s made of gold. Data is written directly to the surface of the gold cartridge (which features some nickel) using a laser etching process. The data takes the form of shrunken down human-readable images or documents, or machine-readable data that can be encrypted and read with a smartphone app (which is still in development).

This makes perfect sense to me. If you write it in metal, that means Ruin can’t read or modify it.

Comments closed

Best Practices for SQL Server on Physical Machines

Published 2021-10-14 by Kevin Feasel

Anthony Nocentino has some practices for us, the best practices:

The intent of this post is a quick reference guide based on the recommendation made on Pure Storage Support page in the Microsoft Platform Guide . The target audience for this blog post is for SQL Server DBAs introducing them to the most impactful configurations and settings for running SQL Server on physical machines on Pure Storage.

Click through for a checklist of recommendations.

Comments closed

Power BI Storage Modes and Aggregations

Published 2021-10-11 by Kevin Feasel

Phil Seamark dives into storage modes in Power BI:

How to choose the correct storage mode for Power BI Tables.
This article aims to help explain the different storage modes available when designing an aggregation strategy for a Power BI Report. What each storage mode is and when you would use it. Picking the correct storage mode for each table in your model can significantly affect overall performance.

Click through for the tl;dr version, but stay for the whole thing.

Comments closed

Measuring File Latency in SQL Server

Published 2021-10-07 by Kevin Feasel

Anthony Nocentino has a script and some tips for us:

This post is a reference post for retrieving IO statistics for data and log files in SQL Server. We’ll look at where we can find IO statistics in SQL Server, query it to produce meaningful metrics, and discuss some key points when interpreting this data.

Click through for the script, and then a bulleted list of things to keep in mind as you’re reviewing the data.

Comments closed

Pure Storage FlashArray Snapshot Torture Test

Published 2021-09-22 by Kevin Feasel

Argenis Fernandez puts SQL Server snapshots on a Pure Storage FlashArray to the test:

Look, I’m not here to fight your religious war about how snapshots should not be called backups. I’m just gonna call them fast-as-fast restores(*) and be done with it. Because let’s be honest, with Pure Storage there’s absolutely nothing faster than a storage snapshot to recover a volume. Or volume(s). You get the idea. It’s about how fast you recover, every time.
Yes, I do understand that there are a million of considerations for something to be called a “backup”. We’ll get to those little by little – don’t expect a thorough post on that debate right now. Today I want to focus on one question: Are Pure Storage FlashArray snapshots stable, trustworthy enough that I can take them without pausing I/O against my database? Can I trust that the database will come online every time from a snapshot?

Read on for the Answer. For additional fun, read the whole article with your mental voice sounding like Argenis.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: Storage