Press "Enter" to skip to content

Category: Storage

Searching for Files in a Blob Storage Container

Andy Brownsword hits one of my bugbears:

Shifting from handling data on premises to Azure has been a real change of mindset. Whilst what I want to build may be similar, the how part is completely different. There’s a learning curve not just to the tooling but how you use it too.

This is one of those instances.

I had a storage container with files which had a date in their name. I wanted to perform a wildcard search to select some of them. That sounds straight forward, right?

This is unnecessarily painful, especially if you’re trying to find the right full backup in a container filled with full and transaction log backup files. Andy’s solution does work but also requires a full scan of keys. And I don’t think there’s a better way to do it.

Leave a Comment

Saving Money on Azure Storage

Rahul Miglani claws back some cash:

In today’s digital landscape, businesses are increasingly turning to cloud storage solutions to manage their data effectively. Microsoft Azure offers a wide range of storage options tailored to meet diverse business needs while optimizing costs. In this blog post, we’ll explore how organizations can leverage Azure storage options to achieve significant cost savings without compromising performance or reliability.

Read on for ten tips. A lot of it boils down to keeping just enough data and putting it in the right tier, but there’s a bit more to the story.

Comments closed

Measuring Write Speeds in SQL Server

Vlad Drumea performs a test:

In this post I cover a script I’ve put together for measuring storage write speeds in SQL Server, namely against database data files.

This is meant to help get an idea of how the underlying storage performs when SQL Server is writing 1GB of data to a database.

At this point, you might be asking yourself: “Why not use CrystalDiskMark instead?”.
The answer is simple: you might not always be able to install/run additional software in an environment. Even more so if you work with external customers or you’re a consultant. It’s a lot simpler to ask a customer to run a script and send you the output, than it is to ask them to install and run some 3rd party software.

Click through for the script, what it does, and how to run it, as well as a note on limitations and example based on three drives.

Comments closed

The Importance of Data Retention Policies

Ed Pollack shares some great advice:

It is always an afterthought. New objects are created that start off small and current. New feature development takes over and the recently architected data structures become old news. Over time, data grows and suddenly a previously small table contains millions or billions of rows.

Is all that data necessary? How long should it be retained for? If there is no answer to this question, then the actuality may be “Forever”, or more honestly “No one knows for sure.”

Retention takes on many forms and this article dives into ways in which data can be managed over time to ensure that it is fast, accurate, and readily available.

We don’t tend to think about data retention in the development phase, but it’s an important consideration and thinking about it up-front might save you disk space headaches later.

Comments closed

Concatenating Many Files in Azure Blob Storage

Drew Furgiuele concatenates a lot of files:

Lately, I’ve found myself with a few requests from friends and users that have a particular problem: they’ve got themselves a data lake in Azure, and they can read and write files just fine to it. The problem, though, is that sometimes they need to take a series of files and mash them all together, or as the cool kids call it: concatenate them. And when it comes to third party tools and methods that can do the trick, you’re spoiled for choice: Azure Data Factory, Spark via Databricks, or even PowerShell.

Case in point: I was working with someone who had tens of thousands of CSV files that they needed to merge together into one big file, but they were already out in their Azure storage account. That doesn’t sound so bad, does it? 

Drew explains why it is, but also why it isn’t. So click through and check that out.

Comments closed

Differentiating Physical and Logical Reads in SQL Server

Jose Manuel Jurado Diaz explains a concept:

In the realm of Azure SQL Database, query performance is a paramount concern for database administrators and developers alike. A critical aspect of this performance is understanding how SQL Server interacts with data, particularly through physical and logical reads. This article delves into these two fundamental concepts, providing insights into their impact on database performance and a practical lab to observe these metrics in action.

Read on for the difference, as well as a demonstration. With slow disks and insufficient RAM, it’s really important to know this difference. But as you have more RAM and move to formats like NVMe for storage, I’d argue that it’s less of an issue. The additional RAM, in particular, is important because the idea is that data access frequently will remain in the buffer pool for longer, so you’re more likely to see logical reads in action. Of course, poor indexing and bad decisions can ruin that idea, so don’t do that, okay?

Comments closed

Whitepapers for Oracle and SQL Server in Azure

Kellyn Gorman has been busy:

I’ve been pretty busy with work and travel, but I finally got an official Silk Github repository to publish a couple new white papers and sizing assessment worksheets for customer access.  These are primarily Oracle and SQL Server to Azure focused white papers, but I will be publishing ones on GCP next, to be followed by AI and other database platforms soon.

Click through for links to the documents.

Comments closed

Common SQL Server Mistakes: Default Auto-Growth

Hemantgiri Goswami takes a look at auto-growth:

Auto Growth is a feature that allows database files (primary, secondary, and log) to expand when the database file becomes full – without manual intervention.

Auto Growth feature is handy when we do not want to increase the size of database files manually. There are two ways you can set auto growth – using SQL Server Management Studio (SSMS hereafter) and T-SQL. Auto Growth can be configured – In Percent and Megabytes.

Auto-growth isn’t a problem on its own, though growth sizes, especially in older versions of SQL Server, were far too low for medium- and large-sized databases.

I don’t particularly like the 2.5 MB example Hemantgiri shows. I have a quick rule of thumb which is 64MB for small databases, 256-512 for medium-sized databases, and 1GB for large databases (assuming my underlying disk is fast). This limits the number of auto-growth events and, for log files in particular, keeps virtual log file counts more reasonable.

Comments closed