Storage – Page 11 – Curated SQL

AzureTableStor: Table Storage in R

Published 2020-11-19 by Kevin Feasel

Hong Ooi announces a new package on CRAN:

I’m pleased to announce that the AzureTableStor package, providing a simple yet powerful interface to the Azure table storage service, is now on CRAN. This is something that many people have requested since the initial release of the AzureR packages nearly two years ago.
Azure table storage is a service that stores structured NoSQL data in the cloud, providing a key/attribute store with a schemaless design. Because table storage is schemaless, it’s easy to adapt your data as the needs of your application evolve. Access to table storage data is fast and cost-effective for many types of applications, and is typically lower in cost than traditional SQL for similar volumes of data.

If that sounds like a fit for you, check out the package.

Comments closed

Monitoring Storage Metrics

Published 2020-10-07 by Kevin Feasel

Robert Sheldon continues a series on storage concepts for the DBA:

When monitoring storage systems, engineers should track a variety of metrics to ensure the systems continue to meet application requirements. Three of the most important and commonly cited metrics are latency, I/O operations per second (IOPS), and throughput. In addition to these three, queue length and I/O splitting can also provide valuable insights into storage performance.
In this article, I discuss all five of these metrics and demonstrate them in action. Despite my focus on these five, they’re not the only important metrics to monitor. For example, engineers should also track storage capacity, device cache usage, controller operations, and storage networks. Even seemingly unrelated components can be a factor, such as low CPU utilization, which can indicate that the processor is waiting on storage to complete requests from the application.

Robert also shows off the new perfmon, which may or may not be better than Perfmon Classic.

Comments closed

Storing SQL Server Database Files in Blob Storage

Published 2020-07-30 by Kevin Feasel

Tomaz Kastrun has a wacky idea:

Storing SQL Server database files in Azure blob storage is a great solution for all the databases that are often migrated between instances, servers, virtual machines, or would have been divided between instances. This scenario also has the positive aspect to it, since the ability to create snapshot backups to Azure is seamless.
Following the steps, we will create a Azure Blob storage, where MSSQL Server database files will reside with MSSQL Server running on-prem. Assuming, that you already have the Azure account (if not, you can get a free Azure account), let’s proceed by opening the Windows Terminal in PowerShell mode.

I’m impressed that it worked and could see it being an option for small demo databases, but I can’t imagine performance would be good enough for a production scenario.

Comments closed

Loading Data from S3 into Power BI

Published 2020-07-30 by Kevin Feasel

Gilbert Quevauvilliers loves a challenge:

I really enjoy a good challenge, and with my customer they have all their data stored in AWS S3. Whilst there is no native connector, I thought there must be a way for me to get the data from AWS S3 into Power BI.
I did a bit of Googling and could not find any suitable solution. I also found and learnt that I could use AWS Athena to query the data living in S3. (I am definitely an expert of have a lot of knowledge in the AWS space. I am fortunate that I have other people who know AWS and were able to setup, configure and give me the details to connect to S3 via AWS Athena)
Below are the steps on how I got this working.

Why they don’t have a proper connector is a bit of a head-scratcher to me given the sheer amount of data stored in S3 and the sheer number of connectors in Power BI.

Comments closed

Disk Caching with SQL Server VM Disks in Azure

Published 2020-07-22 by Kevin Feasel

Niko Neugebauer performs some tests:

Microsoft has been extremely clear in the best practices recommendation for the SQL Server workloads on Azure VMs:
– use read caching for the data drives/storage pools
– use no caching for the log drives/storage pools
– use read caching for the temp db drives/storage pools
Sounds simple and direct, isn’t it ?
Let me borrow your attention for the next couple of minutes pointing to some situations where you might want to reconsider the best practices.

But do read on for some important notes.

Comments closed

Backing Up Databases to Azure Blob Storage

Published 2020-06-24 by Kevin Feasel

David Fowler shows how you can back up databases to Azure Blob Storage:

SQL Server has given us the option to backup our databases directly to Azure BLOB storage for a while now but it’s not something that I’ve had all that much call to use until recently.
So this is just going to be a quick walk through on how you can backup your on premise SQL Servers to Azure BLOB storage. I’m going to assume that you’ve already got an Azure account, if you haven’t, you get set up a free trial which will see you good for this demo.

Performance typically won’t be as good as backing up locally to disk, so if you need the fastest backup performance and cloud storage, the best route would be to write backups to disk and have a separate process which migrates them to Blob Storage, S3, or wherever. But in many cases, doing this directly can work out just fine, especially if you are already using an Azure-based VM.

Comments closed

Contrasting Azure Storage Accounts against Azure Managed Storage

Published 2020-06-09 by Kevin Feasel

Denny Cherry explains the difference between Azure Storage Accounts and Azure Managed Storage:

Microsoft Azure has two different kinds of storage available, Storage Accounts and Managed Storage. Each of these has its use, and with one exception can’t really be interchanged between each other.

Read on to learn how each is useful in its own way.

Comments closed

Setting Drive Allocation Unit Size using Powershell

Published 2020-06-04 by Kevin Feasel

Eric Cobb has a tiny script for us:

There seems to be some ongoing debate around whether or not formatting your data and log drives with 64KB allocation unit size even matters anymore. I would encourage you to do your own research to determine if you want to do this or not. My personal take on it is: if it doesn’t hurt, and it may help, and it only takes 2 seconds to click the “go” button on my PowerShell script, then I would rather go ahead and do it and not need it than not do it and wish I had later down the road.

I don’t have a strong opinion in that debate, myself, so I’ll just say that if you want to see how to do this in a couple lines of Powershell code, check out Eric’s post.

Comments closed

How Writing to Parquet Works

Published 2020-06-01 by Kevin Feasel

Dmitry Tolpeko walks us through the algorithm for writing to Parquet format:

After writing first 100 values for a column (for 100 rows), the Parquet writer checks if this 100-values column content exceeds the specified page size (default is 1 MB).
If the raw data size for the column does not exceed the page size threshold then the next page size check is constantly adjusted based on the actual column size, so it neither checked after every column value nor after every 100 values. Thus the page size is not the strict limit.
If the raw data size size exceeds the page size, the column content is compressed (if a compression is specified for the Parquet file), and flushed into the Page store for the column.

This is a nice explanation of the process.

Comments closed

Blob Storage Enhancements

Published 2020-05-28 by Kevin Feasel

James Serra is keeping on top of things around Azure Blob Storage:

Account failover: Customer-initiated storage account failover is now generally available, allowing you to determine when to initiate a failover instead of waiting for Microsoft to do so. When you perform a failover, the secondary replica of the storage account becomes the new primary. The DNS records for all storage service endpoints—blob, file, queue, and table—are updated to point to this new primary. Once the failover is complete, clients will automatically begin reading from and writing to data to the storage account in the new primary region, with no code changes. Customer initiated failover is available for GRS, RA-GRS, GZRS and RA-GZRS accounts. To learn more, see Disaster recovery and account failover

Read on for several more improvements.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Category: Storage