MI GP uses Azure Premium Storage to store database files for all databases, except for the tempdb database. From the perspective of the database engine, this storage type is remote, i.e. it is accessed over the network, using Azure network infrastructure. To use Azure Premium Storage, MI GP takes advantage of SQL Server native capability to use database files directly in Azure Blob Storage. This means that there is not a disk or a network share that hosts database files; instead, file path is an HTTPS URL, and each database file is a page blob in Azure Blob Storage.
Since Azure Premium Storage is used, its performance characteristics, limits, and scalability goals fully apply to MI GP. The High-performance Premium Storage and managed disks for VMs documentation article includes a section describing Premium Storage disk limits. While the topic is written in the context of VMs and Azure disks, which is the most common usage scenario for Azure Premium Storage, the documented limits are also applicable to blobs. As shown in the limits table in the documentation, the size of the blob determines the maximum IOPS and throughput that can be achieved against the blob. For MI GP, this means that the size of a database file determines the maximum IOPS and throughput that is achievable against the file.
The disk/blob size shown in the limits table is the maximum size for which the corresponding limit applies. For example, a blob that is > 64 GB and <= 128 GB (equivalent to a P10 disk) can achieve up to 500 IOPS and up to 100 MB/second throughput.
Read the whole thing if you’re looking at Managed Instances, but there are some tips for SQL Server in Azure IaaS.
Storing large amounts of data in volatile memory (RAM) is expensive. Modern disk technologies based on SSD are providing fast access to data but at a much lower cost when compared to RAM. Hence, we wanted to move part of the data out of memory without sacrificing availability or performance. The cost to store 1 TB of data on SSD is much lower than storing the same amount in RAM.
We observed during experimentation that RAM random read latencies were rarely higher than 1 microsecond whereas typical SSD random read speeds are between 100–500 microseconds. For EVCache our typical SLA (Service Level Agreement) is around 1 millisecond with a default timeout of 20 milliseconds while serving around 100K RPS. During our testing using the storage optimized EC2 instances (I3.2xlarge) we noticed that we were able to perform over 200K IOPS of 1K byte items thus meeting our throughput goals with latency rarely exceeding 1 millisecond. This meant that by using SSD (NVMe) we were able to meet our SLA and throughput requirements at a significantly lower cost.
NVMe isn’t as fast as RAM, but we are well beyond the days of spinning disk hard drives.
Two of the products I use extensively for this purpose are Amazon Glacier and, more recently, Microsoft Azure Blob Storage Archive Tier. As happy as I’ve been with Amazon Glacier since its introduction in 2012, I always hoped Microsoft would offer a similar service. My wish came true in Fall of 2017 when an archive tier of Azure Blob Storage was announced. Rather than branding this capability as a new product, Microsoft decided to present it as a new tier of Azure Blob Storage, alongside the existing hot and cool storage tiers.
A noticeable difference from the hot and cool storage tiers is that the archive storage tier is only available on a per-blob basis. While a storage account can be configured to have all blobs placed in either the hot or cool tier by default once they are uploaded, the archive tier is not an option. Once a blob is uploaded, it must explicitly be moved into the archive tier. If one is using the Azure Portal to do this, there’s several clicks involved per blob. The free Azure Storage Explorer client is no better. While I found several third party tools that can upload files to the archive tier, none were free. At this point, I decided to write my own method using Powershell, which I am happy to share below.
Read on for the script. A good use for Azure Blob Storage Archive Tier would be storing old database backups which you have to keep around for compliance purposes but rarely use.
In short, ADLS Gen2 is the combination of the current ADLS (now called Gen1) and Blob storage. Gen2 is built on Blob storage. By GA, ADLS Gen2 will have all the features of both, which means it will have features such as limitless storage capacity, support all Blob tiers (Hot, Cool, and Archive), the new lifecycle management feature, Azure Active Directory integration, hierarchical file system, and read-access geo-redundant storage.
A Gen2 capability is what is called “multi-modal” which means customers can use either Blob object store APIs or the new Gen2 file system APIs. The key here is that both blob and file system semantics are now supported over the same data.
One very interesting thing to me is that Gen2 pricing is half of Gen1.
To identify the allocation unit size for a volume, we can use the fsutil.exe utility. In the output you are looking for “Bytes Per Cluster” which is your allocation unit size. Here is an example to retrieve the information for the G:\ volume.
Ryan also shows how to change the allocation size, should you need to do so.
Of course, PowerShell excels at this. By using the SQL Server module, it’s really easy to:
- Connect to an instance and collect every user database, and
- From each database, collect every table, and
- For each table, collect row counts and space used, and
- If there are any indexes, group them, and sum their usage and report that as well
Here’s the script. Note that I have the server name hard-coded in there as localhost (more on that in a coming paragraph). Go ahead and take a look before we break it down.
Click through for the script, and homework is due next Tuesday on his desk.
This is just great on its own as I get information shown below. It is basically data like this at the Database/FileGroup/File levels
Free of Max size
Size as a percent at the instance level
Free size as a percent at the instance level
Max size as a percent at the instance level
Free of Max size as a percent at the instance level
Read on to see how Jana makes use of this data, as well as where you can get the code.
There are advantages and disadvantages to each approach. The tiered approach has the most flexibility for an operator to tune the performance of the cluster while trading off size of the hot data zone for better performance or smaller resource footprint. The downside of this approach is that, having data on HDFS, resizing the cluster is a slow and tedious process due to HDFS needing to be rebalanced to achieve performance and fault-tolerance expectations. Thus this architecture is generally only used for statically sized clusters with steady, well-known workloads.
The decoupled architecture, on the other hand, enables maximum flexibility for cluster growth and reduction. For example, a cluster could run at 100 nodes during the day to support analytics and reporting and then shrink to 24 nodes overnight to support smaller ETL workloads. Historically, the disadvantage to decoupling is that cloud storage is not local and therefore could drastically affect runtime of the analytical workloads (hence the hybrid approach of tiered storage). However, the advent of LLAP in Hive 2.0 has limited this overhead making the approach far more attractive. The dynamic cache within LLAP also means that we do not need to statically define what data is hot. It can be inferred at query time (i.e., as users access the data, that data will become hot). We will look closer at how LLAP closes the runtime gap in the next section.
Historically, the argument was that you should avoid S3 in part because it’s relatively flaky compared to disks (in terms of performance and in its eventual consistency model). It looks like that’s no longer a pressing concern.
Arguably, the biggest thing about S2D is that the solutions currently have to be certified (see this bit of documentation from MS for more detail). This obviously doesn’t really affect, say, virtualized versions or ones up in the public cloud such as in Azure in a meaningful way, but it’s still technically a requirement much like logoed hardware for Windows Server supportability. Anyone want to point me to the logo stamped on your VMs? Didn’t think so. Now, from a pure FCI perspective none of this is an issue. The way a Windows Server failover cluster (WSFC) is currently designed, it is expecting that all nodes participating in the WSFC are also using/needed S2D. Why am I mentioning this? Disaster recovery.
It shouldn’t stop you from moving forward with S2D, but means you’ll probably have a bit more research on your hands.
What is Diskspd?
Diskspd is a storage testing tool created by Microsoft Windows, Windows Server and Cloud Server Infrastructure Engineering teams. It combines robust and granular IO workload definition with flexible runtime and output options. That makes it a perfect tool for storage performance testing, validation and benchmarking.
Where to find Diskspd?
Diskspd is a free and open source utility. Its source code can be found on GitHub. The repository also hosts other frameworks which use Diskspd. You can find them under ‘Frameworks’ directory. A binary release is hosted by Microsoft at the following location: http://aka.ms/diskspd.
Click through for more details, including an example of a poorly-performing I/O solution.