Today, we’re going to talk about the Databricks File System (DBFS) in Azure Databricks. If you haven’t read the previous posts in this series, Introduction, Cluster Creation and Notebooks, they may provide some useful context. You can find the files from this post in our GitHub Repository. Let’s move on to the core of this post, DBFS.
As we mentioned in the previous post, there are three major concepts for us to understand about Azure Databricks, Clusters, Code and Data. For this post, we’re going to talk about the storage layer underneath Azure Databricks, DBFS. Since Azure Databricks manages Spark clusters, it requires an underlying Hadoop Distributed File System (HDFS). This is exactly what DBFS is. Basically, HDFS is the low cost, fault-tolerant, distributed file system that makes the entire Hadoop ecosystem work. We may dig deeper into HDFS in a later post. For now, you can read more about HDFS here and here.
Click through for more detail on DBFS.
Moving data between two cloud providers can be painful, and require more provider scripting if doing api calls. For this, you can benefit from a tool that abstracts the calls into a seamless synchronization tool.
I’ve used RClone before when needing to deduplicate several terabytes of data in my own Google Drive, so I figured I’d see if it could help me sync up 25GB of json files from Azure to S3.
You’ll have to do a few of the steps on your own, but this looks like a good way of parking data in two clouds.
However, the command output doesn’t include the total size of each drive, making it impossible to determine the percent free space. If you’re in an environment where a separate team monitors disk space, and has alerts set when free space falls below a certain percentage, you may want to ensure you don’t breach those levels. The following script provides “the big picture” for your servers, since it provides total size, free space, available space, and the percent free. It does require the use of the documented and supported
sys.xp_cmdshellsystem extended stored procedure. The code uses the drive letters returned by
sys.xp_fixeddrivesinside a cursor. Inside the cursor, we call the dos command
fsutil volume diskfree C:to get total capacity and free space, etc:
Click through for the script.
SQL 2017 on Windows Server 2016 behaves the same as SQL 2016 on Windows Server 2016 – “tail of the log” is supported. However, there is no support for PMEM with SQL 2017 on supported Linux distributions (except as a traditional block store). Using PMEM with SQL 2019 on Linux supports what’s known as “enlightenment”, which allows us to place data and log files on DAX formatted volumes, thereby reducing latency considerably. SQL 2019 on Linux also support “tail of the log”.
This is one of those areas where understanding Linux versus Windows administration really pays off, at least until Windows Server supports something like enlightenment.
This first step when using T-SQL to read Extended Files that are stored in an Azure Storage Account is to create a database credential. Of course the credential will provide essential security information to connect to the Azure Storage Account. This first data point you will need is the URL to a blog storage container in you storage account. If you look below, you can see where you would place your storage account name and the blob storage container name.
Dave gives us the grand tour of the configuration process, including where things differ between on-prem SQL Server and Azure SQL Database (which is quite a bit)
One of the key principals of Kubernetes is the ephemerality of Pods. No Pod is every redeployed, a completely new Pod is created. If a Pod dies, for whatever reason, a new Pod is created in its place there is no continuity in the state of that Pod. The newly created Pod will go back to the initial state of the container image defined in the Pod’s spec. This is very valuable for stateless workloads, not so much for stateful workloads like SQL Server.
This means that for a stateful workload like SQL Server we need to store both configuration and data externally from the Pod to maintain state through the recreation of a Pod. Kubernetes give us constructs two constructs to do that, environment variables and Persistent Volumes.
Read on for a good bit of background and a few scripts to help you get started.
A question that often crops up is “Can I use local storage”, the answer is “It depends”. Kubernetes is essentially a container scheduler at its most basic and fundamental level. The ‘Pod’ is the unit of scheduling, containers in the same pod share the same life cycle and always run on the same node. For stateless pods life is reasonably simple and straight forward, for state-full pods, life is a bit more nuanced. If for any reason a node fails, the pods that ran on that node have to be rescheduled to run on a working node, and their storage needs to follow them. This involves un-mounting the volume from the failed node and then mounting it on the node the pod(s) are rescheduled to run on. With basic vanilla hyper-converged storage, i.e. storage and compute in the same chassis, this will ultimately lead to scheduling problems. However, software defined solutions exist that enable this kind of infrastructure to be turned into a storage cluster which allows state to follow pods around the cluster. Some people automatically associated HDFS with local storage, the reason for this is probably because “Back in the day”, the most cost efficient way for Google to scale out its infrastructure was via commodity servers with local disks.
Read the whole thing.
The mssqluser named volume is going to be mounted as /var/opt/sqlserver and the mssqlsystem volume is going to be mounted as /var/opt/mssql. This is the key to the databases automatically being attached in the new container, /var/opt/mssql is the location of the system databases.
If we didn’t mount a named volume for the system databases any changes to those databases (particularly for the master database) would not be persisted so the new container would have no record of any user databases created.
Read the whole thing.
– You may need to consider separate storage accounts if you need to segregate access control (RBAC), virtual networks, access keys, and the like. (Note that RBAC can also be set at the container level too, but ACL type permissions only apply to ADLS Gen2 and not to blob storage.)
– If you don’t need the hierarchical namespace whatsoever (for non-analytical use cases), this could mean a separate storage account. The storage cost is the same but transaction costs are higher when the HNS is enabled (discussed in item #8 of this post).
Click through for more details, including several more tips about Azure Storage Accounts, Azure Blob Storage Containers, and the Azure Storage Blobs themselves.
I was contacted last week by someone who was confused about the WRITELOG wait type. They were seeing lots of these waits, with an average wait time of 18ms. The log was stored on a Raid-1 array, using locally-attached spinning disks in the server. They figured that by moving the log to Raid-1 array of SSDs, they’d reduce the WRITELOGwait time and get better workload throughput.
They did so and got better performance, but were very surprised to now see WRITELOG as the most frequent wait type on the server, even though the average wait time was less than 1ms, and asked me to explain.
Read on for Paul’s explanation of why this is not a scary situation, or is it particularly weird. SQL Server performance is a complicated thing and trying to limit it to one measure or one query can lead you down the wrong path.