Building A Kubernetes Cluster With Kubespray

Chris Adkin continues a series on Kubernetes clusters:

In essence Kubespray is a bunch of Ansible playbooks; yaml file that specify what actions should take place against one or more machines specified in a hosts.ini file, this resides in what is known as an inventory. Of all the infrastructure as code tools available at the time of writing, Ansible is the most popular and has the greatest traction. Examples of playbooks produced by Microsoft can be found on GitHub for automating tasks in Azure and deploying SQL Server availability groups on Linux. The good news for anyone into PowerShell is that PowerShell modules can be installed via Ansible and PowerShell commands can be executed via Ansible. Also, there are people already using PowerShell desired state configuration with Ansible. Ansible’s popularity is down to the facts it is easy to pick up and agent-less because it relies on ssh, hence why one of the steps in this post includes the creation of keys for ssh. This free tutorial is highly recommended for anyone wishing to pick up Ansible.

Click through for a step-by-step tutorial.

Attaching Databases Via Dockerfile

Andrew Pruski shares a better technique for attaching database files held outside of a Docker container:

Now this works a treat. It waits ten seconds for the SQL instance to come up within the container and then runs the sqlcmd script below.
The problem with this is, it’s a bit of a hack. The HEALTHCHECK command isn’t designed to run once, it’ll carry on running every 10 seconds once the container comes up…not great.
So, what’s the better way of doing it?

Andrew gives us a clear explanation of what’s going on and gives a shout out to Bob Ward’s SQL Server on Linux book.

Managing Powershell Core On Non-Windows Machines

Max Trinidad shows us how to grab the latest version of Powershell Core if you aren’t using Windows:

So, if PowerShell Core isn’t available in the package repository, with a few steps you can download and install PowerShell. But, the first thing I do is to remove it before installing.


## - When PowerShell Core isn't available in their repository: (download and execute install)
cd Downloads
sudo dpkg -i powershell_6.1.1-1.ubuntu.18.04_amd64.deb
## - When available in Apt/Apt-Get repository:
sudo apt install -y powershell #-> Or, powershell-preview

Click through for demos of CentOS (or any other yum-based system) and MacOS X.

External Memory Pressure In SQL Server 2019 On Linux

Anthony Nocentino walks us through memory pressure in SQL Server on Linux:

Now in SQL Server 2017 with that 7GB program running would cause Linux to need to make room in physical memory for this process. Linux does this by swapping least recently used pages from memory out to disk. So under external memory pressure, let’s look at the SQL Server process’ memory allocations according to Linux. In the output below we see we still have a VmSize of around 10GB, but our VmRSS value has decreased dramatically. In fact, our VmRSS is now only 2.95GB. VmSwap has increased to 5.44GB. Wow, that’s a huge portion of the SQL Server process swapped to disk.

In SQL Server 2019, there’s a different outcome! In the data below we see our 16GB VmSize which won’t change much because of the virtual address space for the process. With that large external process running SQL Server reduced VmRSS from 7.9GB (from Table 1) to 2.8GB only placing about 4.68GB in the swap file. That doesn’t sound much better, does it? I thought SQL Server was going to react to the external memory pressure…let’s keep digging and ask SQL Server what it thinks about this.

Anthony is doing some great work digging into this.  This is an area where you do have to understand the differences between Windows and Linux.

Basic Linux For The SQL DBA

Kellyn Pot’Vin-Gorman continues her series on getting SQL Server DBAs ramped up on Linux:

Let’s begin with discussing WHY it’s not a good idea to be root on a Linux host unless absolutely necessary to perform a specific task. Ask any DBA for DB Owner or SA privileges, and you will most likely receive an absolute “No” for the response. DBAs need to have the same respect for the host their database runs on. Windows hosts have significantly hardened user security by introducing enhancements and unique application users to enforce similar standards at the enterprise server level, and Linux has always been this way. To be perfectly blunt, the Docker image with SQL Server running as root is a choice that shows lacking investigation to what privileges are REQUIRED to run, manage and support an enterprise database. This is not how we’d want to implement it for customer use.

Unlike a Windows OS, the Linux kernel is exposed to the OS layer. There isn’t a registry that requires a reboot or has a safety mechanism to refuse deletion or write to files secured by the registry or library files. Linux ASSUMED if you are root or if you have permissions to a file/directory, you KNOW what you’re doing. Due to this, it’s even more important to have the least amount of privileges to perform any task required.

Proper deployment would have a unique MSSQL Linux login owning the SQL Server installation and a DBAGroup as the group vs. the current configuration of ROOT:ROOT owning everything. With all the enhancements to security, this is one area that as DBAs, we should request to have adhered to. Our databases should run as a unique user owning the bin files and database processes.

Much of this post is walking us through some basics of security, but it also includes helpful built-in commands unrelated to security, like df to view free disk space.

Hadoop + SQL Server In 2019

Travis Wright shows off a big part of what the SQL Server team has been working on the last couple of years:

SQL Server 2019 big data clusters provide a complete AI platform. Data can be easily ingested via Spark Streaming or traditional SQL inserts and stored in HDFS, relational tables, graph, or JSON/XML. Data can be prepared by using either Spark jobs or Transact-SQL (T-SQL) queries and fed into machine learning model training routines in either Spark or the SQL Server master instance using a variety of programming languages, including Java, Python, R, and Scala. The resulting models can then be operationalized in batch scoring jobs in Spark, in T-SQL stored procedures for real-time scoring, or encapsulated in REST API containers hosted in the big data cluster.

SQL Server big data clusters provide all the tools and systems to ingest, store, and prepare data for analysis as well as to train the machine learning models, store the models, and operationalize them.
Data can be ingested using Spark Streaming, by inserting data directly to HDFS through the HDFS API, or by inserting data into SQL Server through standard T-SQL insert queries. The data can be stored in files in HDFS, or partitioned and stored in data pools, or stored in the SQL Server master instance in tables, graph, or JSON/XML. Either T-SQL or Spark can be used to prepare data by running batch jobs to transform the data, aggregate it, or perform other data wrangling tasks.

Data scientists can choose either to use SQL Server Machine Learning Services in the master instance to run R, Python, or Java model training scripts or to use Spark. In either case, the full library of open-source machine learning libraries, such as TensorFlow or Caffe, can be used to train models.

Lastly, once the models are trained, they can be operationalized in the SQL Server master instance using real-time, native scoring via the PREDICT function in a stored procedure in the SQL Server master instance; or you can use batch scoring over the data in HDFS with Spark. Alternatively, using tools provided with the big data cluster, data engineers can easily wrap the model in a REST API and provision the API + model as a container on the big data cluster as a scoring microservice for easy integration into any application.

I’ve wanted Spark integration ever since 2016 and we’re going to get it.

Running SQL Server 2019 In Docker

Andrew Pruski walks us through setting up SQL Server 2019 CTP 2 on Linux with Docker for Windows:

If you’ve been anywhere near social media this week you may have seen that Microsoft has announced SQL Server 2019.

I love it when a new version of SQL is released. There’s always a whole new bunch of features (and improvements to existing ones) that I want to check out. What I’m not too keen on however is installing a preview version of SQL Server on my local machine. It’s not going to be there permanently and I don’t want the hassle of having to uninstall it.

This is where containers come into their own. We can run a copy of SQL Server without it touching our local machine.

Click through for the step-by-step.

SQL Server On Ubuntu 18.04 LTS

Kevin Feasel



Avanish Panchal shows how to get SQL Server on Linux running on Ubuntu 18.04 LTS:

April 26th, 2018 Ubuntu 18.04 released. A 7th long-term support release of the world’s most popular OSOS available in Desktop, Server, Cloud, and Core versions. Tried quite a bit to install SQL Server 2017 on 18.04, but couldn’t get luck. Investigated quite a bit and came across few reasons. Lot of workaround are mentioned across web, however wondering easiest and simplest way, always.

Potential reason till SQL Server for Linux 2017 CU9 package had dependencies on OPENSSL version(1.0.0) but Ubuntu 18.04 LTS comes with OpenSSL 1.1. Microsoft has update the SQL Server 2017 installation package so it can use libssl1.0.0 pkg. However still found issue while installing today (08-Sep-2018). Realised Microsoft SQL Server on Linux 2017 CU10 was updated but still few changes are required at OS level i.e. Ubuntu 18.04.

Read on for screen shots showing how to fix the problem.

Joining Your SQL Server On Linux VM To A Domain

Dylan Gray and Tejas Shah provides some tips on joining a SQL Server on Linux instance to an existing Active Directory domain:

AD authentication is a popular mechanism for login and user authentication. It works very well in many scenarios, especially for enterprise applications. AD authentication is a supported scenario on SQL Server on Linux. Configuring the Linux VM to join with Active Directory (AD) can be a little tricky at sometimes though, especially in a complex enterprise environment.

  • One error message you may see from “realm join” is “realm: Couldn’t join realm: This computer’s host name is not set correctly.” This is due to a generic hostname (e.g. “localhost”), an incorrect domain in your hostname (e.g. “” instead of “”), or a duplicate hostname on the domain. To fix this, edit /etc/hostname to have a unique hostname and reboot the machine. On Ubuntu, it can also be helpful to put the fully qualified domain name in /etc/hostname (e.g. “” instead of “host1”).


They provide in this post some of the low-hanging fruit answers, where the problem is in basic server configuration.

What To Do After Installing SQL Server On Linux

Manoj Pandey has a few tips for what to do after installing SQL Server on Linux:

Here are some of the best practices post installing SQL Server on Linux that can help you maximize database performance:

1. To maintain efficient Linux and SQL Scheduling behavior, it’s recommended to use the ALTER SERVER CONFIGURATION command to set PROCESS AFFINITY for all the NUMANODEs and/or CPUs. [Setting Process Affinity]

2. To reduce the risk of tempdb concurrency slowdowns in high performance environments, configure multiple tempdb files by adding additional tempdb files by using the ADD FILE command. [tempdb Contention]

3. Use mssql-conf to configure the memory limit and ensure there’s enough free physical memory for the Linux operating system.

Some of these are common for Windows and Linux (like multiple tempdb files) but there are several Linux-specific items here.


January 2019
« Dec