Press "Enter" to skip to content

Category: Containers

Using Specific R Package Versions in Docker Images

Roman Lustrik shares how to fix package versions in Docker images:

Using package in R is easy. You install from CRAN using install.packages("packagename"), it resolves dependencies and you’re good to go. What R natively doesn’t handle so well is installing a particular package version without jumping through hoops. Technically you need the source file of the package version you want to install AND all source files of the dependencies (in the correct version, of course). This has been made almost seamless with packages packrat and recently, renv.

This comes handy when you are constructing a Docker file to run in production. Usually you want to run this defensively and do not want things to change from one image build to another. To get there, you can save all your package names and version into a file (renv.lock) and use that to reconstruct the now defined package structure with predictable versions (see renv vignette here).

This is quite useful as R package developers tend not to covet backwards compatibility, and one of the key benefits of containers is to have the option to keep the same code base and configuration in all environments.

Leave a Comment

Creating a New Container from a SQL Server on Windows Dockerfile

Jamie Wick continues a series on SQL Server and Windows containers:

The docker build command sends the contents of the working directory, along with a dockerfile, to the Docker daemon, as a build context, to create the new image. A dockerfile is a plain text file that contains the name of a (base) image, along with a set of instructions for modifying the image. By default, the dockerfile is assumed to be in the root of the working directory, but a separate location can be specified using the -f parameter in the build command. Additionally, the -t parameter can be used to specify a repository and tag for the new image. Finally, the working directory can be specified using a Path or URL. In the example below, the current directory (.) is being used as the working directory (the docker build command is being run at the root level of the working directory).

Read on for examples.

Leave a Comment

OpenShift and SQL Server Big Data Clusters

Chris Adkin explains why support for OpenShift is important for SQL Server Big Data Clusters:

One thing that should become immediately apparent when installing and administering an OpenShift cluster, is that it is a lot more prescriptive and opinionated that vanilla Kubernetes. The simple reason for this is that OpenShift is intended to be deployed to environments that require enterprise grade levels of hardening and security. For example, Red Hat mandates the operating system distributions you must use, to the extent that when deploying a cluster on VMware – Red Hat’s documentation recommends the use of OVA’s, compressed files containing install-able virtual machines.

Read on for the full story.

Leave a Comment

Database Restoration in Docker

John Morehouse gives us one way to restore a database in Docker:

Here are the steps that we will take to make this work:

1. Download one of the sample databases from I have a “mssql” directory in my local profile to make things easier
2. Make sure the container is started.  You can issue a “docker ps” command terminal to see which containers are running
3. Create a directory within the container
4. Copy the sample database backup file into the directory you just created
5. Restore the database onto the SQL instance that is running within the container

The set of steps is fine and it’s what I normally do, though someone did suggest to set up an external volume linking, e.g., /var/opt/mssql/backups outside the container. That way, you can drop your backup file in and it’ll be there without the copy step.

Leave a Comment

Open-sourcing Kube2Hadoop

Cong Gu, et al, announce the open-sourcing of a project:

By default, there is a gap between the security model of Kubernetes and Hadoop. Specifically, Hadoop uses Kerberos, a three-party protocol built on symmetric key cryptography to ensure any clients accessing the cluster are who they claim to be. In order to avoid frequent authentication checks against a Kerberos server, Delegation Tokens, a lightweight two-party authentication method, was introduced to complement Kerberos authentication. The Hadoop delegation token by default has a lifespan of one day and can be renewed for up to seven days. Kubernetes, on the other hand, uses a certificate-based approach for authentication, and does not expose the owner of a job in any of its public-facing APIs. Therefore, it is not possible to securely determine the authorized user from within the pod using the native Kubernetes API and then use that username to fetch the Hadoop delegation token for HDFS access.  

To allow for Kubernetes workloads to securely access HDFS, we built Kube2Hadoop, a scalable and secure integration with HDFS Kerberos. This enables AI modelers at LinkedIn to use HDFS data in Kubernetes pods with access control through a user account or a headless account. Headless accounts are oftentimes used to denote a virtual team that is working on projects that would share the same data within the team. The data acquired can then be used in their model exploration and training with KubeFlow components such as the tf-operator and mpi-operator. In this blog, we will describe the design and authentication model of Kube2Hadoop. 

Read on to see how it works and a link to the GitHub repo.

Leave a Comment

Building a Docker Container of a SQL Server Database

I have a post showing how to turn a database in SQL Server into a Docker container:

Today, we’re going to go through the process of turning a database you’ve built into a Docker container. Before we get started, here are the expectations:

1. I want a fully running copy of SQL Server with whatever database I’m using, as well as key components installed.
2. I want this not to be on a persistent volume. In other words, when I destroy the container and create a new one from my image, I want to reset back to the original state. I’m using this for technical demos, where I want to be at the same starting point each time.
3. I want this to be as easy as possible for users of my container. I consider the use of a container here as not particularly noteworthy in and of itself, so the more time I make people think trying to set up my demo environment, the more likely it is that people will simply give up.

With that preamble aside, let’s get to work!

As a bonus, you can finally learn my real thoughts on medieval France. Fun story around that: a much longer time ago than I’m willing to admit, I played a Hundred Years War scenario in Civilization 2, and the one thing I remember from that scenario is killing the Dauphin. After that, the script spawned a new claimant to the throne, who immediately attacked my troops and died. And then the script spawned yet another new claimant, who met the same fate within a couple turns. And then a third. If I remember correctly, I ran France out of claimants to the throne by the end of it.

Comments closed

SQL Server on a Windows Container

Kevin Chant lives dangerously:

In this post I want to cover an interesting Windows Container with SQL Server installed experiment that I did. Because it was fairly involved, and it took a while.

In fact, this is the experiment I was talking about in my recent post about recent Azure Data Studio updates. Which you can read about in detail here.

My general philosophy is to avoid Windows containers at all costs, though I’m glad that there are some more adventurous than I.

Comments closed

Patching SQL Server in Docker Containers

Rob Farley takes us through updating SQL Server when it lives in a container:

Now, the thing with running SQL in containers is that the concept of downloading a patch file doesn’t work in the same way. If it were regular Linux, the commands would be very simple, such as ‘sudo yum update mssql-server’ in RHEL. But Docker doesn’t quite work the same way, as reflected by the Microsoft documentation which mentions Docker for installing but not in the Update section.

Rob then explains the process. Containers are cattle, not pets. Just make sure your data files live outside the container before you blow it away…

Comments closed

Understanding kubeadm Authentication and Authorization

Praveen Sripati takes us through the way that kubeadm handles authorization and authentication for Kubernetes processes:

In the above K8S cluster, the default user (kubernetes-admin) created during the cluster setup has admin privileges to the cluster. This time I was curious on how the authentication and authorization work in K8S for this user to have full access to the cluster. This will enable me to be create additional users with different privileges, authentication and authorization mechanisms. It took me some time to get my mind/thoughts around it, but it’s all interesting. This blog is all about the same.

Note that the cluster has been setup using kubeadm, for kops and other the below varies a little bit. And also, kubeadm cluster setup default used X509 certificates for authentication. Authentication Providers are not built into K8S and so has to be integrated with external systems like Google Accounts, Active Directory, LDAP etc.

Click through to see what Praveen learned in the process.

Comments closed

Configuring Kubernetes Pod Eviction Time

Andrew Pruski is a Kubernetes slumlord:

The default time that it takes from a node being reported as not-ready to the pods being moved is 5 minutes.

This really isn’t a problem if you have multiple pods running under a single deployment. The pods on the healthy nodes will handle any requests made whilst the pod(s) on the downed node are waiting to be moved.

But what happens when you only have one pod in a deployment? Say, when you’re running SQL Server in Kubernetes? Five minutes really isn’t an acceptable time for your SQL instance to be offline.

Click through to see how to handle this scenario.

Comments closed