Category: Containers

Open-sourcing Kube2Hadoop

Published 2020-06-12 by Kevin Feasel

Cong Gu, et al, announce the open-sourcing of a project:

By default, there is a gap between the security model of Kubernetes and Hadoop. Specifically, Hadoop uses Kerberos, a three-party protocol built on symmetric key cryptography to ensure any clients accessing the cluster are who they claim to be. In order to avoid frequent authentication checks against a Kerberos server, Delegation Tokens, a lightweight two-party authentication method, was introduced to complement Kerberos authentication. The Hadoop delegation token by default has a lifespan of one day and can be renewed for up to seven days. Kubernetes, on the other hand, uses a certificate-based approach for authentication, and does not expose the owner of a job in any of its public-facing APIs. Therefore, it is not possible to securely determine the authorized user from within the pod using the native Kubernetes API and then use that username to fetch the Hadoop delegation token for HDFS access.
To allow for Kubernetes workloads to securely access HDFS, we built Kube2Hadoop, a scalable and secure integration with HDFS Kerberos. This enables AI modelers at LinkedIn to use HDFS data in Kubernetes pods with access control through a user account or a headless account. Headless accounts are oftentimes used to denote a virtual team that is working on projects that would share the same data within the team. The data acquired can then be used in their model exploration and training with KubeFlow components such as the tf-operator and mpi-operator. In this blog, we will describe the design and authentication model of Kube2Hadoop.

Read on to see how it works and a link to the GitHub repo.

Comments closed

Building a Docker Container of a SQL Server Database

Published 2020-06-08 by Kevin Feasel

I have a post showing how to turn a database in SQL Server into a Docker container:

Today, we’re going to go through the process of turning a database you’ve built into a Docker container. Before we get started, here are the expectations:
1. I want a fully running copy of SQL Server with whatever database I’m using, as well as key components installed.
2. I want this not to be on a persistent volume. In other words, when I destroy the container and create a new one from my image, I want to reset back to the original state. I’m using this for technical demos, where I want to be at the same starting point each time.
3. I want this to be as easy as possible for users of my container. I consider the use of a container here as not particularly noteworthy in and of itself, so the more time I make people think trying to set up my demo environment, the more likely it is that people will simply give up.
With that preamble aside, let’s get to work!

As a bonus, you can finally learn my real thoughts on medieval France. Fun story around that: a much longer time ago than I’m willing to admit, I played a Hundred Years War scenario in Civilization 2, and the one thing I remember from that scenario is killing the Dauphin. After that, the script spawned a new claimant to the throne, who immediately attacked my troops and died. And then the script spawned yet another new claimant, who met the same fate within a couple turns. And then a third. If I remember correctly, I ran France out of claimants to the throne by the end of it.

Comments closed

SQL Server on a Windows Container

Published 2020-06-02 by Kevin Feasel

Kevin Chant lives dangerously:

In this post I want to cover an interesting Windows Container with SQL Server installed experiment that I did. Because it was fairly involved, and it took a while.
In fact, this is the experiment I was talking about in my recent post about recent Azure Data Studio updates. Which you can read about in detail here.

My general philosophy is to avoid Windows containers at all costs, though I’m glad that there are some more adventurous than I.

Comments closed

Patching SQL Server in Docker Containers

Published 2020-05-11 by Kevin Feasel

Rob Farley takes us through updating SQL Server when it lives in a container:

Now, the thing with running SQL in containers is that the concept of downloading a patch file doesn’t work in the same way. If it were regular Linux, the commands would be very simple, such as ‘sudo yum update mssql-server’ in RHEL. But Docker doesn’t quite work the same way, as reflected by the Microsoft documentation which mentions Docker for installing but not in the Update section.

Rob then explains the process. Containers are cattle, not pets. Just make sure your data files live outside the container before you blow it away…

Comments closed

Understanding kubeadm Authentication and Authorization

Published 2020-04-24 by Kevin Feasel

Praveen Sripati takes us through the way that kubeadm handles authorization and authentication for Kubernetes processes:

In the above K8S cluster, the default user (kubernetes-admin) created during the cluster setup has admin privileges to the cluster. This time I was curious on how the authentication and authorization work in K8S for this user to have full access to the cluster. This will enable me to be create additional users with different privileges, authentication and authorization mechanisms. It took me some time to get my mind/thoughts around it, but it’s all interesting. This blog is all about the same.
Note that the cluster has been setup using kubeadm, for kops and other the below varies a little bit. And also, kubeadm cluster setup default used X509 certificates for authentication. Authentication Providers are not built into K8S and so has to be integrated with external systems like Google Accounts, Active Directory, LDAP etc.

Click through to see what Praveen learned in the process.

Comments closed

Configuring Kubernetes Pod Eviction Time

Published 2020-04-08 by Kevin Feasel

Andrew Pruski is a Kubernetes slumlord:

The default time that it takes from a node being reported as not-ready to the pods being moved is 5 minutes.
This really isn’t a problem if you have multiple pods running under a single deployment. The pods on the healthy nodes will handle any requests made whilst the pod(s) on the downed node are waiting to be moved.
But what happens when you only have one pod in a deployment? Say, when you’re running SQL Server in Kubernetes? Five minutes really isn’t an acceptable time for your SQL instance to be offline.

Click through to see how to handle this scenario.

Comments closed

Image Caching with Docker

Published 2020-04-03 by Kevin Feasel

etash2901 at the Knoldus blog walks us through the way Docker caches images:

If the objects on the file system that Docker is about to produce have not changed between builds, reusing a cache of a previous build on the host is a great time-saver. It makes building a new container really, really fast. None of those file structures have to be created and written to disk this time — the reference to them is sufficient to locate and reuse the previously built structures.
This is an order of magnitude faster than a a fresh build. If you’re building many containers, this reduced build-time means getting that container into production costs less, as measured by compute time.

Click through for some advice on how to minimize the amount of time you spend waiting for image layers to download or process.

Comments closed

Using Ephemeral Containers for Debugging Kubernetes-Based Applications

Published 2020-04-01 by Kevin Feasel

Praveen Sripati walks us through the notion of Ephemeral Containers:

It’s always CRITICAL to pack a Container image with the minimal binaries required as this makes the surface area of attack minimal, upgrading the image and testing also becomes easier as there are less variables to be addressed. Distroless Docker images can be used for the same. In the above diagram Container (A) has only the application and the dependent binaries and nothing more. So, if there are no debugging tools in the Container (A) nor any way to check the status of the process then how do we debug any problem in the application? Once a pod is created, it’s even not possible to add Containers to it for additional debugging tools.
That’s where the Ephemeral Containers come into picture as in the Container (B) in the above picture. These Containers are temporary that can be included in the Pod dynamically with additional debugging tools. Once a Ephemeral Container has been created, we can connect to it as usual using the kubectl attach, kubectl exec and kubectl logs commands.

It’s an interesting approach to the problem.

Comments closed

Diving into Kubernetes: a Workshop

Published 2020-04-01 by Kevin Feasel

Chris Adkin has been busy:

I have not blogged for a while, it was my hope to produce part 5 in the series of creating a Kubernetes cluster for production grade Big Data Clusters. However, there is a very good reason for this, and that is because I have been working on a one day workshop to be delivered at SQL Bits in September, the material can be found here, enjoy !

I’ve only looked at the module listings, but Chris does a great job putting long-form articles together, so I’ve already added it to my todos.

Comments closed

Running SQL Server on a Windows Container

Published 2020-03-18 by Kevin Feasel

Jamie Wick takes us through the less-trodden path:

SQL Server containers are gaining popularity as a way of enhancing and standardizing development environments for Windows & Linux based SQL databases. SQL containers allow developers to have their ‘own’ dedicated copy of a database, usually without the need for extensive server infrastructures. Additionally, a single computer can host multiple containers, each with a different edition/version of SQL Server. This allows the user to quickly switch between environments, without the need to reinstall. Currently, a popular option for implementing containers on Windows-based computers uses Docker.
For those not familiar with containerization, here is a Microsoft article on Windows containers.

I’d definitely prefer to use Linux containers, even on Windows machines. But if Windows-based containers is your thing (or you need to use them for some reason), Jamie’s got you covered.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31