Press "Enter" to skip to content

Category: Containers

Whither Running Kafka On Kubernetes

Gwen Shapira walks through some of the costs and benefits of using Kubernetes to host your Apache Kafka brokers:

First, if you are running most of your other applications and microservices on Kubernetes, it becomes the organizational path of least resistance. This is just like how organizations who standardized on VMs have found it very difficult to allocate physical machines with local disks for Kafka.

I see situations with larger organizations where deploying Kafka outside of Kubernetes causes significant organizational headache that involves many approvals. When this is the case, I usually say that this isn’t a good hill to die on. It is possible to run Kafka on Kubernetes, so just do it. You’ll get your environment allocated faster and will be able to use your time to do productive work rather than fight an organizational battle.
And if things go wrong, you’ll get much better service from your internal infrastructure teams, because you’ll be running in an environment that is familiar to them.

Read on for more benefits as well as a few drawbacks.

Comments closed

What’s New With Docker For Windows Server 2019

Elton Stoneman walks us through several additions to Docker support on Windows Server 2019:

5. Volume mounts have usable directory paths

Docker volumes are how you separate storage from the lifecycle of your containers. You attach a volume to a container, and it surfaces as a directory in the container’s filesystem. Your app writes to C:\jenkins (or whatever path you mount) and the data actually gets stored in the volume, which could be storage on the Docker host – like a RAID array on the server – or a separate storage unit in the datacenter, or a cloud storage service.

The mount inside the container should be transparent to the app, but actually in Windows Server 2016 the implementation used symlink directories, and that caused a few problems.

Elton notes that Docker support on Windows is now approaching that of Linux, so check out some of the gaps that have been filled with the latest server release.

Comments closed

Using Containers To Build A Home Lab

Dmitri Korotkevitch walks us through creating a home lab with Docker containers:

Obviously, in the real life, we do not work with vanilla SQL Server installation. We need to customize it by changing SQL Server settings and logins, creating and/or restoring the databases and do other actions. There are a couple of ways how you can do that.

The first approach is customizing existing container manually and creating the image from it using docker container commit command. After that, you can start the new containers from created image the same way as we already discussed. We will cover a couple ways to move data to and from containers later.

There is the better way, however. You can automate this process by utilizing docker build command. The process is very simple. You just need to define DockerFile, which contains the reference to the main image and specifies the build actions. You can copy scripts and database backups into the image, run SQLCMD, BCP and PowerShell scripts there – you, pretty much, have the full control. Internally, Docker runs every command inside deployment containers (creating and destroying them during the process) saving the final one as the target image.

Read the whole thing.

Comments closed

Deploying An Azure Container Within A Virtual Network

Andrew Pruski shows us that you can now deploy an Azure container running SQL Server within an Azure virtual network:

Up until now Azure Container Instances only had one option to allow us to connect. That was assigning a public IP address that was directly exposed to the internet.

Not really great as exposing SQL Server on port 1433 to the internet is generally a bad idea: –

Now I know there’s a lot of debated about whether or not you should change the port that SQL is listening on to prevent this from happening. My personal opinion is, that if someone wants to get into your SQL instance, changing the port isn’t going to slow them down much. However, a port change will stop opportunistic hacks (such as the above).

But now we have another option. The ability to deploy a ACI within a virtual network in Azure! So let’s run through how to deploy.

Click through for those instructions.

Comments closed

Stateful Services With Kubernetes

Kevin Sookocheff explains some scenarios in which stateful Kubernetes services can work well:

With leader election, you begin with a set of candidates that wish to become the leader and each of these candidates race to see who will be the first to be declared the leader. Once a candidate has been elected the leader, it continually sends a heart beat signal to keep renewing their position as the leader. If that heart beat fails, the other candidates again race to become the new leader. Implementing a leader election algorithm usually requires either deploying software such as ZooKeeper, or etcd and using it to determine consensus, or alternately, implementing a consensus algorithm on your own. Neither of these are ideal: ZooKeeper and etcd are complicated pieces of software that can be difficult to operate, and implementing a consensus algorithm on your own is a road fraught with peril. Thankfully, Kubernetes already runs an etcd cluster that consistently stores Kubernetes cluster state, and we can leverage that cluster to perform leader election simply by leveraging the Kubernetes API server.

Kubernetes already uses the Endpoints resource to represent a replicated set of pods that comprise a service and we can re-use that same object to retrieve all the pods that make up your distributed system. Given this list of pods, we leverage two other properties of the Kubernetes API: ResourceVersions and Annotations. Annotations are arbitrary key/value pairs that can be used by Kubernetes clients, and ResourceVersions mark the unique version of every Kubernetes resource in the cluster. Given these two primitives, we can perform leader election in a fairly straightforward manner: query the Endpoints resource to get the list of all pods running your service, and set Annotations on those resources. Each change to an Annotation also updates the ResourceVersion metadata. Because the Kubernetes API server is backed by etcd, a strongly consistent datastore, you can use Annotations and the ResourceVersion metadata to implement a simple compare-and-swap algorithm.

Google has used this approach to implement leader election as a Kubernetes Service, and you can run that service as a sidecar to your application to perform leader election backed by etc. For more on running a leader election algorithm in Kubernetes, refer to this blog post.

This is one of the parts that container services like Docker are striving to answer, but I don’t think they have it quite nailed down yet.

Comments closed

Running SQL Server 2019 In Docker

Andrew Pruski walks us through setting up SQL Server 2019 CTP 2 on Linux with Docker for Windows:

If you’ve been anywhere near social media this week you may have seen that Microsoft has announced SQL Server 2019.

I love it when a new version of SQL is released. There’s always a whole new bunch of features (and improvements to existing ones) that I want to check out. What I’m not too keen on however is installing a preview version of SQL Server on my local machine. It’s not going to be there permanently and I don’t want the hassle of having to uninstall it.

This is where containers come into their own. We can run a copy of SQL Server without it touching our local machine.

Click through for the step-by-step.

Comments closed

SQL Server 2019 Containers Available

The SQL Server team has a getting started post on pulling down the latest CTP in a container, as well as some additional container features:

SQL Server 2019 is now available on Red Hat Enterprise Linux as a Red Hat Certified Container Images and Ubuntu-based container images enabling you to take advantage of the latest SQL Server engine innovations such as new SQL Graph features, and Data Discovery and Classification. We are also making it possible to adopt SQL Server in containers with existing scenarios such as Replication and Distributed Transaction which are now part of SQL Server 2019 on Linux.

This makes it easier to get started with SQL Server 2019 without potentially messing up your already-working systems.

Comments closed

The Basics Of Kubernetes

Chris Adkin shares some thoughts on what Kubernetes is and why it might be interesting to data platform professionals:

I strongly urge anyone with an interest in learning Kubernetes to watch this presentation, as it makes a great job of explaining Kubernetes from the ground up.

“Out of the box”, Kubernetes will look after scheduling. If there is a requirement to ensure that pods only run on a specific set of nodes, there is a means of doing this via label selectors, as documented here. A label selector is a directive of influencing resource utilization related decisions made by the cluster.

Read the whole thing.

Comments closed

Removing The Windows Container Service

Melody Zacharias shows us how to remove the Windows container service from a machine:

This week I wanted to try out a new product from a vendor.  I thought I may use it for a demo so wanted to be able to run it on my laptop.  You never know when you go to do a demo and cannot access Azure, so I try to always put my demos on my laptop as a backup; just in case….

The product required me to remove all previous versions of windows containers on my machine.  The vendor recommended I use this posh command.

Remove-WindowsFeature Containers

I was fairly sure I did not have any but wanted to check to make sure.

Melody walks through a few tricky issues, including the difference between the command on Windows 10 versus Windows Server.

Comments closed

Thoughts On The Evolution Of Big Data

Praveen Sripati shares an opinion on where the various Hadoop and Big Data platforms are headed:

The different Cloud Vendors had been offering Big Data as a service for quite some time. Athena, EMR, RedShift, Kinesis are a few of the services from AWS. There are similar offerings from Google CloudMicrosoft Azure and other Cloud vendors also. All these services are native to the Cloud (built for the Cloud) and provide tight integration with the other services from the Cloud vendor.

In the case of Cloudera, MapR and HortonWorks the Big Data platforms were not designed with the Cloud into considerations from the beginning and later the platforms were plugged or force fitted into the Cloud. The Open Hybrid Architecture Initiative is an initiative by HortonWorks to make their Big Data platform more and more Cloud native.

It’ll be interesting to see where this goes.

Comments closed