In this ultimate guide I will give you a simple step-by-step tutorial on installing Kafka Docker on Kubernetes. This post includes a complete video walk-through.
There has been a lot of interest lately about deploying Kafka to a Kubernetes cluster. If you are wanting to take the deep dive yourself then you found the right article. Now that we have Kafka Docker, deploying a Kafka cluster to Kubernetes is a snap.
This makes it even easier to get started with Kafka in a development environment.
This is quite new syntax in docker and you need at least docker 17.05 (client and daemon), after the images “FROM blah” you can specify a name “core” in this case, then later you can copy from the first image to the second using “–from=” on the “COPY” command.
In this dockerfile I have added Spark 2.4.3 and the default environment variables we need to get spark running, if you grab this dockerfile and run “docker build -t dotnet-spark .” you should get an images you can then run which includes the dependencies for dotnet as well as spark.
Ed includes all of the scripts needed to test this out, too.
In 2016 I created the Max Server Memory Matrix as a guide for configuring the maximum amount of memory that should be assigned to SQL Server, using an algorithm developed by Jonathan Kehayias.
SQL Server 2019 is still in preview as I write this, but I wanted to point out a new feature that Microsoft has added to SQL Server Setup, on the Windows version.
On the Database Engine Configuration screen are two new tabs, called MaxDOP and Memory. These are both new configuration options for SQL Server 2019. Last week we looked at MaxDOP, and this post will specifically look at the Memory tab.
It’s a small change but a nice one.
This will fail before the query really even gets out of the gate. Why? The proc xp_create_subdir cannot create the directory because it requires elevated permissions. The fix for that is easy enough – grant permissions to write to the Database directory after creating it while in sudo mode. I will get to that in just a bit. Let’s see what the errors would look like for now.
Msg 22048, Level 16, State 1, Line 15
xp_create_subdir() returned error 5, ‘Access is denied.’
Msg 25602, Level 17, State 23, Line 36
The target, “5B2DA06D-898A-43C8-9309-39BBBE93EBBD.package0.event_file”, encountered a configuration error during initialization. Object cannot be added to the event session. The operating system returned error 5: ‘Access is denied.
‘ while creating the file ‘C:\Database\XE\PREEMPTIVE_OS_PIPEOPS_0_132072025269680000.xel’
Read on for the solution.
The purpose of an Elastic Job is to execute a T-SQL script that is scheduled or executed ad-hoc against a group of Azure SQL databases. Targets can be in different SQL Database servers, subscriptions, and/or regions. This blog post is quite long and heavy (code wise) so grab a coffee and follow through.
The architecture you could follow is shown below.
All of the code is in Powershell and Arun talks us through it.
This will spin up one container running SQL Server 2019 CTP 3.1, accept the EULA, set the SA password, and set the default location for the database data/log/backup files using named volumes created on the fly.
Let’s convert this using Kompose and deploy to a Kubernetes cluster.
It looks pretty straightforward to use; check it out.
Using AdventureworksDW2016CTP3 we will work with the FactResellerSalesXL table which has 11.6 million rows in it. The simple query we will use as a demo just selects the ProductKey and returns some aggregations grouping them by the different product keys.
First, we will run the query with no existing columnstore index and only using the current clustered rowstore (normal) index. Note that I turned on SET STATISTICS IO and TIME on. These two SET statements will help us better illustrate the improvements provided by the columnstore index. SET STATISTICS IO displays statistics on the amount of page activity generated by the query. It gives you important details such as page logical reads, physical reads, scans, and lob reads both physical and logical. SET STATISTICS TIME displays the amount of time needed to parse, compile, and execute each statement in the query. The output shows the time in milliseconds for each operation to complete. This allows you to really see, in numbers, the differences.
Click through for the example.
Have you ever written an array formula in Excel? (Don’t worry, most people haven’t). Have you ever written a FOR loop in a programming language? (Again, don’t worry, there’s another question coming). Have you every repeated something over and over again, slowly building up to a final result?
That’s what SUMX() does. It loops through a list, performs a calc at each step, and then adds up the results of each step. That’s a pretty simple explanation, but it produces some results in pivots that are nothing short of spectacular.
Read on for a few examples.