Kafka has built-in fault tolerance by replicating partitions across a configurable number of brokers. However, when a broker fails and a new replacement broker is added, the replacement broker fetches all data the original broker previously stored from other brokers in the cluster that host the other replicas. Depending on your application, this could involve copying tens of gigabytes or terabytes of data. Fetching this data takes time and increases network traffic, which could impact the performance of the Kafka cluster for the period the data transfer is happening.
EBS volumes are persisted when an instance fails or is terminated. When an EC2 instance running a Kafka broker fails or is terminated, the broker’s on-disk partition replicas remain intact and can be mounted by a new EC2 instance. By using EBS, most of the replica data for the replacement broker will already be in the EBS volume and hence won’t need to be transferred over the network. Only data produced since the original broker failed or was terminated will need to be fetched across the network.
There are some good insights here; read the whole thing if you’re thinking about running Kafka.
I used to spend a lot of time doing patching, and I had plenty of times when:
Servers wouldn’t come back up after a reboot. Someone had to go into the iLo/Rib card and give them a firm shove
Shutdown took forever. SQL Server can be super slow to shut down! I understand this better after reading a recent post on the “SQL Server According to Bob” blog. Bob Dorr explains that when SQL Server shuts down, it waits for all administrator (sa) level commands to complete. So, if you’ve got any servers where jobs or applications are running as sa, well…. hope they finish up fast.
Patching accidentally interrupted something important. Some process was running from an app server, etc, that failed because patching rebooted the server, and it fired off alarms that had to be cleaned up.
Something failed during startup after reboot. A service started and failed, or a database wasn’t online. (Figuring out “was that database offline before we started?” was the first step. Ugh.)
Miscommunication caused a problem on a cluster. Whoops, you were working on node2 while I was working on node1? BAD TIMES.
This is a really good post. Kendra’s done a lot more patching than I have, and she’s definitely though about it in more detail. Me, I’m waiting for the day—which is very close for some companies—in which you don’t patch servers. Instead, you spin up and down virtual apps and virtual servers which are fully patched. It’s a lot harder to do with databases compared to app servers, but if you separate data from compute, your compute centers are interchangeable. When a new OS patch comes out, you spin up new machines which have this patch installed, they take over for the old ones, and after a safe period, you can delete the old versions forever. If there’s a failure, shut down the new version, spin back up the old version, and you’re back alive.
My friend and coworker Melissa Coates (aka @sqlchick) messaged me the other day to see if I could help with a DAX formula. She had a Power BI dashboard in which she needed a very particular interaction to occur. She had slicers for geographic attributes such as Region and Territory, in addition to a chart that showed the percent of the regional total that each product type represented. The product type was in the fact/data table. Region and territory were in a dimension/lookup table and formed a hierarchy where a region was made up of one or more territories and each territory had only one region.
It’s rare to hear me say “MDX was easier” but in this case, MDX was easier…
Before I begin, allow me to perform the data science Airing of Grievances. This is an important part of data analysis which most people gloss over, instead jumping right into the “clean up the dirty data” phase. But no, let’s revel in its filth for just a few moments.
Despite my protestations and complaints, I think there are some reasonable conclusions. If you need to look like you’re working for a couple of hours (or at least want to play around a bit with SQL and R), this is the post for you.
When executed through the SQL Agent, the SQLPS.exe mini-shell is called and the current working directory is switched to the SQLSERVER:\ provider. When you call a cmdlet that uses the FILESYSTEM provider under the context of the SQLSERVER provider the cmdlet will fail.
At this time, we want to remind you of a critical Microsoft Visual C++ 2013 runtime pre-requisite update that may be* required on machines where SQL Server 2016 will be, or has been, installed. Installing this, via either of the two methods described below, will update the Microsoft Visual C++ 2013 runtime to avoid a potential stability issue affecting SQL Server 2016 RTM.
* You can determine if an update is required on a machine via one of the two methods below:
Select View Installed Updates in the Control Panel and check for the existence of either KB3164398 or KB3138367. If either is present, you already have the update installed and no further action is necessary.
Check if the version of %SystemRoot%\system32\msvcr120.dll is 12.0.40649.5 or later. If it is, you already have the update installed and no further action is necessary. (To check the file version, open Windows Explorer, locate and then right-click the %SystemRoot%\system32\msvcr120.dll file, click Properties, and then click the Details tab.)
If you’re running 2016, please make sure that your systems are up to date. This post includes an easy T-SQL query you can run to see if you’re up to date already.
Pick the location based on two factors, Azure Data Factory is not available everywhere so you are limited to use only the ones where it is available. If you pick one where it isn’t available, you will get an error message letting you know why you cannot create the resource. Whenever possible within Azure to pick the same resource where your data lives. There are charges within Azure if you migrate data across resources and no charge if you stay in the same resource. You may want to go look at where the data lives which will be used in Data Factory before deciding where to put it. I always check the Pin to Dashboard option so that I can find the resource later, but it is not required and can be done later. Click on the create button to create a Data Factory Resource. If you have selected Pin to Dashboard you will see a little window which says Deploying Data Factory. This little window goes away once Data Factory is completed, and you will have an entry in the list of resources for Data Factory.
Read the whole thing if you’re thinking of getting started with Azure Data Factory.
Starting with SQL Server 2016, if you have enough RAM and suffering from the TempDB Spills that do have a significant impact on your workload, then you can enable the Trace Flag 9389 that will enable Batch Mode Iterators to request additional memory for the work and thus avoiding producing additional unnecessary I/O.
I am glad that Microsoft has created this functionality and especially that at the current release, it is hidden behind this track, and so Microsoft can learn from the applications before enabling it by default, hopefully in the next major release of SQL Server.
There’s a lot of good stuff in here. Read the whole thing.
Apache Spark 2.0 has officially been released. Vinay Shukla gives us some highlights:
Project Tungsten has completed another major phase and with new completely new stage code generation, significant performance improvements have been delivered. Parquet and ORC file processing have also delivered performance improvements.
Databricks Community Edition offers (tiny) free clusters with Spark 2.0 on top of Scala 2.10 and Scala 2.11.
But despite the longwindedness of it, I love it, because at no point did I need to figure out that Adelaide was in +10:30, or that Eastern was -5:00 – I simply needed to know the time zone by name. Figuring out whether daylight saving should apply or not was handled for me.
It works by using the Windows registry, which has all that information in it, but sadly, it’s not perfect when looking back in time. Australia changed the dates in 2008, and the US changed its dates in 2005 – both countries saving daylight for more of the year. AT TIME ZONE understands this. But it doesn’t seem to appreciate that in Australia in the year 2000, thanks to the Sydney Olympics, Australia started daylight saving about two months earlier. This is a little frustrating, but it’s not SQL’s fault – we need to blame Windows for that. I guess the Windows registry doesn’t remember the hotfix that went around that year. (Note to self: I might need to ask someone in the Windows team to fix that…)
Like most companies dealing with multiple time zones, we ended up building a table and function to translate. This is a nice, built-in way of doing something very similar.