Press "Enter" to skip to content

Curated SQL Posts

Kafka Cruise Control

Jiangjie Qin announces Cruise Control, an automated workload management system for Kafka:

Intelligent automation is critical under these circumstances, which is why we developed Cruise Control: a general-purpose system that continually monitors our clusters and automatically adjusts the resources allocated to them to meet pre-defined performance goals. In essence, users specify goals, Cruise Control monitors for violations of these goals, analyzes the existing workload on the cluster, and automatically executes administrative operations to satisfy those goals. You can see a video here about Cruise Control at the Stream Processing Meet Up last fall.

Today we are pleased to announce that we have open sourced Cruise Control and it is now available on Github. In this post, we’ll describe Cruise Control’s uses both generally and at LinkedIn, its architecture, and some unique challenges we faced when creating it. For further details about Kafka terminology used throughout this post, this reference can be a helpful guide.

This isn’t a monitoring tool per se, but rather a resource balancing tool.  And it’s now freely available to all.

Comments closed

Getting Distinct Rows In R

Rob J. Hyndman shows four different techniques (one “classic” and three tidyverse) for getting a distinct subset of a data set in R:

So that looks much better — clean, short, and easy to understand. But is it fast? Rather than grabbing the first lines of each group, it has to go searching for duplicates. But avoiding grouping and ungrouping must save some time.

So I ran some microbenchmark timings:

Click through for techniques and timings.  I’m not surprised that the “classic” method won out in terms of time, but for explanatory value, I’d definitely prefer trying to explain the tidyverse distinct version.  H/T R-Bloggers

Comments closed

Azure SQL Database Multi-Factor Authentication

Arun Sirpal notes that the latest version of SQL Server Management Studio supports Multi-Factor Authentication with Azure Active Directory:

Quite a mouth full for a title but never the less very exciting. With the new version of SQL Server Management Studio (SSMS) 17.2 You now have the option to use Azure AD authentication for Universal Authentication with Multi-factor authentication (MFA) enabled, by that I mean use a login via SSMS that is enabled for MFA where below I will show you the two step verification using a push notification to my iPhone. (Yes iPhone I love it)

Download SSMS 17.2 from this link. https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms

Once installed you will see new Authentication options, the option that I want is the one highlighted below – “Active Directory – Universal with MFA support”

Click through for a demo of this.  I wonder if (when?) something like this comes to on-prem, maybe in conjunction with a third-party multi-factor authentication service.

Comments closed

The Downside Of Trusted Assemblies

Solomon Rutzky does not like the Trusted Assembly solution to SQL Server 2017 CLR:

Hopefully, Microsoft removes all traces of “Trusted Assemblies” (as I have suggested here). In either case, please just use Certificates (and possibly Asymmetric Keys, depending on your preference and situation) as I have demonstrated in these past three posts (i.e. Parts 2, 3, and 4). Even better, especially for those using SSDT, would be if Microsoft implemented my suggestion to allow Asymmetric Keys to be created from a binary hex bytes string. But, even without that convenience, there is still no reason to ever, ever, use the “Trusted Assemblies” feature.

He’s given three alternatives so far, so if you’re interested in CLR security, there’s plenty of food for thought.

Comments closed

SQL Code Smells

Phil Factor has updated his code smells compendium:

Some time ago, Phil Factor wrote his booklet ‘SQL Code Smells’, collecting together a whole range of SQL Coding practices that could be considered to indicate the need for a review of the code. It was published as 119 code smells, even though there were 120 of them at the time. Phil Factor has continued to collect them and the current state of the art is reflected in this article. There are now around 150 of these smells and SQL Code Guard is committed to cover as many as possible of them.

I loved this booklet when it came out almost as much as I loved his Confessions of an IT Manager.  If you’re looking for some light reading over a long weekend, you can do a lot worse than this.

Comments closed

Searching In Windbg

Ewald Cress shows us how to search for a four-byte pattern in the Windows debugger:

Cracking open Windbg on 2016 SP1 with the s command to look for byte patterns yielded nothing of value. Maybe something has changed with conventions or indirection? Nope, no joy in 2014 either.

In the end, it took the extremely brave step of RTFM, in this case the Windbg online help, to realise where I was going wrong. I was searching for a four-byte pattern by searching for doublewords. Sounds reasonable on the face of it, but what I had missed was that this specifically required the doublewords to be doubleword-aligned, i.e. starting on an address divisible by four. My method only had a 25% chance of working, so it’s sheer luck I ever got good results with it.

Changing to a byte search for four consecutive bytes gave me the non-aligned semantics my taste buds craved, and the results came pouring in.

This is in the context of gathering information on an uncommon wait type related to columnstore indexes.

Comments closed

Azure Archive Blob Storage

James Serra talks about a new tier of blob storage:

Last year Microsoft introduced Azure Cool Blob storage, which cost customers a penny per GB per month in some Azure regions.  Now, users have another, lower-cost option in Azure Archive Blob Storage, along with new Blob-Level Tiering data lifecycle management capabilities.  So there are now three Azure blog storage tiers: Hot, Cool, and Archive.

Azure Archive Blob Storage costs 0.18 cents per GB per month when the service is delivered through its cloud data center in the East US 2 (for comparison, in the same region hot is 1.8 cents and cool is 1.0 cents per GB per month) .  Customers can expect a 99 percent availability SLA (service level agreement) when the service makes its way out of the preview stage.

This is Azure’s response to AWS Glacier.  The immediate sticker price is a bit higher, but if there aren’t any incremental costs associated with deletion, uploading, or retrieving files, then it could end up matching Glacier in TCO.

Comments closed

R Versus Python

Vincent Granville believes that Python is overtaking R in the realm of data science:

We use the app in question to compare search interest for R data Science versus Python Data Science, see above chart.  It looks like until December 2016, R dominated, but fell below Python by early 2017. The above chart displays an interest index, 100 being maximum and 0 being minimum. Click here to access this interactive chart on Google, and check the results for countries other than US, or even for specific regions such as California or New York.

Note that Python always dominated R by a long shot, because it is a general-purpose language, while R is a specialized language. But here, we compare R and Python in the niche context of data science. The map below shows interest for Python (general purpose) per region, using the same Google index in question.

It’s an interesting look at the relative shift between R and Python as a primary language for statistical analysis.

Comments closed

The SQL Has Landed

Drew Furgiuele’s SQL Server in Space project was a success:

We took quite a few readings to try and account for any variations, and when we were done we ended up with a little more lift than we calculated, but not by much. We then proceeded to tie off the balloon. This is by far the most stressful part of the entire operation. Weather balloons have what amounts to a very long, thick neck where you inflate them from. Unlike, say, a party balloon, you don’t just knot the end and call it a day. Instead, you need to tie one end of the neck off to prevent the escape of the helium, then bend the neck back. Then, you tie that off. You’re left with a “U” shaped bend at the bottom of the balloon that you then attach your payload train to: the parachute, then the reflector, and ultimately the payload. Oh, and you have to do this while, you know, trying to wrangle a 7 foot diameter balloon.

Tagged “cloud” even though it was a bright and sunny day.

Comments closed