Press "Enter" to skip to content

Curated SQL Posts

Joins With Kafka

Florian Trossbach and Matthias J Sax show the various sorts of joins offered in Kafka, both streams and tables:

Apache Kafka’s Streams API provides a very sophisticated API for joins that can handle many use cases in a scalable way. However, some join semantics might be surprising to developers as streaming join semantics differ from SQL semantics. Furthermore, the semantics of changelog streams and tombstone messages (that are used for deletes) are a new concept in stream processing.

Kafka’s journey from Pub/Sub broker to distributed streaming platform is well underway, and our times as engineers are very exciting!

I didn’t know you could join streams together in Kafka, so that’s really cool.

Comments closed

Clearing Cached Credentials With Powershell

Adam Bertram shows how to use Powershell and cmdkey to clear out cached credentials:

It is even easier to use cmdkey with PowerShell. IT can build a small wrapper script that can manage cached credentials on one remote computer at a time and perform the action just as quickly on multiple computers at once.

The following example uses a PowerShell module called PSCredentialManager. IT pros can download the module from the PowerShell Gallery by running Install-Module.

Read the whole thing.

Comments closed

Rounding Up The Usual Suspects

Arun Sirpal shows us the suspect pages table in msdb:

Did you know that SQL Server tracks suspect pages and puts the details into the msdb database in a table called suspect_pages? Well it does, but the question I want answering is what happens if the page within the suspect pages table is fixed? Does it automatically get removed/updated or do we as the administrators have to do anything manually?

Let’s find out.

It’s a useful table to monitor.

Comments closed

Transactional Replication And SQL On Linux

Phil Grayson shows a way to get transactional replication working on Linux:

Microsoft have stated that transactional replication isn’t supported on Linux and we’re not sure if they intend to in the future. This means that if you try to add the server to a publisher, you get the following message. So you can’t use the GUI and it also means that you can’t use pull as the necessary files won’t be there.

Despite that warning, there is a way to set up a push subscription; click through for that way.

2 Comments

Polybase And RPC Protection

Casey Karst announces that Polybase supports Hadoop RPC protection:

Supporting this configuration allows PolyBase to connect and query Hadoop clusters that have wire encryption turned on. This enables a secure connection between Hadoop and SQL Server; as well as, among the Hadoop Data Nodes.

To connect to a Hadoop cluster with the hadoop.rpc.protection set to privacy or integrity, you will need to alter the core-site.xml file that is installed with PolyBase. This file is generally found at C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Binn\Polybase\Hadoop\conf.

That’s good news for Polybase users.

Comments closed

Regular Expression Cheat Sheets

Mara Averick shows off a collection of regular expression guides:

There are helpful string-related R packages 📦, stringr (which is built on top of the more comprehensive stringi package) comes to mind. But, at some point in your computing life, you’re gonna need to get down with regular expressions.

And so, here’s a collection of some of the Regex-related links I’ve tweeted 🐦:

Click through for links to regular expression resources.

Comments closed

Visualizing A Single Number

Tim Bock shows a dozen methods for visualizing a single number:

There are a number of situations in which it can be advantageous to create a visualization to represent a single number:

  • To communicate with less numerate viewers/readers;

  • Infographics and dashboards commonly use one important number;

  • To attract the attention of distracted or busy viewers/readers;

  • To add some humanity or “color”, to create an emotional connection;

  • Or to increase the redundancy of the presentation (see Improve the Quality of Data Visualizations Using Redundancy).

To a great extent, my favorite is the first.  There are good cases for many of the others—primarily the shock value of the uncountable pictogram—but typically, the best visualization is simple.

Comments closed

Date Correlation Optimization

Monica Rathbun explains another quasi-hidden SQL Server configuration option:

According to MSDN – The DATE_CORRELATION_OPTIMIZATION database SET option improves the performance of queries that perform an equi-join between two tables whose date or datetime columns are correlated, and which specify a date restriction in the query predicate.

How many of you read what MSDN says and thinks “wuuuuuttt, English please”? I do.

Read on for the English translation.

Comments closed

Dealing With Noisy Neighbors

Kevin Kline explains what Resource Governor does:

There are lots of ways to manage noisy neighbors. For example, you could spin up additional instances of SQL Server on a single physical or virtual machine (VM), and then segregate the applications to a distinct instance. You could also follow the old adage of “one application, one SQL Server” by putting the SQL Server onto its own machine, either physical or virtual. But that can get very expensive very quickly, depending on your licensing methodology.

If you’re running SQL Server 2008 or later, you might want to investigate Resource Governor as an alternative. Resource Governor lets you create limits on the amount of system resources a database and application can consume. On versions 2008 to 2012, Resource Governor can explicitly limit CPU and memory and, starting with version 2014, limit I/O consumption as well. This is powerful medicine for multi-tenant instances with noisy neighbors!

My response to noisy neighbors is to turn my music up really loud as a passive-aggressive response.  Oh, wait, wrong kind of noisy neighbor…  H/T SentryOne

Comments closed