Press "Enter" to skip to content

Author: Kevin Feasel

Summarize in KQL

Robert Cain continues a series on KQL:

When data is analyzed, it is seldom done on a row by row basis. Instead, data analysts look at the big picture, looking at total values. For example, the total number of times the disk transfer counter is recorded for a time period may give an indication of disk utilization.

To aggregate these values with KQL, we’ll use the summarize operator.

Read on for plenty of demos.

Leave a Comment

Apache Flink Table Store

Jingsong Lee and Jiangjie Qin have an announcement:

As of now it is quite common that people deploy a few storage systems to work with Flink for different purposes. A typical setup is a message queue for stream processing, a scannable file system / object store for batch processing and ad-hoc queries, and a K-V store for lookups. Such an architecture posts challenge in data quality and system maintenance, due to its complexity and heterogeneity. This is becoming a major issue that hurts the end-to-end user experience of streaming and batch unification brought by Apache Flink.

The goal of Flink table store is to address the above issues. This is an important step of the project. It extends Flink’s capability from computing to the storage domain. So we can provide a better end-to-end experience to the users.

Click through to see how table storage works.

Leave a Comment

T-SQL Order of Execution and Aliases

Joe Billingham explains why you can’t do that thing you want to do:

So, you have just written a query, hit execute and you have encountered an error: Invalid column name ‘[column name]‘.

The column you’ve used in your WHERE clause cannot be identified by its alias. You’ve defined it at the top of the query and used it fine previously as your ORDER BY condition, so why can’t the engine recognise it?

Read on for the answer. This is why some people I know have wanted a SQL-like language which runs in order of execution, so a query would start with the FROM clause rather than the SELECT clause. Languages like KQL do work that day, so there are examples in the wild.

Leave a Comment

Left and Right Deep Hash Joins

Forrest McDaniel dives into the forest:

There’s a lot already written about left versus right deep hash joins. Unfortunately for us SQL Server nerds, “left” and “right” don’t make as much sense in SSMS query plans – those have a different orientation than the trees of database theory.

But if you rotate plans, you can see left and right that make sense (even if they still don’t match canonical shapes). Just follow the join operators.

Read on to understand the difference and what it means for query performance.

Leave a Comment

823/824 Alerts with SQL Server and VMware

David Klee loops us in on a tricky-to-catch problem:

We’ve been tracking a weird state with SQL Server virtual machines on VMware and possible warnings on database corruption while VM backups are running, largely centered around (but not isolated to) the tempdb database.

TLDR: We’ve now got a VMware KB article on this situation that you and your VM admins should read if you hit the condition and fall into the specifics listed below. Reference VMware KB 88201 for more details.

Read on for David’s thoughts and what to do if you hit this problem.

2 Comments

Azure Resource Locks

Craig Porteous explains the benefit (and pain) behind resource locks in Azure:

In theory, these are perfect for preventing accidental (or deliberate) deletion of resources in Azure. They don’t prevent the deletion of data though, only operating at the “control plane” of a resource. That still sounds great though. Turn them on everywhere! That’s another layer of security in your cloud data platform. Right?

Yeah, here’s where the pain comes in. I tried using resource group locks but there are some resources which use delete capabilities, such as Azure Media Service. A delete lock means no ability to delete uploaded videos.

Leave a Comment

Backups with Checksum

Chad Callihan tempts Betteridge’s Law of Headlines:

When you’re specifying WITH CHECKSUM as you’re backing up databases, SQL Server will use checksums to help catch any inconsistencies with pages. This seems like a setting that you should always use and would expect to be a default setting. So why doesn’t SQL Server include it by default?

Using the principle that a backup isn’t valid until it’s verified, CHECKSUM acts as a useful but not sufficient check.

Leave a Comment

Protecting ML Models and IP

Pete Warden has some advice:

Over the last decade I’ve helped hundreds of product teams ship ML-based products, inside and outside of Google, and one of the most frequent questions I got was “How do I protect my models?”. This usually came from executives, and digging deeper it became clear they were most worried about competitors gaining an advantage from what we released. This worry is completely understandable, because modern machine learning has become essential for many applications so quickly that best practices haven’t had time to settle and spread. The answers are complex and depend to some extent on your exact threat models, but if you want a summary of the advice I usually give it boils down to:

– Treat your training data like you do your traditional source code.

-Treat your model files like compiled executables.

Read on to see why Pete came to this as the appropriate answer, as well as what I have to consider a sly mention of duck boat tours.

Leave a Comment