Press "Enter" to skip to content

Month: January 2023

Max and Min Functions in KQL

Robert Cain goes extreme:

The max and min aggregation functions are common to almost every language, and the Kusto Query Language is no exception. As you would think, when you pipe in a dataset max returns the maximum value for the column name you pass in. Likewise min returns the lowest value.

In addition, there are variants for each, maxif and minif. We’ll see examples for all of these in this post.

Click through for a few functions you can call via the summarize operator.

Comments closed

Organizing R Code

Tomaz Kastrun tidies up:

Keeping your R code organised is not as straightforward as one might think. Just think about the libraries, variables, functions, and many more. All these objects can be defined and later rewritten, some might get obsolete during the process.

This process is proven to be even more crucial when you are part of a larger group of engineers, and scientists, who collaborate with you.

Click through for some organizational tips specific to R code.

Comments closed

Data Types and CONCAT_WS

Koen Verbeeck hits a concatenation issue:

I was writing some dynamic SQL that generates some SQL statements to load my facts and dimensions into a data warehouse. Some of those SQL statements can become very long, for example if a dimension has a lot of columns. When debugging, I noticed a couple of statements failing with various errors. Turns out, they were truncated after 4000 characters. What was going on?

Read on to see what happened.

Comments closed

Restoring a TDE-Enabled Database

Matthew McGiffen notes how to restore a database with transparent data encryption:

When encrypting a database with Transparent Data Encryption (TDE), a vital consideration is to make sure we are prepared for the scenario where something goes wrong. For instance, if the server hosting our SQL instance goes belly-up, can we recover the data that we have encrypted with TDE?

Click through to learn what you’ll need to have.

Comments closed

Minimizing Cross-Highlighting on Direct Query

Dany Hoter explains why cross filtering might not be the best when you are using Direct Query in Power BI:

Every time a user clicks on a row or a column in a visual, every other visual in the page feels an urge to refresh itself and respond to the click.

Visual that are based on direct query may issue multiple queries for each refresh.

Depending on the number of measures used in the visual, the number of generated queries can be as high as 10 or even more for a single visual.

Read on for more information and what you can do to avoid this problem.

Comments closed

Window Functions in DAX

Marco Russo and Alberto Ferrari lay out the foundations of window functions:

Window functions by themselves do not increase the expressivity of DAX. Most if not all of the calculations performed with window functions can be expressed with more complex DAX code. The goal is to simplify authoring these calculations and improve their performance.

These new functions also introduce a new concept to the DAX language: “apply semantics”. We will publish more articles about window functions and “apply semantics” over time. SQLBI+ subscribers will get a dedicated video course later this year and already have access to the window functions whitepaper we are currently writing.

Click through for an introduction.

Comments closed

Thoughts on Code Commenting

Maelle Salmon has comments:

An important goal of good code is to be readable so that future contributors can build with and upon it as needed. Good commenting is part of the toolset for reaching that goal. In this post we shall first present principles of code commenting, and then a few tips.

I agree with the general thrust of Maelle’s argument. “What and how” types of comments are fine for pseudo-code that you write before beginning the real work, but they’re scaffolding and shouldn’t stick around when the code is done. Instead, focus on the “why.”

One area of focus I’d bring in terms of how I view comments is that I will have (and like to see) more detailed comments in the most difficult sections of code. Yeah, if you can simplify the code, that’s better than adding a lengthy comment. But there’s always some bit of code which five people have tried to simplify over the years but it doesn’t work. Knowing what the business rules are, what you’ve unsuccessfully tried in the past, and why this is the best available option (as of the time of the last update) can help prevent developer six from tilting at windmills.

Comments closed

The Importance of Re-Learning for Knowledge Updates

Neil Saunders thinks about learning:

Some years ago I read an article – I forget where – describing how our general knowledge often becomes frozen in time. Asked to name the tallest building in the world you confidently proclaim “the Sears Tower!”, because for most of your childhood that was the case – never mind that the record was surpassed long ago and it isn’t even called the Sears Tower anymore. From memory the example in the article was of a middle-aged speaker who constantly referred to a figure of 4 billion for the human population – again, because that’s what he learned in school and had never mentally updated.

Is this the case with programming too? Oh yes – as I learned today when performing the simplest of tasks: reading CSV files using R.

The specific task involved ways to read a list of CSV files in R, though the impetus behind the post is ways to keep that knowledge up to date. This is one reason why it can be useful to attend introductory-level sessions on topics you already know: there might be new things in recent versions of software which change the game. There are also times when you learn something en passant: in a talk (or blog post or video) about topic X, the author might casually use some technique or tool not related to the topic itself.

Comments closed

Orchestrating Synapse Notebooks and Spark Jobs from ADF

Abhishek Narain has an announcement:

Today, we are introducing support for orchestrating Synapse notebooks and Synapse spark job definitions (SJD) natively from Azure Data Factory pipelines. It immensely helps customers who have invested in ADF and Synapse Spark without requiring to switch to Synapse Pipelines for orchestrating Synapse Notebooks and SJD. 

NOTESynapse notebook and SJD activities were only available in Synapse Pipelines previously. 

If you’re familiar with Synapse Pipelines, the equivalent ADF operations are extremely similar, as you’d probably expect.

Comments closed

What to Use Instead of Get-EventLog

Emin Atac gets a scary message:

When you type the following

Get-EventLog-SourceMicrosoft-Windows-Kernel-General-Newest20-LogNameSystem-InstanceId1 | Select-ExpandPropertyMessage

You get

Possible detection of CVE: 2023-01-09T09:08:23.5000000Z
Additional Information: 2023-01-08T19:56:29.1492612Z
This Event is generated when an attempt to exploit a known vulnerability (2023-01-09T09:08:23.5000000Z) is detected.
This Event is raised by a User mode process.

Read on to learn what this error message means, why it pops up, and what you can do to avoid it in the future.

Comments closed