Press "Enter" to skip to content

Curated SQL Posts

The take Operator in KQL

Robert Cain continues a series on KQL:

In this example we took the Perf table, and piped the dataset it generated into the take operator. We indicated we wanted to get 10 rows, which it did as you can see.

It is important to understand that take grabs these rows at random. Further, there is no consistency between each execution of take. You are likely to get a different set of rows with each execution. Let me run the exact same query again, so you can see the different data being returned.

Take if you want a slice, if you want a piece, if it feels alright.

Comments closed

Separating Code from Presentation with Jupyter

John Mount disaggregates Jupyter notebook results:

As I switch back and forth between R and Python projects for various clients and partners, I got to thinking: “is there an easy way to separate code from presentations in Jupyter notebooks?”

The answer turns is yes. Jupyter itself exposes a rich application programming interface in Python. So it is very easy to organize Jupyter’s power into tools that give me a great data science and analysis workflow in Python.

Read on to see how.

Comments closed

Alerting on Azure Budget Thresholds

Daron Yondem makes a budget:

You can’t imagine how many of us forget to set up the proper alerting mechanisms for our cloud subscription consumption. Here is how to do it in Azure in under 2 minutes.

Read on for the answer. I do like Azure’s budgeting tools except for one big thing: you can’t set a cap. Alerting is great but I want to have a “break glass in case of emergency” capability to stop spend altogether if you hit a certain point. I wouldn’t use it in production but for personal or development accounts, that’s big. And you can do it but only when you have a subscription which uses Azure credits—as soon as dollars are involved, there are no caps.

Comments closed

Using Extended Events with AWS RDS

Grant Fritchey tries out extended events in Amazon’s RDS:

AWS has posted the documentation on what you have to do in order to enable the collection of Extended Events within RDS. Normallly, I’d follow along with the documentation. However, I’m going to approach this like I knew that Extended Events support was there, but I wasn’t aware of the docs. So, I’m starting in SSMS and I’m just going to try plugging in the Extended Events GUI to see what happens. Further, I’m going to use the simplest method for launching Extended Events, XEvent Profiler. 

Read on for Grant’s findings.

Comments closed

Empty Catch Blocks in T-SQL

Erik Darling looks disapprovingly:

So if you execute the above proc, all it returns is an empty result with no error message.

Kinda weird.

Like not having finger or toenails.

I’m dating myself here but I remember a comment in a TheDailyWTF thread in which somebody puts an ASCII art Pikachu in every empty catch block because hey, you gotta catch ’em all.

Comments closed

Securing Azure Storage

Craig Porteous continues a series on Azure Data Platform security:

This is the third in a series where I look at all of the resources common to a Data Lakehouse platform architecture and what you need to think about to get it past your security team.

Building upon Azure Databricks, I’ll move from the compute engine to our blob and data lake storage. Things are a little simpler to secure but the plethora of options available can have significant impacts on usability and cost so it’s important to understand the impact before baking them into your design.

Read on for some good advice around securing Azure storage accounts.

Comments closed

Multiple Aggregations with CASE

Chad Callihan shows off a good use of aggregate functions and the CASE statement:

Have you have been tasked with pulling multiple counts from the same table? Maybe you need to find how many records have a value for a column and how many are NULL. Or maybe you need to see how many records are true and how many are false.

It’s simple enough to run a query to count one set of criteria, run another query for the second set of criteria, and combine them when sending your results. Did you know you can get multiple counts with one query?

Here’s an example of how using COUNT and CASE can speed up your day.

Click through for an example. I mildly disagree with Chad’s conclusion that this is something you’ll rarely do—the more you work with reporting and analytical queries, the more you’ll appreciate this.

Comments closed

Power Query XML Inconsistencies

Chris Webb walks us through some inconsistencies:

A few months ago one of my colleagues at Microsoft, David Browne, showed me an interesting Power Query problem with how the Xml.Tables and Xml.Document M functions handle null or missing values. I’m posting the details here because the problem seems fairly common, it causes a lot of confusion and it’s not easy to deal with.

In XML there are two ways to represent a null or missing value:<a></a> or omitting the element completely. Unfortunately the Xml.Tables and Xml.Document M functions handle these inconsistently: they treat the <a></a> form as a table but the other as a scalar.

Click through for an example and a workaround for the issue.

Comments closed

Comparing R Package Versions with Diffify

Clarissa Barratt and Parisa Gregg announce an interesting tool:

You know that sinking feeling that you get when you’re months into a big project and you log in one day and nothing works? Turns out something has updated and things have been removed that you needed and now you need to spend hours-days figuring out what’s changed and your masters deadline is getting closer and … ok, apparently this took me back to a very specific event.

But I’m sure most of that sounds familiar to you if you’ve ever programmed something over a longer period of time.

Over the last few months, Jumping Rivers have been working on a tool that will make it easier to see differences between R package versions: Diffify.

Read on to see it in action. It looks quite useful for troubleshooting issues in which a package suddenly changed API functionality, something which tends to happen frequently in the R and Python worlds.

Comments closed

Playing with gganimate

Tomaz Kastrun tries out gganimate:

I firmly believe that animation and transition between different data states can give end-users much better insights and understanding of the data, than a single table with data points or correlation metrics.

With help of ggplot, gganimate, you can quickly create an animation based on your needs. This is a simple IRIS dataset example.

You can find more at the gganimate website. The real downside is that I don’t think it’s being maintained any longer, as the last commit was a year ago.

Comments closed