Press "Enter" to skip to content

Month: May 2021

Counts of Last-Known States of Items with DAX

Phil Seamark has an interesting problem:

The requirement was simple enough. Take the following dataset and, for any given day, produce a count of each possible State using the last known State for any given TestID. The dataset contains six unique Test IDs (A through F). At any given point in time, we first want to establish the last State for each TestID. We also want to group this by day and produce a count value for each possible State. Note, a given TestID can have more than one event in a day, and we only care about the last one.

I’m particularly interested in this because I find a lot of merit in the event-based structure in Phil’s input dataset, but it can be tricky going from that to data in a shape the customer likes.

Comments closed

To Cloud or Not to Cloud

That is the question, according to Guy Glantser:

This is not a regular blog post. I was looking for an old blog post that I wrote several years ago, and while searching, I found an even older blog post that I wrote back in 2009. It had the same title that you see here – To Cloud or Not to Cloud?

In 2009 the cloud was already a thing, but it was the early days. Microsoft’s cloud, Azure, wasn’t even announced yet until February 2010. The cloud has seen a tremendous advancement over the years. It’s interesting and also amusing to read what I wrote 12 years ago about the cloud. Some things are still true today, while others are completely irrelevant.

So here it is…

It’s good to reflect back on these thoughts to see how the industry has shifted. Issues which were show-stoppers may be completely eradicated by this point, while others remain trade-offs without an ideal answer.

Comments closed

What Helps with Readability of T-SQL Code

Erik Darling comes out of the gates with a hot take:

One line I see over and over again — I’ve probably said it too when I was young and needed words to fill space — is that CTEs make queries more readable.

Personally, I don’t think they make queries any more readable than derived tables, but whatever. No one cares what I think, anyway.

I’m going to split the middle of Erik’s take. Yes, using common table expressions by themselves doesn’t make a query easier to read. And yes, a good formatting technique helps a lot in readable code. Once that’s taken care of, I do think that common table expressions can be a bit more readable than their equivalent subqueries, for the reason that they do a better job of separating logically distinct segments of code. In those situations, I can read through each common table expression, getting a feeling for what they’re doing and let them tell a story with a top-to-bottom progression. This technique is most effective when you need several common table expressions for a query. By contrast, when creating subqueries (which I tend not to do much) or derived tables with APPLY (which I do so often, I get the employee discount), the story’s a little more disjointed, as the eyes seem to bounce more frequently between the main query and the subqueries. And don’t get me started on subqueries in the SELECT clause—those are the equivalent of somebody telling a story and saying, “Hang on, now I have to tell this story so that you get what I’m talking about.”

But going back to my agreement, if your code looks terrible, it doesn’t matter what constructs you use—it’s not very human-friendly.

Comments closed

Finding Text in a Stored Procedure

Chad Callihan has a way to search the text of stored procedures:

Story time. Let’s say a database server is receiving a new release that include a change to a stored procedure. All of the databases are supposed to get the changes but one way or another there are problems with the release and it has to be stopped part of the way through. Maybe some changes got rolled back but others weren’t rolled back. We don’t have accurate logging of what databases have been updated but we want to know if a stored procedure is on the old version or received the new version.

I’ve used this technique quite often. The only downside is that if you have a lot of procedures and don’t specify the object ID, search can get a bit slow.

Comments closed

The Basics of Finding Blocking

Alex Stuart has a way to find blocked processes:

So we need monitoring and alerting on it. Enterprise monitoring tools can do this, and do it well – but if you don’t have one, or don’t have enough licenses for your entire estate, you’ll need to roll your own. (OK, or copy someone else’s if you don’t need the learnin’). This post will demonstrate a basic method for detecting blocking and alerting based on a given threshold.

Read on for the process.

Comments closed

AI versus ML versus Deep Learning

Holger von Jouanne-Diedrich asks the expert:

This is our 101’st blog post here on Learning Machines and we have prepared something very special for you!

Oftentimes the different concepts of data science, namely artificial intelligence (AI)machine learning (ML), and deep learning (DL) are confused… so we asked the most advanced AI in the world, OpenAI GPT-3, to write a guest post for us to provide some clarification on their definitions and how they are related.

We are most delighted to present this very impressive (and only slightly redacted) essay to you – enjoy!

The machine has learned about itself. This is where I’m glad I only believe weak AI is possible…

Comments closed

Learning the Basics of Kafka via Notebook

Francesco Tisiot shares a way to learn about the basics of Apache Kafka using Jupyter notebooks:

One of the best ways to learn a new technology is to try it within an assisted environment that anybody can replicate and get working within few minutes. Notebooks represent an excellence in this field by allowing people to share and use pre-built content which includes written descriptions, media and executable code in a single page.

This blog post aims to teach you the basics of Apache Kafka Producers and Consumers through building an interactive notebook in Python. If you want to browse a full ready-made solution instead, check out our dedicated github repository.

The classic tutorials tend to use a couple command prompts and the built-in producer and consumer shell scripts. I like this approach as a way of being able to review the code and results later as a refresher.

Comments closed

Translating a Result Set into a Comma-Separated List

Kiana Bergsma shows us a tried-and-true method to confuse people:

Often times I have told developers, here is how you do it, and if you Google on it you will find some great samples.  Now it is time that I provide my own sample. I call this the FOR XML hack since it used the FOR XML command, without actually involving any XML at all.

I’m quite happy that STRING_AGG() is around as of SQL Server 2017, as it is a much clearer representation of how to solve this problem. If I had a dollar for every time somebody needed me to explain why I used FOR XML PATH() when I clearly wasn’t building XML, I’d have several dollars. Probably not a fistful of dollars, though.

Comments closed

Powershell: the Rest is Commentary

Kenneth Fisher stands on one foot:

Commenting your code, still super important. That piece of code that looks a bit strange because you couldn’t find another way to make it work? Better put in a note why so the next person doesn’t have to spend hours figuring out what you did and why. That block of code that pulls a list of zip files and unzips them? Explain what you are doing. The next person to look at this (who just may be you) could use a hint as to what you were thinking. Weird variable name? Heck, not so weird variable name. It couldn’t hurt to explain the purpose. Did I ever tell you I got a job because I did such a good job commenting my code during a technical test?

Read the whole thing.

Comments closed

VS_NEEDSNEWMETADATA in SSIS

Hadi Fadlallah discusses what was the bane of my existence for about 3 months in 2010:

In this article, we will briefly explain the VS_NEEDSNEWMETADATA SSIS exception, one of the most popular exceptions that an ETL developer may face while using SSIS. Then, we will run an experiment that reproduces this error. Then, we will show how we can fix it.

This was really annoying prior to SQL Server 2008 (at least, that’s my early-morning recollection of when the SSIS engine started trying to auto-fix this) and has been mildly annoying since. I had far too many conversations which I could summarize as “Yes, I understand that this Excel spreadsheet is basically the same, but it’s different in that the casing on one header column has changed slightly and that breaks the entire system.

Comments closed