Press "Enter" to skip to content

Author: Kevin Feasel

Using Enums in Powershell

Robert Cain quietly tells us that Powershell is a real programming language, sysadmins who claim to hate programming:

This post begins a series on using Classes in PowerShell. As a first step, we will cover the use of an Enum, as enums are frequently used in combination with classes.

An Enum is a way to provide a set of predetermined values to the end user. This allows the user to pick from a finite list, and assure a value being passed into a function or class will be valid.

Click through to learn more about enums and how they work in Powershell.

Leave a Comment

Ranking Window Functions

I continue a series on window functions in SQL Server:

The whole concept of ranking window functions is to assign some numeric ordering to a dataset. There are four ranking functions in SQL Server. Three of them are very similar to one another: ROW_NUMBER()RANK()DENSE_RANK(). The fourth one, NTILE(), is the odd cousin of the family.

Unlike aggregate window functions, all ranking window functions must have at least an ORDER BY clause in the OVER() operator. The reason is that you are attempting to bring order to the chaos of your data by assigning a number based on the order in which you join.

Watch me ramble on about monotonicity and quietly admit that I learned what it was from economics, where the naming feels utterly backward (“strongly monotonic” is the “greater than or equal to” of monotonicity, whereas “weakly monotonic” is the “greater than” of monotonicity). Also, I structured this entire post so that I could get that video from The Prisoner (the good one, not the garbage one) in it.

Leave a Comment

Data Types Matter, Even in the Serverless SQL Pool

Jovan Popovic has a public service announcement for us:

The serverless SQL pool is a distributed computing system that executes concurrent queries on a set of distributed compute nodes. Multiple compute nodes are running the parts of a distributed query plan that read the underlying files, join the data sets, group, and aggregate results. Different queries might try to use the same compute nodes to execute the parts of the queries.

The oversized column types like VARCHAR(MAX) might trick the compute node to allocate more resources than is needed. However, the allocation is based on the estimate, but these over-allocated resources will not be used in actual execution because they are not needed. If a compute node needs 100MB to sort the results it will use these 100MB although the query optimizer allocated 4GB of memory for the task on the compute node.

Read the whole thing.

Leave a Comment

Azure Synapse Database Templates

Aaron Merrill announces database templates for Azure Synapse Analytics:

The Synapse database template for Agriculture is a comprehensive data model that addresses the typical data requirements of organizations engaged in growing crops, raising livestock, and producing dairy products, including field and pasture management and satellite and drone data.

The Synapse database template for Energy & Commodity Trading is a comprehensive data model that addresses the typical data requirements of organizations engaged in trading energy, commodities, and/or carbon credits, whether as a primary trading business or in support of their supply chains, operating businesses, and hedging activities.

You may remember Microsoft buying ADRM Software a while back. This is why.

Leave a Comment

A Case for EAV?

Erik Darling makes the case:

EAV styled tables can be excellent for certain data design patterns, particularly ones with a variable number of entries.

Some examples of when I recommend it are when users are allowed to specify multiple things, like:

I’m not sure I agree on the examples. When there are specific known things with expected shapes, I’d rather have a separate entity to model each. Even if each table is a single string, I’d still like the separation for logical modeling purposes.

That said, there are cases when EAV ends up being the best approach (unfortunately), particularly when you don’t even know the types of things a customer would wish to include. Just try to fight back hard when the inevitable request comes in to pivot all of that data.

Leave a Comment

Solving Linear Constraints with Python

Luke Menzies and Gavita Regunath create a schedule:

Linear optimisation (often referred to as linear programming) is not cutting edge or new. It has been around for a very long time. It was first introduced within the field of operational research during World War II, where it was used to help minimise costings. The method proposed for solving these problems is known as the simplex method, and it hasn’t changed much today. Although this method hasn’t changed significantly, what has changed significantly is the computing power and accessibility of this technique, allowing these methods to be used on very complex scenarios with almost a click of a button. Convenient libraries have allowed the intricate complexities of setting these problems up on a computer to be simplified.

Read on for an example of linear programming. This is something I’ve always enjoyed, but haven’t had many places to use this technique in my professional career. That said, shout out to everyone who’s ever used LINGO.

Leave a Comment

Validating Pandas DataFrames with Pydantic

Sebastian Cattes continues a series on using Pydantic:

In part 1 of the article we learned about dynamic typing, Pydantic and decorators.

In this part we will learn how to combine these concepts for Pandas DataFrame validation in our codebase.

1. Combining Decorators, Pydantic and Pandas – Combine section 2. and 3. of Part 1 to showcase how to use them for output validation.

2. Let’s define ourselves a proper spaceship!

3. Summary

Check out both parts of the article.

Leave a Comment

Re-Making a Line Chart

Alex Velez cleans up a line chart:

When sharing a makeover, typically I’ll show a side-by-side “before” and “after” view. This is a powerful moment for many audiences as they witness the dramatic impact of an effective graph. I share this with you, because until recently if a combination chart was found within my makeovers it represented the “before” state. That’s because most combo charts are hard to read, so I tend to revise them into something simpler—like this makeover.

Today’s article shows the inverse of that process, where, in order to make a visual more informative and easier to understand, I chose to transform the original, a “simple” line chart, into a more “complicated” combination chart. 

This is a good reminder that visuals themselves aren’t necessarily bad (except for the pie chart, which is inherently evil and don’t try to convince me otherwise); it’s all about whether the specific chart makes sense given the story you are trying to tell.

Leave a Comment

ETL from an API

Bill Fellows unravels a bad practice:

The direction for software as a service providers is to provide APIs to access their data instead of structured file exports. Which is a pity, as every SaaS system requires a bespoke data extract solution. I inheireted a solution that had an adverse pattern I’d like to talk about.

The next level of complexity, at least in the space Bill covers in the example, is what to do when the upstream provider changes their data, some with changes even a week later.

Leave a Comment

Non-Yielding IO Completion Ports

Sean Gallardy is here to demystify a concept:

IO Completion Ports are a set of Windows APIs which allow for efficient, fast, multithreaded asynchronous IO. Great, that pretty much tells you nothing.

SQL Server uses IO Completion Ports not for disk-based IO but for general network IO when it comes into SQL Server for TDS level items. This means it’s used for things such as connecting to an instance of SQL Server, sending batch and rpc information, etc., and is used to properly take actions on the incoming items. These actions should be extremely short and quick, the name of the game is low latency and high throughput which means not doing things like reading or writing from disk, allocating memory, calling functions that may block, etc., to keep things flowing.

Read on to see what happens when there is a problem and what might cause that problem.

Leave a Comment