Press "Enter" to skip to content

Author: Kevin Feasel

TerraForm Commands in Visual Studio Code

Josephine Bush deploys some resources:

I realized I never created a post to show how to deploy Terraform from VS Code. I haven’t done that in a while because I don’t do it at work. We have Azure DevOps pipelines to handle that, but I like to test my code on the side in my personal environment because I don’t have a pipeline set up to push the code. I don’t need a pipeline in my personal environment.

Now, I feel rusty on Terraform commands and how to run them from VS Code, so I’m writing this blog post so my future self can thank me. I could look it up on someone else’s website or ask an AI, but I would rather document this for myself.

Click through for a primer on those commands.

Comments closed

Diving into Hash Tables

Hugo Kornelis dives into the arcane:

But what you probably don’t know is how that hash table is structured. How is the data stored? Where are new rows added, how is the table accessed?

To be fair, none of this is useful knowledge, unless you work for the engine team at Microsoft. And if you do, then you have access to source code and documentation, so you won’t need me to explain this structure to you. So why do I even take the trouble to investigate and describe this structure? Because I am a geek, and geeks love to dig into technical stuff and uncover things they were never meant to uncover.

“Because I can” is a perfectly valid reason to dig into a topic.

Comments closed

Kafka: From ZooKeeper to KRaft

Phil Yang lays out how to make a migration:

Apache Kafka has made a landmark shift in KIP-500 with the introduction of Kafka Raft (KRaft) mode, eliminating the dependency on Apache ZooKeeper for metadata management. With KRaft, the Kafka nodes themselves can be configured as KRaft controllers – which allow for metadata management and leader elections to work all within just Kafka, resulting in significant performance improvements. This cemented KRaft’s status as the metadata management protocol for Kafka moving forward.

This blog will guide you through the importance of this transition, what migrating from ZooKeeper to KRaft entails, and how we, at NetApp Instaclustr, make this seamless with our automated, streamlined process that is built into our platform.

Click through to see how you can update your own clusters, whether you’re using the Instaclustr service or not.

Comments closed

Text Classification with Decision Trees

Ivan Palomares Carrascosa takes us through a simple natural language processing problem and solution:

It’s no secret that decision tree-based models excel at a wide range of classification and regression tasks, often based on structured, tabular data. However, when combined with the right tools, decision trees also become powerful predictive tools for unstructured data, such as text or images, and even time series data.

This article demonstrates how to build decision trees for text data. Specifically, we will incorporate text representation techniques like TF-IDF and embeddings in decision trees trained for spam email classification, evaluating their performance and comparing the results with another text classification model — all with the aid of Python’s Scikit-learn library.

Read on for the demos and to see how three different approaches work.

Comments closed

Sideways Recursion in DAX Calculation Groups

Marco Russo and Alberto Ferrari’s example goes sideways:

DAX calculation items do not provide full recursion. However, a limited form of recursion is available, known as sideways recursion. We describe this complex topic through examples. Let us start by understanding what recursion is and why it is essential to discuss it. Recursion may occur when a calculation item refers to itself, resulting in an infinite loop within the application of calculation items (read the linked article in case you are not familiar with the concept of “application”, which is different from “execution”). Let us elaborate on this.

Read on for a demonstration of the principle. I haven’t dug into the topic, but I was curious because I’d never heard of “sideways recursion” before. It turns out that there’s some discussion of it in the DAX community and there was something known as Simpson’s sideways recursions from the 1980s, but I’m not sure if that’s the same thing.

Comments closed

Downloading Power BI Reports from the Power BI Service

Gilbert Quevauvilliers wants to download a report:

I am sure we have all had it where there is Power BI report in the service which has been working for a long time. Then there is a requirement to make a change, and NO ONE can find the original PBIX.

There now is an easy way to download the Power BI Report or the Power BI Semantic model from the Power BI Service, and I will show you how to do this!

Click through to see how. No Power BI Report Server jokes this time around, however, because that functionality has been around for a while as long as you have appropriate permissions on the reports themselves.

Comments closed

Flags in SQL Server Regular Expression Functionality

Louis Davidson continues a series on regular expressions:

In this week’s sixth entry of my learning RegEx series, I am going to do two last intro entries for a while, this one on case sensitivity, and another on multi and single line searches. After this I will move into all of the functions that are available in SQL Server 2025 and Azure SQL (and I will come back if I learn any additional things that we need to cover either right after that, or anytime I learn something new I want to share about RegEx).

Read on to see which flags SQL Server currently supports. Of those, Louis tries out a pair.

Comments closed

Loading Data from Network-Protected Storage Accounts into OneLake

Matt Basile grabs some data:

AzCopy is a powerful and performant tool for copying data between Azure Storage and Microsoft OneLake, and is the preferred tool for large-scale data movement due to its ease of use and built-in performance optimizations. AzCopy now supports copying data from firewall-enabled Azure Storage accounts into OneLake using trusted workspace access. Now you can use AzCopy to load data from even network-protected storage accounts, letting you effortlessly load data into OneLake without compromising on security or performance.

Click through for an explanation of trusted workspace access, followed by the steps to try it out for yourself.

Comments closed

Feature Importance in XGBoost

Ivan Palomares Carrascosa takes a look at one of my favorite plots in XGBoost:

One of the most widespread machine learning techniques is XGBoost (Extreme Gradient Boosting). An XGBoost model — or an ensemble that combines multiple models into a single predictive task, to be more precise — builds several decision trees and sequentially combines them, so that the overall prediction is progressively improved by correcting the errors made by previous trees in the pipeline.

Just like standalone decision trees, XGBoost can accommodate both regression and classification tasks. While the combination of many trees into a single composite model may obscure its interpretability at first, there are still mechanisms to help you interpret an XGBoost model. In other words, you can understand why predictions are made and how input features contributed to them.

This article takes a practical dive into XGBoost model interpretability, with a particular focus on feature importance.

Read on to learn more about how feature importance works, as well as the three different views of the data you can get.

Comments closed

Portfolio Theory and Risk Reduction

John Mount continues a series on risk optimization:

I want to discuss how fragile optimization solutions to real world problems can be. And how to solve that.

Small changes in modeling strategy, assumptions, data, estimates, constraints, or objective can lead to unstable and degenerate solutions. To warm up let’s discuss one of the most famous optimization examples: Stigler’s minimal subsistence diet problem.

There are some neat stories in the post as you walk through problems of linear programming.

Also, Nina Zumel has a post on overestimation bias:

Revenue optimization projects can be particularly valuable and exciting. They involve:

  • Estimating demand as a function of offered features, price, and match to market.
  • Picking a set of offerings and prices optimizing the above inferred demand.

The great opportunity of these projects is that one can derive value from improving the inference of the demand estimate function, improving the optimization, and even improving the synergy between these two steps.

However, there is a common situation that can lose client trust and sink revenue optimization projects.

Read on for that article.

Comments closed