Press "Enter" to skip to content

Author: Kevin Feasel

The Costs and Benefits of Dirty Reads

Chad Callihan explains what a dirty read is and does a cost-benefit analysis on it:

When you are not careful with your transaction isolation levels or you get sneaky with the NOLOCK hint, one problem you can encounter is a dirty read. Let’s look at a short example to demonstrate a dirty read.

In a vacuum, I’m not necessarily opposed to the idea of dirty reads because you can find legitimate cases in which they can be useful. In practice, I’m generally very much in opposition because of two reasons: first, Read Committed Snapshot Isolation eliminates the majority of those reasons; and second, because the misuse is almost always in the direction of over-use of NOLOCK hints.

Comments closed

A Warning: VPCs and Distributed Database Platforms

Wade Trimmer takes us through a reason why you might not want to use VPC endpoints to separate applications from distributed database platforms:

AWS PrivateLink (also known as a VPC endpoint) is a technology that allows the user to securely access services using a private IP address. It is not recommended to configure an AWS PrivateLink connection with Apache Kafka or Apache Cassandra mainly due to a single entry point problem. PrivateLink only exposes a single IP to the user and requires a load balancer between the user and the service. Realistically, the user would need to have an individual VPC endpoint per node, which is expensive and may not work. 

Using PrivateLink, it is impossible by design to contact specific IPs within a VPC in the same way you can with VPC peering. VPC peering allows a connection between two VPCs, while PrivateLink publishes an endpoint that others can connect to from their own VPC.

Read on to understand how this affects platforms like Cassandra and Kafka.

Comments closed

Configuring Classifiers for Workload Management

Reiss McSporran continues a series on workload management in Azure Synapse Analytics:

So in part 1 we found out what Workload Management was and why we wanted to use it. (If you haven’t read that part, please click here and read that post before you carry on, as we’re jumping in at the deep end!) In this part we’ll be learning how to configure it.

Continuing from the end of the previous post, you have two main options to configure this. First is to assign a user or role to one of the existing system defined workload groups, second is to create a custom workload group and assign a user or role to this instead.

Read on to learn how to create classifiers, what the rules look like for them, and their importance.

Comments closed

Fun with Excel Cube Functions

Chris Webb has a new series on Excel cube functions combined with Lambda helper functions. First up, Chris collects items from a set:

This example shows the fundamental problem that has always existed with CubeRankedMember though: in order to show all the items in a set you need to know how many items there are in advance, and populate as many cells with CubeRankedMember formulas as there are items. In this case see how the range B4:B6 contains the numbers 1, 2 and 3; these numbers are used in the formulas in the range C4:C6 to get the first, second and third items in the set.

If a fourth product was added to the table, however, it would not appear automatically – you would have to add another cell with another CubeRankedMember formula in it manually. I’ve seen some workarounds but they’re a bit hacky and require you to know what the maximum possible number of items in a set could ever be. Indeed that’s always been one of the key differences between cube functions and PivotTables: cube functions are static whereas PivotTables can grow and shrink dynamically when the data changes.

Read on for a better answer. Then, check out part 2, which covers generating a dynamic table using Excel cube functions and Lambda helpers:

In the first post in this series I showed how to use the new Excel Lambda helper functions to return an array containing all the items in a set. That isn’t very useful on its own, so in this post I’ll show you how to generate an entire dynamic table using Excel cube functions and Lambda helper functions.

In this post I’ll be using the same source data as in my previous post: a table containing sales data with just two columns.

Comments closed

Putting a SQL Server Database into Recovery Pending

John Morehouse causes harm to a perfectly good database:

You might be asking why on earth would you want to get a database into an undesirable state, more specifically into a Recovery Pending state.  Well, in my case, I had a client database that had entered into this state due to a failure of storage.  Thankfully, this database was not being used for production, so I had some time to determine the best way to fix the issue.

A phrase that was often used during my time in the fire service, was “Try Before You Pry”.  Does the front door of the house need to be kicked in?  It may be unlocked and by trying before prying (with my boot) I can prevent damage to the door.  In these types of scenarios, this philosophy holds true.  Try things out on non-critical databases will help prevent any further damage.

In this instance, I want to try it before forcing something that might be damaging.  This meant I had to get a test database into a recovering state.  Once it is in the state I need, then I can attempt different methods to recover the database properly.  Once a successful solution is determined, I can then confidently pry against the damaged database in production knowing that I am using a validated solution.

Read on to learn one way to put a database into recovery pending state, but definitely pay attention to the disclaimer. And its twin.

Comments closed

Creating a Time Zone Dimension

Jonathan Kehayias has a follow-up to a prior post:

The basis for the approach in this post is going to be the TimeZoneInfo class from .NET, which is “supposed” to be up to date with any time zone DST rules from the Windows Registry. I used a modification of the PowerShell example in the .NET Docs page for the TimeZoneInfo class to build a a flat file that is then bulk loaded to a table in SQL Server using BCP with all the time zone adjustment rules:

There’s a lot of effort in here, but as Jonathan notes, it will provide you the ability to get the right time zone as of a particular date and time. Or at least, maybe—I’m not sure if the Windows registry includes historical time zone shifts such as when the US changed when daylight savings time starts and ends.

Comments closed

An Introduction to ML.NET

Ivan Matec gives us a walkthrough of the ML.NET library and its Model Builder component:

Before we dive into our example, let’s talk a bit about ML.NET history and its current state.

ML.NET draws its origins from the 2002’s Microsoft Research project named TMSN, which stands for “test mining search and navigation.” Later it was renamed to TLC, “the learning code.” ML.NET war derived from the TLC library. Initially, it was used on internal Microsoft products.

The first publicly available version ML.NET 1.0 was released in 2019. It included the Model Builder add-in and AutoML (Automated Machine Learning) capabilities.

The current version is 1.6.0. More details about all releases can be found on the official ML.NET release page.

ML.NET is not a bad library if you need to do some fairly simple work

Comments closed

Working with App Secrets in .NET Core

Santosh Hari shows us how to use application secrets when building .NET Core applications:

I was writing a sample dotnetcore console application for a talk because why I felt using a sample aspnet core web app was overkill. The app was connecting to a bunch of Azure cloud and 3rd party services (think Twilio API for SMS or LaunchDarkly API for Feature Flags) and I had to deal with connection strings.

Now I have a nasty habit of “accidentally” checking in connection string and secrets into public GitHub repositories, so I wanted to do this right from the get go.

That’s a bad habit to be in, and Santosh shows us how we can avoid doing that via use of application secrets.

Comments closed

Fun with Arrays in Powershell

Robert Cain looks at how arrays operate in Powershell:

In this article, we’ll look at the different ways to use Arrays in PowerShell. We’ll cover the basics, then move onto more advanced topics.

For all of the examples, we’ll display the code, then under it the result of our code. In this article I’ll be using PowerShell Core, 7.1.3, and VSCode. The examples should work in PowerShell 5.1 in the PowerShell IDE, although they’ve not been tested there.

Click through for a variety of tips and tricks when working with arrays.

Comments closed

Visualizing Data over Time with F#

Codesuji takes us through creating an interesting video:

How is this accomplished? I reach into F#’s bag of tricks to leverage Deedle, Plotly.NET, and ffmpeg in order to transform a series of data files into a singular video showing county-level drought data from 1900-2016. Together these bring static data into a dynamic representation. For reference, the Palmer Drought Severity Index (PDSI) typically ranges from -10 (dry) to 10 (wet). Putting this all together is pretty straight-forward, but I wanted to call out a couple specific parts. For this particular example Deedle is overkill, but pairing it with Plotly.NET can often be useful in more complex situations. Plotly offers some nice customization options, which I take advantage of below. Once all the images are generated with Plotly, F# can shell out to ffmpeg to perform the video assembly. I do this in two parts, creating both an mp4 and webm file.

We’re reading datasets, parsing text files, deserializing JSON contents, building a visual for each point in time, and then creating a video out of it—all in 100 lines of code. Not bad.

Comments closed