Press "Enter" to skip to content

Day: August 2, 2021

A Warning: VPCs and Distributed Database Platforms

Wade Trimmer takes us through a reason why you might not want to use VPC endpoints to separate applications from distributed database platforms:

AWS PrivateLink (also known as a VPC endpoint) is a technology that allows the user to securely access services using a private IP address. It is not recommended to configure an AWS PrivateLink connection with Apache Kafka or Apache Cassandra mainly due to a single entry point problem. PrivateLink only exposes a single IP to the user and requires a load balancer between the user and the service. Realistically, the user would need to have an individual VPC endpoint per node, which is expensive and may not work. 

Using PrivateLink, it is impossible by design to contact specific IPs within a VPC in the same way you can with VPC peering. VPC peering allows a connection between two VPCs, while PrivateLink publishes an endpoint that others can connect to from their own VPC.

Read on to understand how this affects platforms like Cassandra and Kafka.

Comments closed

Fun with Excel Cube Functions

Chris Webb has a new series on Excel cube functions combined with Lambda helper functions. First up, Chris collects items from a set:

This example shows the fundamental problem that has always existed with CubeRankedMember though: in order to show all the items in a set you need to know how many items there are in advance, and populate as many cells with CubeRankedMember formulas as there are items. In this case see how the range B4:B6 contains the numbers 1, 2 and 3; these numbers are used in the formulas in the range C4:C6 to get the first, second and third items in the set.

If a fourth product was added to the table, however, it would not appear automatically – you would have to add another cell with another CubeRankedMember formula in it manually. I’ve seen some workarounds but they’re a bit hacky and require you to know what the maximum possible number of items in a set could ever be. Indeed that’s always been one of the key differences between cube functions and PivotTables: cube functions are static whereas PivotTables can grow and shrink dynamically when the data changes.

Read on for a better answer. Then, check out part 2, which covers generating a dynamic table using Excel cube functions and Lambda helpers:

In the first post in this series I showed how to use the new Excel Lambda helper functions to return an array containing all the items in a set. That isn’t very useful on its own, so in this post I’ll show you how to generate an entire dynamic table using Excel cube functions and Lambda helper functions.

In this post I’ll be using the same source data as in my previous post: a table containing sales data with just two columns.

Comments closed

Configuring Classifiers for Workload Management

Reiss McSporran continues a series on workload management in Azure Synapse Analytics:

So in part 1 we found out what Workload Management was and why we wanted to use it. (If you haven’t read that part, please click here and read that post before you carry on, as we’re jumping in at the deep end!) In this part we’ll be learning how to configure it.

Continuing from the end of the previous post, you have two main options to configure this. First is to assign a user or role to one of the existing system defined workload groups, second is to create a custom workload group and assign a user or role to this instead.

Read on to learn how to create classifiers, what the rules look like for them, and their importance.

Comments closed

Creating a Time Zone Dimension

Jonathan Kehayias has a follow-up to a prior post:

The basis for the approach in this post is going to be the TimeZoneInfo class from .NET, which is “supposed” to be up to date with any time zone DST rules from the Windows Registry. I used a modification of the PowerShell example in the .NET Docs page for the TimeZoneInfo class to build a a flat file that is then bulk loaded to a table in SQL Server using BCP with all the time zone adjustment rules:

There’s a lot of effort in here, but as Jonathan notes, it will provide you the ability to get the right time zone as of a particular date and time. Or at least, maybe—I’m not sure if the Windows registry includes historical time zone shifts such as when the US changed when daylight savings time starts and ends.

Comments closed

Putting a SQL Server Database into Recovery Pending

John Morehouse causes harm to a perfectly good database:

You might be asking why on earth would you want to get a database into an undesirable state, more specifically into a Recovery Pending state.  Well, in my case, I had a client database that had entered into this state due to a failure of storage.  Thankfully, this database was not being used for production, so I had some time to determine the best way to fix the issue.

A phrase that was often used during my time in the fire service, was “Try Before You Pry”.  Does the front door of the house need to be kicked in?  It may be unlocked and by trying before prying (with my boot) I can prevent damage to the door.  In these types of scenarios, this philosophy holds true.  Try things out on non-critical databases will help prevent any further damage.

In this instance, I want to try it before forcing something that might be damaging.  This meant I had to get a test database into a recovering state.  Once it is in the state I need, then I can attempt different methods to recover the database properly.  Once a successful solution is determined, I can then confidently pry against the damaged database in production knowing that I am using a validated solution.

Read on to learn one way to put a database into recovery pending state, but definitely pay attention to the disclaimer. And its twin.

Comments closed