November 2021 – Page 4

Managed Instance Failover Groups

Published 2021-11-23 by Kevin Feasel

Arun Sirpal takes us through Azure SQL Managed Instance failover groups:

If you have been following me for a while you will know that I really like the Fail over groups within Azure SQL DB and it is no different to when applying it to Managed Instances. If you want a rock-solid DR plan, this is the way forward.
Remember it’s an abstraction layer on top of the active geo-replication feature, before this we had to do a lot of manual one to one database setups but now this feature simplifies deployment and management of geo-replicated databases at scale. You can initiate failover manually or automatically if there is a massive failure (researching this topic this could mean things from memory leaks to wrong network cables cut during routine hardware decommissioning – you never know, it could happen so plan for it)

Click through to see how to set this up and what failover looks like.

Comments closed

Intent Shared Locks in RCSI

Published 2021-11-23 by Kevin Feasel

Tomas Zika troubleshoots a deadlocking problem:

I was analyzing a deadlock graph and there was a mystery lock of type IS (Intent Shared). That was weird by itself because the database has Read Committed Snapshot Isolation (RCSI) enabled, which is the Optimistic Concurrency model that shouldn’t take shared locks. All the statements were contained in this database. Also, the locked table was seemingly unrelated to anything that has been going on in the deadlock report.

Click through for an image, a repro script, and an explanation as to what exactly is going on.

Comments closed

Decision-Making with Bayes’s Theorem

Published 2021-11-22 by Kevin Feasel

Bill Schmarzo lays out a framework to classify decision-making:

In my blog “Making Informed Decisions in Imperfect Situations”, I discussed the importance of properly and objectively framing the decision that we seek to make and how that impacts the data that we gather (and ignore) in an effort to make an informed decision. That is:
Are you trying to gather data to determine the right decisions or are you gathering data to support the decision that you have already made?
In that blog, I introduced two tools that can help us make informed decisions using the best available data, even when that data might be incomplete, conflicting, and/or distorted by others.

Read the whole thing.

Comments closed

Using Enums in Powershell

Published 2021-11-22 by Kevin Feasel

Robert Cain quietly tells us that Powershell is a real programming language, sysadmins who claim to hate programming:

This post begins a series on using Classes in PowerShell. As a first step, we will cover the use of an Enum, as enums are frequently used in combination with classes.
An Enum is a way to provide a set of predetermined values to the end user. This allows the user to pick from a finite list, and assure a value being passed into a function or class will be valid.

Click through to learn more about enums and how they work in Powershell.

Comments closed

Ranking Window Functions

Published 2021-11-22 by Kevin Feasel

I continue a series on window functions in SQL Server:

The whole concept of ranking window functions is to assign some numeric ordering to a dataset. There are four ranking functions in SQL Server. Three of them are very similar to one another: ROW_NUMBER(), RANK(), DENSE_RANK(). The fourth one, NTILE(), is the odd cousin of the family.
Unlike aggregate window functions, all ranking window functions must have at least an ORDER BY clause in the OVER() operator. The reason is that you are attempting to bring order to the chaos of your data by assigning a number based on the order in which you join.

Watch me ramble on about monotonicity and quietly admit that I learned what it was from economics, where the naming feels utterly backward (“strongly monotonic” is the “greater than or equal to” of monotonicity, whereas “weakly monotonic” is the “greater than” of monotonicity). Also, I structured this entire post so that I could get that video from The Prisoner (the good one, not the garbage one) in it.

Comments closed

Data Types Matter, Even in the Serverless SQL Pool

Published 2021-11-22 by Kevin Feasel

Jovan Popovic has a public service announcement for us:

The serverless SQL pool is a distributed computing system that executes concurrent queries on a set of distributed compute nodes. Multiple compute nodes are running the parts of a distributed query plan that read the underlying files, join the data sets, group, and aggregate results. Different queries might try to use the same compute nodes to execute the parts of the queries.
The oversized column types like VARCHAR(MAX) might trick the compute node to allocate more resources than is needed. However, the allocation is based on the estimate, but these over-allocated resources will not be used in actual execution because they are not needed. If a compute node needs 100MB to sort the results it will use these 100MB although the query optimizer allocated 4GB of memory for the task on the compute node.

Read the whole thing.

Comments closed

Azure Synapse Database Templates

Published 2021-11-22 by Kevin Feasel

Aaron Merrill announces database templates for Azure Synapse Analytics:

The Synapse database template for Agriculture is a comprehensive data model that addresses the typical data requirements of organizations engaged in growing crops, raising livestock, and producing dairy products, including field and pasture management and satellite and drone data.
The Synapse database template for Energy & Commodity Trading is a comprehensive data model that addresses the typical data requirements of organizations engaged in trading energy, commodities, and/or carbon credits, whether as a primary trading business or in support of their supply chains, operating businesses, and hedging activities.

You may remember Microsoft buying ADRM Software a while back. This is why.

Comments closed

A Case for EAV?

Published 2021-11-22 by Kevin Feasel

Erik Darling makes the case:

EAV styled tables can be excellent for certain data design patterns, particularly ones with a variable number of entries.
Some examples of when I recommend it are when users are allowed to specify multiple things, like:

I’m not sure I agree on the examples. When there are specific known things with expected shapes, I’d rather have a separate entity to model each. Even if each table is a single string, I’d still like the separation for logical modeling purposes.

That said, there are cases when EAV ends up being the best approach (unfortunately), particularly when you don’t even know the types of things a customer would wish to include. Just try to fight back hard when the inevitable request comes in to pivot all of that data.

Comments closed

Solving Linear Constraints with Python

Published 2021-11-19 by Kevin Feasel

Luke Menzies and Gavita Regunath create a schedule:

Linear optimisation (often referred to as linear programming) is not cutting edge or new. It has been around for a very long time. It was first introduced within the field of operational research during World War II, where it was used to help minimise costings. The method proposed for solving these problems is known as the simplex method, and it hasn’t changed much today. Although this method hasn’t changed significantly, what has changed significantly is the computing power and accessibility of this technique, allowing these methods to be used on very complex scenarios with almost a click of a button. Convenient libraries have allowed the intricate complexities of setting these problems up on a computer to be simplified.

Read on for an example of linear programming. This is something I’ve always enjoyed, but haven’t had many places to use this technique in my professional career. That said, shout out to everyone who’s ever used LINGO.

Comments closed

Validating Pandas DataFrames with Pydantic

Published 2021-11-19 by Kevin Feasel

Sebastian Cattes continues a series on using Pydantic:

In part 1 of the article we learned about dynamic typing, Pydantic and decorators.
In this part we will learn how to combine these concepts for Pandas DataFrame validation in our codebase.
1. Combining Decorators, Pydantic and Pandas – Combine section 2. and 3. of Part 1 to showcase how to use them for output validation.
2. Let’s define ourselves a proper spaceship!
3. Summary

Check out both parts of the article.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Month: November 2021