Press "Enter" to skip to content

Month: February 2024

Supporting Replication in dbatools

Jess Pomfret makes an announcement:

Replication, everyone loves to hate it, but it’s been our most requested feature as far as adding commands to dbatools. For ages we’ve said ‘sounds great’ and ‘we would love that’, but when we started looking into it the energy soon fizzled away, due to it’s dependency on RMO – Replication Management Objects, as opposed to SMO – SQL Server Management Objects, things weren’t as easy as we had hoped.

That said, it’s now there as of version 2.1.1. It supports snapshot, transactional, and merge replication. No peer-to-peer, but is anyone actually surprised at that?

Comments closed

Search Patterns in T-SQL

Erik Darling puts on the fedora and grabs the bullwhip:

First, what you should not do: A universal search string:

The problem here is somewhat obvious if you’ve been hanging around SQL Server long enough. Double wildcard searches, searching with a string type against numbers and dates, strung-together OR predicates that the optimizer will hate you for.

These aren’t problems that other things will solve either. For example, using CHARINDEX or PATINDEX isn’t a better pattern for double wildcard LIKE searching, and different takes on how you handle parameters being NULL don’t buy you much.

Read on for an example of a terrible search query, a mediocre search query, a good search query, and a possible unicorn: an actually valid reason to use a non-clustered columnstore index.

Comments closed

Clustered Index Rebuilds and Transaction Log Space

Chad Callihan talks about a potential problem:

Of all the problems that can cause a database transaction log to fill up, perhaps one lesser-known cause is a large clustered index rebuild. Let’s look at a scenario where this issue could cause a headache.

Click through for the scenario. I will say that I disagree with reorganizing the index under pretty much any circumstances. Jeff Moden, in a long and deep but extremely interesting presentation he gave to us at TriPASS three years ago, laid out why that is. I do agree that rebuilding a clustered index should be uncommon and may be a sign of a data modeling problem.

The other thing to note is that this problem is bigger when you rebuild online and your data drive is the same as your log drive. When you rebuild an index online, SQL Server needs enough space to make a copy of the data, so the rule of thumb is that you’ll have 2x the normal disk space requirements during online rebuild. It’s not necessarily exactly 2x because of fragmentation, compression, and other factors, but that’s why it’s a rule of thumb and not a “guarantee or your money back” type of thing.

Comments closed

Postgres Backup and Recovery with Barman

Muhammad Ali shows off another way to back up and restore data in Postgres databases:

Barman is a production grade tool for managing the backup and recovery process of PostgreSQL databases. It not only handles the physical backups but also provides automatic management of retention policies, compression, near zero Recovery point objective(RPO) and enables recovery at any desired point(PITR) in time within the recovery window.

Barman’s Stream Archiving feature stands out as a key component in achieving zero Recovery Point Objective (RPO). This is achieved by using pg_recievewal utility which continuously backs up Write-Ahead Logging (WAL) files in real-time to a designated Barman server. This capability is particularly important for applications where even minimal data loss is unacceptable.

Read on for a couple questions about it and a demonstration of how Barman works.

Comments closed

Common Mistakes in T-SQL Code

Rich Benner builds a list:

One potential issue here is the lack of time to delve really deeply into each programming language at one’s disposal. The thing about SQL Server is that it is a language that performs very differently than, say, C#. Those languages are row based, which means it’s common to iterate through a data set and process each row individually. This works quite well in those languages, but quite terribly in SQL Server. Being a set-based language, SQL Server prefers to deal with the whole data set at once.

Because of these nuances, it’s easy for developers to make mistakes when creating SQL server code. Let’s go through some common errors that we see.

Read on for several common mistakes that we often find in our code.

Comments closed

Adding Superscripts and Subscripts to Axis Labels in R

Steven Sanderson changes the script:

Before we dive into the code, let’s quickly review what superscripts and subscripts are.

  • Superscripts: These are smaller-sized characters or numbers that appear above the baseline of the text. They are often used to denote exponents or indices.
  • Subscripts: On the other hand, subscripts are smaller-sized characters or numbers that appear below the baseline of the text. They are commonly used in mathematical expressions or chemical formulas.

Read on to see how you can generate these in R visuals.

Comments closed

GUID Conversion and the Serverless SQL Pool

Reitse Eskens hits a weird error:

One of the transformations is to change one primary key column from integer to GUID. This is something you can do with some trickery you’ll see in the code. But what I found was that, even though the primary key is unique, the GUID’s weren’t. And then the fun starts digging into the why…

Read on for the research Reitse performed. I don’t even have a good guess for this, it’s so weird. It feels like a bug but it’s weird regardless.

Comments closed

Full and Incremental Loads in Microsoft Fabric

John Miner continues a series on data engineering in Microsoft Fabric:

In a data lake, we have a bronze quality zone that supposed to represent the raw data in a delta file format. This might include versions of the files for auditing. In the silver quality zone, we have a single version of truth. The data is de-duplicated and cleaned up. How can we achieve these goals using the Apache Spark engine in Microsoft Fabric?

Read on for John’s take on the answer. I’ve found that I have a fairly good answer for smaller datasets, though as the size of the data gets larger, the less I like answers for the raw layer.

Comments closed

Community Updates to Community Stored Procedures

Erik Darling shares some updates:

If you’re the kind of person who needs quick and easy ways to troubleshoot SQL Server performance problems, and you haven’t tried my free scripts yet, you’re probably going to keep having SQL Server performance problems.

I don’t get a lot of visitor contributions to my code (and here I thought it was just because it’s perfect), but I had a couple cool recent additions to sp_QuickieStore, my free tool for searching and analyzing Query Store data.

Read on to see what’s new in a few procedures. Also, Die verboten Toten would absolutely be the name of my German punk rock or possibly DC hardcore style band.

Comments closed

The Risk of Query Failure on Readable Secondaries in SQL Server

Kendra Little explains a problem:

If you use readable secondaries in Availability Groups or Read-Scale out instances in Azure SQL Managed Instance, you may have queries fail repeatedly if there is a glitch and statistics are not successfully “refreshed” on the secondary replica. Those queries may keep failing until you manually intervene.

This has been the case for a few years, and it’s unclear if Microsoft will fix this. There is a well established support deflection article which documents the issue and provides ‘workarounds’.

Read on for Kendra’s thoughts. I haven’t run into this before, myself, but I also don’t tend to make very heavy use of readable secondaries.

Comments closed