My aim with this post is to split the dataset into batches of roughly 100 rows each.
DECLARE @target_rowcount bigint=100;
I say “roughly”, because we’re not allowed to split a transaction so that a group (grouping_column_1, grouping_column_2) appears in more than one batch, although a batch can obviously contain more than one group. This means that by necessity, some of the batches are going to be slightly under 100 rows and some are going to be slightly over.
Read on for a good solution to the problem. Daniel mentions places where performance could be better, though this feels like the kind of task you don’t necessarily run all that frequently.