Sometimes you just can’t take the outage. In that case, you’ve got to proceed with your own wits, and your own code. This is tricky because changes are occurring to the table.
The solution typically looks like this:
Set up a way to track changes to the table – either triggers that duplicate off modifications or Change Data Capture (Enterprise Edition)
Create the new table with the new data type, set identity_insert on if needed
Insert data into the new table. This is typically done in small batches, so that you don’t overwhelm the log or impact performance too much. You may use a snapshot from the point at which you started tracking changes.
Start applying changed data to the new table
Make sure you’re cleaning up from the changed data you’re catching and not running out of space
Write scripts to compare data between the old and new tables to make sure you’re really in sync (possibly use a snapshot or a restored backup to compare a still point in time)
Cut over in a quick downtime at some point using renames, schema transfer, etc. If it’s an identity column, don’t forget to fix that up properly.
This method matches what I’ve done in zero downtime situations.
In part 3 of this series, I showed two workarounds to avoid widening an
IDENTITYcolumn – one that simply buys you time, and another that abandons
IDENTITYaltogether. The former prevents you from having to deal with external dependencies such as foreign keys, but the latter still doesn’t address that issue. In this post, I wanted to detail the approach I would take if I absolutely needed to move to
bigint, needed to minimize downtime, and had plenty of time for planning.
Because of all of the potential blockers and the need for minimal disruption, the approach might be seen as a little complex, and it only becomes more so if additional exotic features are being used (say, partitioning, In-Memory OLTP, or replication).
At a very high level, the approach is to create a set of shadow tables, where all the inserts are directed to a new copy of the table (with the larger data type), and the existence of the two sets of tables is as transparent as possible to the application and its users.
Those are two good posts on this topic.