Kendra Little walks through the process of expanding an int column into a bigint:
Sometimes you just can’t take the outage. In that case, you’ve got to proceed with your own wits, and your own code. This is tricky because changes are occurring to the table.
The solution typically looks like this:
-
Set up a way to track changes to the table – either triggers that duplicate off modifications or Change Data Capture (Enterprise Edition)
-
Create the new table with the new data type, set identity_insert on if needed
-
Insert data into the new table. This is typically done in small batches, so that you don’t overwhelm the log or impact performance too much. You may use a snapshot from the point at which you started tracking changes.
-
Start applying changed data to the new table
-
Make sure you’re cleaning up from the changed data you’re catching and not running out of space
-
Write scripts to compare data between the old and new tables to make sure you’re really in sync (possibly use a snapshot or a restored backup to compare a still point in time)
-
Cut over in a quick downtime at some point using renames, schema transfer, etc. If it’s an identity column, don’t forget to fix that up properly.
This method matches what I’ve done in zero downtime situations.
Also see Aaron Bertrand’s article on the same topic:
In part 3 of this series, I showed two workarounds to avoid widening an
IDENTITY
column – one that simply buys you time, and another that abandonsIDENTITY
altogether. The former prevents you from having to deal with external dependencies such as foreign keys, but the latter still doesn’t address that issue. In this post, I wanted to detail the approach I would take if I absolutely needed to move tobigint
, needed to minimize downtime, and had plenty of time for planning.Because of all of the potential blockers and the need for minimal disruption, the approach might be seen as a little complex, and it only becomes more so if additional exotic features are being used (say, partitioning, In-Memory OLTP, or replication).
At a very high level, the approach is to create a set of shadow tables, where all the inserts are directed to a new copy of the table (with the larger data type), and the existence of the two sets of tables is as transparent as possible to the application and its users.
Those are two good posts on this topic.