Paul Randal explains the unlikelihood that you’d run out of bigints in a table:
So with 1 million rows per second, you’ll be generating 1 million x 3,600 (seconds in an hour) x 24 (hours in a day) = 86.4 billion rows per day, so you’ll need about 1.4 terabytes of new storage per day. If you’re using the bigint identity as a cluster key, each row needs new space, so you’ll need almost exactly 0.5 petabytes of new storage every year.
At that rate, actually running out of bigint values AND storing them would take roughly 150 thousand petabytes. This is clearly impractical – especially when you consider that storing *just* a bigint is pretty pointless – you’d be storing a bigint and some other data too – probably doubling the storage necessary, at least.
By contrast, if you have a staging table that flows 10 million rows a day (meaning 10 million leave and 10 million new ones enter), you’ll overflow an int column in less than a year. It’s worth thinking about data sizes before deciding on the type of a surrogate key. Bigint is the safest, and if you think you’ll need it, go with it. But there is that storage overhead.
Comments closed