Unsharing The Database

Randy Shoup talks about scaling up through breaking out a shared database:

For an early- and mid-stage startup, a monolithic database is absolutely the appropriate architecture choice. With a small team and a small company, a single shared database made it simple to get started. Moving fast meant being able to make rapid changes across the entire system. A shared database made it very easy to join data between different tables, and it made transactions across multiple tables possible. These are pretty convenient.

As we have gotten larger, those benefits have become liabilities. It has become a single point of failure, where issues with the shared database can bring down nearly all of our applications. It has become a performance bottleneck, where long-running operations from one application can slow down others. Finally, and most importantly, the shared database has become a coupling point between teams, slowing down our ability to make changes.

I have my misgivings (as you’d expect from a database snob), particularly because I value highly the benefits of normalization and see sharded systems as a step backwards in that regard.  But even with that said, there are absolutely benefits to slicing out orthogonal sections of data; the point of disagreement is in those places in which two teams’ entities and attributes overlap.

Related Posts

Event Sourcing On Kafka

Adam Warski shows how you can use Apache Kafka as your event sourcing data source: There’s a number of great introductory articles, so this is going to be a very brief introduction. With event sourcing, instead of storing the “current” state of the entities that are used in our system, we store a stream of events that relate to these […]

Read More

Avoid Scalar Functions In Computed Columns

Daniel Hutmacher shows why you should not include scalar functions inside computed column definitions: Scalar functions can be a real headache when you’re performance tuning. For one, they don’t parallelize. In fact, if you use a scalar function in a computed column, it will prevent any query that uses that table from going parallel – even if you […]

Read More


May 2016
« Apr Jun »