Thinking About The Gitlab Outage

Brent Ozar shares his thoughts on the recent Gitlab outage:

You can read more about the details in GitLab’s outage timeline doc, which they heroically shared while they worked on the outage. Oh, and they streamed the whole thing live on YouTube with over 5,000 viewers.

There are so many amazing lessons to learn from this outage: transparency, accountability, processes, checklists, you name it. I’m not sure that you, dear reader, can actually put a lot of those lessons to use, though. After all, your company probably isn’t going to let you live stream your outages. (I do pledge to you that I’m gonna do my damnedest to do that ourselves with our own services, though.)

There are some good pointers in here.

Related Posts

New Diagnostics For Synchronous Statistics Updates

Joe Sack announces a new wait type and request command: Consider the following query execution scenario: You execute a SELECT query that triggers an automatic synchronous statistics update. The synchronous statistics update begins execution and your query waits (is essentially blocked) until the fresh statistics are generated. The query compilation and execution does not resume […]

Read More

Deep Dive On Log Buffer Flushes

Itzik Ben-Gan delves into log buffer flushes and how SQL Server maintains durability without giving up too much performance: The way SQL Server enforces transaction durability, in part, is by ensuring that all of the transaction’s changes are written to the database’s transaction log on disk before returning control to the caller. In a case of a […]

Read More

Categories

February 2017
MTWTFSS
« Jan Mar »
 12345
6789101112
13141516171819
20212223242526
2728