An RDS Postmortem

Andy Isaacson covers a performance issue Honeycomb experienced with RDS:

In retrospect, the failure chain had just 4 links:

  1. The RDS MySQL database instance backing our production environment experienced a sudden and massive performance degradation; P95(query_time) went from 11 milliseconds to >1000 milliseconds, while write-operation throughput dropped from 780/second to 5/second, in just 20 seconds.

  2. RDS did not identify a failure, so the Multi-AZ feature did not activate to fail over to the hot spare.

  3. As a result of the delays due to increased query_time, the Go MySQL client’s connection pool filled up with connections waiting for slow query results to come back and opened more connections to compensate.

  4. This exceeded the max_connections setting on the MySQL server, leading to cron jobs and newly-started daemons being unable to connect to the database and triggering many “Error 1040: Too many connections” log messages.

This was very interesting to read, and I applaud companies making public these kinds of post-mortems, especially because the idea of publicizing the reasons for failures is so scary.

Related Posts

More dbatools Changes

Chrissy LeMaire has a few more breaking changes for us: It’s been a busy couple days! Here’s a list of our breaking changes Invoke-Sqlcmd2 has been removed and replaced with a warning to use Invoke-DbaQuery The NetworkShare parameter has been renamed to SharedPath UseLastBackups has been renamed to UseLastBackup There are also several new commands […]

Read More

Migrating A Database To Managed Instances

Frank Gill shows how to migrate a database from on-premises to an Azure SQL Managed Instance: If you have run through my last Managed Instance blog post, you have a Managed Instance at your disposal.  The PowerShell script for creating the network requirements also contains steps to create an Azure VM in a different subnet in […]

Read More

Categories

May 2018
MTWTFSS
« Apr Jun »
 123456
78910111213
14151617181920
21222324252627
28293031