An RDS Postmortem

Andy Isaacson covers a performance issue Honeycomb experienced with RDS:

In retrospect, the failure chain had just 4 links:

  1. The RDS MySQL database instance backing our production environment experienced a sudden and massive performance degradation; P95(query_time) went from 11 milliseconds to >1000 milliseconds, while write-operation throughput dropped from 780/second to 5/second, in just 20 seconds.

  2. RDS did not identify a failure, so the Multi-AZ feature did not activate to fail over to the hot spare.

  3. As a result of the delays due to increased query_time, the Go MySQL client’s connection pool filled up with connections waiting for slow query results to come back and opened more connections to compensate.

  4. This exceeded the max_connections setting on the MySQL server, leading to cron jobs and newly-started daemons being unable to connect to the database and triggering many “Error 1040: Too many connections” log messages.

This was very interesting to read, and I applaud companies making public these kinds of post-mortems, especially because the idea of publicizing the reasons for failures is so scary.

Related Posts

Access Violation Error In SQL Server 2016 SP2 CU4

Lonny Niederstadt tracked down an ugly bug in SQL Server 2016 SP2 CU4: When I started investigating, the error was known only as an access violation, preventing some operations related to data cleansing or fact table versioning. It occurred deep within a series of stored procedures.  The execution environment included cross-database DELETE statements, cross-database synonyms, […]

Read More

Tooling For SQL Server Automation With Powershell

Max Trinidad shares some tools you can use to automate SQL Server processes with Powershell: For script automation we could install either or both version of PowerShell Core: (As of February 19th, 2019)PowerShell Core GA version 6.1.3PowerShell Core Preview 6.2.0 Preview 4 Here are some important PowerShell Modules to use for SQL Server management scripting:*SQLServer – […]

Read More

Categories

May 2018
MTWTFSS
« Apr Jun »
 123456
78910111213
14151617181920
21222324252627
28293031