Press "Enter" to skip to content

Tips for Limiting Redis Failures

Phil Booth provides the ammo and we provide the feet:

Production outages are great at teaching you how not to cause production outages. I’ve caused plenty and hope that by sharing them publicly, it might help some people bypass part one of the production outage learning syllabus. Previously I discussed ways I’ve broken prod with PostgreSQL and with healthchecks. Now I’ll show you how I’ve done it with Redis too.

For the record, I absolutely love Redis. It works brilliantly if you use it correctly. The gotchas that follow were all occasions when I didn’t use it correctly.

My one addition here is to be really careful if you use Redis as persistent storage rather than a cache. Redis as a cache is easy: if the server goes down or you have trouble, you simply have more database calls than normal. Redis as persistent storage is a much more complicated beast which seems to fall over a lot more often and is significantly more finicky about drivers.