Solr Lock Contention

Michael Sun shows how the Apache Solr team found and fixed a performance issue in their code:

Based on this testing, lock contention, which usually results in a performance bottleneck and underutilized resources, was our first “suspect.” We knew that using a commercial Java profiler, such as Yourkit, JProfiler and Java Flight Recorder, would help easily identify locks and determine how much time threads spend waiting on them. Meanwhile, the team had built custom infrastructure that allows one to run experiments with a profiler attached via a single command-line parameter.

In my own testing, the profiler data indeed revealed some contention particularly related to VersionBucket andHdfsUpdateLog locks, leading to long thread wait time. Although promisingly, this result corresponded somewhat to the description in SOLR-6820, nothing actionable resulted from the experiment.

I like these sorts of case studies because example is the school of mankind.  In this particular case, I really like the methodical approach, using available information to search for a root cause.  Some of the things Michael calls “false starts” I would consider to be initial steps:  checking OS, filesystem, and garbage collection metrics are important even in a case like this in which they did not lead to the culprit, as they help you eliminate suspects.

Related Posts

OLTP-Friendly Database Deployments

Michael Swart looks at one of the biggest problems when trying to do a zero-downtime deployment to an OLTP system: There are two main kinds of SQL queries. SELECT/INSERT/UPDATE/DELETE statements are examples of Data Manipulation Language (DML). CREATE/ALTER/DROP statements are examples of Data Definition Language (DDL). With schema changes – DDL – we have the […]

Read More

What Update Locks Do

Guy Glantser explains the process around updating data in SQL Server, particularly the benefit of having update locks: In order to update a row, SQL Server first needs to find that row, and only then it can perform the update. So every UPDATE operation is actually split into two phases – first read, and then […]

Read More


August 2016
« Jul Sep »