Solr Lock Contention

Michael Sun shows how the Apache Solr team found and fixed a performance issue in their code:

Based on this testing, lock contention, which usually results in a performance bottleneck and underutilized resources, was our first “suspect.” We knew that using a commercial Java profiler, such as Yourkit, JProfiler and Java Flight Recorder, would help easily identify locks and determine how much time threads spend waiting on them. Meanwhile, the team had built custom infrastructure that allows one to run experiments with a profiler attached via a single command-line parameter.

In my own testing, the profiler data indeed revealed some contention particularly related to VersionBucket andHdfsUpdateLog locks, leading to long thread wait time. Although promisingly, this result corresponded somewhat to the description in SOLR-6820, nothing actionable resulted from the experiment.

I like these sorts of case studies because example is the school of mankind.  In this particular case, I really like the methodical approach, using available information to search for a root cause.  Some of the things Michael calls “false starts” I would consider to be initial steps:  checking OS, filesystem, and garbage collection metrics are important even in a case like this in which they did not lead to the culprit, as they help you eliminate suspects.

Related Posts

Range Locks On Multi-Table Indexed Views

Erik Darling looks at the kinds of locks taken when updating an indexed view: So what causes Range Locks? Just ask Sunil. He knows everything (this assumes the serializable isolation level): Equality Predicate If the key value exists, then the range lock is only taken if the index is non-unique. In the non-unique index case, the […]

Read More

Visualizing Deadlocks In SQL Sentry & Plan Explorer

Aaron Bertrand shows off new functionality in SQL Sentry and SentryOne Plan Explorer around deadlock visualization: There’s a lot going on there, but much of it is noise. There is a whole bunch of contention on the table SqlPerf.Session — session 342 is trying to perform an update, but it is stuck waiting on shared locks taken […]

Read More

Categories

August 2016
MTWTFSS
« Jul Sep »
1234567
891011121314
15161718192021
22232425262728
293031