EHC ❯ Segment.writeLock() was not released, causing a deadlock
-
Bug
-
Status: Closed
-
2 Major
-
Resolution: Not a Bug
-
ehcache-core
-
-
cdennis
-
Reporter: sharakan
-
November 07, 2011
-
0
-
Watchers: 2
-
July 27, 2012
-
November 07, 2011
Attachments
Description
Two threads apparently got in to a deadlock on a EhCache Segment object. See the attached file for stack traces generated by JStack.
I can tell from other logs that the last operation successfully completed by thread 1694 was some 9 hours before this stack trace was executed. During that time, thread 1696 finished all of it’s other work, but still holds a write lock on a particular Segment, thus causing 1694 to wait indefinitely. I thought at first this was an error on our part, because we are using explicit locking (acquireWriteLockOnKey, etc), but I believe the Segment lock is not related to element level locking.
We’re using a persistent DiskStore, TTI=24 hrs, TTL=48 hrs, and at the time of this error the in memory cache should’ve been full, with up to 10000 elements on disk.
Comments
Chris Dennis 2011-11-07
Chris Leon 2011-11-07
I’ll definitely review our explicit locking, it is possible that there’s a problem in that area. Regarding the Segment locks being the ones that are used by explicit locking calls though, I thought this wasn’t the case based on the comment on this page:
http://ehcache.org/documentation/user-guide/jta#performance
Since we’re using 2.4.4, I thought the explicit locks were at the element level. Am I misunderstanding that documentation?
Chris Leon 2011-11-07
Chris, you are correct. I found the missing finally() block. Would’ve found it earlier, but I thought that paragraph in the documentation meant that it could not possibly be in our code. Thanks!
Chris Dennis 2011-11-07
The documentation you linked to refers to the soft-lock implementation used by the JTA (transactional) caches. The explicit locking still relies on locking the segment locks directly. Glad you sorted out your problem though.
The locks used by the explicit locking code are the segment locks inside the CompoundStore. The explicit locking code simply hashes the supplied key to the matching segment and then allows you to lock that segment directly. The leak could still be in Ehcache but I would check your explicit locking code first. You might want to pay particular attention to making sure your try finally blocks around all your explicit locking usage are correctly formed.