• Bug
  • Status: Closed
  • 1 Critical
  • Resolution: Fixed
  • hhuynh
  • Reporter: tgautier
  • April 17, 2008
  • 0
  • Watchers: 0
  • May 12, 2008
  • April 29, 2008

Attachments

Description

java.lang.AssertionError: Assert Failed : [email protected][ServerThreadID{ClientID[0],ThreadID=[1]}](HELD-LOCKS={[]}, WAITING-ON={ LockID(@1003), Level: WRITE (2)
Holders (1)
[email protected][ClientID[1],ThreadID=[-9223372036854775808],level=WRITE (2),timeout=120000]
Wait Set (0)
Pending lock requests (1)
[email protected][ClientID[0],ThreadID=[1],level=READ (1)]
}) : old = LockID(@1003), Level: WRITE (2)
Holders (1)
[email protected][ClientID[1],ThreadID=[-9223372036854775808],level=WRITE (2),timeout=120000]
Wait Set (0)
Pending lock requests (1)
[email protected][ClientID[0],ThreadID=[1],level=READ (1)]
new = LockID(@1003), Level: WRITE (2) Holders (1) [email protected][ClientID[1],ThreadID=[-9223372036854775808],level=WRITE (2),timeout=120000] Wait Set (0) Pending lock requests (1) [email protected][ClientID[0],ThreadID=[1],level=READ (1)]

at com.tc.objectserver.lockmanager.impl.ServerThreadContext.setWaitingOn(ServerThreadContext.java:82) at com.tc.objectserver.lockmanager.impl.Lock.addPending(Lock.java:357) at com.tc.objectserver.lockmanager.impl.Lock.addPendingTryLockRequest(Lock.java:328) at com.tc.objectserver.lockmanager.impl.Lock.queueRequest(Lock.java:291) at com.tc.objectserver.lockmanager.impl.Lock.requestLock(Lock.java:252) at com.tc.objectserver.lockmanager.impl.Lock.tryRequestLock(Lock.java:201) at com.tc.objectserver.lockmanager.impl.LockManagerImpl.basicRequestLock(LockManagerImpl.java:196) at com.tc.objectserver.lockmanager.impl.LockManagerImpl.requestLock(LockManagerImpl.java:180) at com.tc.objectserver.lockmanager.impl.LockManagerImpl.tryRequestLock(LockManagerImpl.java:168) at com.tc.objectserver.handler.RequestLockUnLockHandler.handleEvent(RequestLockUnLockHandler.java:39) at com.tc.async.impl.StageImpl$WorkerThread.run(StageImpl.java:142)

Attached is a repro case - download to local directory.

$ javac Main.java $ start-tc-server.sh $ dso-java Main $ dso-java Main

on second dso invocation, server will assert.

Comments

Steve Harris 2008-04-17

Now this is definitely a bug :-)

Taylor Gautier 2008-04-17

Just FYI using the tryLock() method with no timeouts works exactly as expected.

Geert Bevin 2008-04-21

Did you forget to attach the repro case?

Taylor Gautier 2008-04-21

See DEV-1562 for repro case.

Taylor Gautier 2008-04-21

To repro, use the code from DEV-1562.

$ javac Main.java $ start-tc-server.sh $ dso-java.sh Main $ dso-java.sh Main

Antonio Si 2008-04-25

There are actually 2 issues:

  1. Geert find out the condition of the assertion is not quite correct. Here is the comment from Geert:

    The current condition to throw the assertion is: !(this.waitingOn == null || !this.waitingOn.equals(lock)) and it should rather be ( this.waitingOn != null && !this.waitingOn.equals(lock))

  2. When a tryLock timer expires, we send out a cannotAward message, but we did not clean up the serverThreadContext. That’s why the assertion is thrown on the same Lock. I attach a patch for this issue.

Geert Bevin 2008-04-28

Ok, I read the one line patch from Antonio and looked at the relevant code and I’m just wondering if it’s correct. Everything else in L2 Lock class uses the local clearWaitingOn method which besides clearing the waiting on field of ServerThreadContext also removes the context from the ServerThreadContextFactory if it’s clear (not waiting on anything and no locks held). It seems to me that the local method should be used instead of the clearWaitingOn method on the context instance. Any thoughts?

(note, I’m very careful about all this since I’m really diving into the locking code with little knowledge about the specifics of the implementation)

Fiona OShea 2008-04-29

Geert has a fix ready,