EHC ❯ putWithWriter throws NullPointerException after cache reconnects with terracotta server
-
Bug
-
Status: Closed
-
1 Critical
-
Resolution: Cannot Reproduce
-
ehcache-core,ehcache-terracotta
-
-
drb
-
Reporter: cspros
-
May 23, 2011
-
0
-
Watchers: 3
-
July 27, 2012
-
August 04, 2011
Description
If a cache with a cacheWriter has been configured to communicate with terracotta and has lost/restored connection, calling the putWithWriter method will throw a NullPointerException. It appears that once the cache has rejoined the cluster, the cacheWriterManager no longer has a reference to the cache. This results in the call to putWithWriter(..) throwing a NullPointerException:
Exception in thread “main” net.sf.ehcache.CacheException: net.sf.ehcache.writer.CacheWriterManagerException: java.lang.NullPointerException
at net.sf.ehcache.constructs.nonstop.NonstopExecutorServiceImpl.execute(NonstopExecutorServiceImpl.java:87)
at net.sf.ehcache.constructs.nonstop.store.ExecutorServiceStore.executeWithExecutor(ExecutorServiceStore.java:157)
at net.sf.ehcache.constructs.nonstop.store.ExecutorServiceStore.executeWithExecutor(ExecutorServiceStore.java:126)
at net.sf.ehcache.constructs.nonstop.store.ExecutorServiceStore.putWithWriter(ExecutorServiceStore.java:326)
at net.sf.ehcache.constructs.nonstop.store.NonstopStoreImpl.putWithWriter(NonstopStoreImpl.java:377)
at net.sf.ehcache.Cache.putInternal(Cache.java:1408)
at net.sf.ehcache.Cache.putWithWriter(Cache.java:1366)
at com.prosrm.webtest.TerracottaExample.
Comments
James House 2011-07-27
James House 2011-07-29
I have been unable to reproduce this, and in fact think that I have a test case that proves this works (at least with trunk).
I have made the following assumptions: non-stop cache, time-out long enough to allow for rejoin while put is blocked, write-through mode writer on the cache.
I have created a test that instantiates the cache, does a small number (100) puts, kills the TC server, then calls put() on the cache. I have verified that the put does block until rejoin occurs, and that the put unblocks and succeeds after the rejoin.
On the assumption that it may be a race condition, I have scripted the run of the test to occur over and over, and have not been able to achieve a failure - even with running additional heavy processes, etc. trying to affect thread timing.
Any assistance with reconstructing you failure scenario would be much appreciated.
James House 2011-08-04
I am still unable to reproduce this issue after having spent many hours trying to do so (on both 2.4.2 and trunk).
If you can give more details and/or submit a test case that reproduces it we can look further into it.
Is there a test case for reproducing this?
Can you be more specific about the scenario where this occurs? “has lost/restored connection” is a bit vague - are we talking about a cluster rejoin, or was the client application down for a while and then restarted, or?