• Bug
  • Status: Resolved
  • 1 Critical
  • Resolution: Fixed
  • ehcache-terracotta
  • mscott
  • Reporter: gadams00
  • April 23, 2013
  • 0
  • Watchers: 10
  • June 06, 2013
  • April 25, 2013

Attachments

Description

I’m using BigMemory Max 4.0.0 with a free license with the client in a tomcat web application. Both the TSA and tomcat client are running on CentOS 6.2 on the Oracle 64-bit server VM version 1.7.0_21. I’ve enabled restartability in my tc-config.xml. After I start bigmemory and tomcat and do a cache put, then stop tomcat, stop bigmemory, start bigmemory, start tomcat, and try to do a cache get, I receive the following error:

net.sf.ehcache.CacheException: Uncaught exception in get() - java.io.StreamCorruptedException: invalid type code: 00
        at org.terracotta.modules.ehcache.store.ClusteredSafeStore.get(ClusteredSafeStore.java:203)
        at org.terracotta.modules.ehcache.store.nonstop.NonStopStoreWrapper.get(NonStopStoreWrapper.java:517)
        at net.sf.ehcache.Cache.get(Cache.java:1634)
...

This is a repeatable process - I get the error every time I follow the steps above. I should mention that both servers are VMWare virtual servers, since that might be relevant.

I’ve attached all relevant logs and config files.

Comments

Yi Zhang 2013-04-23

Hi Greg is it possible for you to attach any relevant source code? In particular, the object you’re storing in the cache and the put/get logic. This will help us reproduce your issue. Additionally, can you confirm that the put/get is fine if you do not restart and/or only a restart of Tomcat? Thanks.

Greg Adams 2013-04-24

I can’t give you the full source code because of IP rules here at work, but I can serialize the object that is giving us fits and give you a simple WAR project that does cache gets and puts using the serialized object.

I’ve attached ehc-1022.zip, which contains the following:

  1. ehc-1022/* - maven war project that creates the simple web application that reproduces the issue
  2. tc-client-config.xml - terracotta client config file that specifies the big memory TSA server to connect to ; more on that later.
  3. tc-config.xml - terracotta server config file I’ve been using

To reproduce in full, you’ll need 2 servers; one for the TSA and one for tomcat. Install latest tomcat 7 on one, and make sure you start tomcat with a JVM argument like this:
-Dterracotta.config.location=/srv/tomcat/idstore/conf/tc-client-config.xml

with the path to the tc-client-config from the zip. Also, edit the tc-client-config.xml used on your tomcat server to reference your TSA. This externalizes the TSA server information from the WAR.

Install BigMemory Max 4.0.0 on the TSA server. Place the tc-config.xml from the zip in the server/bin directory. You’ll probably need to edit it to change paths to data, logs, etc.

Now, run mvn clean package in ehc-1022, which will create target/ehc-1022.war. Deploy this to the tomcat server. In a browser, go to http://tc_server:tc_port/ehc-1022/productComposite. The servlet mapped to this url tries to get the product composite object via ehcache and if present, outputs a message indicating a successful get. If not present, it loads the product composite object via ObjectInputStream from a file included in the war, puts it in ehcache and outputs a message indicating a put.

When I initially deploy in tomcat and do the put followed by repeated gets, everything is fine. Even restarting tomcat, everything is fine. As soon as I restart bigmemory max, however, I get the exception mentioned about the corrupt stream on the get, no matter how many times I try. Even clearing the productComposite cache doesn’t get rid of the error (successful put followed by failed gets result). Even clearing the productComposite cache and restarting BigMemory doesn’t get rid of the error (successful put followed by failed gets result). The only thing that clears the error is stopping bigmemory, emptying the data directory, and starting bigmemory.

Tim Eck 2013-04-25

thanks for the test case – it makes a *huge* difference in getting a resolution!

I can confirm it reproduces the exception

Tim Eck 2013-04-25

Myron – FWIW, I suspect you could take this code out of tomcat and get the same exception in a standalone program

Myron Scott 2013-04-25

Deserialization of serialized entries assumed that one read operation would would suffice. That is not the case for large objects, as buffering in the layered streams may require several read operations to completely pull the required bytes into the allocated array.

Saravanan Subbiah 2013-04-26

Should this be fixed in Go too ?

Greg Adams 2013-04-26

Thanks for fixing this issue. Do you have any idea what about the ProductComposite object causes this issue? I’ve tested with other simple objects and the issues does not occur with those, which leads me to believe it’s either just related to the size of the ProductComposite objects we’re using, or possibly due to the types used within ProductComposite or it’s contained objects.

Also, can you tell me when this fix will be available?

Myron Scott 2013-04-26

This issue does not affect Go. It was an error in the way the ManagedObjectState of ClusteredObjects was reconstituted.

Should I respond openly to Greg Adam’s question?

Myron Scott 2013-05-07

A combination of size and serialization pattern of ProductComposite exposed an issue in stream processing on fetches from the cluster.