• Bug
  • Status: New
  • 2 Major
  • Resolution:
  • ehcache-terracotta
  • drb
  • Reporter: nizara
  • September 23, 2014
  • 0
  • Watchers: 2
  • October 09, 2014

Attachments

Description

Hello, We are experiencing a serious problem with TC 3.7.5 , hibernate 2nd level cache (ehcache 2.6.6). We have one of our services keep crashing every one or 2 weeks. We are suspecting TC client ( service A) lost connection and never recovered and this caused Service A to crashed.

NO rejoin and no nonstop cache configured. We are using concurrency strategy READ_WRITE .

Any recommendation / suggestions or know issue when hibernate 2nd cache and TC used together specially with READ-WRITE strategy???

BTW, We think the issue happened when deleting / updating hibernate collections ( delete / update property see our configs below).

Hibernate mapping look like:

A: many-many with B <?xml version=”1.0”?> <!DOCTYPE hibernate-mapping PUBLIC “-//Hibernate/Hibernate Mapping DTD 3.0//EN” “http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd”>

B: many - many with A <?xml version=”1.0”?> <!DOCTYPE hibernate-mapping PUBLIC “-//Hibernate/Hibernate Mapping DTD 3.0//EN” “http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd”>

Property: B and prop are (1 - many) <?xml version=”1.0”?> <!DOCTYPE hibernate-mapping PUBLIC “-//Hibernate/Hibernate Mapping DTD 3.0//EN” “http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd”>

AB : m - m mapping <?xml version=”1.0”?> <!DOCTYPE hibernate-mapping PUBLIC “-//Hibernate/Hibernate Mapping DTD 3.0//EN” “http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd”>

Comments

Louis Jacomet Jacomet 2014-10-01

I would start with looking at logs and understand why the Terracotta client crashed. No reason to look into your usage or mappings at this point.

NIZAR ALSEDDEG 2014-10-01

FYI, It happen when deleting / updating hibernate collections, not sure what’s exactly happening between TC and hibernate 2nd level cache. Also if we disabled TC from ehcache.xml the problem does not occur, that why I think something between TC and hibernate .

Here what has shown in the log:

ERROR 2014-09-17 14:25:10,139 org.hibernate.engine.ActionQueue pool-2-thread-12 - could not release a cache lock org.hibernate.cache.CacheException: net.sf.ehcache.CacheException: Underlying Terracotta connection closed! Can’t act on clustered lock anymore… at net.sf.ehcache.hibernate.regions.EhcacheTransactionalDataRegion.writeLock(EhcacheTransactionalDataRegion.java:204) at net.sf.ehcache.hibernate.strategy.ReadWriteEhcacheEntityRegionAccessStrategy.afterUpdate(ReadWriteEhcacheEntityRegionAccessStrategy.java:92) at net.sf.ehcache.hibernate.nonstop.NonstopAwareEntityRegionAccessStrategy.afterUpdate(NonstopAwareEntityRegionAccessStrategy.java:82) at org.hibernate.action.EntityUpdateAction.afterTransactionCompletion(EntityUpdateAction.java:256) at org.hibernate.engine.ActionQueue.afterTransactionCompletion(ActionQueue.java:198) at org.hibernate.impl.SessionImpl.afterTransactionCompletion(SessionImpl.java:452) at org.springframework.orm.hibernate3.SpringSessionSynchronization.afterCompletion(SpringSessionSynchronization.java:215) at org.springframework.transaction.support.TransactionSynchronizationUtils.invokeAfterCompletion(TransactionSynchronizationUtils.java:157) at org.springframework.transaction.support.AbstractPlatformTransactionManager.invokeAfterCompletion(AbstractPlatformTransactionManager.java:951) at org.springframework.transaction.support.AbstractPlatformTransactionManager.triggerAfterCompletion(AbstractPlatformTransactionManager.java:926) at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:754) at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:678) at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:321) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:116) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171) at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:635) at com.org.x.dao.xDAO\(EnhancerByCGLIB\)b0547a30.saveOrUpdate() at com.org.x.service.XServlet.setYProperty(XServlet.java:253) at com.org.x.service.XServlet.doGet(XServlet.java:60) at javax.servlet.http.HttpServlet.service(HttpServlet.java:690) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:530) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:427) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:931) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:362) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:867) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113) at org.eclipse.jetty.server.Server.handle(Server.java:334) at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:559) at org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:992) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:541) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:203) at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:406) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:462) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: net.sf.ehcache.CacheException: Underlying Terracotta connection closed! Can't act on clustered lock anymore... at org.terracotta.modules.ehcache.concurrency.TcSync.hideTCNotRunningException(TcSync.java:112) at org.terracotta.modules.ehcache.concurrency.TcSync.lock(TcSync.java:38) at net.sf.ehcache.hibernate.regions.EhcacheTransactionalDataRegion.writeLock(EhcacheTransactionalDataRegion.java:199) ... 37 more

Thanks,

Louis Jacomet Jacomet 2014-10-01

This is your application log. It contains an exception about client code failing because the Terracotta connection is dead. Can you also provide your terracotta client log? It probably has the root cause of why the client crashed.

NIZAR ALSEDDEG 2014-10-01

I have attached the tc client log,

Thanks.

NIZAR ALSEDDEG 2014-10-01

TC client log

Louis Jacomet Jacomet 2014-10-03

{noformat} 2014-09-19 19:41:55,945 [L1_L2:TCComm Main Selector Thread_R (listen 0:0:0:0:0:0:0:0:42332)] WARN com.tc.net.protocol.transport.ClientMessageTransport - ConnectionID(0.436cc15d49144167b76c61487f1609cb.3fac0c22-99a8-4914-a216-06050b958f9f-1488ea59b2e)[]: CLOSE EVENT : com.tc.net.core.TCConnectionImpl@781189897: connected: false, closed: true local=127.0.0.1:37312 remote=127.0.0.1:9510 connect=[Fri Sep 19 16:01:36 UTC 2014] idle=534ms [777429 read, 620280 write]. STATUS : ESTABLISHED {noformat}

The line above indicates that the connection was closed. Could you also attach server logs?

NIZAR ALSEDDEG 2014-10-06

server log

NIZAR ALSEDDEG 2014-10-06

I did look at both logs client and server , but I have not seen any useful info. Please take a look see if you can get some clue.

NIZAR ALSEDDEG 2014-10-09

Hi, Here are both TC server/client logs + service threats dump. It has crashed again today, thx for all the help.

Louis Jacomet Jacomet 2014-10-09

The attached server logs shows lots of connections rejected. Any idea where these are coming from?

And the given client log does not show anything, it connects and stays so till the end of the log.

And the thread dump does not seem to indicate anything specifc.

NIZAR ALSEDDEG 2014-10-09

Hi Louis, I believe the connections rejection in the server from the restart of the platform yesterday, I don’t think they are related to the issue. They were from yesterday and the issue has happened today.