CDV ❯ Tomcat JMX Vector with Spring QuartzScheduler causes intermittent deadlock (complete server failure)
-
Bug
-
Status: Open
-
2 Major
-
Resolution:
-
Sessions
-
-
eellis00
-
Reporter: eellis00
-
February 13, 2009
-
0
-
Watchers: 2
-
April 17, 2009
-
Attachments
Description
Please see attached thread dump and screen shot.
I am aware that Terracotta may not be the cause of the deadlock - that Terracotta always shows up in thread dumps whether by fault or not.
Thread Name DefaultQuartzScheduler_Worker-9 State Deadlock/Waiting on monitor Monitor Owns Monitor Lock on 0x00002aab1164cfe8 Waiting for Monitor Lock on 0x00002aaaf12d2038 Java Stack at java.util.Vector.size(Vector.java:268) - waiting to lock [0x00002aaaf12d2038] (a java.util.Stack) at org.apache.tomcat.util.net.PoolTcpEndpoint.getCurrentThreadsBusy(PoolTcpEndpoint.java:280) at sun.reflect.GeneratedMethodAccessor5291.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.commons.modeler.BaseModelMBean.getAttribute(BaseModelMBean.java:346) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638) at org.jstripe.tomcat.probe.tools.JmxTools.getIntAttr(JmxTools.java:47) at org.jstripe.tomcat.probe.beans.ContainerListenerBean.getThreadPools(ContainerListenerBean.java:153) - locked [0x00002aab1164cfe8] (a org.jstripe.tomcat.probe.beans.ContainerListenerBean) at org.jstripe.tomcat.probe.beans.stats.collectors.ConnectorStatsCollectorBean.collect(ConnectorStatsCollectorBean.java:33) at sun.reflect.GeneratedMethodAccessor5282.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.springframework.util.MethodInvoker.invoke(MethodInvoker.java:248) at org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean$MethodInvokingJob.executeInternal(MethodInvokingJobDetailFactoryBean.java:165) at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:90) at org.quartz.core.JobRunShell.run(JobRunShell.java:195) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
Comments
Fiona OShea 2009-02-17
Fiona OShea 2009-03-05
This is not a tested configuration.
Eric Ellis 2009-04-16
We have had 4 server crashes today all caused by Vector.size() deadlock. Please look into this and take it seriously.
The above stack trace indicated the trace of the deadlock thread. Below is the stack of every request thread being blocked:
at java.util.Vector.__tc_addElement(Vector.java:572) - waiting to lock [0x00002aaaf1881c80] (a java.util.Stack) at java.util.Vector.addElement(Vector.java) at java.util.Stack.push(Stack.java:50) at org.apache.tomcat.util.net.PoolTcpEndpoint.recycleWorkerThread(PoolTcpEndpoint.java:608) at org.apache.tomcat.util.net.MasterSlaveWorkerThread.run(MasterSlaveWorkerThread.java:115) at java.lang.Thread.run(Thread.java:619)
Looking at these stack traces it appears that this is a bug in the Terracotta instrumentation of Tomcat. We are running Tomcat version 5.5.
Eric Ellis 2009-04-16
Correction, here are some of the other stack traces of blocked request threads:
at java.util.Vector.size(Vector.java:268) - waiting to lock [0x00002aaaf1881c80] (a java.util.Stack) at org.apache.tomcat.util.net.PoolTcpEndpoint.getCurrentThreadsBusy(PoolTcpEndpoint.java:280) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:796) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527) at org.apache.tomcat.util.net.MasterSlaveWorkerThread.run(MasterSlaveWorkerThread.java:112) at java.lang.Thread.run(Thread.java:619)
at org.jstripe.tomcat.probe.beans.ContainerListenerBean.handleNotification(ContainerListenerBean.java:65) - waiting to lock [0x00002aab4d5c25a8] (a org.jstripe.tomcat.probe.beans.ContainerListenerBean) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor$ListenerWrapper.handleNotification(DefaultMBeanServerInterceptor.java:1732) at javax.management.NotificationBroadcasterSupport.handleNotification(NotificationBroadcasterSupport.java:257) at javax.management.NotificationBroadcasterSupport$SendNotifJob.run(NotificationBroadcasterSupport.java:322) at javax.management.NotificationBroadcasterSupport$1.execute(NotificationBroadcasterSupport.java:307) at javax.management.NotificationBroadcasterSupport.sendNotification(NotificationBroadcasterSupport.java:229) at javax.management.MBeanServerDelegate.sendNotification(MBeanServerDelegate.java:193) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.sendNotification(DefaultMBeanServerInterceptor.java:1524) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:448) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:403) at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:506) at org.apache.commons.modeler.Registry.unregisterComponent(Registry.java:643) at org.apache.coyote.http11.Http11Protocol$MXPoolListener.threadEnd(Http11Protocol.java:109) at org.apache.tomcat.util.threads.ThreadPool.removeThread(ThreadPool.java:276) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:728) at java.lang.Thread.run(Thread.java:619)
Alex Miller 2009-04-17
Deadlock in the dump is below. The http thread is sitting and waiting for incoming tcp connections and handing them off. It’s synchronizing on the Stack and then spinning off a worker thread that does a jmx call to register an mbean that triggers some event stuff that ultimately locks the ContainerListenerBean. The Quartz scheduler thread is invoking a job which is trying to collect stats in the same listener bean which is looking up some stats in the tcp junk and trying to lock the same collection from prior thread.
This looks like a deadlock between Tomcat and Quartz to me. I don’t see how Terracotta is involved at all?
“http-8081”: at org.jstripe.tomcat.probe.beans.ContainerListenerBean.handleNotification(ContainerListenerBean.java:65) - waiting to lock <0x00002aab1164cfe8> (a org.jstripe.tomcat.probe.beans.ContainerListenerBean) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor$ListenerWrapper.handleNotification(DefaultMBeanServerInterceptor.java:1732) at javax.management.NotificationBroadcasterSupport.handleNotification(NotificationBroadcasterSupport.java:257) at javax.management.NotificationBroadcasterSupport$SendNotifJob.run(NotificationBroadcasterSupport.java:322) at javax.management.NotificationBroadcasterSupport$1.execute(NotificationBroadcasterSupport.java:307) at javax.management.NotificationBroadcasterSupport.sendNotification(NotificationBroadcasterSupport.java:229) at javax.management.MBeanServerDelegate.sendNotification(MBeanServerDelegate.java:193) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.sendNotification(DefaultMBeanServerInterceptor.java:1524) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1501) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:963) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:917) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:312) at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:482) at org.apache.commons.modeler.Registry.registerComponent(Registry.java:871) at org.apache.coyote.http11.Http11Protocol$JmxHttp11ConnectionHandler.init(Http11Protocol.java:147) at org.apache.tomcat.util.net.MasterSlaveWorkerThread.start(MasterSlaveWorkerThread.java:131) at org.apache.tomcat.util.net.PoolTcpEndpoint.newWorkerThread(PoolTcpEndpoint.java:595) at org.apache.tomcat.util.net.PoolTcpEndpoint.createWorkerThread(PoolTcpEndpoint.java:574) - locked <0x00002aaaf12d2038> (a java.util.Stack) at org.apache.tomcat.util.net.PoolTcpEndpoint.run(PoolTcpEndpoint.java:631) at java.lang.Thread.run(Thread.java:619) “DefaultQuartzScheduler_Worker-9”: at java.util.Vector.size(Vector.java:268) - waiting to lock <0x00002aaaf12d2038> (a java.util.Stack) at org.apache.tomcat.util.net.PoolTcpEndpoint.getCurrentThreadsBusy(PoolTcpEndpoint.java:280) at sun.reflect.GeneratedMethodAccessor5291.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.commons.modeler.BaseModelMBean.getAttribute(BaseModelMBean.java:346) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638) at org.jstripe.tomcat.probe.tools.JmxTools.getIntAttr(JmxTools.java:47) at org.jstripe.tomcat.probe.beans.ContainerListenerBean.getThreadPools(ContainerListenerBean.java:153) - locked <0x00002aab1164cfe8> (a org.jstripe.tomcat.probe.beans.ContainerListenerBean) at org.jstripe.tomcat.probe.beans.stats.collectors.ConnectorStatsCollectorBean.collect(ConnectorStatsCollectorBean.java:33) at sun.reflect.GeneratedMethodAccessor5282.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.springframework.util.MethodInvoker.invoke(MethodInvoker.java:248) at org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean$MethodInvokingJob.executeInternal(MethodInvokingJobDetailFactoryBean.java:165) at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:90) at org.quartz.core.JobRunShell.run(JobRunShell.java:195) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
Is the kind of thing that will be solved by new Unlocked Sessions?