• Bug
  • Status: Resolved
  • 2 Major
  • Resolution: Fixed
  • Communications Layer
  • gkeim
  • Reporter: ecd
  • October 10, 2012
  • 1
  • Watchers: 12
  • November 16, 2012
  • November 15, 2012

Description

Originally described on the terracotta forums [here http://forums.terracotta.org/forums/posts/list/0/7572.page]
  1. Start up terracotta server
 $ tar xzf terracotta-3.7.0.tar.gz 
 $ terracotta-3.7.0/bin/start-tc-server.sh
  1. Make some requests to pull the config at port 9530
 $ for i in $(seq 1000); do curl -s "http://localhost:9530/config" > /dev/null; done
  1. Check terracotta’s open sockets
 $ lsof -i -P | grep $(pgrep java) | grep CLOSE_WAIT | wc -l
 781

It seems that Terracotta’s HTTP server isn’t properly closing its TCP connections. The file descriptor limit can be reached easily, causing Terracotta to die:

 2012-10-08 15:17:02,245 [WorkerThread(dso-http-bridge, 0)] ERROR com.tc.net.protocol.HttpConnectionContext - Exception thrown
 java.net.SocketException: Too many open files
 	at java.net.Socket.createImpl(Socket.java:414)
 	at java.net.Socket.getImpl(Socket.java:477)
 	at java.net.Socket.getSoTimeout(Socket.java:1050)
 	at com.tc.server.TerracottaConnector$SocketWrapper.getSoTimeout(TerracottaConnector.java:156)
 	at org.mortbay.jetty.bio.SocketConnector$Connection.<init>(SocketConnector.java:183)
 	at com.tc.server.TerracottaConnector.handleSocketFromDSO(TerracottaConnector.java:39)
 	at com.tc.server.HttpConnectionHandler.handleEvent(HttpConnectionHandler.java:35)
 	at com.tc.async.impl.StageImpl$WorkerThread.run(StageImpl.java:145)

Sometimes the terracotta server becomes unresponsive and the process crashes:

 java.lang.RuntimeException: java.lang.AssertionError: Protocol is 2
 	at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
 Caused by: java.lang.AssertionError: Protocol is 2
 	at com.tc.net.protocol.ProtocolSwitch.getReadBuffers(ProtocolSwitch.java:139)
 	at com.tc.net.core.TCConnectionImpl.getReadBuffers(TCConnectionImpl.java:794)
 	at com.tc.net.core.TCConnectionImpl.doReadFromBufferInternal(TCConnectionImpl.java:433)
 	at com.tc.net.core.TCConnectionImpl.doReadFromBuffer(TCConnectionImpl.java:306)
 	at com.tc.net.core.TCConnectionImpl.doReadInternal(TCConnectionImpl.java:290)
 	at com.tc.net.core.TCConnectionImpl.doRead(TCConnectionImpl.java:266)
 	at com.tc.net.core.CoreNIOServices$CommThread.selectLoop(CoreNIOServices.java:624)
 	at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:290)

The CLOSE_WAIT sockets will never die without the application (terracotta) closing them.

Comments

Karthik Lalithraj 2012-10-10

Comments restricted to Terracotta_Internal

A couple of different things here

a) The TCP Socket issue identified. b) Certain enterprise customers want to entirely disable http access to config files - sometimes considered a security issue.

Saravanan Subbiah 2012-10-10

I think we get more than just the config from the server. Some L1 side config like reconnect settings, version checks etc are also going over HTTP. I think the right thing to do would be is to secure that channel too.

Tim Eck 2012-10-10

Has anyone confirmed if this is a regression?

Fiona OShea 2012-10-10

IS this a regression?

Saravanan Subbiah 2012-10-10

Can we get someone from QA validate the claims and assign it back to serverteam ?

Sandeep Bansal [X] 2012-10-14

Tested on 3.7.1-SNAPSHOT, 3.6.5-SNAPSHOT and 3.6.4.

Terracotta Enterprise 3.7.1-SNAPSHOT, as of 20121014-080232 (Revision 21462-20951) - 992 open sockets Terracotta Enterprise 3.6.5-SNAPSHOT, as of 20121014-080218 (Revision 21459-20949) - 0 open sockets Terracotta Enterprise 3.6.4, as of 20120907-052304 (Revision 19983-20642) - 0 open sockets

So, no open sockets in 3.6.4 and 3.6.5-SNAPSHOT. But 3.7.1 has got this issue.

Sandeep Bansal [X] 2012-10-14

Assigning to server team.

Nishant Bangarwa 2012-10-18

fixed in trunk and 3.7

Gary Keim 2012-11-15

When I run the above steps, I also get the result posted. But if I run this:

gkeim-> lsof | grep “java” | grep “PIPE” | wc -l 2002

This seems to indicate that the 2 Pipes created by the PipeSocket at TCConnectionImpl.detachImpl, are never being released.

Linking to DEV-8438.