• Bug
  • Status: Closed
  • 2 Major
  • Resolution: Fixed
  • hhuynh
  • Reporter: tgautier
  • November 19, 2007
  • 0
  • Watchers: 0
  • January 23, 2008
  • November 19, 2007

Attachments

Description

Orion and I are trying some network a/p stuff and we got this error.

It happened after I had my server running which had already elected itself as the active. Then orion started his and I got this exception:

tgautier-laptop:~/terracotta-2.5-stable1/samples tgautier$ ./start-demo-server.sh -n taylor 2007-11-19 15:57:45,160 INFO - Terracotta 2.5-stable1, as of 20071113-181118 (Revision 6296 by cruise@rh4mo0 from 2.5) 2007-11-19 15:57:45,839 INFO - Configuration loaded from the file at ‘/Users/tgautier/terracotta-2.5-stable1/samples/tc-config.xml’. 2007-11-19 15:57:45,878 INFO - Log file: ‘/Users/tgautier/terracotta-2.5-stable1/samples/logs/terracotta-server.log’. 2007-11-19 15:57:46,421 INFO - JMX Server started. Available at URL[service:jmx:jmxmp://localhost:9520] Nov 19, 2007 3:57:47 PM org.apache.catalina.tribes.transport.ReceiverBase bind INFO: Receiver Server Socket bound to:localhost/127.0.0.1:9530 2007-11-19 15:57:52,488 INFO - Becoming State[ ACTIVE-COORDINATOR ] 2007-11-19 15:57:52,522 INFO - Terracotta Server has started up as ACTIVE node on port 9510 successfully, and is now ready for work. 2007-11-19 16:05:55,103 INFO - NodeID[tcp://10.0.0.144:9530] joined the cluster Nov 19, 2007 4:05:55 PM org.apache.catalina.tribes.io.BufferPool getBufferPool INFO: Created a buffer pool with max size:104857600 bytes of type:org.apache.catalina.tribes.io.BufferPool15Impl Nov 19, 2007 4:05:55 PM org.apache.catalina.tribes.transport.nio.ParallelNioSender doLoop WARNING: Member send is failing for:tcp://10.0.0.144:9530 ; Setting to suspect and retrying. Nov 19, 2007 4:05:55 PM org.apache.catalina.tribes.group.interceptors.TcpFailureDetector memberDisappeared INFO: Received memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://10.0.0.144:9530,10.0.0.144,9530, alive=0,id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 }, payload={}, command={}, domain={}, ]] message. Will verify. Nov 19, 2007 4:05:55 PM org.apache.catalina.tribes.group.interceptors.TcpFailureDetector memberDisappeared INFO: Verification complete. Member still alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://10.0.0.144:9530,10.0.0.144,9530, alive=0,id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 }, payload={}, command={}, domain={}, ]] 2007-11-19 16:05:55,130 WARN - Requesting node to quit due to the following error NodeID : NodeID[tcp://10.0.0.144:9530] Error Type : COMMUNICATION ERROR Details : Error publishing states to NodeID[tcp://10.0.0.144:9530] : Exception : com.tc.net.groups.GroupException: org.apache.catalina.tribes.ChannelException: Send failed, attempt:2 max:1; Faulty members:tcp://10.0.0.144:9530; at com.tc.net.groups.TribesGroupManager.sendTo(TribesGroupManager.java:468) at com.tc.l2.state.StateManagerImpl.publishActiveState(StateManagerImpl.java:269) at com.tc.l2.ha.L2HACoordinator.nodeJoined(L2HACoordinator.java:256) at com.tc.l2.handler.GroupEventsDispatchHandler.handleEvent(GroupEventsDispatchHandler.java:24) at com.tc.async.impl.StageImpl$WorkerThread.run(StageImpl.java:142) Caused by: org.apache.catalina.tribes.ChannelException: Send failed, attempt:2 max:1; Faulty members:tcp://10.0.0.144:9530; at org.apache.catalina.tribes.transport.nio.ParallelNioSender.doLoop(ParallelNioSender.java:172) at org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:78) at org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:48) at org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:80) at org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:78) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:75) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:75) at org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage(TcpFailureDetector.java:87) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:75) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:75) at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:216) at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:175) at com.tc.net.groups.TribesGroupManager.sendTo(TribesGroupManager.java:466) … 4 more Caused by: java.net.SocketException: Invalid argument at sun.nio.ch.Net.setIntOption0(Native Method) at sun.nio.ch.Net.setIntOption(Net.java:152) at sun.nio.ch.SocketChannelImpl$1.setInt(SocketChannelImpl.java:372) at sun.nio.ch.SocketOptsImpl.setInt(SocketOptsImpl.java:46) at sun.nio.ch.SocketOptsImpl$IP.typeOfService(SocketOptsImpl.java:249) at sun.nio.ch.OptionAdaptor.setTrafficClass(OptionAdaptor.java:158) at sun.nio.ch.SocketAdaptor.setTrafficClass(SocketAdaptor.java:330) at org.apache.catalina.tribes.transport.nio.NioSender.completeConnect(NioSender.java:147) at org.apache.catalina.tribes.transport.nio.NioSender.process(NioSender.java:89) at org.apache.catalina.tribes.transport.nio.ParallelNioSender.doLoop(ParallelNioSender.java:130) … 16 more

Nov 19, 2007 4:05:55 PM org.apache.catalina.tribes.transport.nio.ParallelNioSender doLoop WARNING: Not retrying send for:tcp://10.0.0.144:9530; Sender is disconnected. Nov 19, 2007 4:05:55 PM org.apache.catalina.tribes.group.interceptors.TcpFailureDetector memberDisappeared INFO: Received memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://10.0.0.144:9530,10.0.0.144,9530, alive=0,id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 }, payload={}, command={}, domain={}, ]] message. Will verify. Nov 19, 2007 4:05:55 PM org.apache.catalina.tribes.group.interceptors.TcpFailureDetector memberDisappeared INFO: Verification complete. Member still alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://10.0.0.144:9530,10.0.0.144,9530, alive=0,id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 }, payload={}, command={}, domain={}, ]] 2007-11-19 16:05:55,140 WARN - NodeID[tcp://10.0.0.144:9530] left the cluster

Comments

Taylor Gautier 2007-11-19

We didn’t use the same config file.

Here is mine.

orion 2007-11-19

network-active-passive.xml is the configuration file I was using.

Taylor Gautier 2007-11-19

Orion and I made sure our files are exactly the same, namely the servers section looks like this:

<server host="10.0.0.101" name="taylor">
  <dso>
    <persistence>
      <mode>permanent-store</mode>
    </persistence>
  </dso>
</server>
<server host="10.0.0.144" name="orion">
  <dso>
    <persistence>
      <mode>permanent-store</mode>
    </persistence>
  </dso>
</server>
<ha>
 <mode>networked-active-passive</mode>
 <networked-active-passive>
   <election-time>5</election-time>
   </networked-active-passive>
</ha>

Taylor Gautier 2007-11-19

we are still seeing the same issue…

Taylor Gautier 2007-11-19

and we made sure to clear the data directory

Saravanan Subbiah 2007-11-19

Thanks ! I will ask them to try that !

Filip Hanik wrote:

Yes, the problem can be fixed by: -Djava.net.preferIPv4Stack=true

To the command line Let me know if that doesn’t resolve it Filip

Hi Filip,

How are you ? We made a release with Active-Passive in 2.4 and generally we have been getting good feedbacks, so that is a good news.

Someone recently reported that they were getting this exception in their MAC when running it though.

Caused by: java.net.SocketException: Invalid argument at sun.nio.ch.Net.setIntOption0(Native Method) at sun.nio.ch.Net.setIntOption(Net.java:152) at sun.nio.ch.SocketChannelImpl$1.setInt(SocketChannelImpl.java:372) at sun.nio.ch.SocketOptsImpl.setInt(SocketOptsImpl.java:46) at sun.nio.ch.SocketOptsImpl$IP.typeOfService(SocketOptsImpl.java:249) at sun.nio.ch.OptionAdaptor.setTrafficClass(OptionAdaptor.java:158) at sun.nio.ch.SocketAdaptor.setTrafficClass(SocketAdaptor.java:330) at org.apache.catalina.tribes.transport.nio.NioSender.completeConnect(NioSender.java:147) at org.apache.catalina.tribes.transport.nio.NioSender.process(NioSender.java:89) at org.apache.catalina.tribes.transport.nio.ParallelNioSender.doLoop(ParallelNioSender.java:130) … 26 more

Looking at the code and researching around the internet tells me that setTrafficClass() is not supported in MAC and I see that tribes sets it to

private int soTrafficClass = 0x04 0x08 0x010;

Is there a reason its set to this ? Have you seen this problem before ? Can this be changed ?

thanks, Saravanan

Saravanan Subbiah 2007-11-19

Like explained above, it is a problem with MAC. It can be resolved by starting with

-Djava.net.preferIPv4Stack=true