• Bug
  • Status: Closed
  • 0 Showstopper
  • Resolution: As Designed
  • Failover
  • serverteam
  • Reporter: siyer
  • December 19, 2007
  • 0
  • Watchers: 0
  • July 27, 2012
  • January 09, 2008

Attachments

Description

Seems like there is a corner case when the client-reconnect-window (default of 2 minutes) is not honored in the Network A/P configuration.

See http://forums.terracotta.org/forums/posts/list/667.page

This snippet from Terracotta Server log reveals the bug or is at best confusing - 1> It shows the Election being won by this server at 14:48 (and therefore the clock starts ticking then) and the expecation is that 14:50 is when the client-reconnect-window closes. 2> But it instead closes at 14:54…Is this because, there is a lot more work happening before the server is actually Active - so that the timer only started at 14:52.

2007-12-04 14:48:07,338 [WorkerThread(l2_state_message_handler_stage,0)] ERROR com.tc.l2.state.StateManagerImpl - State[ ACTIVE-COORDINATOR ] Received Election Won Msg : L2StateMessage [ NodeID[tcp://iadadobcnapp01s.ood.ops:9530], type = ELECTION_WON, Enrollment [ NodeID[tcp://iadadobcnapp01s.ood.ops:9530], isNew = false, weights = 9223372036854775807,9223372036854775807 ]]. Possible split brain detected … … … 2007-12-04 14:54:27,510 [Reconnect timer] INFO com.tc.objectserver.handshakemanager.ServerClientHandshakeManager - Reconnect window closed. All dead clients removed.

Comments

Saravanan Subbiah 2008-01-09

The reconnect window in fact got closed in 2 minutes. If you look closely at the logs there is this line.

2007-12-04 14:47:45,599 [WorkerThread(group_events_dispatch_stage,0)] INFO com.terracottatech.console - Becoming State[ ACTIVE-COORDINATOR ] … 2007-12-04 14:47:48,650 [WorkerThread(l2_state_change_stage,0)] INFO com.terracottatech.console - Terracotta Server has started up as ACTIVE node on port 9510 successfully, and is now ready for work. … 2007-12-04 14:49:48,626 [Reconnect timer] INFO com.tc.objectserver.handshakemanager.ServerClientHandshakeManager - Reconnect window closing. Killing any previously connected clients that failed to connect in time: [ChannelID=[59], ChannelID=[58], ChannelID=[60]]

The cleanups completes at a later point in time. My guess is because of the network glitch the cleanup took a longer time to complete.

Saravanan Subbiah 2008-01-09

Reconnect window closes in 2 minutes.