CDV ❯ Issue around client-reconnect-window not being honored in case of re-election of Active in Networked A/P case.
-
Bug
-
Status: Closed
-
0 Showstopper
-
Resolution: As Designed
-
Failover
-
-
serverteam
-
Reporter: siyer
-
December 19, 2007
-
0
-
Watchers: 0
-
July 27, 2012
-
January 09, 2008
Attachments
Description
Seems like there is a corner case when the client-reconnect-window (default of 2 minutes) is not honored in the Network A/P configuration.
See http://forums.terracotta.org/forums/posts/list/667.page
This snippet from Terracotta Server log reveals the bug or is at best confusing - 1> It shows the Election being won by this server at 14:48 (and therefore the clock starts ticking then) and the expecation is that 14:50 is when the client-reconnect-window closes. 2> But it instead closes at 14:54…Is this because, there is a lot more work happening before the server is actually Active - so that the timer only started at 14:52.
2007-12-04 14:48:07,338 [WorkerThread(l2_state_message_handler_stage,0)] ERROR com.tc.l2.state.StateManagerImpl - State[ ACTIVE-COORDINATOR ] Received Election Won Msg : L2StateMessage [ NodeID[tcp://iadadobcnapp01s.ood.ops:9530], type = ELECTION_WON, Enrollment [ NodeID[tcp://iadadobcnapp01s.ood.ops:9530], isNew = false, weights = 9223372036854775807,9223372036854775807 ]]. Possible split brain detected … … … 2007-12-04 14:54:27,510 [Reconnect timer] INFO com.tc.objectserver.handshakemanager.ServerClientHandshakeManager - Reconnect window closed. All dead clients removed.
Comments
Saravanan Subbiah 2008-01-09
Saravanan Subbiah 2008-01-09
Reconnect window closes in 2 minutes.
The reconnect window in fact got closed in 2 minutes. If you look closely at the logs there is this line.
2007-12-04 14:47:45,599 [WorkerThread(group_events_dispatch_stage,0)] INFO com.terracottatech.console - Becoming State[ ACTIVE-COORDINATOR ] … 2007-12-04 14:47:48,650 [WorkerThread(l2_state_change_stage,0)] INFO com.terracottatech.console - Terracotta Server has started up as ACTIVE node on port 9510 successfully, and is now ready for work. … 2007-12-04 14:49:48,626 [Reconnect timer] INFO com.tc.objectserver.handshakemanager.ServerClientHandshakeManager - Reconnect window closing. Killing any previously connected clients that failed to connect in time: [ChannelID=[59], ChannelID=[58], ChannelID=[60]]
The cleanups completes at a later point in time. My guess is because of the network glitch the cleanup took a longer time to complete.