• Documentation
  • Status: Closed
  • 2 Major
  • Resolution: Fixed
  • docs group
  • Reporter: ari
  • March 14, 2010
  • 0
  • Watchers: 1
  • July 15, 2010
  • March 23, 2010

Description

I spoke to Igal on Friday about issues users w/ having getting HA wrong. After speaking to Igal, I ended up reading the doc to help a customer in a sales engagement who is screwing up. IMO the doc is incorrect:

  1. It doesn’t mention the reconnect window at all (when an L2 fails and an L1 at the same time).
  2. It doesn’t document ping probes and stuff correctly (although it is a straight cut-and-paste of the .properties file)
  3. It doesn’t document the difference between L2 and L1 reconnect.
  4. It should list the L1 socket connect timeout from the embedded tc.properties l1.max.connect.retries = -1 l1.connect.versionMatchCheck.enabled = true l1.socket.connect.timeout=10000 l1.socket.reconnect.waitInterval=1000

tc.transport.handshake.timeout=10000 tc.config.getFromSource.timeout=30000

  1. and the L2 ones as well: l2.nha.tcgroupcomm.handshake.timeout = 5000

Essentially, the doc is woefully inadequate (pardon the rudeness) and we need to think of a more explicit strategy for how to “keep it simple” yet still keep all the potential settings within easy reach so that we can help people help themselves.

Comments

Saravanan Subbiah 2010-03-16

I want to make these settings much simpler, not have so many things to learn to use it. Will be good if we do that and then change the document to describe that.

ilevy 2010-03-23

Closing all jiras files against the HA chapter as its been redone. File new jiras against new version.