CDV ❯ Cluster info queries that timeout return empty results which makes the timeout error condition undetectable
-
Bug
-
Status: Open
-
2 Major
-
Resolution:
-
DSO:L1
-
-
cdennis
-
Reporter: cdennis
-
November 13, 2009
-
0
-
Watchers: 1
-
October 11, 2011
-
Description
When a cluster info query times out due to a lack of server response we currently return an empty collection. This means that the timeout error condition is indistinguishable from a normal but empty return. At a minimum I would like to change this to throwing a TimeoutException. We could also think about adding methods which take the timeout as an argument,
Comments
Tim Eck 2009-11-16
Steve Harris 2010-08-11
If this is an issue for toolkit and or ehcache maybe we should fix it? Is it?
Fiona OShea 2010-08-11
Do you have more information on this?
Chris Dennis 2010-08-12
From what I can see this is still true for the cluster info stuff in the toolkit. However changing this now would I believe be a non backwards compatible API change (adding a checked exception throw to the relevant toolkit methods) - so I don’t think we can fix it now. Probably the best thing to do is put this in some pile of “we would like to fix but it requires going to API 2.0” bucket, and wait until the bucket is either really full, or a PM request forces us to move to 2.0 anyway.
The affected methods are:
DsoClusterImpl.getNodesWithObject(Object) DsoClusterImpl.getNodesWithObjects(Object…) DsoClusterImpl.getNodesWithObjects(Collection<?>) These two methods are complex since if the remote request times out, then it returns an empty collection, but the local client will still merge in its own knowledge so it looks like the object is only local here.
DsoClusterImpl.getKeysForOrphanedValues(TCMap map) This will just return an empty Set if the operation times out.
These then just straight translate in to the toolkit equivalents.
Chris Dennis 2011-04-25
My recent work on DEV-4460 has revealed that these timeouts actually trigger null returns internally. Some but not all of the callers correctly handle this null event…
I think I’m leaning towards an exception on timeout, but I guess I’d have to run through the code a bit more and look at the existing callers of the methods. One bit of saving grace is that the timeout is logged.