• Bug
  • Status: Closed
  • 2 Major
  • Resolution: Fixed
  • DSO:L1
  • nbangarw
  • Reporter: jrray
  • April 07, 2011
  • 0
  • Watchers: 8
  • October 02, 2012
  • August 22, 2012

Description

java.util.concurrent.ConcurrentHashMap specifies that entrySet() and values() return an iterator that is “weakly consistent” and are not expected to return null or throw NullPointerException. org.terracotta.collections.ClusteredMap and its implementors do not specify one way or the other what kind of behavior to be expected, so please excuse me if this bug report is reporting expected behavior.

When using the Toolkit via a connected client, requesting a ClusteredMap via getMap(), iterating over the map has two different failure modes in the face of concurrent modification.

Using entrySet().iterator(), iterator.next() will sometimes return null. Using values().iterator(), iterator.next() will sometimes raise NullPointerException.

My test code creates a map and in 8 threads repeatedly calls put() and remove() with random integer keys from 0 to 100. The main thread attempts to iterate over the map and reports errors.

With entrySet().iterator(), the nulls are returned after a random number of calls to next(): {{entry null after 145 entry null after 8 entry null after 243 entry null after 180}}

With values.iterator(), the exception is similarly random: {{NPE after 202 NPE after 26 NPE after 112 NPE after 86}}

Here is the stack for the NPE: {{java.lang.NullPointerException at java.util.AbstractMap$2$1.next(AbstractMap.java:360) at com.terracotta.toolkit.util.AggregateMapIterator.next(AggregateMapIterator.java:49) at com.imageworks.st.config.AppConfig.iterator_demo(AppConfig.scala:172)}}

As a workaround, using getAllEntriesSnapshot() survives the test with no nulls or NPEs.

Here’s my test code (scala): {{ def iterator_demo(t: org.terracotta.api.ClusteringToolkit) = { import scala.actors.Actor

val m: org.terracotta.collections.ClusteredMap[Int, String] = t.getMap("iterator.test2")

val actors = for (i <- 1 to 8) yield {
  val a = new Actor {
    def act = {
      val rand = new java.util.Random
      while (true) {
        m.put(rand.nextInt(100), "1")
        m.remove(rand.nextInt(100))
      }
    }
  }
  a.start
  a
}

var count = 0
while (true) {
  try {
    var i = m.values.iterator
    while ((i ne null) && i.hasNext) {
      val entry = i.next
      if (entry eq null) {
        logger.debug("entry null after " + count)
        count = 0
        i = null
      }
      /*
      else if (entry.getValue eq null) {
        logger.debug("entry.getValue null after " + count)
        count = 0
        i = null
      }
      */
      else {
        count += 1
      }
    }
  }
  catch {
    case e: NullPointerException =>
      logger.debug("NPE after " + count, e)
      count = 0
  }
}   \} \}\}

Comments

Tim Eck 2011-04-08

Thanks. That is pretty clearly a bug. The values() view is expressed via the entrySet() so I think there is probably only one bug here at least.

The problem is easy to reproduce as you’ve shown

Fiona OShea 2011-04-12

Have you looked into fixing this? How hard is it to fix?

J Robert Ray 2011-04-14

My workaround of using getAllEntriesSnapshot is having a problem, raising “java.lang.ClassCastException: com.tc.object.ObjectID cannot be cast to [B” on some calls to getValue() while iterating. This only happens shortly after bringing up a new L1 client. My real values are byte[].

Is this worthy of a separate bug report? I don’t have a small standalone repro example for this one, unfortunately.

Tim Eck 2012-08-03

Is this still a bug in toolkit2? Perhaps we can close if not

Nishant Bangarwa 2012-08-22

Have verified that its not a bug in toolkit2. Its being handled in ServerMapEntrySet. Have also added tests to ensure that its working properly in Rev 20577.

Nishant Bangarwa 2012-08-22

Already Fixed in Trunk.