CDV ❯ Some additional lock-levels (apart from write, synchronous-write) can give applications along more choices along correctness/latency trade-offs.
-
New Feature
-
Status: Open
-
2 Major
-
Resolution:
-
-
-
prodmgmt
-
Reporter: siyer
-
April 15, 2008
-
1
-
Watchers: 0
-
March 19, 2010
-
Description
Lock–level Write => Terracotta Transaction written to Commit Buffer and based on l1.transactionmanager properties, it gets flushed to L2 Asnchronously.
There is some exposure here in that
- If this JVM (L1) dies, some uncommitted “Terracotta Transactions” in the commit buffer aren’t yet been flushed,
- If any other component of the cluster fails, there is *no* exposure since upon rejoining the cluster, this JVM will retransmit the transaction and manage it as “in-flight” until the L2 Acks back.
So with regards to the exposure in (1), i.e. for applications with requirements around greater levels of guarantees around Terracotta-transaction commits, Terracotta supports Synchronous-Write Lock-Level, where upon the calling thread blocks until the “Terracotta Transaction” is committed. However adoption of this feature is stunted, given the performance costs associated with it. Current understanding is that the Application Calling thread will block until the L2 Acks, which it will do so only after:
- The transaction gets applied on Primary L2 (which in turn may synch-relay it to Secondary L2 in NAP and/or persist to disk).
- ACKs are received from all other L1s to which the change was broadcast to.
So to trade-off the risk/reward (i.e. guarantees v/s latency-issues-experienced-by-calling-application-thread), several DSO users request that we support more finer-grained synchronous-write lock levels.
EXAMPLE - support synchronous-write-Lite where-in the flush from the L1 Commit-Buffer to the Primary L2 is synchronous (so data is in at least 2 places instead of the usual N for the best possible guarantee), but the persistence to disk and/or ACKS from other L1s happen asynchronously.
Steve and I have discussed this - but in the context of just an optimization. We think that synchronous-write can just be optimized to not wait for acks from disk or passives anyway….
However this is an interesting idea. Maybe there is value in providing increasing levels of paranoia that can be enabled at a lock level. As you state, the current sync-write setting goes “all the way” but the lite just goes to the server (which is the optimization Steve and I considered).