• Bug
  • Status: Open
  • 1 Critical
  • Resolution:
  • ehcache-terracotta
  • FCSFeedback
  • dmccartn
  • Reporter: gadams00
  • April 26, 2013
  • 0
  • Watchers: 10
  • June 18, 2013

Attachments

Description

We’re using bigmemory max 4.0.0 with a web application client running on tomcat with Oracle JVM 1.7.0_21. We’ve got a particular cache that contains several very large metadata elements that are updated very infrequently. The cache entries are deep, complicated objects, largely composed of strings, and they range from 1-32MB in size. When these entries get faulted into L1 from L2, we get warnings about message size in the TSA server logs and in the terracotta client logs, and we notice a marked impact on performance when that happens. In an effort to optimize performance where these large entries are concerned, I’ve pinned the cache to localMemory. However, when L1 fills up with other entries, I notice that I still see the warnings about large messages. Under heavy load, these messages occur every few seconds. It appears that pinning doesn’t work on big memory max 4.0.0.

I’ve attached a reproduction test case as a maven project (bigmemory-pinning.zip), as well as the server tc-config.xml that I’m using, though I don’t think that’s very relevant here. To reproduce, install bigmemory max 4.0.0 and start the TSA using the attached tc-config.xml. Extract bigmemory-pinning.zip somewhere and execute mvn install in the project root. The unit test puts and gets a single 10MB byte[] in a cache named “largeItems”, which is configured as pinned in src/main/resources/ehcache.xml. It then asserts that largeItems cache contains 1 item in L1 by using Cache.getStatistics().getLocalHeapSize(). Next, it puts and gets 10000 1KB byte arrays in another cache called “smallItems”, which is not pinned. This is to fill up L1 (which is configured in ehcache.xml to have 20MB local heap pool) with items that should not be pinned. It then asserts that largeItems still has 1 item in L1, which fails. Since largeItems is pinned and smallItems is not, I shouldn’t see any faulting on largeItems. Ehcache should prefer to evict unpinned items from L1 before evicting pinned items, but it doesn’t appear to be working that way.

Comments

Greg Adams 2013-04-26

On a side note, if someone ends up looking at the bigmemory-pinning test case, note that I had to add a Thread.sleep(1000) after initializing the CacheManager. Initially this sleep wasn’t there and I was having issues where the largeItems cache put call would finish but the element I added wouldn’t actually be added to L1 or L2. Is there some CacheManager warm up time needed? Is there some way to tell when that’s complete? I tried using CacheManager.getStatus() and Cache.getStatus() and asserting that status was Status.STATUS_ALIVE, but neither of these fixed my problem.

Greg Adams 2013-04-26

Ignore the previous comment. I’m sure this is a bug with nonstop caches, so I’m creating another JIRA issue.

Tim Wu 2013-05-02

Admittedly, it’s not terribly clear but this is the intended behavior of localmemory pinning. What it does is aggressively refetch entries from the L2 when there is an invalidation caused by a change by another node. It does not keep entries from getting evicted from the L1 due to resource constraints. In order to get the desired behavior, you will need to adjust ARC constraints such that other caches will not force evictions in the cache that you want to keep in localmemory.

For example, for your ehcache.xml, you could add a constraint to the “smallItems” for maxBytesLocalHeap=”5m”. That would leave the remaining 15mb for the “largeItems” cache.

Will that work for you?

Greg Adams 2013-05-02

According to this page: http://terracotta.org/documentation/bigmemorymax/configuration/data-life#pinning-data

localMemory – Cache data is pinned to the memory store or the off-heap store. Unexpired entries can never be flushed to a lower tier or be evicted.

Your workaround would be fine for the simple sample application I provided where we have only 2 caches, but in my real scenario there are dozens of caches. We prefer to allow the overall pool to limit the size of L1 and allow ehcache to keep the elements across all caches that are hit the least to decide what to evict to L2 when necessary. In terracotta open source 3.7.2 we use pinning=”localHeap” for our cache of large metadata objects and this seems to do exactly what I want, which is to prevent eviction of that particular cache to L2. Eviction of such large objects seems to cause a lot of GC thrash. LocalHeap pinning doesn’t seem to be a valid configuration option with big memory max.

Fiona OShea 2013-05-03

Greg we will update the misleading documentation.

We will also incorporate your use case into a feature enhancement discussion on this topic. Any resulting EHC tasks will be linked to this Jira. Thank you for your feedback.

Fiona OShea 2013-05-03

Can we fix docs in next site push? thanks Fiona

Fiona OShea 2013-05-03

server team - assigned to you until we have discussed what we want to do with this at a future date

Greg Adams 2013-05-03

Isn’t this actually a loss in functionality when compared to terracotta 3.7? Was localheap pinning removed for any particular reason?

ilevy 2013-05-03

We’ll try to pin down a clear picture of how this actually works now and whether it’s going to stick around (I had heard talk of it being evicted for good). Seems it’s being faulted for both the way it’s worked in 3.7.x and the way it’s working now. Once we’re flush with knowledge we’ll put the new info so users can get it.

Karen Quastler 2013-05-03

I can change the localMemory pinning definition, however the page already has this clarification:

http://www.terracotta.org/documentation/bigmemorymax/configuration/data-life#pinning-and-cache-sizing

For background, see DEV-8741 http://jira.terracotta.org/jira/browse/DEV-8741?focusedCommentId=71878&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-71878

Greg Adams 2013-05-04

Could someone please point to any documentation, forum/blog postings, or anything at all that explains what you’re saying about localmemory pinning? The behavior you’ve described, aggressively refetching entries from L2 when changes are made on another node, is the default behavior I would expect from a distributed cache. If you have 2 client nodes and are expecting write-behind caching to work, then the behavior you’re describing as being optionally enabled by localmemory pinning should be the default. Cache puts on any node should immediately be apparent in all nodes. Otherwise cache gets would potentially return stale data when data changed in L1 on another node happens to reside in L1 on the current node. Does big memory max not make this guarantee unless localmemory pinning is used? Is this a difference from what terracotta 3.7 did?

Everywhere I see anything written about cache pinning, it’s in reference to keeping cache entries from being evicted from the specified tier.

http://terracotta.org/documentation/bigmemorymax/configuration/data-life#pinning-a-cache http://ehcache.org/documentation/configuration/data-life http://performanceterracotta.blogspot.com/2012/01/performance-benefits-of-pinned-cache.html http://terracotta.org/documentation/bigmemorygo/configuration/data-life#pinning-a-cache

Even the ehcache javadocs included with bigmemory max 4.0.0 for net.sf.ehcache.config.PinningConfiguration.store() and getStore() indicate the meaning I expected:

store

public PinningConfiguration store(PinningConfiguration.Store store) Set the lowest store from which elements must not be evicted from Parameters: store - the store Returns: this

getStore

public PinningConfiguration.Store getStore() Return the lowest store from which elements must not be evicted from Returns: the lowest store from which elements must not be evicted from

The company I work for is considering purchasing the full version of Big Memory Max, but this issue is a show stopper for us and I can’t imagine the way cache pinning has been explained to work being acceptable to anyone else.

Dustin McCartney 2013-05-07

Greg,

Pinning and consistency (puts on 1 node being visible on other nodes) are 2 unrelated concepts. Can you contact me directly and I’ll clarify? My email is: [email protected]

If you want, you can email me your phone number and we can talk… it would be good to understand what you are trying to do.

Thanks,

Dustin McCartney Sr. Solutions Architect Terracotta, Inc.

Karen Quastler 2013-06-18

Removed check from Doc Required checkbox, as all terracotta.org docs that reference pinning now describe the current implementation.