Limit to the number of caches that can be replicated

Type: Bug
Status: Closed
Priority: 1 Critical
Resolution: Fixed
Component/s: ehcache-core
Labels:

Assignee: asingh
Reporter:
Created: October 13, 2009
Votes: 2
Watchers: 2
Updated:
February 17, 2010
Resolved:
November 04, 2009

Description

Im having the exact same issue. I cannot and probably should not redo the logic behind how hibernate names it caches with packageName.className.

Is there another work around?

Thanks

Date: Tue, 13 Oct 2009 11:05:13 -0500 From: [email protected] To: [email protected] Subject: Re: [Ehcache-List] FATAL [MulticastKeepaliveHeartbeatSender] Heartbeat is not working. Configure fewer caches for replication.

I just realized that my reply went straight to your email and not to the list… Sorry…

Thank you much for the response Greg!

I can’t get logging to show me debug info for ehcache… I’ve tried a bunch of things… I’m using tomcat 5.5 and evidently my changes to logging config are not being picked up!

One of the challenges is going to be that I don’t control either variable (server name or cache name) AFAIK, hibernate names the caches after the entity names. I may have to rewrite how hibernate deals with that and see if I can plug that into our app.

The other issue is that it seems that there will still be a limitation of how many more entities I’ll be able to roll out…

Is this affecting me only because I’m using multicast replication? Any other thing I could do to have replication w/o this issue?

Thanks!

Greg Luck wrote:

Alexander

1500 is a hard limit. I did not think anyone would ever get to it.

It is based on the size of the URL. A workaround is to shorten the URL, by releasing your host name length and cache name. Turn on debugging and take a look at the strings.

It should look something like rmiUrls=//server1:40000/sampleCache1|//server2:40000/sampleCache1|//server1:40000/sampleCache2|//server2:40000/sampleCache2”/>

On 13/10/2009, at 7:54 AM, Alexander Wallace wrote:

Hi all!

This is my first post to the list.

I use liferay 5.1.2, which in turn uses Hibernate and Ehcache. I’ve configured rmi cache replication using multicast.

The whole setup has been working great until now. With the last build of our application we added a few db entities and right after starting the app, I get this in my log:

FATAL [net.sf.ehcache.distribution.MulticastKeepaliveHeartbeatSender] Heartbeat is not working. Configure fewer caches for replication. Size is 1539 but should be no greater than1500

After that, a lot of lines like the following:

FATAL [net.sf.ehcache.distribution.PayloadUtil] Could not ungzip. Heartbeat will not be working. Unexpected end of ZLIB input stream

I don’t know enough of this issue yet, but it seems to me that it is related to how many entities we have and the names used for each one of them…

I really hope there is a configuration work around… Is there?

Please advise! _______________________________________________ Ehcache-List mailing list [email protected] http://lists.terracotta.org/mailman/listinfo/ehcache-list

Greg

Comments

gluck 2009-10-13

We should at the least improve the efficiency of the compression so that we can increase the number of caches.

Alexander Wallace 2009-10-13

When I first reported I did not mention I’m on 1.4.1… It would be nice if the fix could be written in a way that could be backported.

While better compression can buy us some time, I wonder how far it will go… Are you thinking on using bzip2 perhaps?

We currently have 146 caches, and the urls for each vary from 60 to 105 characters.

I think we would probably add up to 20 within a year… I certainly will be more mindful of adding caches. In my case these translate to db tables (via hibernate). I never thought my long package names and class names would end up making me pay this way…

I understand that 1500 bytes is the MTU for Ethernet, but not knowing the internals of Ehcache and the multicast heartbeat, I can only wish that It could cope with datagram fragmentation if someone exceeds the MTU.

In any case, I really appreciate your help!

Petr H 2009-10-16

Hello,

we currently face the same problem during development of one project.

Though that hardcoded limit is quite nasty thing there appear to be some options.

If you take a look at the following: http://docs.jboss.org/hibernate/stable/core/reference/en/html/performance.html#performance-cache-mapping you’ll see that you can actually define the region attribute for each cache entry in Hibernate and this should be used as the cache name then. So switching from default class names to some short names here should do the trick. We haven’t got into the stage of trying that as we’ve got another tasks to do first, but I believe it should really be it.

Another option related to the first one might be that it should be also possible to share one cache region among several Hibernate cache entries, so if you consolidate all your caches into some logical groups (read-only, read-write, type of data cached, roughly estimated cache sizes, lifetimes, whatever…) it should reduce the amount of actual caches so the heartbeat packet would be even smaller. Again, we haven’t tried this one yet as well so I can’t tell whether this really works (and safely enough).

Also, you may specify which caches would be replicated and which not in ehcache.xml - but I agree that for high amount of these caches (which is the case here) it may be pretty difficult task, not talking about the resulting “mess” in ehcache.xml

Alex Miller 2009-10-29

Abhishek, I know Greg has some thoughts on this one, so please follow up with him on the direction to go with this.

Abhishek Singh 2009-11-04

Fixed. Payload will be broken up into multiple payloads if it exceeds the MTU.

Alexander Wallace 2009-11-04

Peter H. Thanks a lot for the info on a good workaround!

Abhishek: Those are great news! I will try to back port it to 1.4.1 in December. We were able to avoid the limit by eliminating a few unused entities, but will hit it again for sure in a couple of months if we don’t apply a fix…

Thanks to all!

Himadri Singh 2009-11-19

RMICacheReplicatorWithLargePayloadTest covers this, running on monkeys.

Alexander Wallace 2010-02-17

I have successfully back-ported the fix it to 1.4.1

Thanks to all!