• Bug
  • Status: Closed
  • 1 Critical
  • Resolution: As Designed
  • ehcache-core
  • yzhang
  • Reporter: lijie
  • December 13, 2012
  • 0
  • Watchers: 5
  • April 01, 2013
  • February 20, 2013

Attachments

Description

We are trying out Bigmemroy Go in our app with the Fast Restartable store. We load 2800000 objects(each object’s size is about 2K~3Kbytes), the time cost 1780250 ms. When we restart our app, the Bigmemory recovery cost too much time, about 1180500 ms, is close to the load time.

The detail log please see the attachment.

Our running environment: CPU: 64core @2.00GHz Memory: 128G OS:CentOS release 6.2 (Final) jdk1.6( -XX:MaxDirectMemorySize=20480m)

Tip: when the Bigmemory in the recovering, the OS’s cpu/memory/io is not very busy.

Comments

Gautam Jayaprakash 2012-12-13

Can you try reproducing this?

Saravanan Subbiah 2012-12-14

What disk architecture are u running with ? do u have SSDs ? Also do have search indexes ? Can u list the side of the data files and search files ?

Yi Zhang 2012-12-14

Some info from testing shows that the problem seems to be a dirty recovery (INFO: Detected un-clean shutdown in the former session. Deleting existing search index files in /iodrive/bench/yzhang/testng/frsdata/RestartableStoreWithSearchTestData/search-index) Emulating the use case, I can load 2.8m entries with 100 search attributes on bigmemory07 using its iodrive persistence in about 3000 seconds. With clean shutdown recovery takes about 27 seconds. Deleting the .cleanShutdown file and forcing reconstruction of the search index files brings the recovery to 600 seconds. With only 4 attributes the recovery time is down to 75 seconds.

Yi Zhang 2012-12-14

The timing difference is due to the versions My initial numbers were for 3.7.2, however he appears to be using 3.7.1 (w/ lucene 3.0.3) Trying against that, I see load time of about 1500s and recovery time of about 1250s It seems interesting that 3.7.2 has almost double the load time and half the recovery time. Trunk has similar timings to 3.7.2

li jie 2012-12-14

Our disk is SATA, not SSD. We have 110 search attributes. The data files and search files as below(5600000 objects that I have loaded):

[[email protected] t0]$ ll */*/* -rw-rw-r–. 1 bigm bigm 0 12月 14 06:59 persistence/cachedata/frs.backup.lck -rw-rw-r–. 1 bigm bigm 0 12月 14 13:53 persistence/cachedata/FRS.lck -rw-rw-r–. 1 bigm bigm 537216132 12月 14 07:04 persistence/cachedata/seg000000000.frs -rw-rw-r–. 1 bigm bigm 536971643 12月 14 07:10 persistence/cachedata/seg000000001.frs -rw-rw-r–. 1 bigm bigm 537301421 12月 14 07:15 persistence/cachedata/seg000000002.frs -rw-rw-r–. 1 bigm bigm 537134844 12月 14 07:20 persistence/cachedata/seg000000003.frs -rw-rw-r–. 1 bigm bigm 536936055 12月 14 07:25 persistence/cachedata/seg000000004.frs -rw-rw-r–. 1 bigm bigm 537345892 12月 14 07:30 persistence/cachedata/seg000000005.frs -rw-rw-r–. 1 bigm bigm 537290431 12月 14 07:35 persistence/cachedata/seg000000006.frs -rw-rw-r–. 1 bigm bigm 537112044 12月 14 07:40 persistence/cachedata/seg000000007.frs -rw-rw-r–. 1 bigm bigm 536912075 12月 14 07:46 persistence/cachedata/seg000000008.frs -rw-rw-r–. 1 bigm bigm 536991916 12月 14 07:51 persistence/cachedata/seg000000009.frs -rw-rw-r–. 1 bigm bigm 537102157 12月 14 07:56 persistence/cachedata/seg000000010.frs -rw-rw-r–. 1 bigm bigm 60513994 12月 14 08:07 persistence/cachedata/seg000000011.frs -rw-rw-r–. 1 bigm bigm 0 12月 14 14:28 persistence/metadata/frs.backup.lck -rw-rw-r–. 1 bigm bigm 0 12月 14 13:53 persistence/metadata/FRS.lck -rw-rw-r–. 1 bigm bigm 139507 12月 14 08:07 persistence/metadata/seg000000000.frs -rw-rw-r–. 1 bigm bigm 133356 12月 14 10:32 persistence/metadata/seg000000001.frs -rw-rw-r–. 1 bigm bigm 133356 12月 14 13:53 persistence/metadata/seg000000002.frs -rw-rw-r–. 1 bigm bigm 116918 12月 14 14:33 persistence/metadata/seg000000003.frs -rw-rw-r–. 1 bigm bigm 20 12月 14 13:53 search-index/alarmcache/__terracotta_cache_name.txt -rw-rw-r–. 1 bigm bigm 2158 12月 14 14:07 search-index/alarmcache/__terracotta_schema.properties

search-index/alarmcache/0: 总用量 3026220 -rw-rw-r–. 1 bigm bigm 1096559013 12月 14 14:32 _0.cfx -rw-rw-r–. 1 bigm bigm 1204 12月 14 13:59 _33.fnm -rw-rw-r–. 1 bigm bigm 205414327 12月 14 13:59 _33.frq -rw-rw-r–. 1 bigm bigm 2467158 12月 14 13:59 _33.prx -rw-rw-r–. 1 bigm bigm 955723 12月 14 13:59 _33.tii -rw-rw-r–. 1 bigm bigm 80554172 12月 14 13:59 _33.tis -rw-rw-r–. 1 bigm bigm 1204 12月 14 14:04 _66.fnm -rw-rw-r–. 1 bigm bigm 194793658 12月 14 14:04 _66.frq -rw-rw-r–. 1 bigm bigm 2434860 12月 14 14:04 _66.prx -rw-rw-r–. 1 bigm bigm 964079 12月 14 14:04 _66.tii -rw-rw-r–. 1 bigm bigm 81899569 12月 14 14:04 _66.tis -rw-rw-r–. 1 bigm bigm 1204 12月 14 14:10 _99.fnm -rw-rw-r–. 1 bigm bigm 215578745 12月 14 14:10 _99.frq -rw-rw-r–. 1 bigm bigm 2538628 12月 14 14:10 _99.prx -rw-rw-r–. 1 bigm bigm 934900 12月 14 14:10 _99.tii -rw-rw-r–. 1 bigm bigm 78571051 12月 14 14:10 _99.tis -rw-rw-r–. 1 bigm bigm 1204 12月 14 14:15 _cc.fnm -rw-rw-r–. 1 bigm bigm 178425732 12月 14 14:15 _cc.frq -rw-rw-r–. 1 bigm bigm 2359173 12月 14 14:15 _cc.prx -rw-rw-r–. 1 bigm bigm 950477 12月 14 14:15 _cc.tii -rw-rw-r–. 1 bigm bigm 81302265 12月 14 14:15 _cc.tis -rw-rw-r–. 1 bigm bigm 1204 12月 14 14:20 _ff.fnm -rw-rw-r–. 1 bigm bigm 197110261 12月 14 14:21 _ff.frq -rw-rw-r–. 1 bigm bigm 2445501 12月 14 14:21 _ff.prx -rw-rw-r–. 1 bigm bigm 945507 12月 14 14:21 _ff.tii -rw-rw-r–. 1 bigm bigm 79895123 12月 14 14:21 _ff.tis -rw-rw-r–. 1 bigm bigm 1218 12月 14 14:25 _ii.fnm -rw-rw-r–. 1 bigm bigm 187189882 12月 14 14:26 _ii.frq -rw-rw-r–. 1 bigm bigm 2398253 12月 14 14:26 _ii.prx -rw-rw-r–. 1 bigm bigm 918088 12月 14 14:26 _ii.tii -rw-rw-r–. 1 bigm bigm 78263176 12月 14 14:26 _ii.tis -rw-rw-r–. 1 bigm bigm 1218 12月 14 14:31 _ll.fnm -rw-rw-r–. 1 bigm bigm 193808865 12月 14 14:31 _ll.frq -rw-rw-r–. 1 bigm bigm 2441380 12月 14 14:31 _ll.prx -rw-rw-r–. 1 bigm bigm 955047 12月 14 14:31 _ll.tii -rw-rw-r–. 1 bigm bigm 80643014 12月 14 14:31 _ll.tis -rw-rw-r–. 1 bigm bigm 2843704 12月 14 14:31 _lv.cfs -rw-rw-r–. 1 bigm bigm 28846540 12月 14 14:31 _lw.cfs -rw-rw-r–. 1 bigm bigm 2858644 12月 14 14:31 _lx.cfs -rw-rw-r–. 1 bigm bigm 2864500 12月 14 14:31 _ly.cfs -rw-rw-r–. 1 bigm bigm 2860052 12月 14 14:31 _lz.cfs -rw-rw-r–. 1 bigm bigm 2808529 12月 14 14:31 _m0.cfs -rw-rw-r–. 1 bigm bigm 1920966 12月 14 14:32 _m1.cfs -rw-rw-r–. 1 bigm bigm 32 12月 14 13:53 segments_1 -rw-rw-r–. 1 bigm bigm 20 12月 14 13:53 segments.gen -rw-rw-r–. 1 bigm bigm 0 12月 14 13:53 __terracotta_init.txt -rw-rw-r–. 1 bigm bigm 0 12月 14 13:53 write.lock

search-index/alarmcache/1: 总用量 3020328 -rw-rw-r–. 1 bigm bigm 1094542186 12月 14 14:32 _0.cfx -rw-rw-r–. 1 bigm bigm 1204 12月 14 13:59 _33.fnm -rw-rw-r–. 1 bigm bigm 205263422 12月 14 13:59 _33.frq -rw-rw-r–. 1 bigm bigm 2466467 12月 14 13:59 _33.prx -rw-rw-r–. 1 bigm bigm 953358 12月 14 13:59 _33.tii -rw-rw-r–. 1 bigm bigm 80621120 12月 14 13:59 _33.tis -rw-rw-r–. 1 bigm bigm 1219 12月 14 14:04 _66.fnm -rw-rw-r–. 1 bigm bigm 194468076 12月 14 14:04 _66.frq -rw-rw-r–. 1 bigm bigm 2435569 12月 14 14:04 _66.prx -rw-rw-r–. 1 bigm bigm 965089 12月 14 14:04 _66.tii -rw-rw-r–. 1 bigm bigm 82168899 12月 14 14:04 _66.tis -rw-rw-r–. 1 bigm bigm 1233 12月 14 14:10 _99.fnm -rw-rw-r–. 1 bigm bigm 215793855 12月 14 14:10 _99.frq -rw-rw-r–. 1 bigm bigm 2538445 12月 14 14:10 _99.prx -rw-rw-r–. 1 bigm bigm 931448 12月 14 14:10 _99.tii -rw-rw-r–. 1 bigm bigm 78463300 12月 14 14:10 _99.tis -rw-rw-r–. 1 bigm bigm 1233 12月 14 14:15 _cc.fnm -rw-rw-r–. 1 bigm bigm 178871143 12月 14 14:15 _cc.frq -rw-rw-r–. 1 bigm bigm 2362784 12月 14 14:15 _cc.prx -rw-rw-r–. 1 bigm bigm 954338 12月 14 14:15 _cc.tii -rw-rw-r–. 1 bigm bigm 81470708 12月 14 14:15 _cc.tis -rw-rw-r–. 1 bigm bigm 1233 12月 14 14:20 _ff.fnm -rw-rw-r–. 1 bigm bigm 197148202 12月 14 14:21 _ff.frq -rw-rw-r–. 1 bigm bigm 2445626 12月 14 14:21 _ff.prx -rw-rw-r–. 1 bigm bigm 947575 12月 14 14:21 _ff.tii -rw-rw-r–. 1 bigm bigm 79959380 12月 14 14:21 _ff.tis -rw-rw-r–. 1 bigm bigm 1233 12月 14 14:25 _ii.fnm -rw-rw-r–. 1 bigm bigm 186736099 12月 14 14:26 _ii.frq -rw-rw-r–. 1 bigm bigm 2398078 12月 14 14:26 _ii.prx -rw-rw-r–. 1 bigm bigm 924534 12月 14 14:26 _ii.tii -rw-rw-r–. 1 bigm bigm 78123215 12月 14 14:26 _ii.tis -rw-rw-r–. 1 bigm bigm 1246 12月 14 14:31 _ll.fnm -rw-rw-r–. 1 bigm bigm 193533119 12月 14 14:31 _ll.frq -rw-rw-r–. 1 bigm bigm 2438457 12月 14 14:31 _ll.prx -rw-rw-r–. 1 bigm bigm 949621 12月 14 14:31 _ll.tii -rw-rw-r–. 1 bigm bigm 80338300 12月 14 14:31 _ll.tis -rw-rw-r–. 1 bigm bigm 2832529 12月 14 14:31 _lv.cfs -rw-rw-r–. 1 bigm bigm 28856109 12月 14 14:31 _lw.cfs -rw-rw-r–. 1 bigm bigm 2863550 12月 14 14:31 _lx.cfs -rw-rw-r–. 1 bigm bigm 2840559 12月 14 14:31 _ly.cfs -rw-rw-r–. 1 bigm bigm 2809266 12月 14 14:31 _lz.cfs -rw-rw-r–. 1 bigm bigm 1292559 12月 14 14:32 _m0.cfs -rw-rw-r–. 1 bigm bigm 32 12月 14 13:53 segments_1 -rw-rw-r–. 1 bigm bigm 20 12月 14 13:53 segments.gen -rw-rw-r–. 1 bigm bigm 0 12月 14 13:53 __terracotta_init.txt -rw-rw-r–. 1 bigm bigm 0 12月 14 13:53 write.lock

search-index/alarmcache/2: 总用量 3021020 -rw-rw-r–. 1 bigm bigm 1094724541 12月 14 14:32 _0.cfx -rw-rw-r–. 1 bigm bigm 1204 12月 14 13:59 _33.fnm -rw-rw-r–. 1 bigm bigm 205568206 12月 14 13:59 _33.frq -rw-rw-r–. 1 bigm bigm 2468702 12月 14 13:59 _33.prx -rw-rw-r–. 1 bigm bigm 953615 12月 14 13:59 _33.tii -rw-rw-r–. 1 bigm bigm 80632219 12月 14 13:59 _33.tis -rw-rw-r–. 1 bigm bigm 1204 12月 14 14:04 _66.fnm -rw-rw-r–. 1 bigm bigm 194839372 12月 14 14:04 _66.frq -rw-rw-r–. 1 bigm bigm 2435413 12月 14 14:04 _66.prx -rw-rw-r–. 1 bigm bigm 960172 12月 14 14:04 _66.tii -rw-rw-r–. 1 bigm bigm 82080331 12月 14 14:04 _66.tis -rw-rw-r–. 1 bigm bigm 1204 12月 14 14:10 _99.fnm -rw-rw-r–. 1 bigm bigm 215717000 12月 14 14:10 _99.frq -rw-rw-r–. 1 bigm bigm 2538774 12月 14 14:10 _99.prx -rw-rw-r–. 1 bigm bigm 931018 12月 14 14:10 _99.tii -rw-rw-r–. 1 bigm bigm 78616548 12月 14 14:10 _99.tis -rw-rw-r–. 1 bigm bigm 1217 12月 14 14:15 _cc.fnm -rw-rw-r–. 1 bigm bigm 178423093 12月 14 14:15 _cc.frq -rw-rw-r–. 1 bigm bigm 2359026 12月 14 14:15 _cc.prx -rw-rw-r–. 1 bigm bigm 955366 12月 14 14:15 _cc.tii -rw-rw-r–. 1 bigm bigm 81351662 12月 14 14:15 _cc.tis -rw-rw-r–. 1 bigm bigm 1217 12月 14 14:20 _ff.fnm -rw-rw-r–. 1 bigm bigm 197340026 12月 14 14:21 _ff.frq -rw-rw-r–. 1 bigm bigm 2446313 12月 14 14:21 _ff.prx -rw-rw-r–. 1 bigm bigm 949492 12月 14 14:21 _ff.tii -rw-rw-r–. 1 bigm bigm 79952970 12月 14 14:21 _ff.tis -rw-rw-r–. 1 bigm bigm 1217 12月 14 14:25 _ii.fnm -rw-rw-r–. 1 bigm bigm 186889871 12月 14 14:26 _ii.frq -rw-rw-r–. 1 bigm bigm 2397640 12月 14 14:26 _ii.prx -rw-rw-r–. 1 bigm bigm 918934 12月 14 14:26 _ii.tii -rw-rw-r–. 1 bigm bigm 78144316 12月 14 14:26 _ii.tis -rw-rw-r–. 1 bigm bigm 1217 12月 14 14:31 _ll.fnm -rw-rw-r–. 1 bigm bigm 193516133 12月 14 14:31 _ll.frq -rw-rw-r–. 1 bigm bigm 2437593 12月 14 14:31 _ll.prx -rw-rw-r–. 1 bigm bigm 949668 12月 14 14:31 _ll.tii -rw-rw-r–. 1 bigm bigm 80527618 12月 14 14:31 _ll.tis -rw-rw-r–. 1 bigm bigm 28842134 12月 14 14:31 _lv.cfs -rw-rw-r–. 1 bigm bigm 2853608 12月 14 14:31 _lw.cfs -rw-rw-r–. 1 bigm bigm 2849823 12月 14 14:31 _lx.cfs -rw-rw-r–. 1 bigm bigm 2836506 12月 14 14:31 _ly.cfs -rw-rw-r–. 1 bigm bigm 2798383 12月 14 14:31 _lz.cfs -rw-rw-r–. 1 bigm bigm 1203508 12月 14 14:32 _m0.cfs -rw-rw-r–. 1 bigm bigm 32 12月 14 13:53 segments_1 -rw-rw-r–. 1 bigm bigm 20 12月 14 13:53 segments.gen -rw-rw-r–. 1 bigm bigm 0 12月 14 13:53 __terracotta_init.txt -rw-rw-r–. 1 bigm bigm 0 12月 14 13:53 write.lock

search-index/alarmcache/3: 总用量 3022240 -rw-rw-r–. 1 bigm bigm 1095215710 12月 14 14:32 _0.cfx -rw-rw-r–. 1 bigm bigm 1204 12月 14 13:59 _33.fnm -rw-rw-r–. 1 bigm bigm 205359366 12月 14 13:59 _33.frq -rw-rw-r–. 1 bigm bigm 2466753 12月 14 13:59 _33.prx -rw-rw-r–. 1 bigm bigm 956292 12月 14 13:59 _33.tii -rw-rw-r–. 1 bigm bigm 80554027 12月 14 13:59 _33.tis -rw-rw-r–. 1 bigm bigm 1219 12月 14 14:04 _66.fnm -rw-rw-r–. 1 bigm bigm 194542884 12月 14 14:04 _66.frq -rw-rw-r–. 1 bigm bigm 2434188 12月 14 14:04 _66.prx -rw-rw-r–. 1 bigm bigm 960235 12月 14 14:04 _66.tii -rw-rw-r–. 1 bigm bigm 82162802 12月 14 14:04 _66.tis -rw-rw-r–. 1 bigm bigm 1246 12月 14 14:10 _99.fnm -rw-rw-r–. 1 bigm bigm 215665348 12月 14 14:10 _99.frq -rw-rw-r–. 1 bigm bigm 2538473 12月 14 14:10 _99.prx -rw-rw-r–. 1 bigm bigm 933599 12月 14 14:10 _99.tii -rw-rw-r–. 1 bigm bigm 78645510 12月 14 14:10 _99.tis -rw-rw-r–. 1 bigm bigm 1246 12月 14 14:15 _cc.fnm -rw-rw-r–. 1 bigm bigm 178671191 12月 14 14:15 _cc.frq -rw-rw-r–. 1 bigm bigm 2359992 12月 14 14:15 _cc.prx -rw-rw-r–. 1 bigm bigm 963383 12月 14 14:15 _cc.tii -rw-rw-r–. 1 bigm bigm 81386473 12月 14 14:15 _cc.tis -rw-rw-r–. 1 bigm bigm 1246 12月 14 14:20 _ff.fnm -rw-rw-r–. 1 bigm bigm 197324105 12月 14 14:21 _ff.frq -rw-rw-r–. 1 bigm bigm 2447058 12月 14 14:21 _ff.prx -rw-rw-r–. 1 bigm bigm 947356 12月 14 14:21 _ff.tii -rw-rw-r–. 1 bigm bigm 79958780 12月 14 14:21 _ff.tis -rw-rw-r–. 1 bigm bigm 1246 12月 14 14:25 _ii.fnm -rw-rw-r–. 1 bigm bigm 186725594 12月 14 14:26 _ii.frq -rw-rw-r–. 1 bigm bigm 2394282 12月 14 14:26 _ii.prx -rw-rw-r–. 1 bigm bigm 924965 12月 14 14:26 _ii.tii -rw-rw-r–. 1 bigm bigm 78161324 12月 14 14:26 _ii.tis -rw-rw-r–. 1 bigm bigm 1246 12月 14 14:31 _ll.fnm -rw-rw-r–. 1 bigm bigm 193756247 12月 14 14:31 _ll.frq -rw-rw-r–. 1 bigm bigm 2441020 12月 14 14:31 _ll.prx -rw-rw-r–. 1 bigm bigm 947531 12月 14 14:31 _ll.tii -rw-rw-r–. 1 bigm bigm 80456810 12月 14 14:31 _ll.tis -rw-rw-r–. 1 bigm bigm 2838914 12月 14 14:31 _lv.cfs -rw-rw-r–. 1 bigm bigm 28882259 12月 14 14:31 _lw.cfs -rw-rw-r–. 1 bigm bigm 2844219 12月 14 14:31 _lx.cfs -rw-rw-r–. 1 bigm bigm 2870654 12月 14 14:31 _ly.cfs -rw-rw-r–. 1 bigm bigm 2798419 12月 14 14:31 _lz.cfs -rw-rw-r–. 1 bigm bigm 2123755 12月 14 14:32 _m0.cfs -rw-rw-r–. 1 bigm bigm 32 12月 14 13:53 segments_1 -rw-rw-r–. 1 bigm bigm 20 12月 14 13:53 segments.gen -rw-rw-r–. 1 bigm bigm 0 12月 14 13:53 __terracotta_init.txt -rw-rw-r–. 1 bigm bigm 0 12月 14 13:53 write.lock

Yi Zhang 2012-12-14

It appears that you are using the 3.7.1 version of BigMemoryGo. There are several enhancements in the latest version (3.7.2) that improved recovery time. Can you please give it a try from http://terracotta.org/downloads/bigmemorygo and let us know if it performs better for you?

In general, constructing/reconstructing search indices is very intensive, especially for the large number (~110) of search indexes that you have. If you can do a clean shutdown then recovery will be much much faster. We are also constantly trying to improve the speed of a complete recovery so your feedback on these issues is very helpful.

li jie 2012-12-15

hi,ZhangYi I have deployed the latest version 3.7.2, The recovery time not obvious shorten.But as you said, when I do a clean shutdown, the recovery is much faster.

But another problem is when recovery is ok, why the search not return the results?

The load & search log:

11:14:16,225 INFO SearchTest:308 - load 2800000 objects, costs = 1925659 ms 11:14:17,704 INFO SearchTest:42 - query1 count = 14042, costs = 1475ms 11:14:18,321 INFO SearchTest:42 - query1 count = 14042, costs = 616ms 11:14:18,839 INFO SearchTest:42 - query1 count = 14042, costs = 517ms 11:14:20,363 INFO SearchTest:66 - query1 topN return 1000, costs = 1523ms 11:14:21,047 INFO SearchTest:66 - query1 topN return 1000, costs = 684ms 11:14:21,606 INFO SearchTest:66 - query1 topN return 1000, costs = 558ms 11:14:21,607 INFO SearchTest:319 - ——————————- 11:14:25,717 INFO SearchTest:121 - query2 count = 88910, costs = 4096 11:14:28,960 INFO SearchTest:121 - query2 count = 88910, costs = 3235 11:14:32,033 INFO SearchTest:121 - query2 count = 88910, costs = 3066 11:14:35,245 INFO SearchTest:168 - query2TopN return 1000, costs = 3208ms 11:14:38,495 INFO SearchTest:168 - query2TopN return 1000, costs = 3246ms 11:14:41,519 INFO SearchTest:168 - query2TopN return 1000, costs = 3020ms 11:14:41,519 INFO SearchTest:326 - ——————————- 11:14:42,733 INFO SearchTest:277 - query3 queryTimeOutB count = 0, costs = 1205 11:14:43,746 INFO SearchTest:277 - query3 queryTimeOutB count = 0, costs = 1005 11:14:44,778 INFO SearchTest:277 - query3 queryTimeOutB count = 0, costs = 1025 十二月 16, 2012 11:14:44 上午 com.terracottatech.frs.compaction.CompactorImpl$CompactorThread run 信息: Compactor is interrupted. Shutting down. 十二月 16, 2012 11:14:44 上午 com.terracottatech.frs.compaction.CompactorImpl$CompactorThread run 信息: Compactor is interrupted. Shutting down. 十二月 16, 2012 11:14:45 上午 net.sf.ehcache.store.offheap.search.LuceneIndexedSearchManager createCleanShutdownFile 信息: Creating clean-shutdown file in [/home/lijie/lj/tmp/t0/.cleanShutdown]

—– recovery & search log:

十二月 16, 2012 11:32:54 上午 com.terracottatech.frs.recovery.RecoveryManagerImpl$ProgressLoggingFilter filter 信息: Recovery progress 90% 十二月 16, 2012 11:32:56 上午 com.terracottatech.frs.recovery.RecoveryManagerImpl$ProgressLoggingFilter filter 信息: Recovery progress 100% 十二月 16, 2012 11:32:56 上午 net.sf.ehcache.store.offheap.search.LuceneIndexedSearchManager recoveryComplete 信息: Search Indices recovered, deleting old clean-shutdown file and resuming normal operation. 11:32:56,938 INFO SearchTest:310 - recover cost = 35926 ms, cache.size = 2800000 11:32:56,969 INFO SearchTest:44 - query1 count = 0, costs = 17ms 11:32:56,969 INFO SearchTest:44 - query1 count = 0, costs = 0ms 11:32:56,970 INFO SearchTest:44 - query1 count = 0, costs = 0ms 11:32:56,970 INFO SearchTest:68 - query1 TopN return 0, costs = 0ms 11:32:56,971 INFO SearchTest:68 - query1 TopN return 0, costs = 1ms 11:32:56,971 INFO SearchTest:68 - query1 TopN return 0, costs = 0ms 11:32:56,972 INFO SearchTest:319 - ——————————- 11:32:56,998 INFO SearchTest:123 - query2 queryTimeOutA count = 0, costs = 11 11:32:57,016 INFO SearchTest:123 - query2 queryTimeOutA count = 0, costs = 10 11:32:57,022 INFO SearchTest:123 - query2 queryTimeOutA count = 0, costs = 3 11:32:57,028 INFO SearchTest:170 - query2 queryTimeOutATopN return 0, costs = 2ms 11:32:57,034 INFO SearchTest:170 - query2 queryTimeOutATopN return 0, costs = 2ms 11:32:57,038 INFO SearchTest:170 - query2 queryTimeOutATopN return 0, costs = 2ms 11:32:57,039 INFO SearchTest:326 - ——————————- 11:32:57,044 INFO SearchTest:277 - query3 queryTimeOutB count = 0, costs = 1 11:32:57,050 INFO SearchTest:277 - query3 queryTimeOutB count = 0, costs = 0 11:32:57,056 INFO SearchTest:277 - query3 queryTimeOutB count = 0, costs = 0 十二月 16, 2012 11:32:57 上午 com.terracottatech.frs.compaction.CompactorImpl$CompactorThread run 信息: Compactor is interrupted. Shutting down. 十二月 16, 2012 11:32:57 上午 com.terracottatech.frs.compaction.CompactorImpl$CompactorThread run 信息: Compactor is interrupted. Shutting down. 十二月 16, 2012 11:32:57 上午 net.sf.ehcache.store.offheap.search.LuceneIndexedSearchManager createCleanShutdownFile 信息: Creating clean-shutdown file in [/home/lijie/lj/tmp/t0/.cleanShutdown]

Yi Zhang 2012-12-16

Hi I do not see any problem with getting search results after recovery.

[testng] INFO: Completed recovery in 27635 millis [testng] Dec 16, 2012 2:44:22 AM com.terracotta.qa.test.RestartableStoreWithSearchTest verifyData [testng] INFO: Cache size after recovery: 2800000 [testng] Dec 16, 2012 2:44:23 AM com.terracotta.qa.test.RestartableStoreWithSearchTest queryData [testng] INFO: Found results of size = 1000 in 204 millis

Can you please attach your test so that we can try to reproduce the issue?

li jie 2012-12-16

Project for recovery data but can’t query results

li jie 2012-12-16

Hi I run the test.bigm.SearchTest main method twice, first time load the data and call ‘query1’ method then return the results. Second time it recovery data and also call ‘query1’ method, but not results return.

The data and the test project in the attachment.

Yi Zhang 2012-12-17

Hi I’ve replicated your test case and it appears that you’ve identified a bug with recovering search indices when recreating a dynamic cache. While we’re working on a proper fix for this issue, the workaround is to specify the cache explicitly in the configuration.

e.g. in EhcacheManger.getMostIndexesCache() instead of {noformat} cm = CacheManager.create(); Cache cache2 = cm.getCache(CACHE_NAME); if(cache2!=null && cache2.getSize()>0) { return cache2; } ccf = new CacheConfiguration(); ccf.persistence(new PersistenceConfiguration().strategy(Strategy.LOCALRESTARTABLE).synchronousWrites(false)); ccf.setName(CACHE_NAME); ccf.setMaxEntriesLocalHeap(MAX_ELEMENTS_IN_MEMORY); ccf.setOverflowToOffHeap(Boolean.TRUE); ccf.setMaxMemoryOffHeap(MAX_MEMORY_OFF_HEAP); Searchable searchable = new Searchable();

for(String fieldName : fieldNames) { searchable.addSearchAttribute(new SearchAttribute().name(fieldName)); } ccf.addSearchable(searchable); cache = new Cache(ccf); cm.addCache(cache); logger.info(“Printing Ehchache configuration:”); logger.info(cm.getActiveConfigurationText(CACHE_NAME)); return cache; {noformat}

use {noformat} Configuration config = new Configuration(); config.diskStore(new DiskStoreConfiguration().path(“D:\Temp\b0”));

ccf = new CacheConfiguration(); ccf.persistence(new PersistenceConfiguration().strategy(Strategy.LOCALRESTARTABLE).synchronousWrites(false)); ccf.setName(CACHE_NAME); ccf.setMaxEntriesLocalHeap(MAX_ELEMENTS_IN_MEMORY); ccf.setOverflowToOffHeap(Boolean.TRUE); ccf.setMaxMemoryOffHeap(MAX_MEMORY_OFF_HEAP); Searchable searchable = new Searchable(); for(String fieldName : fieldNames) {
searchable.addSearchAttribute(new SearchAttribute().name(fieldName)); } ccf.addSearchable(searchable); config.addCache(ccf);

cm = CacheManager.create(config); Cache cache = cm.getCache(CACHE_NAME); logger.info(“Printing Ehchache configuration:”); logger.info(cm.getActiveConfigurationText(CACHE_NAME)); return cache; {noformat}

Note I’ve also explicitly specified the diskStore path for the persistence data to the directory you were using in the SearchTest instead of relying on the java.io.tmpdir.

Yi Zhang 2012-12-17

Actually under further inspection, the workaround shown above may not be safe. There is still a potential problem with recovering the indices. A “dirty” shutdown as you had before which recreates the indices should work, however we recognize that this is slower. Please continue to follow this jira as we work on a proper fix.

li jie 2012-12-19

Okay. Using the workaround is works. The 3.7.2 really improve the recovery time when If I shutdown cm cleanly, it is much faster than 3.7.1. This is my testing logs: 3.7.1 load cost = 139809 ms recover cost = 91795 ms, cache.size = 200000 3.7.2 load cost = 152784 ms recover cost = 17761 ms, cache.size = 200000

Thanks.

Yi Zhang 2013-01-31

Will revisit for 4.0

Yi Zhang 2013-02-20

Re-running the test case against 3.7.4 and trunk 1.4M entries w/ 100 search indices Load time ~1100 seconds for both, recovery time about 15 seconds, dirty/full recovery about 300 seconds This seems to be on par with what is acceptable to the user and better than the near 1:1 load:recovery time originally noted so closing this for now. Other jira will track the FRS-search recovery issues

li jie 2013-02-28

Thanks. When will the go version 4.0 release?

li jie 2013-02-28

Thanks. When will the go version 4.0 release?

Yi Zhang 2013-03-08

The 4.0.0 release is live on the website now.

li jie 2013-03-28

Hi I have download the 4.0.0 and try it,but found the performance not improved. Here is my test results: 4.0 Not do “cache.getCacheManager().shutdown();” Load data costs = 95500 ms, cache.getSize() = 100000 Recover data costs = 57564 ms, cache.getSize() = 99799 ## also slowly

Do "cache.getCacheManager().shutdown();"
Load data costs = 91055 ms, cache.getSize() = 100000
Recover data costs = 12919 ms, cache.getSize() = 100000

3.7.2
Not do "cache.getCacheManager().shutdown();"
Load data costs = 91119 ms, cache.getSize() = 100000
Recover data costs = 54949 ms, cache.getSize() = 99812  ##slowly

Do "cache.getCacheManager().shutdown();"
Load data costs = 87103 ms, cache.getSize() = 100000
Recover data costs = 16622 ms, cache.getSize() = 100000	

[Code snippet] long st = System.currentTimeMillis(); Cache cache = getBigmCache(); if (cache.getSize() == 0) { loadData(cache, dataPath, fileCount); System.out.println(“Load data costs = “ + (System.currentTimeMillis() - st) + “ ms, cache.getSize() = “ + cache.getSize()); } else { System.out.println(“Recover data costs = “ + (System.currentTimeMillis() - st) + “ ms, cache.getSize() = “ + cache.getSize()); } // cache.getCacheManager().shutdown(); [/Code snippet]

Yi Zhang 2013-04-01

Hi, 4.0 contains several stability and recovery fixes, but as you note, does not significantly affect the recovery time. Faster recovery of search indices is still something we’re actively working on. My earlier comment was just to verify that 4.0 did not contain a regression compared to 3.7.