Hi Sergio,
I've done a quick analysis of the amount of time spent on a defag being performed with the new method of defragging every x inserts (default being 1000000). While this is nice and predictable (we see a defrag event on average every 60 minutes), the amount of time spent during the defrag process tends to increase with each subsequent defrag. This is not particularly practical in a production environment, as within a week we would be dropping significant numbers of log records each time a defrag runs:
./defragtime.py
1: 26.0 seconds
2: 35.0 seconds
3: 35.0 seconds
4: 40.0 seconds
5: 47.0 seconds
6: 53.0 seconds
7: 62.0 seconds
8: 54.0 seconds
9: 103.0 seconds
10: 170.0 seconds
11: 238.0 seconds
12: 260.0 seconds
13: 287.0 seconds
14: 348.0 seconds
15: 344.0 seconds
16: 441.0 seconds
17: 387.0 seconds
18: 407.0 seconds
19: 425.0 seconds
20: 480.0 seconds
21: 522.0 seconds
22: 540.0 seconds
23: 558.0 seconds
24: 561.0 seconds
25: 580.0 seconds
26: 590.0 seconds
27: 611.0 seconds
28: 632.0 seconds
29: 646.0 seconds
30: 653.0 seconds
31: 681.0 seconds
32: 672.0 seconds
33: 699.0 seconds
34: 727.0 seconds
35: 733.0 seconds
36: 905.0 seconds
37: 778.0 seconds
38: 872.0 seconds
39: 859.0 seconds
40: 877.0 seconds
41: 876.0 seconds
42: 915.0 seconds
43: 940.0 seconds
44: 945.0 seconds
45: 960.0 seconds
This is after running from a fresh database for 2 days. While increasing the number of inserts before defragging will cut down on the frequency of the tasks, the time spent on the defrag action will be longer, and still increase over time. This looks to be a limitation of the HSQLDB backend's defrag implementation. Have you considered adapting Nimrod for other storage engines, like Cassandra?
Regards,
Adam Horwich