Questions regarding Krati performance

116 views
Skip to first unread message

mhgrove

unread,
Feb 21, 2011, 8:01:57 AM2/21/11
to Krati
I am still trying to use Krati and I'm having some issues with its
performance. I've switched uses from my earlier post [1], and I'm
trying to just use it as a persistent key-value store, which I feel is
more directly in its core use case.

What I've noticed is that Krati's performance is ok up to some
threshold that I can't quite put my finger on, and after that,
performance degrades rapidly.

I have a simple test program that iterates over a data set and adds
key-value pairs to Krati. Every million cycles, I print out some
basic stats. The time between each cycle is very consistent for an
initial period. Eventually, Krati seems to hit a wall, and
performance quickly degrades, slowing to about 1/4 the speed it was
running at, and with much more variance in running times. Below is a
simple chart w/ the number of seconds between processing cycles and
the number reported by getLoadCount of the store. For a number of
cycles previous to the data below, I was consistently seeing times
between 450 and 500, but then the spike below occurs before leveling
out at about 1500s per cycle.

Two things I noticed sitting at the computer, my disk goes crazy at
the same time that the performance degradation hits -- the disk sounds
like its thrashing, though iostat does not show any noticeable dip in
the throughput to disk, and also, after *hours* of no debug messages,
I suddenly start getting debug output -- which probably does not help
the performance degradation.

I let my test harness run over the weekend hooked up to the Netbeans
profiler. It shows that 30.1% of the total time was spent in
krati.core.segment.ChannelSegment.read(int, byte[]). No other method
was over 3.3%. It says that read was called 18.6M times, totaling
18.2M ms, basically just under 1ms per read. I realize that the
profiling adds non trivial overhead to an application, but that's
*significantly* slower than what I would expect given the numbers
reported on your website.

To contrast, the other Krati method listed as a top hotspot is
krati.core.segment.ChannelSegment.append(byte[], int, int) which is
called 21.9M times totalling just 416k ms. I assume this is the main
write method for krati and it's order of magnitude faster than reading
from the store.

I am initializing the store as such:
return new DynamicDataStore(theDirectory, new
ChannelSegmentFactory());

I also tried this configuration:
return new DynamicDataStore(theDirectory, 0, 5000000, 5, 256, new
ChannelSegmentFactory(), .5, .75, new FnvHashFunction());

Hoping that we would sync to disk fewer times and get better
performance, but it did not seem to affect the results.

I am hoping that someone can help shed some light on why Krati seems
to be hitting a wall, what's going on, is there anything that can be
done to work around it? Why do reads seem massively slower than
writes according to the profiling information? And where is the
performance mismatch overall?

I'm running these tests on your ordinary run of the mill iMac, gave
the test harness 12G of RAM even though it only needed probably 3 or
4. 2.8Ghz processors, Java 1.6.

Thanks.

Mike

(seconds b/w cycles in seconds - number reported by getLoadCount)
473 - 20579692
573 - 20908800
782 - 21472796
761 - 21984972
843 - 22536690
1014 - 23070578
1512 - 23631670
1326 - 24121331
1693 - 24706459
1813 - 25269074
1554 - 25821772
1381 - 26333938
1238 - 26905789
1436 - 27452486

[1] http://groups.google.com/group/krati/browse_thread/thread/187679075dba41a4

Jingwei

unread,
Feb 21, 2011, 6:22:53 PM2/21/11
to Krati
Hi Michael,

First, I am sorry for not getting back to your previous use case last
week as I was complete occupied by another project.

In your new use case you, the way you construct DynamicDataStore won't
give you good read/write throughput performance. The reason is Krati
uses linear hashing to grow hashtable as the number of keys grows. The
initial hastable capacity is critical to the read/write performance to
Krati DynamicDataStore.

The two constructors you used have initLevel set to 0. That means the
initial capacity of DynamicDataStore is 64K = 2^0 * 64K. As the number
of keys grows, krati performs linear hashing by constantly moving data
from segment to segments. What you ended up with is to write the same
data to dist multiple times rather than once. The Krati compactor
reads data from disk crazily during rehashing.

Try to construct DynamicDataStore with initLevel=10, which gives you
the initial capacity of 2^10 * 64K = 2^6 * 1M = 64 Million keys. You
should have reasonably good performance at the scale of 30,000,000
unique keys. If you have more keys, increase the initLevel
accordingly.

new DynamicDataStore(theDirectory, 10, 5000000, 5, 256, new
ChannelSegmentFactory(), .5, .75, new FnvHashFunction());

However, I would suggest you to use IndexedDataStore. You can Krati
from github and run the following command to see the read/write
performance.

mvn test -Dtest=TestIndexedStore -Dkrati.test.jvm.args="-Xloggc:target/
logs/krati.gc -XX:+PrintGCDetails" -Dkrati.test.initLevel=8 -
Dkrati.test.keyCount=5000000 -Dkrati.test.segFileSizeMB=256 -
Dkrati.test.runTimeSeconds=600

This test uses MemorySegment to store keys and values and internally
creates an IndexedDataStore and the following. The test will bootstrap
the test with 5,000,000 keys and then perform random writes only,
random reads/writes, and random checks/writes for 600 seconds. You can
check out target/logs/stats.log to see the performance stats.

new IndexedDataStore(
Files.createTempDir(),
10000,
5,
8, 32, new MemorySegmentFactory(),
8, 256, new MemorySegmentFactory()).

If you want to test more data and more unique keys (say 50,000,000),
try the following

mvn clean
mvn test -Dtest=TestIndexedStoreWriteBuffer -Dkrati.test.jvm.args="-
Xloggc:target/logs/krati.gc -XX:+PrintGCDetails" -
Dkrati.test.initLevel=10 -Dkrati.test.keyCount=50000000 -
Dkrati.test.segFileSizeMB=256 -Dkrati.test.runTimeSeconds=3600

This test internally creates an IndexedDataStore as the following. It
bootstraps the store with 50,000,000 keys and then run the test for 1
hour.

new IndexedDataStore(
Files.createTempDir(),
10000,
5,
10, 32, new MemorySegmentFactory(),
10, 256, new MemorySegmentFactory()).

I will collect some performance data tonight and get back to you.

Thanks.

-jingwei
> [1]http://groups.google.com/group/krati/browse_thread/thread/187679075db...

Jingwei

unread,
Feb 21, 2011, 6:26:37 PM2/21/11
to Krati

FYI, the 5000000 updateBatchSize is too much. Anywhere between 1000 to
10000 should be good enough.

new DynamicDataStore(theDirectory, 10, 10000, 5, 256, new
ChannelSegmentFactory(), .5, .75, new FnvHashFunction());

On Feb 21, 5:01 am, mhgrove <m...@clarkparsia.com> wrote:
> [1]http://groups.google.com/group/krati/browse_thread/thread/187679075db...

Jingwei

unread,
Feb 22, 2011, 12:09:02 AM2/22/11
to Krati
Hi Michael,

Here are the performance results I collected this afternoon.

mvn test -Dtest=TestIndexedStore -Dkrati.test.jvm.args="-Xloggc:target/
logs/krati.gc -XX:+PrintGCDetails" -Dkrati.test.initLevel=8 -
Dkrati.test.keyCount=5000000 -Dkrati.test.segFileSizeMB=256 -
Dkrati.test.runTimeSeconds=3600

>>> ========================================================
>>> BEGIN TestIndexedStore
>>> populate
elapsedTime=246752 ms
writeCount=5000000 rate=20.26 per ms
>>> read only
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
elapsedTime=10001 ms
readCount[0]=2267934 rate=226.77 per ms
readCount[1]=2311932 rate=231.17 per ms
readCount[2]=2259143 rate=225.89 per ms
readCount[3]=2319399 rate=231.92 per ms
Total Read Rate=915.75 per ms
>>> write only
Writer 0 started
writeCount=299999
writeCount=280000
writeCount=280000
writeCount=190000
writeCount=221859
writeCount=261161
writeCount=261483
writeCount=260948
writeCount=274549
writeCount=248470
writeCount=261513
writeCount=240017
writeCount=199406
writeCount=205789
writeCount=212609
writeCount=274007
writeCount=217233
writeCount=260119
writeCount=230837
writeCount=80000
writeCount=270000
writeCount=288052
writeCount=177043
writeCount=274905
writeCount=262364
writeCount=257492
writeCount=240144
writeCount=320000
writeCount=270000
writeCount=160000
writeCount=239823
writeCount=244811
writeCount=231413
writeCount=243953
writeCount=280000
writeCount=160000
writeCount=210686
writeCount=249314
writeCount=263891
writeCount=304516
writeCount=290379
writeCount=181853
writeCount=259361
writeCount=290000
writeCount=160000
writeCount=290000
writeCount=310000
writeCount=295421
writeCount=214579
writeCount=266394
writeCount=283606
writeCount=310000
writeCount=170000
writeCount=300000
writeCount=306528
writeCount=233472
writeCount=219026
writeCount=270974
writeCount=310000
writeCount=246758
writeCount=273242
writeCount=295773
writeCount=260863
writeCount=230518
writeCount=262846
writeCount=290000
writeCount=229259
writeCount=230741
writeCount=270000
writeCount=262307
writeCount=232701
writeCount=304992
writeCount=300000
writeCount=243234
writeCount=216766
writeCount=295517
writeCount=304483
writeCount=240279
writeCount=261675
writeCount=288046
writeCount=280000
writeCount=200000
writeCount=270000
writeCount=317545
writeCount=284806
writeCount=253247
writeCount=244402
writeCount=290000
writeCount=310000
writeCount=189156
writeCount=260844
writeCount=280000
writeCount=240000
writeCount=230000
writeCount=290000
writeCount=297755
writeCount=268353
writeCount=242979
writeCount=290714
writeCount=270199
writeCount=212713
writeCount=277287
writeCount=289697
writeCount=297528
writeCount=188927
writeCount=263127
writeCount=260019
writeCount=250702
writeCount=260000
writeCount=310000
writeCount=263491
writeCount=196509
writeCount=309417
writeCount=299097
writeCount=223476
writeCount=228353
writeCount=279657
writeCount=260000
writeCount=200000
writeCount=300000
elapsedTime=1206652 ms
writeCount[0]=30690000 rate=25.43 per ms
Total Write Rate=25.43 per ms
>>> validate
Quit: running time is over 60000 ms
Validated 3531444/5000000 in 60001 ms
OK
>>> read & write
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
Writer 0 started
write=139999 read=6335588
write=173322 read=8232138
write=146678 read=7436175
write=220000 read=8082597
write=120000 read=7571583
write=230000 read=8064708
write=110000 read=7617817
write=226423 read=8108744
write=123577 read=7537376
write=230000 read=8191044
write=110000 read=7488390
write=197370 read=8345290
write=142630 read=7356578
write=211953 read=8278819
write=138047 read=7409775
write=140000 read=8376531
write=170000 read=7367370
write=124395 read=8150683
write=222477 read=7542464
write=133128 read=8242937
write=208860 read=7448167
write=141140 read=8082238
write=220000 read=7590470
write=141913 read=8406680
write=188087 read=7292202
write=140000 read=8273652
write=154964 read=7469064
write=185036 read=7962731
write=180000 read=7700760
write=180000 read=8021117
write=138240 read=7683501
write=201760 read=8088715
write=117869 read=7611752
write=222131 read=8049027
write=110000 read=7623323
write=230000 read=8086585
write=110000 read=7599774
write=198852 read=8247870
write=141148 read=7457392
write=186916 read=8365178
write=163084 read=7322300
write=196101 read=8254397
write=145087 read=7425291
write=158812 read=8128397
write=180000 read=7577147
write=130943 read=8193001
write=211589 read=7510349
write=140912 read=8159017
write=216556 read=7489935
write=120000 read=8097675
write=210000 read=7615326
write=150000 read=8362542
write=198773 read=7292786
write=131227 read=8002372
write=150000 read=7747755
write=172411 read=8128659
write=120891 read=7619038
write=206698 read=8111171
write=130000 read=7589288
write=230000 read=8332814
write=118906 read=7358308
write=227423 read=8350933
write=113671 read=7344156
write=209894 read=8367891
write=120106 read=7333082
write=187736 read=8231574
write=152264 read=7438855
write=196506 read=8073386
write=163494 read=7555889
write=156609 read=8330240
write=173594 read=7377330
write=123517 read=8243703
write=216280 read=7477269
write=126540 read=8128438
write=223460 read=7577273
write=117383 read=8085430
write=182617 read=7662228
write=162645 read=8531583
write=149195 read=7161558
write=198160 read=8589872
write=157538 read=7077855
write=181718 read=8543153
write=110744 read=7228329
write=220000 read=8441069
write=95480 read=7314994
write=214520 read=8464773
write=130000 read=7239678
write=201158 read=8117997
write=138842 read=7589975
write=180563 read=8115470
write=169437 read=7567151
write=198598 read=8128803
write=159505 read=7512345
write=132164 read=8321637
write=187382 read=7391981
write=122351 read=8205421
write=210000 read=7488648
write=146390 read=8049055
write=222587 read=7640004
write=141023 read=8545153
write=215141 read=7130610
write=134859 read=8536757
write=190000 read=7178855
write=140000 read=8262735
write=165439 read=7459730
write=174561 read=8245767
write=175008 read=7405539
write=183899 read=8131016
write=139160 read=7573382
write=191933 read=7943085
write=120000 read=7776086
write=210000 read=8120142
write=110000 read=7575233
write=213563 read=8182271
write=124999 read=7539808
write=181809 read=8179299
write=149629 read=7522652
write=186737 read=8072598
write=173264 read=7596244
write=197516 read=8279740
elapsedTime=1237831 ms
writeCount[0]=20087526 rate=16.23 per ms
Total Write Rate=16.23 per ms
readCount[0]=235737800 rate=190.44
readCount[1]=235944823 rate=190.61
readCount[2]=234264235 rate=189.25
readCount[3]=234723320 rate=189.62
Total Read Rate=759.92 per ms
writer latency stats:
1,1
5,0
10,1988546
50,18068975
100,1343
200,304
300,3391
400,2072
500,31
600,7
700,3
800,1
900,0
1000,0
1+ms,22852
reader latency stats:
1,630
5,235498069
10,194982
50,43684
100,8
200,0
300,1
400,0
500,0
600,0
700,0
800,0
900,0
1000,0
1+ms,426


mvn test -Dtest=TestIndexedStoreWriteBuffer -Dkrati.test.jvm.args="-
Xloggc:target/logs/krati.gc -XX:+PrintGCDetails" -
Dkrati.test.initLevel=8 -Dkrati.test.keyCount=5000000 -
Dkrati.test.segFileSizeMB=256 -Dkrati.test.runTimeSeconds=3600

>>> ========================================================
>>> BEGIN TestIndexedStoreWriteBuffer
>>> populate
elapsedTime=243532 ms
writeCount=5000000 rate=20.53 per ms
>>> read only
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
elapsedTime=10000 ms
readCount[0]=698474 rate=69.85 per ms
readCount[1]=730364 rate=73.04 per ms
readCount[2]=699966 rate=70.0 per ms
readCount[3]=745322 rate=74.53 per ms
Total Read Rate=287.42 per ms
>>> write only
Writer 0 started
writeCount=274710
writeCount=285289
writeCount=266749
writeCount=263251
writeCount=198676
writeCount=244864
writeCount=261290
writeCount=285170
writeCount=299796
writeCount=257570
writeCount=262634
writeCount=252144
writeCount=285393
writeCount=242587
writeCount=254110
writeCount=258529
writeCount=283425
writeCount=203812
writeCount=130000
writeCount=225895
writeCount=289962
writeCount=294143
writeCount=357490
writeCount=365417
writeCount=297093
writeCount=296648
writeCount=323352
writeCount=310000
writeCount=300000
writeCount=304610
writeCount=345390
writeCount=370000
writeCount=360000
writeCount=320000
writeCount=357162
writeCount=333159
writeCount=309679
writeCount=340000
writeCount=310000
writeCount=330076
writeCount=349924
writeCount=350000
writeCount=350000
writeCount=340000
writeCount=336757
writeCount=361346
writeCount=351897
writeCount=330000
writeCount=290000
writeCount=361512
writeCount=344468
writeCount=336946
writeCount=266400
writeCount=300674
writeCount=338773
writeCount=299393
writeCount=341834
writeCount=312352
writeCount=315352
writeCount=356696
writeCount=335367
writeCount=350233
writeCount=306386
writeCount=327404
writeCount=356210
writeCount=341045
writeCount=334632
writeCount=314323
writeCount=343517
writeCount=356483
writeCount=350000
writeCount=320000
writeCount=363170
writeCount=306830
writeCount=317263
writeCount=328974
writeCount=309528
writeCount=334235
writeCount=354095
writeCount=335905
writeCount=349108
writeCount=334256
writeCount=320346
writeCount=339099
writeCount=302912
writeCount=294279
writeCount=300000
writeCount=340773
writeCount=355156
writeCount=344071
writeCount=359777
writeCount=340223
writeCount=340000
writeCount=330000
writeCount=340000
writeCount=348883
writeCount=321117
writeCount=330000
writeCount=346542
writeCount=315753
writeCount=358438
writeCount=321090
writeCount=305977
writeCount=336212
writeCount=334670
writeCount=331318
writeCount=340000
writeCount=356471
writeCount=343529
writeCount=341822
writeCount=348178
writeCount=325979
writeCount=314021
writeCount=326958
writeCount=353027
writeCount=325103
writeCount=321701
writeCount=346143
writeCount=337068
writeCount=352042
elapsedTime=1200047 ms
writeCount[0]=38242046 rate=31.87 per ms
Total Write Rate=31.87 per ms
>>> validate
Quit: running time is over 60000 ms
Validated 2635465/5000000 in 60001 ms
OK
>>> read & write
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
Writer 0 started
write=357953 read=2993861
write=322490 read=3070788
write=315159 read=3133293
write=340985 read=3072041
write=291857 read=3094127
write=333746 read=3050915
write=335763 read=3068028
write=329560 read=3040627
write=310440 read=3049114
write=334184 read=3044211
write=345816 read=3080145
write=340000 read=3056979
write=330000 read=3069226
write=341243 read=3065353
write=358368 read=3046422
write=320389 read=3054541
write=280000 read=3090466
write=340000 read=3046335
write=316455 read=3047652
write=325248 read=3053164
write=318297 read=3020446
write=320701 read=3041663
write=323071 read=3043841
write=344826 read=3017294
write=320383 read=3056497
write=337157 read=3047072
write=329618 read=3025958
write=312194 read=3031471
write=342050 read=3056694
write=335024 read=3003783
write=308400 read=3069982
write=336576 read=3046329
write=340000 read=3117194
write=333839 read=3033357
write=341986 read=3002913
write=275353 read=3017138
write=191773 read=3012196
write=351175 read=2979102
write=331672 read=2996747
write=314235 read=3027706
write=319967 read=3008475
write=308307 read=2973702
write=331693 read=3009915
write=280000 read=3014422
write=340000 read=3075458
write=344032 read=3062360
write=305968 read=3009621
write=330000 read=3002557
write=350000 read=3006787
write=316835 read=3050911
write=353165 read=3061053
write=338602 read=3054002
write=317665 read=3046989
write=324209 read=3010672
write=319524 read=2978857
write=270000 read=3004153
write=320000 read=3062174
write=344124 read=3067112
write=315876 read=3092513
write=329116 read=3075129
write=319915 read=3055547
write=308004 read=3084007
write=332965 read=3062553
write=294735 read=3080906
write=325265 read=3054750
write=340000 read=3092615
write=320000 read=3021435
write=322437 read=3074929
write=337563 read=3095561
write=351901 read=3069732
write=349874 read=3063933
write=326690 read=3078051
write=321535 read=3076340
write=298970 read=3063347
write=344082 read=3010618
write=306948 read=3040804
write=315109 read=3086977
write=334891 read=3027912
write=320361 read=3072445
write=327310 read=3057694
write=304014 read=3028432
write=214235 read=2881194
write=274080 read=2898051
write=284129 read=3044214
write=334014 read=2985236
write=175932 read=2970931
write=217925 read=3022798
write=302685 read=3098679
write=288878 read=3071662
write=266437 read=3026107
write=285894 read=3067490
write=314106 read=3056598
write=350000 read=3020122
write=130956 read=3006407
write=275802 read=3064723
write=343242 read=3028718
write=328714 read=3042647
write=341286 read=3032545
write=293616 read=3076258
write=266384 read=3044147
write=302611 read=3042294
write=322869 read=3010525
write=343314 read=3040875
write=341206 read=3020826
write=333186 read=3022644
write=337611 read=3086355
write=329203 read=3092772
write=320000 read=3041712
write=300000 read=3008151
write=311683 read=2999169
write=330744 read=2982712
write=347573 read=2994600
write=334001 read=2995659
write=345999 read=2973246
write=320000 read=3010031
write=310920 read=3042029
write=329080 read=2947946
write=302276 read=2979660
write=271339 read=2979921
write=306385 read=3000853
elapsedTime=1200257 ms
writeCount[0]=37967954 rate=31.63 per ms
Total Write Rate=31.63 per ms
readCount[0]=90773485 rate=75.63
readCount[1]=90851598 rate=75.69
readCount[2]=91073955 rate=75.88
readCount[3]=91851611 rate=76.53
Total Read Rate=303.73 per ms
writer latency stats:
1,1
5,0
10,4347827
50,33569561
100,1413
200,81
300,10096
400,611
500,10
600,9
700,13
800,19
900,54
1000,71
1+ms,38188
reader latency stats:
1,0
5,3900327
10,3759054
50,83027440
100,63935
200,21547
300,516
400,80
500,4
600,0
700,0
800,0
900,1
1000,0
1+ms,581


mvn test -Dtest=TestIndexedStoreWriteBuffer -Dkrati.test.jvm.args="-
Xloggc:target/logs/krati.gc -XX:+PrintGCDetails" -
Dkrati.test.initLevel=11 -Dkrati.test.keyCount=50000000 -
Dkrati.test.segFileSizeMB=256 -Dkrati.test.runTimeSeconds=3600

>>> ========================================================
>>> BEGIN TestIndexedStoreWriteBuffer
>>> populate
elapsedTime=2468708 ms
writeCount=50000000 rate=20.25 per ms
>>> read only
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
elapsedTime=10038 ms
readCount[0]=230 rate=0.02 per ms
readCount[1]=230 rate=0.02 per ms
readCount[2]=240 rate=0.02 per ms
readCount[3]=236 rate=0.02 per ms
Total Read Rate=0.08 per ms
>>> write only
Writer 0 started
writeCount=247376
writeCount=257156
writeCount=302960
writeCount=291357
writeCount=249964
writeCount=261186
writeCount=254722
writeCount=271419
writeCount=256619
writeCount=277240
writeCount=310000
writeCount=284330
writeCount=260543
writeCount=268642
writeCount=266485
writeCount=254677
writeCount=292487
writeCount=271046
writeCount=260963
writeCount=261564
writeCount=261222
writeCount=261279
writeCount=296762
writeCount=230000
writeCount=80000
writeCount=100000
writeCount=150000
writeCount=160000
writeCount=243243
writeCount=146757
writeCount=184633
writeCount=208677
writeCount=129989
writeCount=247140
writeCount=99561
writeCount=210000
writeCount=145272
writeCount=234728
writeCount=128707
writeCount=281293
writeCount=89218
writeCount=224057
writeCount=152653
writeCount=51468
writeCount=79984
writeCount=189668
writeCount=65155
writeCount=27797
writeCount=180000
writeCount=1465
writeCount=59459
writeCount=129076
writeCount=205853
writeCount=64147
writeCount=209069
writeCount=100931
writeCount=96944
writeCount=192071
writeCount=160285
writeCount=224149
writeCount=146551
writeCount=235913
writeCount=126243
writeCount=227844
writeCount=107055
writeCount=232945
writeCount=128298
writeCount=233522
writeCount=111755
writeCount=260967
writeCount=135224
writeCount=230180
writeCount=100054
writeCount=260000
writeCount=77342
writeCount=152658
writeCount=120593
writeCount=82592
writeCount=205216
writeCount=150327
writeCount=211897
writeCount=146404
writeCount=222971
writeCount=133717
writeCount=206283
writeCount=152478
writeCount=217522
writeCount=140828
writeCount=169172
writeCount=189229
writeCount=82179
writeCount=230367
writeCount=127775
writeCount=201170
writeCount=108444
writeCount=248991
writeCount=81772
writeCount=271024
writeCount=84737
writeCount=213891
writeCount=141666
writeCount=148755
writeCount=180000
writeCount=115608
writeCount=234392
writeCount=120097
writeCount=238276
writeCount=127238
writeCount=234389
writeCount=130525
writeCount=239475
writeCount=119644
writeCount=228855
writeCount=123076
writeCount=203171
writeCount=126070
writeCount=231159
writeCount=124712
writeCount=234788
writeCount=88525
elapsedTime=1203491 ms
writeCount[0]=21860000 rate=18.16 per ms
Total Write Rate=18.16 per ms
>>> validate
Quit: running time is over 60000 ms
Validated 17841/50000000 in 60015 ms
OK
>>> read & write
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
Writer 0 started
write=49999 read=2791
write=50000 read=2660
write=40000 read=2767
write=40000 read=2898
write=40000 read=2712
write=40000 read=3117
write=13601 read=2214
write=16399 read=2144
write=40000 read=2502
write=18806 read=1965
write=3701 read=1179
write=17493 read=2336
write=0 read=2655
write=30000 read=2189
write=30000 read=2513
write=40000 read=2479
write=20000 read=2525
write=20000 read=2568
write=30000 read=2723
write=20000 read=2617
write=40000 read=2662
write=20000 read=2911
write=10000 read=2516
write=22011 read=1471
write=47989 read=2534
write=20000 read=1583
write=50000 read=2307
write=30000 read=3003
write=40000 read=2927
write=40000 read=3067
write=40000 read=2809
write=30000 read=2693
write=40000 read=2318
write=10000 read=1762
write=30000 read=2434
write=35780 read=2261
write=14220 read=1561
write=50000 read=2610
write=40000 read=2640
write=40000 read=2992
write=40000 read=2981
write=50000 read=2900
write=40000 read=2857
write=15167 read=2020
write=4833 read=2622
write=50000 read=2403
write=14586 read=2164
write=25414 read=2187
write=40000 read=2568
write=40000 read=2815
write=40000 read=2883
write=30000 read=2873
write=40000 read=2889
write=50000 read=3006
write=22868 read=2178
write=7436 read=2374
write=29696 read=2672
write=30000 read=2330
write=10000 read=1826
write=30000 read=2633
write=50000 read=2560
write=40000 read=2830
write=50000 read=3010
write=30000 read=3070
write=40000 read=2995
write=30000 read=2881
write=10000 read=1617
write=30000 read=2508
write=42276 read=2020
write=27724 read=2523
write=40022 read=2725
write=49978 read=2984
write=50000 read=2900
write=30000 read=2941
write=30000 read=3130
write=40000 read=2962
write=14217 read=1819
write=35783 read=2490
write=30000 read=2495
write=30000 read=1557
write=30000 read=2584
write=50000 read=2926
write=40000 read=2877
write=40000 read=2945
write=50000 read=2978
write=40000 read=3099
write=15460 read=1716
write=34540 read=2644
write=39833 read=2843
write=7207 read=1639
write=32960 read=2506
write=50000 read=2690
write=40000 read=2780
write=40000 read=3023
write=30000 read=3014
write=30000 read=3224
write=40000 read=2767
write=14989 read=1421
write=35011 read=2162
write=40000 read=2429
write=8988 read=1601
write=31012 read=2495
write=40000 read=2612
write=50000 read=2911
write=40000 read=2589
write=30000 read=2799
write=40000 read=2968
write=40000 read=2762
write=6342 read=1138
write=43658 read=2487
write=30000 read=1637
write=30000 read=2184
write=40000 read=2294
write=40000 read=2872
write=40000 read=2891
write=50000 read=2752
write=40000 read=2879
write=40000 read=2752
write=9244 read=1667
write=40756 read=2329
elapsedTime=1201515 ms
writeCount[0]=3910000 rate=3.25 per ms
Total Write Rate=3.25 per ms
readCount[0]=75110 rate=0.06
readCount[1]=76097 rate=0.06
readCount[2]=75333 rate=0.06
readCount[3]=76163 rate=0.06
Total Read Rate=0.24 per ms
writer latency stats:
1,127
5,0
10,551106
50,3201678
100,39
200,50
300,684
400,4
500,3
600,0
700,5
800,6
900,6
1000,2
1+ms,156290
reader latency stats:
1,3
5,1591
10,6312
50,45090
100,10
200,0
300,1
400,1
500,0
600,0
700,0
800,0
900,0
1000,0
1+ms,22102

The last test has more than 50GB data values, which cannot fit into
either main memory (16GB) or OS page cache. Once your data is over
memory, read performance suffers big time assuming that read access
pattern is random. Basically, every read may require a disk seek.
Write throughput is better since an in-memory buffer is used for
buffering writes. That is how WriteBufferSegment is named.


On Feb 21, 5:01 am, mhgrove <m...@clarkparsia.com> wrote:
> [1]http://groups.google.com/group/krati/browse_thread/thread/187679075db...

Mike Grove

unread,
Feb 22, 2011, 10:02:32 AM2/22/11
to kr...@googlegroups.com
Hi.  Having the correct configuration parameters for DynamicDataStore helps some, but I still see the same basic behavior.  There's an order of magnitude slowdown, very consistently, and roughly the same size store.  Here's data I collected this morning:

The store configuration:
return new DynamicDataStore(theDirectory, 12, 10000, 5, 256, new ChannelSegmentFactory(), .5, .75, new FnvHashFunction());

For the first 28M iterations, these are the results:
curr: 28000000
keys: 15422673
Name  |    Count |  Total (ms)
====================================
write | 15422673 |     752958
read  | 83999997 |      254613

read avg:   3031ns      330 r/ms
write avg:  48821ns     20.5 w/ms

Write throughput is good, matches exactly what you publish on your site, but the read throughput is 1/3 of the expected 1000-2000 r/ms.

You'll see that there is a little over 15M keys in the store at this point, these numbers are very steady up to this level.  Beyond 15M, performance quickly degrades, here's the diff between cycle 33M and 34M where we have between 18.1M and 18.7M keys in the store

curr: 33000000
keys: 18189290
->
curr: 34000000
keys: 18696642

Name  |     Count |   Total (ms)
=====================================
write |    507352 |       32401
read  |   3000000 |       70194

read avg    23398ns     41.71 r/ms
write avg   63862ns     15.65 w/ms 

We've lost nearly an order of magnitude on the read throughput and 25% performance loss on the write throughput.

Between cycles 35 and 35M, with keys between 19.2 and 19.8M, the performance loss continues:

curr: 35000000
keys: 19274084
->
curr: 36000000
keys: 19824691

Name  |     Count |  Avg | Total (ms)
=====================================
write |    550607 |       44363
read  |   3000000 |       98997

read avg    32999ns     30.3 r/ms
write avg   80571ns     11.6 w/ms

Now we've lost nearly 50% of our write throughput from the start, and now over an order of magnitude slowdown from the original and 25% slower than the previous iteration.

This degradation continues and the application is dead in the water.

Changing the configuration to a higher initLevel:

return new DynamicDataStore(theDirectory, 12, 10000, 5, 256, new ChannelSegmentFactory(), .5, .75, new FnvHashFunction());

Yields pretty much the same story, except on a shorter timeline, the performance degradation begins at around 13M keys, although write performance does seem to degrade much less with this configuration.

curr: 23000000
keys: 12734798
Name  |    Count |   Total (ms)
====================================
write | 12734798 |      610921
read  | 68999997 |      196033

read avg    2841ns      351.98 r/ms
write avg   47972ns     20.84  w/ms

curr: 26000000
keys: 14371535
->
curr: 27000000
keys: 14923973

Name  |     Count |  Total (ms)
=====================================
write |    552438 |    30454
read  |   3000000 |       55000

read avg    18333ns     54.54 r/ms
write avg   55126ns     18.14 w/ms

curr: 31000000
keys: 17088687
->
curr: 32000000
keys: 17652147

Name  |     Count |   Total (ms)
=====================================
write |    563460 |       30398 
read  |   3000000 |       90512

read avg    30170ns     31.54 r/ms
write avg   53948ns     18.53 w/ms

I then tried IndexedDataStore, but at less than 1M keys, it was averaging about 1.5 writes per millisecond and 30 r/s which is entirely too slow.

I am going to check out the source and run your performance tests and compare the results to isolate issues with my disk, but I am wondering if you have any idea for why Krati seems to hit a wall in the ballpark of 15M keys with dramatically decreasing performance.

Is that roughly the level in which you're no longer adding new segments to the hash and lookup degrades towards O(n) as you add overflow pages to the segments and stop splitting?

Thanks.

Mike

P.S.  The following configuration yields an IO exception on both of my machines, stacktrace is copied as well.

return new DynamicDataStore(theDirectory, 12, 10000, 5, 256, new ChannelSegmentFactory(), .5, .75, new FnvHashFunction());
Caused by: java.io.IOException: Invalid argument
at java.io.RandomAccessFile.setLength(Native Method)
at krati.core.array.basic.ArrayFile.setArrayLength(ArrayFile.java:721)
at krati.core.array.basic.DynamicLongArray.expandCapacity(DynamicLongArray.java:152)
at krati.store.DynamicDataStore.<init>(DynamicDataStore.java:318)
... 3 more

Jingwei

unread,
Feb 22, 2011, 11:25:49 AM2/22/11
to Krati
Hi Michael,

A few simple questions:

1. You have 12G of RAM?
2. What is the average size of your value (I know your key is long)?
3. How many readers do you have?

Thanks.

-jingwei
> ...
>
> read more »

Mike Grove

unread,
Feb 22, 2011, 12:12:10 PM2/22/11
to kr...@googlegroups.com
On Tue, Feb 22, 2011 at 11:25 AM, Jingwei <jingw...@gmail.com> wrote:
Hi Michael,

A few simple questions:

1. You have 12G of RAM?

Yes, that's what I've been allocating to the test.
 
2. What is the average size of your value (I know your key is long)?

The values are strings, most will be a dozen or two characters long.  There are the occasional values which could be large blocks of text of perhaps a few KB, but those are not the norm.
 
3. How many readers do you have?

Just a single reader (and writer).  The test I'm running is reading in a data file, pulling out some relevant bits, and putting them into krati.  I expect somewhere in the ballpark of 25-30M unique keys for this particular test.  

The actual datasets will probably have keys in excess of 250M unique keys, however, once the initial data store is created, nearly every access will be a read.

I was able to reproduce your results for the 5M key tests from a krati checkout.  I am currently running one for 20M, but do not have any results yet.

Cheers,

Mike

Mike Grove

unread,
Feb 23, 2011, 9:07:01 AM2/23/11
to kr...@googlegroups.com
On Tue, Feb 22, 2011 at 12:09 AM, Jingwei <jingw...@gmail.com> wrote:
Hi Michael,

Here are the performance results I collected this afternoon.

mvn test -Dtest=TestIndexedStore -Dkrati.test.jvm.args="-Xloggc:target/
logs/krati.gc -XX:+PrintGCDetails" -Dkrati.test.initLevel=8 -
Dkrati.test.keyCount=5000000 -Dkrati.test.segFileSizeMB=256 -
Dkrati.test.runTimeSeconds=3600

fwiw, Running this test from the checkout but with a key count of 20M, initLevel of 11, and 8g of RAM to the JVM does not even get past the populate phase.

Cheers,

Mike 

Jingwei

unread,
Feb 23, 2011, 11:12:47 AM2/23/11
to Krati
Hi Michael,

It does sound really weird. Would you please show the first 10 lines
of target/logs/krati.log?
With 20M keys, it is better to have initLevel set to 9.

On Feb 23, 6:07 am, Mike Grove <m...@clarkparsia.com> wrote:

Mike Grove

unread,
Feb 23, 2011, 11:29:13 AM2/23/11
to kr...@googlegroups.com
On Wed, Feb 23, 2011 at 11:12 AM, Jingwei <jingw...@gmail.com> wrote:
Hi Michael,

It does sound really weird. Would you please show the first 10 lines
of target/logs/krati.log?
With 20M keys, it is better to have initLevel set to 9.

It never makes it past creating the store:

Michael-Groves-iMac:krati mhgrove$ more target/logs/stats.log 
>>> ========================================================
>>> BEGIN TestIndexedStore
>>> populate

That's all there is in the file.

Am I trying to use Krati at scales it is not designed for?

Thanks.

Mike

Jingwei

unread,
Feb 23, 2011, 11:52:13 AM2/23/11
to Krati
Hi Michael,

In Krati test, the avg key size is 30-40 bytes and value size is 1 KB.

With 20M keys, you have at least 20G data to load into memory when
TestIndexedStore is run. So it won't be able to do it.

The convention in krati test is like following

TestIndexedStore uses MemorySegment for storing values

TestIndexedStoreMapped uses MappedSegment for storing values

TestIndexedStoreChannel uses ChannelSegment for storing values

TestIndexedStoreWriteBuffer uses WriteBufferSegment for storing values

When dealing with at 20GB data with 8GB memory, I will choose
WriteBufferSegment or ChannelSegment.

So, you can try the following

mvn test -Dtest=TestIndexedStoreWriteBuffer -Dkrati.test.jvm.args="-
Xloggc:target/logs/krati.gc -XX:+PrintGCDetails" -
Dkrati.test.initLevel=9 -Dkrati.test.keyCount=20000000 -
Dkrati.test.segFileSizeMB=256 -Dkrati.test.runTimeSeconds=3600

Thanks.

-jingwei

On Feb 23, 6:07 am, Mike Grove <m...@clarkparsia.com> wrote:

Mike Grove

unread,
Feb 23, 2011, 2:28:33 PM2/23/11
to kr...@googlegroups.com
Ok, I ran TestIndexedStoreChannel with the configuration you suggested.

These are the snips from the log where it prints the read and write rates (full log is at the end of the email):

writeCount=20000000 rate=4.37 per ms
Total Write Rate=7.68 per ms
Total Write Rate=0.15 per ms

Total Read Rate=0.14 per ms
Total Read Rate=0.16 per ms 
Total Read Rate=0.16 per ms

As you can see, these are dreadfully slow.  I am able to get the same basic numbers on two different machines and this replicates what I'm seeing with my test application using the indexed store.

Cheers,

Mike

Here's the full log:

>>> ========================================================
>>> BEGIN TestIndexedStoreChannel
>>> populate
elapsedTime=4576668 ms
writeCount=20000000 rate=4.37 per ms
>>> read only
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
elapsedTime=10057 ms
readCount[0]=377 rate=0.04 per ms
readCount[1]=384 rate=0.04 per ms
readCount[2]=336 rate=0.03 per ms
readCount[3]=329 rate=0.03 per ms
Total Read Rate=0.14 per ms
>>> write only
Writer 0 started
writeCount=59999
writeCount=80000
writeCount=89756
writeCount=70244
writeCount=83174
writeCount=56826
writeCount=90000
writeCount=60000
writeCount=70000
writeCount=66739
writeCount=93261
writeCount=60000
writeCount=89694
writeCount=90306
writeCount=60000
writeCount=90000
writeCount=90000
writeCount=89651
writeCount=60349
writeCount=90000
elapsedTime=200584 ms
writeCount[0]=1540000 rate=7.68 per ms
Total Write Rate=7.68 per ms
>>> validate
Quit: running time is over 60000 ms
Validated 5281724/20000000 in 60001 ms
OK
>>> read & write
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
Writer 0 started
write=9999 read=1802
write=0 read=1690
write=0 read=1726
write=0 read=1743
write=0 read=1652
write=0 read=1772
write=0 read=1650
write=0 read=1701
write=0 read=1813
write=0 read=1792
write=0 read=1745
write=0 read=1743
write=0 read=1688
write=0 read=1735
write=0 read=1661
write=0 read=1694
write=0 read=1697
write=0 read=1736
write=0 read=1791
write=0 read=1703
elapsedTime=200466 ms
writeCount[0]=10000 rate=0.05 per ms
Total Write Rate=0.05 per ms
readCount[0]=8655 rate=0.04
readCount[1]=8742 rate=0.04
readCount[2]=8764 rate=0.04
readCount[3]=8377 rate=0.04
Total Read Rate=0.16 per ms
writer latency stats:
   1,1
   5,2833
  10,6550
  50,604
 100,7
 200,5
 300,0
 400,0
 500,0
 600,0
 700,0
 800,0
 900,0
1000,0
1+ms,0
reader latency stats:
   1,0
   5,8
  10,1198
  50,3209
 100,11
 200,1
 300,1
 400,0
 500,0
 600,0
 700,0
 800,0
 900,0
1000,0
1+ms,4227
>>> validate
Quit: running time is over 60000 ms
Validated 7171684/20000000 in 60001 ms
OK
>>> check & write
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
Writer 0 started
write=9999 read=1682
write=0 read=1639
write=0 read=1760
write=10000 read=1718
write=0 read=1711
write=0 read=1707
write=10000 read=1734
write=0 read=1651
write=0 read=1635
write=0 read=1701
write=0 read=1661
write=0 read=1660
write=0 read=1718
write=0 read=1603
write=0 read=1683
write=0 read=1678
write=0 read=1604
write=0 read=1686
write=0 read=1701
write=0 read=1752
elapsedTime=200749 ms
writeCount[0]=30000 rate=0.15 per ms
Total Write Rate=0.15 per ms
readCount[0]=8791 rate=0.04
readCount[1]=7566 rate=0.04
readCount[2]=8632 rate=0.04
readCount[3]=8699 rate=0.04
Total Read Rate=0.16 per ms
writer latency stats:
   1,2
   5,8880
  10,18717
  50,2384
 100,11
 200,1
 300,0
 400,0
 500,0
 600,1
 700,1
 800,2
 900,0
1000,0
1+ms,1
>>> validate
Quit: running time is over 60000 ms
Validated 7441294/20000000 in 60005 ms
OK
>>> iterate
read 16555 key-value(s) in 100005 ms
>>> END TestIndexedStoreChannel
>>> 

Jingwei

unread,
Feb 23, 2011, 2:58:47 PM2/23/11
to Krati
Hi Michael,

Actually, that sounds promising. We are closer to the final
resolution.

The reason that read is slow is because reads are random and hit all
keys with equal possibilities. This leads to disk seek almost for
every operation. When disk seek requires 10ms, read throughput looks
fine given that you are using ChannelSegment. When data size is larger
than main memory, you do not have much opportunity to optimize random
reads unless read access pattern has locality characteristics.

Now how can we increase write throughput?

The solution is WriteBufferSegment. You can compare Krati with BDB at
the scale of 20000000 keys.

mvn test -Dtest=TestIndexedStoreWriteBuffer -
Dkrati.test.keyCount=20000000 -Dkrati.test.initLevel=10 -
Dkrati.test.segFileSizeMB=256 -Dkrati.test.runTimeSeconds=600

mvn test -Dtest=TestBdbBytes -Dkrati.test.keyCount=20000000 -
Dkrati.test.runTimeSeconds=600

The detailed results I have collected are here

>>> ========================================================
>>> BEGIN TestIndexedStoreWriteBuffer
>>> populate
elapsedTime=844816 ms
writeCount=20000000 rate=23.67 per ms
>>> read only
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
elapsedTime=10025 ms
readCount[0]=2034 rate=0.2 per ms
readCount[1]=2062 rate=0.21 per ms
readCount[2]=1740 rate=0.17 per ms
readCount[3]=1991 rate=0.2 per ms
Total Read Rate=0.78 per ms
>>> write only
Writer 0 started
writeCount=259999
writeCount=290000
writeCount=302037
writeCount=277963
writeCount=298769
writeCount=261067
writeCount=261385
writeCount=261201
writeCount=261523
writeCount=279376
writeCount=289988
writeCount=308278
writeCount=278413
writeCount=280000
writeCount=310000
writeCount=280000
writeCount=318124
writeCount=269113
writeCount=261490
writeCount=261422
writeCount=261150
writeCount=261328
writeCount=277373
writeCount=300000
writeCount=267711
writeCount=302289
writeCount=290000
writeCount=270000
writeCount=200000
writeCount=250000
writeCount=279146
writeCount=294665
writeCount=306189
writeCount=240000
writeCount=290000
writeCount=290000
writeCount=280000
writeCount=275810
writeCount=277741
writeCount=275415
writeCount=261412
writeCount=261582
writeCount=261546
writeCount=261096
writeCount=261297
writeCount=261350
writeCount=261136
writeCount=261165
writeCount=261255
writeCount=261506
writeCount=261173
writeCount=261098
writeCount=261591
writeCount=261207
writeCount=261206
writeCount=261139
writeCount=280275
writeCount=273236
writeCount=276271
writeCount=269521
writeCount=270972
writeCount=270000
writeCount=280000
writeCount=279641
writeCount=259253
writeCount=279949
writeCount=281157
writeCount=120000
writeCount=100000
writeCount=180000
writeCount=136601
writeCount=223399
writeCount=120000
writeCount=224604
writeCount=205396
writeCount=189193
writeCount=237277
writeCount=142935
writeCount=260595
writeCount=260828
writeCount=108584
writeCount=260588
writeCount=210000
writeCount=187115
writeCount=258855
writeCount=141329
writeCount=274131
writeCount=277386
writeCount=106696
writeCount=294488
writeCount=228543
writeCount=162559
writeCount=285326
writeCount=201474
writeCount=230722
writeCount=291376
writeCount=110000
writeCount=280000
writeCount=228885
writeCount=189572
writeCount=300864
writeCount=141893
writeCount=298786
writeCount=264083
writeCount=119275
writeCount=313917
writeCount=132725
writeCount=190750
writeCount=123554
writeCount=232965
writeCount=92731
writeCount=240843
writeCount=129157
writeCount=195011
writeCount=194132
writeCount=131454
writeCount=239403
writeCount=118562
writeCount=245347
writeCount=119487
elapsedTime=1200423 ms
writeCount[0]=28853406 rate=24.04 per ms
Total Write Rate=24.04 per ms
>>> validate
Quit: running time is over 60000 ms
Validated 12965/20000000 in 60012 ms
OK
>>> read & write
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
Writer 0 started
write=56593 read=2249
write=50000 read=2232
write=50000 read=2131
write=55257 read=2132
write=14743 read=1117
write=46758 read=1512
write=23242 read=1399
write=40000 read=1144
write=50000 read=1805
write=50000 read=2248
write=50000 read=2129
write=50000 read=2059
write=22172 read=1189
write=37828 read=1823
write=30000 read=1147
write=45626 read=1329
write=34374 read=1564
write=50000 read=1995
write=55827 read=2063
write=44173 read=2060
write=41202 read=1787
write=18798 read=1411
write=37026 read=1281
write=32974 read=1358
write=30000 read=1016
write=50000 read=1764
write=50000 read=2044
write=50000 read=2056
write=50000 read=1996
write=23560 read=1131
write=46440 read=1728
write=30000 read=1095
write=40000 read=1244
write=40000 read=1496
write=50000 read=1868
write=50000 read=1905
write=50000 read=1984
write=30092 read=1669
write=19908 read=1310
write=40000 read=1194
write=40000 read=1487
write=30000 read=935
write=50000 read=1665
write=50000 read=1960
write=50000 read=1893
write=50000 read=1928
write=15665 read=1096
write=44335 read=1488
write=30000 read=1094
write=42103 read=1080
write=37897 read=1416
write=50000 read=1801
write=50000 read=1853
write=50000 read=1894
write=23722 read=1061
write=46278 read=1634
write=20000 read=983
write=50000 read=1288
write=40000 read=1185
write=50000 read=1677
write=48496 read=1813
write=51504 read=1902
write=22210 read=1333
write=27790 read=1497
write=36187 read=957
write=43813 read=1432
write=30000 read=922
write=50000 read=1608
write=60000 read=1899
write=40000 read=1799
write=31837 read=1385
write=31127 read=1267
write=37036 read=975
write=50000 read=1533
write=30000 read=851
write=50000 read=1526
write=50000 read=1853
write=50000 read=1811
write=22103 read=1341
write=17897 read=1206
write=45665 read=1024
write=34335 read=1349
write=50000 read=1689
write=50000 read=1792
write=57592 read=1762
write=49526 read=1793
write=52882 read=1753
write=30493 read=1313
write=19507 read=1273
write=40000 read=838
write=50000 read=1388
write=40000 read=1699
write=57541 read=1762
write=43698 read=1826
write=48761 read=1774
write=50000 read=1767
write=32068 read=1346
write=17932 read=1262
write=40000 read=1148
write=40000 read=1264
write=50000 read=1645
write=50000 read=1710
write=54564 read=1834
write=45436 read=1766
write=50000 read=1795
write=32066 read=1154
write=37340 read=1352
write=26831 read=869
write=43763 read=1539
write=60000 read=1739
write=50000 read=1755
write=50000 read=1802
write=50000 read=1803
write=50000 read=1791
write=13790 read=983
write=46210 read=1461
write=30000 read=893
write=40000 read=1467
write=50000 read=1781
write=60000 read=1778
elapsedTime=1200892 ms
writeCount[0]=4976594 rate=4.14 per ms
Total Write Rate=4.14 per ms
readCount[0]=46466 rate=0.04
readCount[1]=46013 rate=0.04
readCount[2]=46238 rate=0.04
readCount[3]=47018 rate=0.04
Total Read Rate=0.16 per ms
writer latency stats:
1,58
5,79505
10,2998391
50,1766995
100,412
200,987
300,423
400,10
500,0
600,3
700,7
800,3
900,1
1000,0
1+ms,129799
reader latency stats:
1,0
5,135
10,668
50,21276
100,62
200,11
300,0
400,0
500,0
600,1
700,1
800,0
900,0
1000,0
1+ms,24312
>>> validate
Quit: running time is over 60000 ms
Validated 13943/20000000 in 60006 ms
OK
>>> check & write
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
Writer 0 started
write=49999 read=1765
write=50000 read=1766
write=50000 read=1733
write=10000 read=1030
write=50000 read=1195
write=40000 read=1107
write=40000 read=1657
write=60000 read=1762
write=50000 read=1778
write=50000 read=1723
write=50000 read=1720
write=30991 read=1376
write=29009 read=1271
write=38204 read=1030
write=41796 read=1287
write=50000 read=1613
write=50000 read=1773
write=45293 read=1725
write=54707 read=1756
write=50000 read=1756
write=22764 read=1017
write=37236 read=1490
write=30000 read=843
write=50000 read=1463
write=50000 read=1728
write=50000 read=1764
write=50000 read=1817
write=60000 read=1721
write=40000 read=1741
write=20000 read=998
write=40000 read=1550
write=30000 read=855
write=50000 read=1546
write=50000 read=1775
write=50000 read=1702
write=50000 read=1724
write=50000 read=1735
write=40000 read=1558
write=20000 read=1011
write=40000 read=1118
write=40000 read=1029
write=50000 read=1471
write=50000 read=1730
write=50000 read=1879
write=50000 read=1800
write=50000 read=1654
write=23023 read=1275
write=26977 read=1267
write=40000 read=1144
write=30000 read=1140
write=50000 read=1544
write=50000 read=1715
write=50000 read=1807
write=50000 read=1863
write=50000 read=1850
write=32034 read=1274
write=27966 read=1275
write=39366 read=869
write=40634 read=1367
write=58213 read=1687
write=51787 read=1762
write=50000 read=1795
write=50000 read=1747
write=44889 read=1835
write=15111 read=940
write=50000 read=1306
write=40000 read=1010
write=44211 read=1620
write=55789 read=1728
write=50000 read=1699
write=50000 read=1808
write=50000 read=1782
write=22842 read=1313
write=27158 read=1287
write=40000 read=1100
write=40000 read=1222
write=50000 read=1667
write=50000 read=1764
write=50000 read=1807
write=50000 read=1796
write=50000 read=1839
write=14983 read=946
write=45017 read=1463
write=30000 read=817
write=57761 read=1497
write=52239 read=1818
write=50000 read=1759
write=50000 read=1820
write=60000 read=1882
write=21865 read=1177
write=38135 read=1392
write=32993 read=838
write=47007 read=1525
write=50000 read=1828
write=60000 read=1749
write=50000 read=1826
write=50000 read=1808
write=40000 read=1660
write=20000 read=1122
write=50000 read=1156
write=30000 read=1170
write=60000 read=1638
write=50000 read=1781
write=50000 read=1771
write=50000 read=1823
write=50000 read=1777
write=20000 read=954
write=40000 read=1361
write=40000 read=1099
write=60000 read=1661
write=50000 read=1822
write=50000 read=1878
write=50000 read=1829
write=54298 read=1899
write=15702 read=956
write=50000 read=1512
write=30000 read=951
write=50000 read=1581
write=60000 read=1803
write=50000 read=1794
elapsedTime=1201373 ms
writeCount[0]=5240000 rate=4.36 per ms
Total Write Rate=4.36 per ms
readCount[0]=45450 rate=0.04
readCount[1]=45581 rate=0.04
readCount[2]=45327 rate=0.04
readCount[3]=45235 rate=0.04
Total Read Rate=0.16 per ms
writer latency stats:
1,53
5,83363
10,3196661
50,1849803
100,115
200,803
300,375
400,4
500,0
600,0
700,0
800,0
900,0
1000,0
1+ms,108823
>>> validate
Quit: running time is over 60000 ms
Validated 16625/20000000 in 60002 ms
OK
>>> iterate
read 18079 key-value(s) in 100010 ms
>>> END TestIndexedStoreWriteBuffer
>>>



>> ========================================================
>>> BEGIN TestBdbBytes.testPerformance
cacheSize=9162601267
TxnNoSync=true
Transactional=true
>>> populate
elapsedTime=1011925 ms
writeCount=20000000 rate=19.76 per ms
>>> read only
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
elapsedTime=10040 ms
readCount[0]=245 rate=0.02 per ms
readCount[1]=247 rate=0.02 per ms
readCount[2]=237 rate=0.02 per ms
readCount[3]=243 rate=0.02 per ms
Total Read Rate=0.08 per ms
>>> write only
Writer 0 started
writeCount=841
writeCount=859
writeCount=955
writeCount=776
writeCount=950
writeCount=972
writeCount=955
writeCount=1033
writeCount=1014
writeCount=1026
writeCount=1047
writeCount=1090
writeCount=1091
writeCount=1017
writeCount=1033
writeCount=1099
writeCount=1015
writeCount=1095
writeCount=1056
writeCount=1031
writeCount=1038
writeCount=1047
writeCount=1035
writeCount=1076
writeCount=1047
writeCount=1038
writeCount=1062
writeCount=1089
writeCount=1022
writeCount=1068
writeCount=1084
writeCount=1037
writeCount=1051
writeCount=1057
writeCount=1047
writeCount=1058
writeCount=1054
writeCount=1084
writeCount=1068
writeCount=1106
writeCount=1059
writeCount=1044
writeCount=1165
writeCount=1058
writeCount=1088
writeCount=1123
writeCount=1103
writeCount=844
writeCount=928
writeCount=773
writeCount=1030
writeCount=1087
writeCount=1058
writeCount=1111
writeCount=1080
writeCount=1123
writeCount=1121
writeCount=1080
writeCount=1102
writeCount=1084
writeCount=1078
writeCount=1090
writeCount=1074
writeCount=1087
writeCount=1099
writeCount=1087
writeCount=1078
writeCount=1184
writeCount=1101
writeCount=1092
writeCount=994
writeCount=1008
writeCount=1046
writeCount=1175
writeCount=1131
writeCount=1081
writeCount=1178
writeCount=1144
writeCount=1177
writeCount=1110
writeCount=1136
writeCount=1206
writeCount=1135
writeCount=1085
writeCount=1120
writeCount=1170
writeCount=1127
writeCount=1181
writeCount=1181
writeCount=1119
writeCount=1280
writeCount=1167
writeCount=1110
writeCount=1074
writeCount=1188
writeCount=1147
writeCount=1136
writeCount=1146
writeCount=1096
writeCount=1072
writeCount=1112
writeCount=1149
writeCount=1146
writeCount=1183
writeCount=1173
writeCount=1172
writeCount=1140
writeCount=1066
writeCount=1183
writeCount=1164
writeCount=1198
writeCount=1217
writeCount=1243
writeCount=1141
writeCount=331
writeCount=1224
writeCount=1197
writeCount=1220
writeCount=1219
writeCount=1184
elapsedTime=1203035 ms
writeCount[0]=129737 rate=0.11 per ms
Total Write Rate=0.11 per ms
>>> validate
Quit: running time is over 60000 ms
Validated 30038/20000000 in 60001 ms
OK
>>> read & write
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
Writer 0 started
write=291 read=1240
write=303 read=1214
write=285 read=1171
write=310 read=1164
write=337 read=1206
write=286 read=1242
write=318 read=1210
write=318 read=1215
write=301 read=1213
write=307 read=1208
write=282 read=1241
write=293 read=1253
write=315 read=1279
write=337 read=1234
write=311 read=1281
write=274 read=1224
write=344 read=1277
write=321 read=1221
write=295 read=1232
write=343 read=1339
write=354 read=1224
write=329 read=1178
write=273 read=1341
write=345 read=1317
write=333 read=1254
write=318 read=1230
write=315 read=1225
write=323 read=1103
write=303 read=1273
write=332 read=1239
write=329 read=1270
write=294 read=1316
write=310 read=1213
write=310 read=1149
write=323 read=1262
write=324 read=1268
write=342 read=1322
write=259 read=1149
write=337 read=1289
write=277 read=1123
write=300 read=1197
write=316 read=1324
write=342 read=1335
write=336 read=1297
write=332 read=1289
write=287 read=1218
write=351 read=1264
write=294 read=1264
write=305 read=1217
write=304 read=1237
write=284 read=1233
write=292 read=1212
write=292 read=1182
write=362 read=1407
write=329 read=1360
write=332 read=1308
write=319 read=1363
write=349 read=1281
write=307 read=1295
write=250 read=939
write=340 read=1300
write=299 read=1125
write=277 read=1196
write=300 read=1302
write=363 read=1333
write=377 read=1267
write=348 read=1362
write=340 read=1394
write=342 read=1273
write=305 read=1199
write=386 read=1396
write=332 read=1334
write=330 read=1353
write=338 read=1368
write=325 read=1285
write=328 read=1292
write=352 read=1292
write=357 read=1334
write=342 read=1380
write=340 read=1456
write=317 read=1333
write=342 read=1228
write=311 read=1329
write=317 read=1405
write=345 read=1330
write=394 read=1440
write=345 read=1340
write=344 read=1269
write=338 read=1317
write=370 read=1351
write=349 read=1330
write=366 read=1351
write=359 read=1465
write=370 read=1363
write=383 read=1358
write=361 read=1414
write=324 read=1392
write=349 read=1422
write=354 read=1435
write=265 read=1124
write=255 read=1015
write=334 read=1376
write=287 read=1104
write=328 read=1285
write=358 read=1453
write=360 read=1296
write=386 read=1453
write=358 read=1476
write=350 read=1407
write=366 read=1476
write=387 read=1306
write=333 read=1342
write=360 read=1442
write=377 read=1448
write=343 read=1462
write=369 read=1445
write=358 read=1411
write=346 read=1462
write=370 read=1427
write=370 read=1503
elapsedTime=1200122 ms
writeCount[0]=39404 rate=0.03 per ms
Total Write Rate=0.03 per ms
readCount[0]=38959 rate=0.03
readCount[1]=38929 rate=0.03
readCount[2]=38440 rate=0.03
readCount[3]=39133 rate=0.03
Total Read Rate=0.12 per ms
writer latency stats:
1,0
5,0
10,0
50,7465
100,4094
200,1489
300,751
400,134
500,44
600,14
700,8
800,1
900,0
1000,0
1+ms,25404
reader latency stats:
1,0
5,0
10,63
50,13020
100,44
200,546
300,4
400,4
500,1
600,1
700,0
800,0
900,0
1000,0
1+ms,25276
>>> validate
Quit: running time is over 60000 ms
Validated 71915/20000000 in 60001 ms
OK
>>> check & write
Reader 0 started
Reader 1 started
Reader 2 started
Reader 3 started
Writer 0 started
write=365 read=1643
write=347 read=1481
write=378 read=1495
write=417 read=1433
write=348 read=1553
write=399 read=1521
write=428 read=1533
write=389 read=1521
write=375 read=1523
write=385 read=1605
write=389 read=1503
write=442 read=1547
write=389 read=1607
write=357 read=1628
write=384 read=1546
write=362 read=1446
write=362 read=1425
write=448 read=1561
write=346 read=1468
write=359 read=1464
write=467 read=1505
write=357 read=1453
write=427 read=1622
write=392 read=1515
write=431 read=1525
write=366 read=1570
write=394 read=1530
write=375 read=1434
write=408 read=1526
write=369 read=1578
write=406 read=1649
write=385 read=1530
write=387 read=1549
write=385 read=1581
write=402 read=1572
write=365 read=1586
write=364 read=1669
write=419 read=1553
write=392 read=1552
write=391 read=1532
write=446 read=1589
write=403 read=1564
write=389 read=1626
write=395 read=1578
write=402 read=1560
write=388 read=1537
write=306 read=1124
write=363 read=1533
write=367 read=1387
write=414 read=1565
write=406 read=1547
write=362 read=1492
write=399 read=1691
write=407 read=1583
write=391 read=1620
write=411 read=1633
write=394 read=1630
write=371 read=1535
write=407 read=1612
write=412 read=1597
write=438 read=1585
write=444 read=1649
write=351 read=1263
write=444 read=1684
write=423 read=1592
write=416 read=1651
write=442 read=1622
write=417 read=1677
write=422 read=1602
write=342 read=1587
write=385 read=1703
write=415 read=1636
write=369 read=1635
write=395 read=1627
write=405 read=1632
write=383 read=1666
write=376 read=1669
write=390 read=1694
write=440 read=1642
write=434 read=1628
write=432 read=1702
write=391 read=1533
write=371 read=1646
write=419 read=1716
write=455 read=1695
write=407 read=1655
write=399 read=1698
write=358 read=1563
write=426 read=1538
write=437 read=1737
write=399 read=1717
write=437 read=1717
write=403 read=1637
write=359 read=1634
write=370 read=1663
write=462 read=1600
write=416 read=1669
write=409 read=1699
write=448 read=1621
write=383 read=1682
write=429 read=1685
write=469 read=1700
write=398 read=1684
write=431 read=1661
write=453 read=1610
write=308 read=1407
write=404 read=1692
write=390 read=1715
write=435 read=1716
write=426 read=1730
write=433 read=1716
write=422 read=1789
write=417 read=1729
write=432 read=1741
write=409 read=1781
write=430 read=1781
write=428 read=1716
write=384 read=1571
write=439 read=1683
write=411 read=1736
elapsedTime=1200106 ms
writeCount[0]=48046 rate=0.04 per ms
Total Write Rate=0.04 per ms
readCount[0]=47974 rate=0.04
readCount[1]=47980 rate=0.04
readCount[2]=47969 rate=0.04
readCount[3]=48027 rate=0.04
Total Read Rate=0.16 per ms
writer latency stats:
1,0
5,0
10,0
50,8210
100,4936
200,2891
300,898
400,176
500,71
600,17
700,16
800,5
900,6
1000,0
1+ms,30820
>>> validate
Quit: running time is over 60000 ms
Validated 195668/20000000 in 60001 ms
OK
>>> END TestBdbBytes.testPerformance
>>>

Mike Grove

unread,
Feb 24, 2011, 1:50:55 PM2/24/11
to kr...@googlegroups.com
On Wed, Feb 23, 2011 at 2:58 PM, Jingwei <jingw...@gmail.com> wrote:
Hi Michael,

Actually, that sounds promising. We are closer to the final
resolution.

The reason that read is slow is because reads are random and hit all
keys with equal possibilities. This leads to disk seek almost for
every operation. When disk seek requires 10ms, read throughput looks
fine given that you are using ChannelSegment. When data size is larger
than main memory, you do not have much opportunity to optimize random
reads unless read access pattern has locality characteristics.

Jingwei,

I appreciate your help on these matters.  I think the read rate is just a little slow for what we were hoping to see.  The more keys I put into krati, the more it seems to slow down and I anticipate needing to store upwards of 500M keys in production, I think that might just be too much data for it.

I am curious what kind of hashing strategies you are using internally?  

Thanks.

Mike

Jingwei

unread,
Feb 24, 2011, 2:25:18 PM2/24/11
to Krati
Hi Mike,

The internal hashing strategy is fairly standard: FnvHash + Linear
Hashing

For DynamicDataStore, hash collision is handled by concatenating key-
value pairs together.

[count:int][key-length:int][key:bytes][value-length:int][value:bytes]
[key-length:int][key:bytes][value-length:int][value:bytes]...

IndexedDataStore uses the same hashing strategy as DynamicDataStore.
But it is only at the index level in the format key-meta pairs. The
meta stores array indexes for values so that values can be retrieved
by array index lookup.

The krati stores are not optimized for simple keys like integer or
long. But it is relatively easy to customize those store
implementation.

Thanks.

-jingwei

On Feb 24, 10:50 am, Mike Grove <m...@clarkparsia.com> wrote:

Mike Grove

unread,
Feb 24, 2011, 2:27:31 PM2/24/11
to kr...@googlegroups.com
On Thu, Feb 24, 2011 at 2:25 PM, Jingwei <jingw...@gmail.com> wrote:
Hi Mike,

The internal hashing strategy is fairly standard: FnvHash + Linear
Hashing

Do you do perform partial expansions or priority splitting on the hash segments, or do you just roll from one bucket to the next as they overflow in the standard algorithm?

Jingwei

unread,
Feb 25, 2011, 5:00:58 PM2/25/11
to Krati
Hi Michael,

The current linear hashing in krati does neither partial expansion nor
priority splitting. It is the classical standard linear hashing.

Thanks.

-jingwei

On Feb 24, 11:27 am, Mike Grove <m...@clarkparsia.com> wrote:
Reply all
Reply to author
Forward
0 new messages