High lookup times with big ChronicleMap

403 views
Skip to first unread message

Kranti Parisa

unread,
Dec 18, 2014, 6:46:50 PM12/18/14
to java-ch...@googlegroups.com
Hi,

I have built a ChronicleMap<byte[],byte[]> and the details are

- Available RAM on the machine ~200GB
- Enough disk space too
- Keys = byte[] => a String with max length of 1000
- Values = byte[24]
- Map size of 850M+
- map size on disk 200GB+
- Key length distribution
50% / 90% / 99% // 99.9% / 99.99% / Max Length ====  35 / 48 / 76 // 140 / 258 / 1,029


## Map builder configs while writing
 int MAX_ENTRIES = 1800000000;
double entriesWithMargin = MAX_ENTRIES * 1.1;
long actualSegments = Maths.nextPower2((long) (entriesWithMargin / (1L << 16)), 1L);
ChronicleMap<byte[], byte[]> map = ChronicleMapBuilder.of(byte[].class, byte[].class)
.actualSegments((int) actualSegments)
.actualEntriesPerSegment((long) (entriesWithMargin / actualSegments))
.putReturnsNull(true)
.removeReturnsNull(true)
.createPersistedTo(file);

## Map builder configs while reading (after the write is completed)
map = ChronicleMapBuilder.of(byte[].class, byte[].class)
.createPersistedTo(file);


## Lookup 
byte[] using = new byte[24];
long time = System.currentTimeMillis();
byte[] value = map.getUsing(key.getBytes(), using);
System.out.println("Lookup time: " + (System.currentTimeMillis() - time));

I did try for bunch of keys and the look up times are 5-10ms. 


Is it expected or anything wrong with the Map builder configs?

If I produce smaller maps and have a multi-threaded lookup, will it make any difference? 

Do I need to use Strings instead of byte[] for keys and values?

Do I need to use any other API than "getUsing"?

Do I need to implement BytesMarshallable? if so, can you provide an example for my case?


Thanks,
KP



Peter Lawrey

unread,
Dec 18, 2014, 10:22:13 PM12/18/14
to java-ch...@googlegroups.com

Can you run the test where;
- use the latest release. There have been some changes recently.
- if I remember this was tuned down to minimise space used which will slow performance. Can you try setting the entries to number * 1.2 and the entry size to 80 bytes?
- move key.getBytes () so it is not in the timing.
- you use the actual key type as the key, which doesn't appear to be a byte [].
- check the map size on disk with: du -h {file}
- time warmed up code e.g. After 20k operations.
- can you test this with less entries to see if it a scalability problem. Eg 50 million entries.
- having multiple threads will improve throughput but at the cost to latency.
- smaller maps shouldn't help much but they might.
- getUsing should be fine.
- you might need to use BytesMarshallable for the key.
- would consider using the natural type for the value unless it really is just 24 bytes.

--
You received this message because you are subscribed to the Google Groups "Chronicle" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-chronicl...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Peter Lawrey

unread,
Dec 18, 2014, 10:44:38 PM12/18/14
to java-ch...@googlegroups.com

How many numa regions do you have? Eg

numactl --show

If you have multiple numa regions you effectively have multiple machines working together over bus(es) and you don't get the same performance. I suggest you try using no more than 90% of one numa region in a single process so it doesn't have to go across the inter connect.

On 18/12/2014 11:46 PM, "Kranti Parisa" <kranti...@gmail.com> wrote:

Roman Leventov

unread,
Dec 18, 2014, 11:29:44 PM12/18/14
to java-ch...@googlegroups.com

My 5 cents:
- YES, there is sonething surely wrong with your code, main miss in your code is not specifying key/value size. When you forget it, entry size defaults to 256 bytes.
While you need keySize(48).constantValueSizeBySample(new byte[24]).
- don't do library's work yourself. Seems actual key type is String, just specify ChronicleMapBuilder.of(String.class, …) and don't bother with translating from/to bytes, unless you need encoding different from UTF-8
- The same for values, if they are constant-size there is a good chance they are suitable for data value generation abstraction, that is zero-copy, zero garbage on ser/deser and allows directly off-heap updates.
- Seems you copy-pasted some cumbersome code from recent thread in this group, aiming to workaround a bug which is now fixed in master. Not sure it is included in the latest release, please wait for next release, rc1, it should be there in a couple of days. Generally, this code: entries(ACTUAL_MAX_NUMBER_OF_ENTRIES_WITHOUT_MARGIN), without specifying number of segments, entries per segment, margin etc. MUST work well. Otherwise it is a bug which should be reported.
- BUT don't forget number of entries you configure is the number of "chunks" of the entry size. Since 90% of your entries should fit 1 entrySize (if configured keySize(48), and almost all others should fit 2, you should configure .9 * 1 + .1 * 2 = 1.1 "entries" over your actual max number of entries. (A note for people reading this message: you should copy-paste entries * 1.1 everywhere! It depends on exact configuration and variance profile!)
- if you have a special type of keys/values, not boxed primitive / string / char sequence / byte[] / char[] and etc. For which efficient marshallers are already written and configured in the lib (all such types will be listed soon in docs), and your type is not suitable for data value generation, yes you might consider implementing BytesMarshallable. But better BytesReader/BytesWriter/BytesInterop, that could save some memory and CPU cost compared to implementing BytesMarshallable, if keys/values are small.

- even if you do everething we noted, you could still have a problem, because there could be bugs in our code. That's very hard to identify without live playing with your case.
- unfortunately it is hard to account all details without good knowledge of the lib; currently only we (developers) are able to cook Chronicle Map best.

19 Дек 2014 г. 10:44 пользователь "Peter Lawrey" <peter....@gmail.com> написал:

Kranti Parisa

unread,
Dec 19, 2014, 12:17:15 AM12/19/14
to java-ch...@googlegroups.com
Hi Peter & Roman,

Thanks for the feedback. ChronicleMap is a great idea and really wanted to see it fitting different use cases that we have. 
I'll look into the points you mentioned below and try to build the map again.

Thanks,
KP

Kranti Parisa

unread,
Dec 19, 2014, 1:00:15 AM12/19/14
to java-ch...@googlegroups.com
FYI
I'm using version 2.0.14b 

Kranti Parisa

unread,
Dec 19, 2014, 3:05:07 AM12/19/14
to java-ch...@googlegroups.com
Hi Peter,

here is the result
numactl --show
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
cpubind: 0 1
nodebind: 0 1
membind: 0 1

Roman Leventov

unread,
Dec 19, 2014, 4:20:48 AM12/19/14
to java-ch...@googlegroups.com
On Fri, Dec 19, 2014 at 11:29 AM, Roman Leventov <roman.l...@higherfrequencytrading.com> wrote:

- BUT don't forget number of entries you configure is the number of "chunks" of the entry size. Since 90% of your entries should fit 1 entrySize (if configured keySize(48), and almost all others should fit 2, you should configure .9 * 1 + .1 * 2 = 1.1 "entries" over your actual max number of entries. (A note for people reading this message: you should copy-paste entries * 1.1 everywhere! It depends on exact configuration and variance profile!)

Should NOT copy-paste, of cause :) 

Kranti Parisa

unread,
Dec 19, 2014, 1:57:41 PM12/19/14
to java-ch...@googlegroups.com
Hi Peter & Roman,

I've generated a new map with the following configs

ChronicleMap<String, byte[]> map = ChronicleMapBuilder.of(String.class, byte[].class)
.entries(110000000)
.keySize(100)
.entrySize(124)
.constantValueSizeBySample(new byte[24])
.createPersistedTo(file);

modified the key to be String (length would be 48-100 characters, not constant). but the value is always a byte[] of size 24 (to be more specific, the value contains 6 float values as a byte[])

While writing to the map, there were no reads. After completing the write: 
Map.size() = 205,202,432
map size on disk (du -sh map.dat) = 26G

Using the version: 2.0.14b

Questions:
-----------
1. If you see the Map.size() after write = 205+M, But the builder.entries() was configured as 110M. How did it accommodate more than 110M entries?

2. Lookup times seems to be between 5-7 millis. sounds high, isn't it?
byte[] using = new byte[24];
long time = System.currentTimeMillis();
byte[] value = map.getUsing(key, using);

System.out.println("Lookup time: " + (System.currentTimeMillis() - time));

Thanks,
KP

On Thursday, December 18, 2014 8:29:44 PM UTC-8, Roman Leventov wrote:

Roman Leventov

unread,
Dec 19, 2014, 2:15:10 PM12/19/14
to java-ch...@googlegroups.com

1) config should be just keySize().constantValueSizeBySample(). EntrySize() is redundant in this case, and it hides other two configs. Actually, entrySize() is very rarely needed. In release it will be renamed to actualChunkSize() to emphasize it's low level nature.

2) Define interface

Interface MyValue {
  void setValueAt(@MaxSize(6) int index, float value);
float getValueAt(int index);
}

Obtain a value using map.newValueInstance().

I. E. Use data value generation.

3) reported 200m size looks strange and like a bug. It would be helpful if you test with latest master ChMap (you can install via get clone & mvn install -DskipTests=true, 2.0.18rc1-SNAPSHOT in deps), or wait a couple of days until next release

20 Дек 2014 г. 1:57 пользователь "Kranti Parisa" <kranti...@gmail.com> написал:

Roman Leventov

unread,
Dec 19, 2014, 2:20:28 PM12/19/14
to java-ch...@googlegroups.com

*git clone

20 Дек 2014 г. 2:15 пользователь "Roman Leventov" <roman.l...@higherfrequencytrading.com> написал:

Kranti Parisa

unread,
Dec 19, 2014, 2:32:41 PM12/19/14
to java-ch...@googlegroups.com
I can try the ChMap master code, with the following changes

ChronicleMap<String, byte[]> map = ChronicleMapBuilder.of(String.class, MyTestValue.class)
.entries(
225000000)//as I would like to insert 205m+ entries
.keySize(
100)
.constantValueSizeBySample(
new MyTestValue())
.createPersistedTo(file);

where, MyTestValue implements MyValue (that you described below)

looks good to give it another try?

Thanks,
KP

Kranti Parisa

unread,
Dec 19, 2014, 2:33:25 PM12/19/14
to java-ch...@googlegroups.com
I can try the ChMap master code, with the following changes

ChronicleMap<String, MyTestValue> map = ChronicleMapBuilder.of(String.class, MyTestValue.class)

.entries(225000000)//as I would like to insert 205m+ entries
.keySize(
100)
.constantValueSizeBySample(
new MyTestValue())
.createPersistedTo(file);

where, MyTestValue implements MyValue (that you described below)

looks good to give it another try?


Roman Leventov

unread,
Dec 19, 2014, 2:59:54 PM12/19/14
to java-ch...@googlegroups.com

If you need 100m entries, you should configure 100m. Or you don't know exactly? Reporting 200m if you inserted only 100 seems like a bug.

Regarding key size - you should think how much bytes would 90% of your keys fit, in uft-8 encoding. From your prev mail, I thought this is 48 bytes.

No need to configure constant value size, if using data value generated values. Just remove this line.

MyTestValue generic param on left hand side.

Everything else seems ok.

20 Дек 2014 г. 2:32 пользователь "Kranti Parisa" <kranti...@gmail.com> написал:

Kranti Parisa

unread,
Dec 19, 2014, 3:08:13 PM12/19/14
to java-ch...@googlegroups.com
Is this the right repo?

aftre mvn install, it gives me chronicle-map-2.0.15b-SNAPSHOT.jar 


On Friday, December 19, 2014 11:20:28 AM UTC-8, Roman Leventov wrote:

Roman Leventov

unread,
Dec 19, 2014, 3:26:42 PM12/19/14
to java-ch...@googlegroups.com

Yes, 15b now. It doesn't matter

20 Дек 2014 г. 3:08 пользователь "Kranti Parisa" <kranti...@gmail.com> написал:

Peter Lawrey

unread,
Dec 19, 2014, 4:22:50 PM12/19/14
to java-ch...@googlegroups.com
Using the latest update, the following

public class LotsOfEntriesMain {
public static void main(String[] args) throws IOException {
workEntries(true);
workEntries(false);
}

private static void workEntries(boolean add) throws IOException {
int entries = 100_000_000;
File file = new File("/tmp/lotsOfEntries.dat");
ChronicleMap<CharSequence, byte[]> map = ChronicleMapBuilder
.of(CharSequence.class, byte[].class)
.entries(entries)
.entrySize(80)
.createPersistedTo(file);
Random rand = new Random();
StringBuilder sb = new StringBuilder();
byte[] bytes = new byte[24];
long start = System.nanoTime();
for (int i = 0; i < entries; i++) {
sb.setLength(0);
int length = (int) (24 / (rand.nextFloat() + 24.0 / 1000));
if (length > 2000)
throw new AssertionError();
sb.append(i);
while (sb.length() < length)
sb.append("-key");
if (add)
map.put(sb, bytes);
else
map.getUsing(sb, bytes);
if (i << -7 == 0 && i % 2000000 == 0)
System.out.println(i);
}
long time = System.nanoTime() - start;
System.out.printf("Map.size: %,d took an average of %,d ns to %s.%n",
map.size(), time / entries, add ? "add" : "get");
map.close();
}
}
prints

Map.size: 100,000,000 took an average of 1,435 ns to add.
Map.size: 100,000,000 took an average of 661 ns to get.

The disk used is 16.8 GB or 168 bytes for a typically 90 byte entry, though some entries will be larger than this, up to 1 KB. With a 80 byte entrySize there will be around 30-40 byte overhead typically half of the entry won't be used. 
e.g if the size is 120 bytes, it will use 2*80 entries to store this. If you add 16 bytes for the hash lookup you can expect an average of 90 + 40+ 16 or 146 bytes.  
That leaves 22 bytes unaccounted for which could be due to the extra 10% we add to the size, or a bug, or an error in my calculations.

I will add an example using an interface.

Peter Lawrey

unread,
Dec 19, 2014, 4:31:45 PM12/19/14
to java-ch...@googlegroups.com
This example uses a type which is a fixed length, which saves a byte.

public class LotsOfEntriesMain {
public static void main(String[] args) throws IOException {
workEntries(true);
workEntries(false);
}

private static void workEntries(boolean add) throws IOException {
        long entries = 100_000_000;
        File file = new File("/tmp/lotsOfEntries.dat");
        ChronicleMap<CharSequence, MyFloats> map = ChronicleMapBuilder
.of(CharSequence.class, MyFloats.class)

.entries(entries)
.entrySize(80)
.createPersistedTo(file);
Random rand = new Random();
StringBuilder sb = new StringBuilder();
        MyFloats mf = map.newValueInstance();
if (add)
for (int i = 0; i < 6; i++)
mf.setValueAt(i, i);
long start = System.nanoTime();
for (long i = 0; i < entries; i++) {

sb.setLength(0);
int length = (int) (24 / (rand.nextFloat() + 24.0 / 1000));
if (length > 2000)
throw new AssertionError();
sb.append(i);
while (sb.length() < length)
sb.append("-key");
if (add)
                map.put(sb, mf);
else
map.getUsing(sb, mf);
            if (i << -7 == 0 && i % 2000000 == 0)
System.out.println(i);
}
long time = System.nanoTime() - start;
System.out.printf("Map.size: %,d took an average of %,d ns to %s.%n",
map.size(), time / entries, add ? "add" : "get");
map.close();
}
}

interface MyFloats {
public void setValueAt(@MaxSize(6) int index, float f);

public float getValueAt(int index);
}
This prints

Map.size: 100,000,000 took an average of 1,422 ns to add.
Map.size: 100,000,000 took an average of 676 ns to get.

Kranti Parisa

unread,
Dec 19, 2014, 6:31:20 PM12/19/14
to java-ch...@googlegroups.com
Hi Peter,

Thanks again for sharing the example. 
So, there is no significant difference between a fixed length byte[] & a fixed size value type. The lookup (get) times are slightly higher than in the case of value type (in this specific example)?
I'm going to regenerate the map again with actual data and see how it behaves.

Roman Leventov

unread,
Dec 19, 2014, 6:48:51 PM12/19/14
to java-ch...@googlegroups.com

Data value generated type is better, 1) because you don't need to convert from/to bytes yourself, 2) you can update off-heap bytes directly without ser/deser of the whole value and repetitive search by key,  3) it doesn't copy bytes on deserialization, only save reference to that bytes. If your value size is 24 it might not be so important but on larger size makes some difference

20 Дек 2014 г. 6:31 пользователь "Kranti Parisa" <kranti...@gmail.com> написал:

Kranti Parisa

unread,
Dec 20, 2014, 3:30:17 AM12/20/14
to java-ch...@googlegroups.com
Thanks again.

Hi Peter,

I've regenerated the map, and here are the findings
int MAX_ENTRIES = 210000000//I don't know the exact number before writing the map and also I want to maintain a margin so that I can write to the same map in future
ChronicleMap<CharSequence, byte[]> map = ChronicleMapBuilder.of(CharSequence.class, byte[].class)
                .entries(MAX_ENTRIES)
.entrySize(80)
.createPersistedTo(file);

After writing:
---------
map.size() => 187,541,584
du -sh map.dat => 19GB
Lookup times are still (tried only few keys) => 6-7ms

Key format:
<string_can_have_spaces><6_digit_int><varying_length_int>
example keys:
chronicle map1234561234567890
java345678637828292973

Value is fixed byte[24]

I understand that value types might perform better, but I want to see the lookup times in nanos than millis with the byte[24] values. Also, I'm not sure if I need to clean up any OS/Disk caches on the machine as I've been generating this map frequently (of course I delete the old map before creating a new one)

I'm still using 2.0.14b. May be I should wait until the next release and try this again.

Thanks,
KP

Kranti Parisa

unread,
Dec 20, 2014, 3:31:34 AM12/20/14
to java-ch...@googlegroups.com
Thanks again.

Hi Peter,

I've regenerated the map, and here are the findings
int MAX_ENTRIES = 210000000//I don't know the exact number before writing the map and also I want to maintain a margin so that I can write to the same map in future
ChronicleMap<CharSequence, byte[]> map = ChronicleMapBuilder.of(CharSequence.class, byte[].class)
                .entries(MAX_ENTRIES)
                .entrySize(80)
                .createPersistedTo(file);

After writing:
---------
map.size() => 187,541,584
du -sh map.dat => 19GB
Lookup times are still (tried only few keys) => 6-7ms

Key format:
<string_can_have_spaces><6_digit_int><varying_length_int>
example keys:
chronicle map1234561234567890
java345678637828292973

Value is fixed byte[24]

I understand that value types might perform better, but I want to see the lookup times in nanos than millis with the byte[24] values. Also, I'm not sure if I need to clean up any OS/Disk caches on the machine as I've been generating this map frequently (of course I delete the old map before creating a new one)

I'm still using 2.0.14b. May be I should wait until the next release and try this again.

Thanks,
KP


Kranti Parisa

unread,
Dec 22, 2014, 9:09:00 PM12/22/14
to java-ch...@googlegroups.com
Hi Peter & All,

Any tentative date for the next release?

Roman Leventov

unread,
Dec 23, 2014, 12:18:44 AM12/23/14
to java-ch...@googlegroups.com

There already was a release since our discussion 16b), you can try it.

23 Дек 2014 г. 9:09 пользователь "Kranti Parisa" <kranti...@gmail.com> написал:

Kranti Parisa

unread,
Dec 23, 2014, 1:37:18 AM12/23/14
to java-ch...@googlegroups.com
cool, can we add that to maven?
http://search.maven.org/#search|gav|1|g:"net.openhft" AND a:"chronicle-map"

Rob Austin

unread,
Dec 23, 2014, 3:56:38 AM12/23/14
to java-ch...@googlegroups.com
We just released another Beta.

Kranti Parisa

unread,
Dec 23, 2014, 2:01:14 PM12/23/14
to java-ch...@googlegroups.com
cool, thanks. will try that.

Kranti Parisa

unread,
Dec 23, 2014, 4:37:29 PM12/23/14
to java-ch...@googlegroups.com
I tried with 2.0.16b. Seems the writes(put) are faster now but the lookups(get) are still at 6-7ms!!
Does this mean, chronicle-map is slower when the keys are varying in lengths

p.s. Map structure (keys & values) is described below.

Roman Leventov

unread,
Dec 23, 2014, 4:46:31 PM12/23/14
to java-ch...@googlegroups.com

There should be a flaw, either in your test, or benchmarking method, or our code. Or problems with your machine. Peter posted an example above, quite close to your case, where put was taking 1500 ns on average, get -- 600 ns.

Variable sized keys/values cost additional bits! and a few nanoseconds, but not milliseconds!

24 Дек 2014 г. 4:37 пользователь "Kranti Parisa" <kranti...@gmail.com> написал:

Peter Lawrey

unread,
Dec 23, 2014, 5:02:07 PM12/23/14
to java-ch...@googlegroups.com

In the test I have done and provided the latency is an average of 0.0014 ms which I consider to be high but nothing like 6 ms. Can you provide a reproducible test to demonstrate this as I suspect you are testing something different to us.

Kranti Parisa

unread,
Dec 23, 2014, 5:19:50 PM12/23/14
to java-ch...@googlegroups.com
Seems i was still testing with the old version. If I switch to new version, it's throwing Error while creating the map. discussed via a separate thread. Rob Austin provided a work around. I'm going to try that. l'll also post the test code in a while.

Kranti Parisa

unread,
Dec 23, 2014, 5:30:52 PM12/23/14
to java-ch...@googlegroups.com
Hi Peter & Roman,

Here is the test code, I might be doing something wrong to get 6-7ms response for lookups(gets). getting 0.0014 ms would be sweet!!

import net.openhft.chronicle.map.ChronicleMap;
import net.openhft.chronicle.map.ChronicleMapBuilder;

import java.io.*;
import java.util.zip.GZIPInputStream;

public class CMapTest {

static int MAX_ENTRIES = 200000000;
ChronicleMap<CharSequence, byte[]> chMap = null;
public static final String inputPath = "/code/ChronicleMapTest/dataInput";
public static final String outputPath = "/code/ChronicleMapTest/dataOutput/cmapTest.dat";

public static void main(String[] args){
try {
new CMapTest().demoMap();
} catch (IOException e) {
e.printStackTrace();
}
}

public void demoMap() throws IOException {
chMap = createChronicleMap();
System.out.println("Map Size Before= " + chMap.size());
buildChronicleMap();
System.out.println("Map Size After = " + chMap.size());
chMap.close();
}

private boolean buildChronicleMap() throws IOException {
boolean status = false;
String line = null;
BufferedReader reader = null;
try {
int numOfProcessRecords = 0;
System.out.println("Processing input path " + inputPath);
for (Object inputFile : getFiles(inputPath)) {
System.out.println("Processing file " + inputFile.toString());
reader = getReader(inputFile);
int i = 0; boolean isProcessed = true;
long start = System.currentTimeMillis();
while ((line = reader.readLine()) != null) {
i++;
isProcessed = processRecord(line, i);
if(isProcessed) {
numOfProcessRecords++;
}
}
reader.close();
reader = null;
status = true;
System.out.println("No of lines processed: "+i+" in "+(System.currentTimeMillis()-start)/1000+" seconds");
}
if (numOfProcessRecords == 0) {
status = false;
}
System.out.println("numOfProcessRecords: "+numOfProcessRecords);
} catch (IOException e) {
status = false;
System.out.println("Failed building the map"+e);
} finally {
if (reader != null) {
try {
reader.close();
} catch (IOException e) {
System.out.println("Failed to close the reader"+e);
}
}
}
return status;
}

private boolean processRecord(String line, int lineNumber) throws IOException {
boolean status = true;
String[] parts = line.split("\t");
if (parts.length == 13) {
try {

if (parts[2].length() > 100){
return false;
}

float a = Float.parseFloat(parts[6]);
float b = Float.parseFloat(parts[5]);
float c = Float.parseFloat(parts[9]);
float d = Float.parseFloat(parts[8]);
float e = Float.parseFloat(parts[12]);
float f = Float.parseFloat(parts[11]);

add(createKey(parts[2], parts[0], parts[3]),
createValue(a, b, c, d, e, f));
} catch (Exception e) {
System.out.println("Failed to process record at lineNumber: "+lineNumber +" Record: "+ line);
e.printStackTrace();
}

} else {
status = false;
}
return status;
}

private ChronicleMap<CharSequence, byte[]> createChronicleMap() throws IOException {
File file = new File(outputPath);

ChronicleMap<CharSequence, byte[]> map = ChronicleMapBuilder.of(CharSequence.class, byte[].class)
.entries(MAX_ENTRIES)
//                .keySize(41)
.entrySize(80)

.constantValueSizeBySample(new byte[24])
.createPersistedTo(file);

        return map;
}

private void add(CharSequence key, byte[] val) throws IOException {
chMap.put(key, val);
}

private CharSequence createKey(String a, String b, String c) {

StringBuilder sb = new StringBuilder();
        sb.append(a);
sb.append(b);
sb.append(c);
return sb.toString();

}

private byte[] createValue(float a, float b, float c, float d, float e, float f) {
byte[] value = new byte[4 + 4 + 4 + 4 + 4 + 4];
putInt(value, 0, Float.floatToRawIntBits(a));
putInt(value, 4, Float.floatToRawIntBits(b));
putInt(value, 8, Float.floatToRawIntBits(c));
putInt(value, 12, Float.floatToRawIntBits(d));
putInt(value, 16, Float.floatToRawIntBits(e));
putInt(value, 20, Float.floatToRawIntBits(f));
return value;
}

private void putInt(byte[] arr, int offset, int val) {
arr[offset] = (byte)((val >>> 24) & 0xFF);
arr[offset+1] = (byte)((val >>> 16) & 0xFF);
arr[offset+2] = (byte)((val >>> 8) & 0xFF);
arr[offset+3] = (byte)(val & 0xFF);
}

private Object[] getFiles(String dirName) throws IOException {
File dir = new File(dirName);
if (dir.isDirectory()) {
return dir.listFiles(new FilenameFilter() {
public boolean accept(File d, String name) {
return name.startsWith("part");
}
});
}
return null;
}

public BufferedReader getReader(Object inputFile) throws IOException {
InputStream is = new FileInputStream((File)inputFile);
if (is == null) {
return null;
}
return new BufferedReader(new InputStreamReader(new GZIPInputStream(is)));
}
}

Roman Leventov

unread,
Dec 23, 2014, 5:37:39 PM12/23/14
to java-ch...@googlegroups.com

There isn't map.get() in this code

24 Дек 2014 г. 5:30 пользователь "Kranti Parisa" <kranti...@gmail.com> написал:

Kranti Parisa

unread,
Dec 23, 2014, 5:41:08 PM12/23/14
to java-ch...@googlegroups.com
Here it is (in a separate class)

Scanner in = new Scanner(System.in);
System.out.println("Key to lookup: ");
String key = in.nextLine();

ChronicleMap<CharSequence, byte[]> map = ChronicleMapBuilder.of(CharSequence.class, byte[].class)
                    .keySize(50)
.createPersistedTo(file);
byte[] using = new byte[24];
long start = System.nanoTime();

byte[] value = map.getUsing(key, using);
System.out.println("Lookup time: " + (System.nanoTime() - start) +" ns");


Roman Leventov

unread,
Dec 23, 2014, 5:46:45 PM12/23/14
to java-ch...@googlegroups.com

Why you are creating a new map on each getUsing operation? Seems you are measuring map bootstrap :)

24 Дек 2014 г. 5:41 пользователь "Kranti Parisa" <kranti...@gmail.com> написал:

Kranti Parisa

unread,
Dec 23, 2014, 6:00:19 PM12/23/14
to java-ch...@googlegroups.com
So, map should be warmed up? and we get faster lookups over a period of time?

Peter Lawrey

unread,
Dec 24, 2014, 3:12:06 AM12/24/14
to java-ch...@googlegroups.com

To be fair, taking 6-7 ms to reload a map from disk with 200 Million entries is pretty good.
If you want to time how long the get() takes I suggest timing it repeatedly after the code has warmed up. E.g. running a test of at least 2 seconds multiple times.e.g 5 times, taking the median.
Btw if you want to store 6 floats I suggest using an off heap reference with named fields. Otherwise using a float[ ]

Kranti Parisa

unread,
Dec 24, 2014, 4:00:10 PM12/24/14
to java-ch...@googlegroups.com
that makes sense. I'll run the test with a timer and capture the times again.
...

Kranti Parisa

unread,
Dec 24, 2014, 5:12:10 PM12/24/14
to java-ch...@googlegroups.com
Generated keyArray[] with 100 keys that exist in the map.
And ran the lookup test for 5 times each 15 seconds by randomly looking up the keys from keyArray[]

Here are the test results in Nanoseconds

50% / 90% / 99% // 99.9% / 99.99% / Max Latency ====  509 / 634 / 848 // 9,666 / 31,423 / 538,873

50% / 90% / 99% // 99.9% / 99.99% / Max Latency ====  487 / 600 / 799 // 9,677 / 31,073 / 612,622

50% / 90% / 99% // 99.9% / 99.99% / Max Latency ====  487 / 595 / 800 // 5,528 / 31,292 / 566,429

50% / 90% / 99% // 99.9% / 99.99% / Max Latency ====  507 / 625 / 844 // 5,927 / 31,467 / 339,205

50% / 90% / 99% // 99.9% / 99.99% / Max Latency ====  505 / 624 / 842 // 9,220 / 31,414 / 638,444

Looks pretty good.
Thanks Peter & team for the support. Now, I should plan to test this in real world use case :)
will keep you posted!
...

voodoowill

unread,
Dec 24, 2014, 11:58:04 PM12/24/14
to java-ch...@googlegroups.com
Kranti: How did you calculate the percentile values. This is not a binomial distribution. And I presume your CPU choice does actual nanosecond resolution for timing (as opposed to a difference in cpu cycles)
...

voodoowill

unread,
Dec 24, 2014, 11:59:35 PM12/24/14
to java-ch...@googlegroups.com
Its called compile threshold and profile generation, which allows the JVM JIT compiler to optimize the code. Takes a little time and good bit of load.
...

Kranti Parisa

unread,
Dec 25, 2014, 12:31:01 AM12/25/14
to java-ch...@googlegroups.com
It's a very basic test to record the endTime-startTime into an array, sort the array and get the percentiles. (very similar to what Peter had in the load tests)
I agree, we will need to do a comprehensive test.. but I was looking for the basic test results before moving fwd to test other ways..
...

Kranti Parisa

unread,
Dec 25, 2014, 1:38:45 AM12/25/14
to java-ch...@googlegroups.com
my next test would be to hit the test using ExecutorService and record the lookup times and also monitor the JVM & GC activities during the test.
...

Kranti Parisa

unread,
Dec 25, 2014, 1:44:08 AM12/25/14
to java-ch...@googlegroups.com
Peter,
using float[6] instead of byte[24] seems to be increasing the lookup times by 6x. haven't looked at the details, but generated the map with float[6] as value and ran the same lookup test

On Wednesday, December 24, 2014 12:12:06 AM UTC-8, Peter Lawrey wrote:
...

Roman Leventov

unread,
Dec 25, 2014, 2:20:47 AM12/25/14
to java-ch...@googlegroups.com
Yes, because float[6] is just serialized as of now. Peter was keeping in mind
interface MyFloats { float getValueAt(@MaxSize(6) int index); void setValueAt(int index, float value); }

Peter Lawrey

unread,
Dec 25, 2014, 2:34:45 AM12/25/14
to java-ch...@googlegroups.com
Or you could give fields meaningful names.

interface MyData {
    float getFieldOne();
    void setFieldOne(float x);
    float getFieldTwo();
    void setFieldTwo(float x);
    float getFieldThree();
    void setFieldThree(float x);
    float getFieldFour();
    void setFieldFour(float x);
    float getFieldFive();
    void setFieldFive(float x);
    float getFieldSix();
    void setFieldSix(float x);

Kranti Parisa

unread,
Dec 28, 2014, 3:53:55 PM12/28/14
to java-ch...@googlegroups.com
Ya, will try that and compare the numbers.
Reply all
Reply to author
Forward
0 new messages