How to search persisted Java objects

Chris

unread,

Apr 14, 2015, 7:28:05 AM4/14/15

to java-ch...@googlegroups.com

Hi,

I was wondering whether there is another (more efficient) way of searching for particular objects stored in a Vanilla Chronicle. My code persists Externalizable instances, then when a search is required, I just run a simple iteration over all entries like this:

ExcerptTailer tailer;
try {
	tailer = chronicle.createTailer();
} catch (IOException e) {
	logger.log(Level.WARNING, "Could not create ExcerptTailer. Check exception", e);
	return Collections.EMPTY_LIST;
}
					
tailer.toStart();
					
					
Collection<RestDataHolder> result = new ArrayList<RestDataHolder>(); 

while(true) {
					
	if(!tailer.nextIndex()) 
		break;//no more data

	Object obj = tailer.readObject();
	tailer.finish();

	if(!(obj instanceof RestDataHolder)) 
		continue;//should be treated as exception though
						
	RestDataHolder data = (RestDataHolder)obj;
						
	if(!accountId.equals(data.getAccountId())) 
		continue;
...

in the code above "accountId" is the key to search for. I am not sure if I miss something and there is another way to search for data.

Thanks,

Chris

Peter Lawrey

unread,

Apr 15, 2015, 7:01:02 AM4/15/15

to java-ch...@googlegroups.com

There is two approaches.
- add the accountId to the start of the message to speed up the scan. If the field is monotonically increasing a binary search is supported.
- you can add a Map as an index for fast lookups.

--
You received this message because you are subscribed to the Google Groups "Chronicle" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-chronicl...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chris

unread,

Apr 15, 2015, 10:59:55 AM4/15/15

to java-ch...@googlegroups.com

Dear Peter,

Thank you for your reply. I do not understand how to practically implement the 1st approach.

Are there any examples/snippets of how this can be done? For example I don't get how adding "accountId" "to the start of the message will speed up the scan? Should I persist 2 separate messages, one containing only the account ID, and the subsequent message to contain the actual data?

Thanks,

Chris

Peter Lawrey

unread,

Apr 15, 2015, 12:51:59 PM4/15/15

to java-ch...@googlegroups.com

I guess you imagine it to be more complex than it is. Instead of writing

appender.writeObject(message);

you write

appender.writeUTFΔ(message.getAccountId());

appender.writeObject(message);

Now the accountId appears at the start of every message. This means that to check the accountId of the message all you need do is read the start of the message.

List<RestDataHolder> result = new ArrayList<RestDataHolder>();

while(tailer.nextIndex()) {
String accountId = tailer.readUTFΔ();

if (accountId.equals(findAccountId))

result.add((RestDataHolder) tailer.readObject());

tailer.finish();

}

This speeds up the search as it doesn't need to deserialize the object if the object is not needed.

To speed this up you can make the object Externalizable or event BytesMarshallable and make the accountId the first field you serialize. This would be more efficient in general, but also avoid writing the accountId twice.

Chris

unread,

Apr 15, 2015, 1:16:54 PM4/15/15

to java-ch...@googlegroups.com

Thank you very much, your example completely makes it clear.

I will also check how to use BytesMarshallable as a better alternative (i suppose) to plain Externalizable.

Thanks,

Chris

Peter Lawrey

unread,

Apr 15, 2015, 2:09:44 PM4/15/15

to java-ch...@googlegroups.com

Chronicle is already much more efficient with Externalizable than Java Serialization however it only exposes a basic set of methods. If you use BytesMarshallable it gives you the full range of methods Bytes offers. Ie. If you don't see an opportunity to use one of the alternative methods Bytes offers there us no benefit.

Reasons you might use Bytes;
- compressed data types. Eg stop bit encoding
- object pooling built in to avoid garbage.
- do your own object management to avoid garbage completely.

Nicholas Whitehead

unread,

Apr 15, 2015, 5:05:01 PM4/15/15

to java-ch...@googlegroups.com

Apologies for hijacking, but how does one implement the binary index on the first [monotonic] field ? Is that built in ?

Peter Lawrey

unread,

Apr 15, 2015, 6:08:46 PM4/15/15

to java-ch...@googlegroups.com

We support binary search or range searches. You provide a comparator and the binary search will find an index where the entry returns 0 or the point which is -1 below and +1 above.
For a range search it finds the point where -1 and 0 are returned as well as 0 and +1 are returned. This will allow you to find the entries for a particular hour for example.

Alex Smith

unread,

May 21, 2015, 5:45:28 AM5/21/15

to java-ch...@googlegroups.com

As far as I can see binary search is only supported for IndexedChronicle? Are there any plans to support this for VanillaChronicle as well ?

Reply all

Reply to author

Forward