Expose API to get longest prefix match

dusan....@gmail.com

unread,

Aug 7, 2013, 9:28:22 AM8/7/13

to concurrent-t...@googlegroups.com

I'm trying achieve something similar as was mentioned at issue#2 (Expose API to get longest prefix match / "closest" keys).
(https://code.google.com/p/concurrent-trees/issues/detail?id=2)

As asked by "phraktle" (and as I understand his question) "longest prefix match" is key from Tree (RadixTree)
where:
- queried key.startWith(key) is true and is the longest from within all such keys in RadixTree.

simple example:
RadixTree<Integer> tree = new RadixTree<Integer>(new DefaultCharArrayNodeFactory());
tree.put("421902903",1);
tree.put("421",2);

usage:
tree.getLongestKey("4219"); // returns "421"
tree.getLongestKey("42190290"); // returns "421"
tree.getLongestKey("4219029031"); // returns "421902903"
tree.getLongestKey("42"); //returns null
tree.getLongestKey("421"); //returns "421"

this can be used for example for routing (if phone number (queried key) starts with 421902903 do something,
if phone number starts with 421 do something else). I think RadixTree is most suited
for such lookups.

tree.getClosestKey has different behaviour and will not return for queried key "42190290" "421" instead it will return iterator
with only key "421902903".

But I think, it's not possible within current implementation (concurrent-trees-2.0.0).

dusan....@gmail.com

unread,

Aug 7, 2013, 9:42:51 AM8/7/13

to concurrent-t...@googlegroups.com

same, can be achieved using TreeMap:

        TreeMap<String, String> map = new TreeMap<String, String>();
        map.put("421903", "2");
        map.put("421", "1");

        System.out.println( map.floorEntry("4") ); // null
        System.out.println( map.floorEntry("421") ); // 421=1
        System.out.println( map.floorEntry("4212") ); // 421=1
        System.out.println( map.floorEntry("42190") ); // 421=1
        System.out.println( map.floorEntry("421903") ); // 421903=2
        System.out.println( map.floorEntry("4219032") ); // 421903=2

Niall

unread,

Aug 7, 2013, 5:49:01 PM8/7/13

to concurrent-t...@googlegroups.com

Hi Dusan,

This was a good suggestion, thanks. I have implemented it and released it as Concurrent-Trees 2.1.0. It should sync to Maven central in the next few hours.

The feature is in InvertedRadixTree, as opposed to RadixTree.

There is a (somewhat artificial) distinction between the RadixTree and the InvertedRadixTree. Both use the same data structure, but require quite different traversal algorithms.

The RadixTree is concerned with looking up keys stored in the tree given prefixes of those keys. The InvertedRadixTree is concerned with scanning input documents for keys contained in the tree (i.e. the other way around).

The traversal algorithm required to implement this feature was already implemented in InvertedRadixTree as it is needed by the existing getKeysContainedIn method. However it was not exposed directly because getKeysContainedIn wraps it in additional logic.

I have added InvertedRadixTree.getKeysPrefixing(CharSequence document) and related methods there.

There is a difference between this implementation and the exact functionality you were looking for, however it should be pretty efficient for your use case nonetheless. Basically the algorithm will traverse the tree and instead of returning the longest key, it will emit every key in the tree which it finds to prefix the input document. This functionality is basically free as the relevant nodes must be traversed anyway, and it might have additional applications beyond your use case. So for the longest key, just iterate through all keys returned and discard all but the last key. The last key will be the longest key in the tree which is a prefix of your input document (phone number in your case).

I tracked this in issue 5 and you can find example usage there.

Thanks for the suggestion, and let me know if you have any problems.

Niall

hcouplet

unread,

Jun 24, 2014, 11:38:31 AM6/24/14

to concurrent-t...@googlegroups.com

TreeMap will not work if you add in you sample

map.put("4210", "3");

System.out.println( map.floorEntry("4212") ); will then go to //4210=3 instead of 421=1

InvertedRadixTree works in that case.

Reply all

Reply to author

Forward