longest prefix match query

47 views
Skip to first unread message

Glen Lockhart

unread,
Sep 18, 2019, 11:53:35 AM9/18/19
to cqengine-discuss
My company is looking to adopt CQEngine for use in a telco application.  We have a requirement for a best / longest prefix match type search.  While we could implement this in a library that makes use of CQEngine, we thought we'd see if this is something that could be built in.

The best prefix match requirement breaks down as the following: 

Given a cache with the following records

| id | code | Operator |
| 0 | 35387 | Vodafone |
| 1 | 35385 | Eir |
| 2 | 353855 | A.N.Other |

A best prefix query using a  query as shown below would return the entity with id 0
Query<Foo> query1 = bestPrefix(Foo.CODE, "35387123456"));

Similarly, queries with the following values would return these rows
353851234567 -> 1
353855234567 -> 2


Do you think this would be something that
  1. Would be feasible to implement in CQEngine?
  2. Would be something you'd like to see added?
If the answer to 1 and 2 above is yes, could you suggest a good place for us to start looking at this (we're happy to do the work, but aren't overly familiar with the internals of CQEngine and would appreciate any pointers you might have for us to get started)

thanks
Glen

Niall Gallagher

unread,
Sep 18, 2019, 3:19:49 PM9/18/19
to cqengine...@googlegroups.com
You could take a look at my concurrent-trees project: https://github.com/npgall/concurrent-trees

Some of the data structures in that project are used as indexes in CQEngine. But not all of their features are exposed via CQEngine.

In particular, the InvertedRadixTree data structure's method getKeysPrefixing() might do what you want?


If you would like to integrate that with CQEngine it should be quite easy to do, and I'd definitely welcome a pull request to do that. 

To do so, you would just need to implement a query like StringIsContainedIn (perhaps called StringIsPrefixOf):


And then extend the InvertedRadixTreeIndex (which wraps the InvertedRadixTree) to accelerate that query as well:


Basically at the moment, it looks like CQEngine is only leveraging a subset of the lookup operations supported by InvertedRadixTree. So it should be straightforward to expose some additional ones.

Alternatively,  you might find that the InvertedRadixTree on it's own would suffice for your use case, and you don't need a full query engine around it. 

Does this help?
Niall

Glen Lockhart

unread,
Sep 19, 2019, 8:46:06 AM9/19/19
to cqengine-discuss
Thanks for the pointers!  I see now how that could be implemented.  I'll try updating CQEngine to add this feature and issue a PR for when I get it working.

Glen Lockhart

unread,
Sep 19, 2019, 12:23:16 PM9/19/19
to cqengine-discuss
Niall, I have a working solution based on the pointers you sent me. But there are 2 issues with it you might be able to help with.

  1. The new longestPrefix query only works correctly if used in conjunction with the InvertedRadixTreeIndex.  If no index is supplied (or a different index is used), then the user will get incorrect results (it behaves more like a StringIsContainedIn with a prefix).
    1. Given that this gives incorrect results, not just a slow query, Is there a way to programatically restrict a query to only work with an index and throw an exception if used incorrectly?
  2. Using this query with a matchesNonSimpleAttribute just doesn't make sense, so I've hardcoded the query to always return false.  Again, is there a way to indicate that this usage is incorrect? throw a runtime exception or something?

cheers,
Glen

On Wednesday, 18 September 2019 20:19:49 UTC+1, Niall wrote:
Reply all
Reply to author
Forward
0 new messages