Parsing dynamic queries...

227 views
Skip to first unread message

Lewis Henderson

unread,
Jan 10, 2014, 3:58:09 PM1/10/14
to cqengine...@googlegroups.com
I have been evaluating CQEngine and am looking for help in using the query parser.

1) How do I register 'dynamic' attributes. By this I mean that my domain objects contain Maps as well as simple properties. e.g.

public class DomainObject {
   
private Long id;
   
private Map<String, Object> details;

   
public Long getId() {
       
return id;
   
}

   
public Map<String, Object> getDetails {
       
return details;
   
}

   
public Object getDetail(String key) {
       
return details.get(key);
   
}
}



2) How do I register attributes for a foreign collection?

An example of how to code the Car -> Garage using the parser would be useful.

I will have at most 3 or 4 joins.



Cheers

Niall Gallagher

unread,
Jan 10, 2014, 4:44:20 PM1/10/14
to cqengine...@googlegroups.com
Hi Lewis,

You mean the string query parser in package com.googlecode.cqengine.query.parser?

I would actually not recommend that you use anything in that package yet. That package is not a documented feature yet :) It contains experimental support to perform SQL and other string-based queries, but it's still under development.

The main way to perform queries in CQEngine is by composing queries programmatically, as you can see in the existing documentation.

You don't need to register attributes with the collection. You can just write the attribute, and then perform queries with it on the collection. CQEngine will ask the attribute to return a value for an object being examined, and compare that to a value supplied in the query. If you wish though, you can build an index on the attribute. And you can then register the index with the collection. Then CQEngine will use the index for queries on that attribute.

Your example is a quite abstract. Could you provide more specific examples of what the data would look like, and the types of queries you want to run on it?

--
-- You received this message because you are subscribed to the "cqengine-discuss" group.
http://groups.google.com/group/cqengine-discuss
---
You received this message because you are subscribed to the Google Groups "cqengine-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cqengine-discu...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Lewis Henderson

unread,
Jan 10, 2014, 5:50:35 PM1/10/14
to cqengine...@googlegroups.com
Niall,

Yes, I am looking at compiling queries based on a text representation. I already use JoSQL (http://josql.sourceforge.net/) to perform what I am looking for in my current applications.

The syntax does not have to SQL, but it must be easily understandable by a (knowledgeable!) end user.

My problem, and the reason for this post, is that the 'attributes' depend on the query more than the 'domain object'. You can think of the 'domain objects' as Maps with one or two mandatory keys(attributes) such as ID and DESCRIPTION. As the system runs, these objects have keys added/updated and possibly removed.

An example :-

Workitem DomainObject
    Long id = 1 (PK)
    String description = 'Timesheet for Joe'
    Map details =>
        monday = 7hrs
        tuesday = 6.5hrs

Address Domain Object
    Long id = 1 (PK)
    Long workitem = 1 (FK)
    process = 'Timesheet'
    step = 'Create Timesheet'

There is a 1 to many relationship between Workitem and Address due to parallel processing where a workitem can be at two or more steps(addresses) at the same time.

The query could be :-

Give me all Workitems where step = 'Create Timesheet' and 'tuesday' > 3hrs.

The join columns will never change so the indexes/attributes for these can be set up outside of the query.


Lewis

Niall

unread,
Jan 10, 2014, 7:29:47 PM1/10/14
to cqengine...@googlegroups.com
Hi Lewis,

In a relational database, you would model the details map as a separate table: WorkItemEntry [ id (PK), workItemId (FK), day, time ]

So one option is to do the same in CQEngine. This way, the hours worked on particular days, is a separate IndexedCollection, and you can add and remove items to this collection dynamically.

I created POJO classes for Address, WorkItem, and WorkItemEntry, and auto-generated attributes for them. You can find source in the attached .zip file.

Here's the main method which demonstrates creating an indexed collection for each of these types of object, and doing a join between them.

    public static void main(String[] args) {
       
IndexedCollection<WorkItem> workItems = CQEngine.newInstance();
        workItems
.add(new WorkItem(1L, "Timesheet for Joe"));
        workItems
.add(new WorkItem(2L, "Timesheet for Jane"));

       
IndexedCollection<WorkItemEntry> workItemEntries = CQEngine.newInstance();
        workItemEntries
.add(new WorkItemEntry(1L, 1L, "monday", 7.0));
        workItemEntries
.add(new WorkItemEntry(1L, 1L, "tuesday", 6.5));
        workItemEntries
.add(new WorkItemEntry(2L, 2L, "tuesday", 2.0)); // Jane also worked on tuesday, but for < 3 hrs

       
IndexedCollection<Address> addresses = CQEngine.newInstance();
        addresses
.add(new Address(1L, 1L, "Timesheet", "Create Timesheet"));
        addresses
.add(new Address(2L, 2L, "Timesheet", "Create Timesheet"));

       
ResultSet<WorkItem> results = workItems.retrieve(
           
and(
                existsIn
(workItemEntries,
                       
WorkItem.WORK_ITEM_ID,
                       
WorkItemEntry.WORK_ITEM_ID,
                       
and(
                            equal
(WorkItemEntry.DAY, "tuesday"),
                            greaterThan
(WorkItemEntry.TIME, 3.0)
                       
)
               
),
                existsIn
(addresses,
                       
WorkItem.WORK_ITEM_ID,
                       
Address.WORK_ITEM_ID,
                        equal
(Address.STEP, "Create Timesheet")
               
)
           
)
       
);

       
System.out.println(results.uniqueResult().getDescription()); // "Timesheet for Joe"
   
}

If you change "greaterThan" to "lessThan" in the query, you get "Timesheet for Jane" instead.

HTH,
Niall
temp_worksheet.zip

Niall

unread,
Jan 10, 2014, 7:49:06 PM1/10/14
to cqengine...@googlegroups.com
BTW I had a typo in the snippet above, but you get the idea.

The primary keys should be unique (1, 2, 3):
        workItemEntries.add(new WorkItemEntry(1L, 1L, "monday", 7.0));

        workItemEntries
.add(new WorkItemEntry(2L, 1L, "tuesday", 6.5));
        workItemEntries
.add(new WorkItemEntry(3L, 2L, "tuesday", 2.0)); // Jane also worked on tuesday, but for < 3 hrs

Lewis Henderson

unread,
Jan 16, 2014, 6:26:14 AM1/16/14
to cqengine...@googlegroups.com
Niall,

Thanks for that, and sorry for the delay in replying. For some reason I did not get an email!?

I have moved on from where I was, and have created a parser to create ALL of your current queries and added spring security ACLs into the mix...

I can now parse and execute queries such as :-

and(
    equal(
        Workitem[queueType]
        , "next"
    )
    , existsIn(
        Address
        , Workitem[id]
        , Address[id]
        , and(
            ,not(
                has(
                    Address[lockedBy]
                )
            , hasPermission(
                Address[threadIdentity]
                ,CLAIM
                )
            )
        )
    )
)

I *may* change the syntax slightly. I had the [] to add options but they are currently not needed and it would be better to have something like Address.lockedBy instead of Address[lockedBy]...

From the above, the context of the query (an IndexedCollection) is provided at query execution time. The attributes are looked up at query parse time matching the contexts to IndexedCollections on the fly. (Address)

The performance is excellent! It is FAR better than the current alternative database solution.

I *may* add the ability to create/remove indexes and ordering via a similar syntax as the user/client app may be able to help. For example a lookup on a dynamic property of, say 'Reference', would perform better with an index.


Cheers

Niall

unread,
Jan 16, 2014, 6:55:11 AM1/16/14
to cqengine...@googlegroups.com
Hi Lewis,

Wow that's cool!

So are you taking the string queries and compiling them to regular Java CQEngine queries, or did you extend the string parser to add support for the rest of the query types?

Nice work!
Niall

Lewis Henderson

unread,
Jan 16, 2014, 7:12:12 AM1/16/14
to cqengine...@googlegroups.com
Niall,

I built a parser using ANTLR4 to enable me to quickly change the syntax and/or add features.

The main work is done in a visitor. Here is an example of the usage in my spring MVC test controller :-

    @RequestMapping(value = "/", method = RequestMethod.GET)
    protected ModelAndView testGet(
            @RequestParam(required = false) String filter
            , @RequestParam(defaultValue = MODEL_ADDRESS) String type) {
        // Set defaults
        if (filter == null)
            if (MODEL_PROCESS.equals(type))
                filter = "equal(" + MODEL_PROCESS + "[state],\"active\")";
            else if (MODEL_WORKITEM.equals(type))
                filter = "equal(" + MODEL_WORKITEM + "[queueType],\"next\")";
            else
                filter = "hasPermission(" + MODEL_ADDRESS + "[threadIdentity],CLAIM)";

        // Create an input stream from the query text...
        ANTLRInputStream input = new ANTLRInputStream(filter);
        
        // The lexer is injected.
        lexer.setInputStream(input);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        tokens.fill();

        // The visitor is injected.
        NativeQueryVisitorImpl visitor = (NativeQueryVisitorImpl) parseTreeVisitor;

        /* 
            Create 'context' providers. The key is the name of the context in the query. For example, in the following
            query we have 'Workitem' and 'Address' contexts. The providers return both Attributes and IndexedCollections
            to the visitor to enable it to create it's queries.
and(
    equal(
        Workitem[queueType]
        , "next"
    )
    , existsIn(
        Address
        , Workitem[id]
        , Address[id]
        , not(
            has(
                Address[lockedBy]
            )
        )
    )
)
         */
        Map<String, Provider> providers = new HashMap<String, Provider>();
        providers.put(MODEL_PROCESS, new ProviderImpl(
                processAttributes
                , processCollection
        ));
        providers.put(MODEL_WORKITEM, new ProviderImpl(
                workitemAttributes
                , workitemCollection
        ));
        providers.put(MODEL_ADDRESS, new ProviderImpl(
                addressAttributes
                , addressCollection
        ));

        visitor.setProviders(providers);

        // Give the parser the tokens created by the lexer...
        parser.setTokenStream(tokens);
        // Create a parse tree.
        ParserRuleContext tree = ((NativeQueryParser) parser).startRule();
        // Visit the nodes in the parse tree creating the query
        Query query = (Query) parseTreeVisitor.visit(tree);

        // Use it...
        ResultSet<Model> resultSet = providers.get(type).getCollection().retrieve(query);

        Map<String, Object> model = new HashMap<String, Object>();
        model.put("filter", filter);
        model.put("type", type);
        model.put("items", resultSet.iterator());
        return new ModelAndView(type.toLowerCase() + "List", model);
    }


This was FAR easier and more flexible than extending the string parser!

Cheers

Niall Gallagher

unread,
Jan 16, 2014, 11:01:35 AM1/16/14
to cqengine...@googlegroups.com
Hi Lewis,

That is pretty cool. If you'd like to contribute any of that string query support to the CQEngine project I'll look at including it :D

I'm definitely in favour of not having to modify any string parsers if additional native query types are added, so I like your approach.

Thanks,
Niall

Lewis Henderson

unread,
Jan 19, 2014, 1:25:35 PM1/19/14
to cqengine...@googlegroups.com
Niall,

I have the source code available for my 'proof of concept'.

1) It is used in a spring framework environment.
2) The cf_extentions artifact contains Model and implementations. Model is purely a marker interface.
    In my case I have three models, ProcessModel, WorkitemModel and AddressModel. They all simply provide getters for the available properties. The complex one is WorkitemModel which contains a workitem object which is similar to nested Maps...
3) I have included my stab at extending your syntax with a 'hasPermission' query. This uses spring security ACLs.

I have attached the two maven projects for your perusal...


Cheers





Lewis Henderson
 
Director
CobraFlow Limited

T:01748 850045 
M:0788 7788 436
Skype:CobraFlow
 


You received this message because you are subscribed to a topic in the Google Groups "cqengine-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cqengine-discuss/I6p39HeYr0I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cqengine-discu...@googlegroups.com.
cqengine.zip

Lewis Henderson

unread,
Jan 19, 2014, 1:31:59 PM1/19/14
to cqengine...@googlegroups.com
Niall,

The controller changed a little, so here it is as a framework for calling the parser etc...

My implementation of the Hazelcast queries (Predicate) has faltered at the final hurdle!

They use static final classes and do not allow you to add something like your Attribute to enable bespoke value retrieval from the objects. This is a complete 'show stopper' as not all objects have simple getters and I can't see an alternative ;-(


Cheers

Lewis Henderson
 
Director
CobraFlow Limited

T:01748 850045 
M:0788 7788 436
Skype:CobraFlow
 


CQEngineGenericListController.java

Niall Gallagher

unread,
Jan 20, 2014, 3:56:18 PM1/20/14
to cqengine...@googlegroups.com
Hi Lewis,

Cool, thanks for sharing the code! When I get a chance I'm going to look at using this kind of an approach instead of the approach I started in trunk.

Re:Hazelcast, can you just put the attributes in a companion class instead? i.e. you don't need to modify the classes, to define attributes on their fields. 

There's an example of a companion class here: http://code.google.com/p/cqengine/wiki/AttributesGenerator

HTH,
Niall

<CQEngineGenericListController.java>

Lewis Henderson

unread,
Jan 21, 2014, 5:25:11 AM1/21/14
to cqengine...@googlegroups.com
Niall,

If only it was that easy!

I tried but there is no way, that I can see, of getting *between* the final classes and the code that gets the attributes. They go for getters/setters only.

I am leaving it for now. That is the only issue that I have with the Hazelcast implementation but it is a 'showstopper' for me. I am currently not able to invest the time to take it further.

Issues with CQEngine :-

1) You expect that the attributes will implement Comparable. I can understand this for the Greater/LessThan/Between queries, but if the properties do not implement Comparable, you get a ClassCastException. Would it not be better to report an error or assume that the row will not appear in the results?
2) I have a performance issue on my Join. I am trying to figure out why at the moment.

The query :-

and(
    equal(
        Workitem[queueType]
        , "next"
    )
    , existsIn(
        Address
        , Workitem[id]
        , Address[id]
        , and(
            hasPermission(
                Address[threadIdentity]
                ,CLAIM
            )
            , not(
                has(
                    Address[lockedBy]
                )
            )
        )
    )
)
 is SLOW!

but...

and(
    equal(
        Workitem[queueType]
        , "next"
    )
    , existsIn(
        Address
        , Workitem[id]
        , Address[id]
        , not(
            has(
                Address[lockedBy]
            )
        )
    )
)

and ...

hasPermission(
    Address[threadIdentity]
    ,CLAIM
)

are fast in their own rights...

I have an index on 'threadIdentity' which is a simple object with a single Serializable property, which in this case is a Long.

Any pointers?


Cheers



Lewis Henderson
 
Director
CobraFlow Limited

T:01748 850045 
M:0788 7788 436
Skype:CobraFlow
 


Niall Gallagher

unread,
Jan 21, 2014, 9:29:44 AM1/21/14
to cqengine...@googlegroups.com
Hi Lewis,

It's only necessary to implement Comparable, if you will use queries which require it such as lessThan() or greaterThan(), or if you use indexes which require it such as NavigableIndex.

You can see that the APIs of the equal() and in() queries, and of HashIndex don't require Comparable.

Regarding the performance issue, what happens if you stub out hasPermission?

One option to try, is to implement hasPermission as a boolean attribute instead of as a query. That way CQEngine will avoid calling it if an index is available. I'd profile the code to see where it is spending the time.

Also not() queries cannot benefit from standard indexes. Because that means "give me objects that are not contained in the index" :)
An alternative could be to define an attribute which returns the negation, and build an index on that attribute, thus it could use an index.

Lewis Henderson

unread,
Jan 21, 2014, 9:58:48 AM1/21/14
to cqengine...@googlegroups.com
Niall,

thanks for those pointers, I'll investigate further.

The problem with Comparable is not the fact that it is required, but the fact that we may not be in control of the query and so a *user* may use those predicates on unsuitable attributes. It may be better to report errors during the parse phase for *fixed* models. In my case however, I do not know the datatypes beforehand in one of my models.


Cheers

Lewis Henderson
 
Director
CobraFlow Limited

T:01748 850045 
M:0788 7788 436
Skype:CobraFlow
 


Reply all
Reply to author
Forward
0 new messages