Construction of disjunctive queries in AQL

39 views
Skip to first unread message

Fred Eisele

unread,
Sep 20, 2017, 10:46:28 AM9/20/17
to categoricaldata
What is the recommended approach for complex where clauses in AQL?

Ryan pointed out two approaches:
1) encode OR using the typeside.  
2) encode OR using the union (sum) of many queries.


'and' logic is pretty simple but 'or' is more involved.

        select s.name
from source as s
where  s.channel = 5 and s.type = 'a-n-A-C-F-m'

       query q_and = simple : S2 {
   from
s : Source
   where
s.channel = 5
                s.type = "a-n-A-C-F-m"
   return
name -> s.name
        }

 Setting aside the problem that only the equality relational operator is supported
the following example illustrates disjunction using the (1) typeside approach.

        select s.name
from source as s
where  s.channel = 5 or s.type = 'a-n-A-C-F-m'

       query q_and = simple : S2 {
   from
s : Source
   where
or(eqInt(s.channel,5), eqVc(ce.cot_type,"a-n-A-C-F-m")) = true
   return
name -> s.name
        }

The or is dealt with here using java_functions.

         java_functions
eqVc : VARCHAR, VARCHAR -> BOOLEAN = "return input[0].equals(input[1])"
eqInt : INTEGER, INTEGER -> BOOLEAN = "return input[0] == input[1]"
or : BOOLEAN, BOOLEAN -> BOOLEAN = "return input[0] || input[1]"

The typeside approach can be used to handle the lack of relational operators.

How is the (2) summation approach done in AQL?

Fred Eisele

unread,
Sep 20, 2017, 3:35:17 PM9/20/17
to categoricaldata

How is the (2) summation approach done in AQL?

One way, as suggested by Peter, is to use the coproduct.

   schema S5 = literal : Ty {
entities
        Q
    attributes
        name : Q -> VARCHAR
        time : Q -> INTEGER
        type : Q -> VARCHAR
        channel : Q -> INTEGER
    }

  query q5c = literal : S2 -> S5  {
entities Q -> {
from
s : Source
return
name -> s.name
type -> s.type
channel -> s.channel
time -> s.time
    }}

  instance s5_inst = eval q5c s2_inst

  query q5a = literal : S5 -> S5 {
entities
Q -> {
from q: Q
where eqInt(q.channel,3) = true
return
name -> q.name
type -> q.type
channel -> q.channel
time -> q.time
}
    }
  instance q5a_inst = eval q5a s5_inst

  query q5b = literal : S5 -> S5 {
entities
Q -> {
from q: Q
where 
q.type = "a-n-A-C-F-m"
eqInt(q.channel,3) = false
return
name -> q.name
type -> q.type
channel -> q.channel
time -> q.time
}
     }
  instance q5b_inst = eval q5b s5_inst

  instance q5c_inst = coproduct q5a_inst + q5b_inst : S5

Is it possible to use some time of pullback to avoid the 'eqInt(q.channel,3) = false' ?

Ryan Wisnesky

unread,
Sep 27, 2017, 7:07:14 PM9/27/17
to categor...@googlegroups.com
Hi Fred,

I’m not sure it’s possible to avoid explicit checks of 'eqInt(q.channel,3)’ for true or false, but we could talk about it further.  However, I did want to point out another approach to this problem.

Suppose you have a query Q : S -> T that has disjunctions in the where clauses.  Rather than create a bunch of queries S -> T, you could instead create a new schema T’ that is a sub-schema of T + … + T (i.e., you’d put some extra entities in T’).  Then you’d add extra sub-queries to Q, similar to what you’re doing now, to populate these extra entities depending on the disjuncts.  You now have a query Q’ : S -> T’.  There is a functor T’ -> T taking the extra entities to the original, and sigma_F will union the extra tables together, so you’d run eval(Q’) followed by sigma_F.

Sigmas of the above form can be executed extremely quickly (see the ‘Relational Foundations’ paper), but AQL does not yet recognize them as special, so they may be slow.  If you try the above approach and run into performance issues, please let me know.

Ryan

--
You received this message because you are subscribed to the Google Groups "categoricaldata" group.
To unsubscribe from this group and stop receiving emails from it, send an email to categoricalda...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fred Eisele

unread,
Feb 13, 2018, 3:31:26 PM2/13/18
to categoricaldata
I had the thought that I could make an intermediate schema and query.
The intermediate query having a separate entity for each term in the disjunction.
Then the final query would join those intermediate entities together.
Next use query composition on the intermediate query and final.
Finally I could ask AQL to show me what the single query would look like.
Here is what I get.

query Q = simple : S {
   
from
       
"(c,ce)" : cot_event
       
"(t,ce)" : cot_event
   
where
       
"(c,ce)".servertime = "(t,ce)".servertime
       
"(c,ce)".source_id.channel = 3
       
"(t,ce)".cot_type = a-n-A-C-F-m
       
"(c,ce)".source_id.name = "(t,ce)".source_id.name
    attributes
        channel
-> "(c,ce)".source_id.channel
        name
-> "(c,ce)".source_id.name
        time
-> "(c,ce)".servertime
        type
-> "(t,ce)".cot_type
}

This is illuminating to me.
And solves the problem of needing the pesky java_functions which do not translate into SQL.
What it seems to do is bring in as many copies of the entity as there are disjoint terms.


David Spivak

unread,
Feb 14, 2018, 8:16:38 AM2/14/18
to categor...@googlegroups.com
Hi Fred,

I didn't really follow. If you're taking a disjunction, I would have expected to see a coeval, not a query eval. Is that what you did?

And how did you get AQL to show you what the single query would look like?

David

--
You received this message because you are subscribed to the Google Groups "categoricaldata" group.
To unsubscribe from this group and stop receiving emails from it, send an email to categoricaldata+unsubscribe@googlegroups.com.

Fred Eisele

unread,
Feb 14, 2018, 9:07:48 AM2/14/18
to categor...@googlegroups.com
That is because I am wrong :-)
My idea of having an intermediate query with multiple entities, one for each disjunction, is fine.
Let’s say there are some records:
(a1 b c) (a2 b d) (a3 e c) (a4 b c)
The intermediate query forms two pattern-based entities.
(_ b _) : (a1 b c) (a2 b d) (a4 b c)
(_ _ c) : (a1 b c) (a3 e c) (a4 b c)
Joining them on the first field gives...
(a1 (b b) (c c))
(a2 (b _) (d _))
(a3 (_ e) (_ c))
(a4 (b _) (c _))
It is similar to a disjunction inasmuch as it contains the right data and has the correct cardinality.

To unsubscribe from this group and stop receiving emails from it, send an email to categoricalda...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "categoricaldata" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/categoricaldata/OVq5uDsRvl0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to categoricalda...@googlegroups.com.

Fred Eisele

unread,
Feb 14, 2018, 11:08:42 AM2/14/18
to categoricaldata

It is wrong because I was assuming an outer rather than an inner join.

Fred Eisele

unread,
Feb 14, 2018, 11:11:31 AM2/14/18
to categoricaldata


On Wednesday, February 14, 2018 at 7:16:38 AM UTC-6, David Spivak wrote:
Hi Fred,
And how did you get AQL to show you what the single query would look like?
 
This is a feature Ryan added. catdata.aql.AqlCmdLine/queryToSql
There is a method that produces SQL for [as far as I can tell] any query.

Ryan Wisnesky

unread,
Feb 14, 2018, 1:22:31 PM2/14/18
to categor...@googlegroups.com
Yes, AQL can emit SQL(s) for any AQL query, but only those AQL queries where the type side functions line up with actual SQL functions will the SQL(s) so generated be executable.  

--
You received this message because you are subscribed to the Google Groups "categoricaldata" group.
To unsubscribe from this group and stop receiving emails from it, send an email to categoricalda...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages