Processing a scalar subquery as a Filter instead of a Join

27 views
Skip to first unread message

Aman Sinha

unread,
Jun 2, 2014, 6:27:44 PM6/2/14
to opti...@googlegroups.com
For the query below, it would be quite useful if Optiq could detect that the subquery in the HAVING clause is scalar and process it as a local filter on the outer 'nation' table instead of as a join. 
For comparison, I have provided the Postgres plan for the same query.   Is there any open enhancement bug for this ?   In general, I am also curious to know if Optiq has functionality to detect if a subquery block is provably scalar or not.   

select n.n_regionkey from nation n group by n.n_regionkey having n.n_regionkey > (select min(n2.n_regionkey) from nation n2);


Optiq logical plan:  (I have removed the cost information since that is not relevant): 

ProjectRel(n_regionkey=[$0]): 
  FilterRel(condition=[>($0, $1)])
    JoinRel(condition=[true], joinType=[left]):
      AggregateRel(group=[{0}]): 
        ProjectRel(n_regionkey=[$1]):
          EnumerableTableAccessRel(table=[[cp, tpch/nation.parquet]]): 
      AggregateRel(group=[{}], EXPR$0=[MIN($0)]):
        ProjectRel(n_regionkey=[$1]): 
          EnumerableTableAccessRel(table=[[cp, tpch/nation.parquet]]):


Postgres plan: 

                                QUERY PLAN
--------------------------------------------------------------------------
 HashAggregate  (cost=24.92..25.55 rows=63 width=4)
   InitPlan 1 (returns $0)
     ->  Aggregate  (cost=12.38..12.39 rows=1 width=4)
           ->  Seq Scan on nation n2  (cost=0.00..11.90 rows=190 width=4)
   ->  Seq Scan on nation n  (cost=0.00..12.38 rows=63 width=4)
         Filter: (n_regionkey > $0)
(6 rows)


Thanks !
Aman

Julian Hyde

unread,
Jun 2, 2014, 7:09:19 PM6/2/14
to opti...@googlegroups.com
I agree that would be useful.

In Optiq we’d model that as a nested loop join. The outer loop (not shown in your Postgres plan) executes once and sets $0. Then the inner loop gets executed.

See CorrelatorRel, which implements a join by setting a variable in an outer loop, and NestedLoopsJoinRule, which creates it. If you produce an implementation of CorrelatorRel in Drill (or in any convention — performance is not an issue) you’re done.

Please log a feature request and add my above comments.

Julian

Aman Sinha

unread,
Jun 2, 2014, 9:39:49 PM6/2/14
to opti...@googlegroups.com
Thanks, Julian.  It looks like the JIRA for Optiq does not exist yet so I logged it here along with your comments:  https://github.com/julianhyde/optiq/issues/297

Aman


On Mon, Jun 2, 2014 at 4:09 PM, Julian Hyde <julia...@gmail.com> wrote:
I agree that would be useful.

In Optiq we'd model that as a nested loop join. The outer loop (not shown in your Postgres plan) executes once and sets $0. Then the inner loop gets executed.

See CorrelatorRel, which implements a join by setting a variable in an outer loop, and NestedLoopsJoinRule, which creates it. If you produce an implementation of CorrelatorRel in Drill (or in any convention -- performance is not an issue) you're done.


Please log a feature request and add my above comments.

Julian

--
You received this message because you are subscribed to a topic in the Google Groups "optiq-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/optiq-dev/FI72pXz_YVg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to optiq-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages