Very nice. A couple of review comments..
I wonder if we keep naming and aliasing consistent enough to make the
tableName comparisons work on aliased tables? Aliases have been a
source of defects in planner code that resolves tables.
For example: select * from P1 as A, P2 as B where A.id = B.id;
Second, I think we need to enforce equality expressions on the
partition key TVE. I think this example may produce an incorrect
answer for: select * from P1, P2 where P1.id < P2.id. Apologies if I
missed an existing equality test while reading the diff.
I spent some trying to decide if it would be cleaner to add an
isPartitionKeyEquality() method to the AccessPath join expressions.
This logic seems like it should be encapsulated with the AccessPath or
with an AccessPath's expression list. Your thoughts?
Ryan.
----- Original Message -----
From: "Ryan Betts" <rbe...@voltdb.com>
To: <voltd...@googlegroups.com>
Sent: Wednesday, February 29, 2012 8:41 AM
Subject: Re: [VoltDB-dev] ENG-496
>> Attached are two patches for ENG-496. I actually took John's advise and
>> modified planner not to generate intermediate Send/Receive nodes for each
>> pair of tables but have only one at the top of the graph in case of join
>> on
>> partition key.
>
> Very nice. A couple of review comments..
>
> I wonder if we keep naming and aliasing consistent enough to make the
> tableName comparisons work on aliased tables? Aliases have been a
> source of defects in planner code that resolves tables.
>
> For example: select * from P1 as A, P2 as B where A.id = B.id;
So TVE.m_tableName may hold alias name while Table.getTypeName returns the
actual table name in this case, right? Than where is the alias-to-name
association for the table is kept? There is a column alias field in TVE but
it doesn't look like a right place.
>
> Second, I think we need to enforce equality expressions on the
> partition key TVE. I think this example may produce an incorrect
> answer for: select * from P1, P2 where P1.id < P2.id. Apologies if I
> missed an existing equality test while reading the diff.
>
No, you didn't miss anything. :) My original thought was to restrict the
join condition to a simple Equality expression (P1.id = P2.id) but then
what about P1.id - P2.id = 0? Anyway you are correct. I maid my validation
too lax.
> I spent some trying to decide if it would be cleaner to add an
> isPartitionKeyEquality() method to the AccessPath join expressions.
> This logic seems like it should be encapsulated with the AccessPath or
> with an AccessPath's expression list. Your thoughts?
>
I agree that it would be a cleaner approach. My only question is what
expression complexity do we want to support?.
Here are some example
(P1.id = 5 and P2.id = 5).
(P1.id = P2.id or P1.non_partion_column =
P2.another_non_partition_column)
The first one doesn't have direct equality test but still qualifies. The
second one does have it but the whole join can't be reduced to
partion-by-partition path.
Or do we want to keep it simple : P1.id = P2.id and nothing else?
Mike
> Ryan.
I would have to write a testcase and investigate, honestly.
> join condition to a simple Equality expression (P1.id = P2.id) but then what
> about P1.id - P2.id = 0?
Expression re-writing would be another fun project. For this case, I
would prefer to keep the optimization very simple: A.id = B.id. It is
beneficial to be able to predict (and document) which queries produce
efficient plans.
Thanks,
Ryan.
>> For example: select * from P1 as A, P2 as B where A.id = B.id;> So TVE.m_tableName may hold alias name while Table.getTypeName ...
I would have to write a testcase and investigate, honestly.
Expression re-writing would be another fun project. For this case, I
> join condition to a simple Equality expression (P1.id = P2.id) but then what
> about P1.id - P2.id = 0?
would prefer to keep the optimization very simple: A.id = B.id. It is
beneficial to be able to predict (and document) which queries produce
efficient plans.
Thanks,
Ryan.
I'll look at this tomorrow afternoon. I'm excited for this
functionality - it has been long missing. Unsure what version control
you prefer to use; you might find it easier to make a github account,
fork the VoltDB project and hack from there? Whatever works for you.
Ryan.
On Thu, Mar 1, 2012 at 9:19 PM, Michael Alexeev
Hey Mike,
I'll look at this tomorrow afternoon. I'm excited for this
functionality - it has been long missing. Unsure what version control
you prefer to use; you might find it easier to make a github account,
fork the VoltDB project and hack from there? Whatever works for you.
>> I'll look at this tomorrow afternoon.
I pushed that patch to an integration branch. Our CI harness will run
the testbase against the change now.
https://github.com/VoltDB/voltdb/tree/eng496-integration
Thanks,
Ryan.
Didn't quite make it through CI. TestPlansGroupBySuite fails while
compiling its catalog .. it asserts() that it finds a non-replicated
table that doesn't specify a partition key. This looks like an error
in the catalog more than an error on the logic for eng-496. I'll
investigate the error this afternoon.
rbetts@rbetts-desktop:voltdb(eng496-integration)$ ant junitclass
-Djunitclass=TestPlansGroupBySuite
[junit] Caused an ERROR
[junit] null
[junit] java.lang.reflect.InvocationTargetException
[junit] at
org.voltdb.planner.AccessPath.isPartitionKeyEquality(AccessPath.java:78)
[junit] at
org.voltdb.planner.SelectSubPlanAssembler.getSelectSubPlanForAccessPath(SelectSubPlanAssembler.java:266)
[junit] at
org.voltdb.planner.SelectSubPlanAssembler.generateMorePlansForJoinOrder(SelectSubPlanAssembler.java:237)
[junit] at
org.voltdb.planner.SelectSubPlanAssembler.nextPlan(SelectSubPlanAssembler.java:212)
[junit] at
org.voltdb.planner.PlanAssembler.getNextSelectPlan(PlanAssembler.java:354)
[junit] at
org.voltdb.planner.PlanAssembler.getNextPlan(PlanAssembler.java:285)
[junit] at
org.voltdb.planner.QueryPlanner.compilePlan(QueryPlanner.java:180)
[junit] at
org.voltdb.compiler.StatementCompiler.compile(StatementCompiler.java:109)
[junit] at
org.voltdb.compiler.ProcedureCompiler.compileJavaProcedure(ProcedureCompiler.java:213)
[junit] at
org.voltdb.compiler.ProcedureCompiler.compile(ProcedureCompiler.java:67)
[junit] at
org.voltdb.compiler.VoltCompiler.compileDatabaseNode(VoltCompiler.java:674)
[junit] at
org.voltdb.compiler.VoltCompiler.compileXMLRootNode(VoltCompiler.java:483)
[junit] at
org.voltdb.compiler.VoltCompiler.compileCatalog(VoltCompiler.java:442)
[junit] at org.voltdb.compiler.VoltCompiler.compile(VoltCompiler.java:320)
[junit] at
org.voltdb.compiler.VoltProjectBuilder.compile(VoltProjectBuilder.java:615)
[junit] at
org.voltdb.compiler.VoltProjectBuilder.compile(VoltProjectBuilder.java:495)
[junit] at
org.voltdb.compiler.VoltProjectBuilder.compile(VoltProjectBuilder.java:485)
[junit] at
org.voltdb.regressionsuites.LocalSingleProcessServer.compile(LocalSingleProcessServer.java:82)
[junit] at
org.voltdb.regressionsuites.TestPlansGroupBySuite.suite(TestPlansGroupBySuite.java:628)
[junit]
[junit] Test org.voltdb.regressionsuites.TestPlansGroupBySuite FAILED
Ryan.
Ah - I see. This test suite performs a join on a view, which returns
fails to isReplicated() but does not have a partition key.
Ryan.
I pushed a fix and am planning on merging this branch to trunk later today.
Ryan.
VoltDB has implicitly partitioned materialized views - they aren't
replicated, but they also are not user partitioned. I check for views
in the condition.
https://github.com/VoltDB/voltdb/commit/fd75015bdb7721445355d6a8bcac8d6ae4b3d9b3
> I also re-enabled commented out test case for ENG-490 in
> TestFixedSQLSuite.java. I had to change the partition column for the ASSET
> table to make join on the partition key.
Okay - I'll take a look. I need to add a testcase to our sqlcoverage /
generated SQL suite and then merge this to master.
Ryan.
This is now on master. Thanks again!
Ryan.