pull prototype

7 views
Skip to first unread message

Michael Alexeev

unread,
Apr 17, 2012, 9:23:52 PM4/17/12
to voltdb-dev
Hi Paul,

I started putting the prototype together and it turned out to be much harder then I imagined.  The work is very much in progress and I haven't committed anything yet.

The main obstacles so far are:
- Ideally, it has to be a mixture of pull/push executors in the same chain. For example, send executor must wait until all downstream executors finish their work and only then perform its own job.  I found 'a way around' in this particular case but not sure whether it would work everywhere. The other examples are select from views and select from select (looks like it's not supported but anyway)
- I can't find a clean way to separate iteration from the tuple processing by individual executors.  An example would be a sequential scan with a predicate (or limit executor) where for a single pull executor must perform multiple iteration to return a tuple. So far my p_next methods is a mixture of both.

For example, the p_next_pull for the sequential executor is:

TableTuple SeqScanExecutor::p_next_pull(const NValueArray &params, bool& status) {
    TableTuple tuple(m_state->m_targetTable->schema());
    bool result = false;
    //
    // if there is a predicate find the next tuple which satisfies it
    //
    if (m_state->m_node->getPredicate() != NULL)
    {
        while (m_state->m_iterator.next(tuple) == true)
        {
            VOLT_TRACE("INPUT TUPLE: %s, %d/%d\n",
                       tuple.debug(m_state->m_targetTable->name()).c_str(), m_state->m_tupleCtr,
                       (int)m_state->m_targetTable->activeTupleCount());
            if ((result = m_state->m_node->getPredicate()->eval(&tuple, NULL).isTrue()) == true)
                // found one
                break;
        }
    }
    else
        // simply get next from the table
        result = m_state->m_iterator.next(tuple);
   
    // set status success
    status = true;
    // and return the tuple
    if (result == true)
    {
        ++m_state->m_tupleCtr;
        return tuple;
    }
    else
        // return an empty one
        return TableTuple(m_state->m_targetTable->schema());
}

where m_state is an aggreagte of SeqScanExecutorState type to keep state between the calls:
    struct SeqScanExecutorState
    {  
        SeqScanPlanNode* m_node;
        Table* m_outputTable;
        Table* m_targetTable;
        TableIterator m_iterator;
        AbstractExpression* m_predicate;
        int m_tupleCtr;
    };

As soon as I will have something presentable (at least compilable) I will commit it so you could take a look.

Mike

Michael Alexeev

unread,
Apr 20, 2012, 10:03:24 AM4/20/12
to voltdb-dev
Hi All,

For the pull prototype I need to disable the optimization inlining
Projection node within the scan one. The access plan for a simple
select like 'select c from t' where t is a replicated table should
look like

Send
  Projection
     SeqScan

I commented out the IF branch making Projection node inline (lines
802-805 PlanAssemler::addProjection) and it seems to be working. I can
see in the debugger that the best plan (QueryPlanner::compilePlan)
consists of three nodes Send, Projection, and SeqScan and SeqScan
doesn't have inlined Projection node. But what puzzles me is that JSON
output shows the Projection node in two places (if I read it right)

1. Child of Send node
2. Inline node of SeqScan

SQL: SELECT lgn_name FROM log ;
COST: 3000000.0
PLAN:
{
    "EXECUTE_LIST": [
        7,
        9,
        10
    ],
    "PARAMETERS": [],
    "PLAN_NODES": [
        {
            "CHILDREN_IDS": [9],
            "ID": 10,
            "INLINE_NODES": [],
            "OUTPUT_SCHEMA": [{
                "COLUMN_ALIAS": "LGN_NAME",
                "COLUMN_NAME": "LGN_NAME",
                "EXPRESSION": {
                    "COLUMN_ALIAS": "LGN_NAME",
                    "COLUMN_IDX": 0,
                    "COLUMN_NAME": "LGN_NAME",
                    "TABLE_NAME": "LOG",
                    "TYPE": "VALUE_TUPLE",
                    "VALUE_SIZE": 100,
                    "VALUE_TYPE": "STRING"
                },
                "SIZE": 100,
                "TABLE_NAME": "LOG",
                "TYPE": "STRING"
            }],
            "PARENT_IDS": [],
            "PLAN_NODE_TYPE": "SEND"
        },
        {
            "CHILDREN_IDS": [7],
            "ID": 9,
            "INLINE_NODES": [],
            "OUTPUT_SCHEMA": [{
                "COLUMN_ALIAS": "LGN_NAME",
                "COLUMN_NAME": "LGN_NAME",
                "EXPRESSION": {
                    "COLUMN_ALIAS": "LGN_NAME",
                    "COLUMN_IDX": 0,
                    "COLUMN_NAME": "LGN_NAME",
                    "TABLE_NAME": "LOG",
                    "TYPE": "VALUE_TUPLE",
                    "VALUE_SIZE": 100,
                    "VALUE_TYPE": "STRING"
                },
                "SIZE": 100,
                "TABLE_NAME": "LOG",
                "TYPE": "STRING"
            }],
            "PARENT_IDS": [10],
            "PLAN_NODE_TYPE": "PROJECTION"
        },
        {
            "CHILDREN_IDS": [],
            "ID": 7,
            "INLINE_NODES": [{
                "CHILDREN_IDS": [],
                "ID": 8,
                "INLINE_NODES": [],
                "OUTPUT_SCHEMA": [{
                    "COLUMN_ALIAS": "LGN_NAME",
                    "COLUMN_NAME": "LGN_NAME",
                    "EXPRESSION": {
                        "COLUMN_ALIAS": "LGN_NAME",
                        "COLUMN_IDX": 1,
                        "COLUMN_NAME": "LGN_NAME",
                        "TABLE_NAME": "LOG",
                        "TYPE": "VALUE_TUPLE",
                        "VALUE_SIZE": 100,
                        "VALUE_TYPE": "STRING"
                    },
                    "SIZE": 100,
                    "TABLE_NAME": "LOG",
                    "TYPE": "STRING"
                }],
                "PARENT_IDS": [],
                "PLAN_NODE_TYPE": "PROJECTION"
            }],
            "OUTPUT_SCHEMA": [{
                "COLUMN_ALIAS": "LGN_NAME",
                "COLUMN_NAME": "LGN_NAME",
                "EXPRESSION": {
                    "COLUMN_ALIAS": "LGN_NAME",
                    "COLUMN_IDX": 1,
                    "COLUMN_NAME": "LGN_NAME",
                    "TABLE_NAME": "LOG",
                    "TYPE": "VALUE_TUPLE",
                    "VALUE_SIZE": 100,
                    "VALUE_TYPE": "STRING"
                },
                "SIZE": 100,
                "TABLE_NAME": "LOG",
                "TYPE": "STRING"
            }],
            "PARENT_IDS": [9],
            "PLAN_NODE_TYPE": "SEQSCAN",
            "PREDICATE": null,
            "TARGET_TABLE_NAME": "LOG"
        }
    ]
}

Also WINNER-0.TXT doesn't show Projection node in the tree:

RETURN RESULTS TO STORED PROCEDURE
 SEQUENTIAL SCAN of "LOG"

Are there other places which needs to be disabled to prevent inlining?

The other question I have is related to the EE side. Assuming that
access plan is indeed
SEND
  PROJECTION
     SCAN

And select is 'select lgn_name from log' where lgn_name is the second
column in the LOG table the output schema in the projectionexecutor
should consist of one column (LGN_NAME) and its index should be 1,
right?

bool ProjectionExecutor::p_init(AbstractPlanNode *abstractNode,
                                TempTableLimits* limits)
{
  ......
    //
    // Construct the output table
    //
    TupleSchema* schema = node->generateTupleSchema(true);
    m_columnCount = static_cast<int>(node->getOutputSchema().size());
    ....
    // initialize local variables
    all_tuple_array_ptr =
      expressionutil::convertIfAllTupleValues(node->getOutputColumnExpressions());
    all_tuple_array = all_tuple_array_ptr.get();
    .....
}

After the initialization all_tuple_array should have a single element
of 1. The schema of the output table (and tuple returned from the
iterator loop) of the seqscanexecutor should match the target (LOG)
table schema. Is it more or less accurate?

Thanks,
Mike

Paul Martel

unread,
Apr 20, 2012, 6:54:22 PM4/20/12
to voltd...@googlegroups.com
Mike,

It seems like you are running into the effects of AbstractPlanNode.generateOutputSchema which in the absence of an inline projection creates one based on the node's m_tableScanSchema (if not empty).
I'm not sure what an empty/non-empty m_tableScanSchema signifies.
In the process, this code also, re-sorts the non-empty m_tableScanSchema columns into the order of their position in the m_tableSchema (?== in the underlying table?).
It's not immediately clear to me how much of this processing is correct and/or essential in the presence of a non-inlined parent projection node if we eliminated the inline projection node -- it may me correct but have implications (e.g. effect column index number settings) on the parent projection node. More below.
 
The other question I have is related to the EE side. Assuming that
access plan is indeed
SEND
  PROJECTION
     SCAN

And select is 'select lgn_name from log' where lgn_name is the second
column in the LOG table the output schema in the projectionexecutor
should consist of one column (LGN_NAME) and its index should be 1,
right?

This SOUNDS reasonable, assuming that index "0" is valid and has no special meaning.

bool ProjectionExecutor::p_init(AbstractPlanNode *abstractNode,
                                TempTableLimits* limits)
{
  ......
    //
    // Construct the output table
    //
    TupleSchema* schema = node->generateTupleSchema(true);
    m_columnCount = static_cast<int>(node->getOutputSchema().size());
    ....
    // initialize local variables
    all_tuple_array_ptr =
      expressionutil::convertIfAllTupleValues(node->getOutputColumnExpressions());
    all_tuple_array = all_tuple_array_ptr.get();
    .....
}

After the initialization all_tuple_array should have a single element
of 1. The schema of the output table (and tuple returned from the
iterator loop) of the seqscanexecutor should match the target (LOG)
table schema. Is it more or less accurate?
This is what I would expect, which suggests that the m_tableScanSchema should not be allowed to effect the outputschema at all when we eliminate the inline projection.


Thanks,
Mike

I have been deferring my reply to your previous message, expecting a post to your github repo that would allow a more concrete discussion.
I'm not looking for "finished product" in any sense, just a basis for discussion.
I think a focus on this one simple query is a good starting point.
Let me know if you have more immediate questions or if you think this is not a good plan.

--paul

Michael Alexeev

unread,
Apr 20, 2012, 10:04:06 PM4/20/12
to voltd...@googlegroups.com
Hi Paul,

> It seems like you are running into the effects of
> AbstractPlanNode.generateOutputSchema which in the absence of an inline
> projection creates one based on the node's m_tableScanSchema (if not empty).
> I'm not sure what an empty/non-empty m_tableScanSchema signifies.
> In the process, this code also, re-sorts the non-empty m_tableScanSchema
> columns into the order of their position in the m_tableSchema (?== in the
> underlying table?).
> It's not immediately clear to me how much of this processing is correct
> and/or essential in the presence of a non-inlined parent projection node if
> we eliminated the inline projection node -- it may me correct but have
> implications (e.g. effect column index number settings) on the parent
> projection node. More below.

This is it! I was running into the problem that projection node
indexes were messed up! The planner resolves them based on the output
schema of the send node without inlined projection (the full target
table) but then the scan schema gets reset and the top projection node
indexes are out of sync. I will probably hack something quick and
dirty just to get around it for now. If you have an idea how it should
be done, please let me know.

>
>>
>> The other question I have is related to the EE side. Assuming that
>> access plan is indeed
>> SEND
>>   PROJECTION
>>      SCAN
>>
>> And select is 'select lgn_name from log' where lgn_name is the second
>> column in the LOG table the output schema in the projectionexecutor
>> should consist of one column (LGN_NAME) and its index should be 1,
>> right?
>>
> This SOUNDS reasonable, assuming that index "0" is valid and has no special
> meaning.

"0" is a valid index of the first column.

>
>> bool ProjectionExecutor::p_init(AbstractPlanNode *abstractNode,
>>                                 TempTableLimits* limits)
>> {
>>   ......
>>     //
>>     // Construct the output table
>>     //
>>     TupleSchema* schema = node->generateTupleSchema(true);
>>     m_columnCount = static_cast<int>(node->getOutputSchema().size());
>>     ....
>>     // initialize local variables
>>     all_tuple_array_ptr =
>>
>> expressionutil::convertIfAllTupleValues(node->getOutputColumnExpressions());
>>     all_tuple_array = all_tuple_array_ptr.get();
>>     .....
>> }
>>
>> After the initialization all_tuple_array should have a single element
>> of 1. The schema of the output table (and tuple returned from the
>> iterator loop) of the seqscanexecutor should match the target (LOG)
>> table schema. Is it more or less accurate?
>
> This is what I would expect, which suggests that the m_tableScanSchema
> should not be allowed to effect the outputschema at all when we eliminate
> the inline projection.

Right!

>
>>
>> Thanks,
>> Mike
>
>
> I have been deferring my reply to your previous message, expecting a post to
> your github repo that would allow a more concrete discussion.
> I'm not looking for "finished product" in any sense, just a basis for
> discussion.
> I think a focus on this one simple query is a good starting point.
> Let me know if you have more immediate questions or if you think this is not
> a good plan.

I want to resolve this schema problem (if I can) and then post to have
something presentable. On the other hand, it shouldn't affect the
executor side at all. i will commit the prototype over the weekend
than.

Mike

Thanks f
>
> --paul
>

Michael Alexeev

unread,
Apr 21, 2012, 11:28:08 PM4/21/12
to voltd...@googlegroups.com
Paul,

> I have been deferring my reply to your previous message, expecting a post to
> your github repo that would allow a more concrete discussion.
> I'm not looking for "finished product" in any sense, just a basis for
> discussion.
> I think a focus on this one simple query is a good starting point.
> Let me know if you have more immediate questions or if you think this is not
> a good plan.
>
> --paul
>

I just pushed the prototype. I accidentally used 'git commit -a' and
few 'extra' files were added:

bin/voltcompiler
src/frontend/org/voltdb/planner/QueryPlanner.java

Don't pay attention to them.

I haven't figure out SeqScanPlanNode schema problem yet.

Looking forward to your comments.

Mike

Reply all
Reply to author
Forward
0 new messages