Heikki Linnakangas
unread,Aug 24, 2017, 2:34:23 AM8/24/17Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Greenplum Developers, Daniel Gustafsson
Hi,
Coverity is pointing out an access to an uninitialized variable in
ExecUpdate. With some irrelevant details omitted, it looks like this:
TransactionId update_xmax = InvalidTransactionId;
ItemPointerData update_ctid;
if (rel_is_heap)
{
result = heap_update(resultRelationDesc, tupleid, tuple,
&update_ctid, &update_xmax, ...);
}
else if (rel_is_aorows)
{
result = appendonly_update(resultRelInfo->ri_updateDesc, tuple,
(AOTupleId *) tupleid, &aoTupleId);
}
else if (rel_is_aocols)
{
result = aocs_update(resultRelInfo->ri_updateDesc,
partslot, (AOTupleId *) tupleid, &aoTupleId);
}
switch (result)
{
case HeapTupleSelfUpdated:
/* already deleted by self; nothing to do */
return;
case HeapTupleMayBeUpdated:
break;
case HeapTupleUpdated:
if (!ItemPointerEquals(tupleid, &update_ctid))
{
TupleTableSlot *epqslot;
Assert(update_xmax != InvalidTransactionId);
epqslot = EvalPlanQual(estate,
resultRelInfo->ri_RangeTableIndex,
&update_ctid,
update_xmax);
if (!TupIsNull(epqslot))
{
*tupleid = update_ctid;
partslot = ExecFilterJunk(estate->es_junkFilter, epqslot);
tuple = ExecFetchSlotHeapTuple(partslot);
goto lreplace;
}
}
/* tuple already deleted; nothing to do */
return;
default:
elog(ERROR, "unrecognized heap_update status: %u", result);
return;
}
Coverity is pointing out, that if we're updating an appendonly or an
AOCS table, the update_ctid variable is not set, but it checked in the
"case HeapTupleUpdated" above. Moreover, if we reach that EvalPlanQual()
above, update_xmax will still be invalid, and we will trip the assertion.
So clearly this is wrong, but I wonder what we should do here? To recap,
this is the code that handles UPDATEs in READ COMMITTED mode. In a
normal heap table, if you try to UPDATE a row, but it's been
concurrently updated by another transaction, the system will follow the
ctid chain to find the latest version of the row, re-evaluate the plan
quals (i.e. the WHERE clause) against the latest version, and if it
still passes the quals, it will update the latest version instead.
That's in READ COMMITTED mode; in REPEATABLE READ / SERIALIZABLE mode,
you get a "could not serialize access due to concurrent update" error
instead.
But tuples in append-only tables don't have CTID pointers, so we cannot
do that. If you find that a tuple has been concurrently updated, there's
no way to find the latest version of the same row. So what should we do?
I can see two feasible options:
1. Skip the update. This is what we do for DELETEd tuples. In essence,
this would mean treating an UPDATEd tuple the same as an DELETE+INSERT.
2. Throw an error, like in SERIALIZABLE mode. This has the problem that
I don't think we can even tell an updated and a deleted tuple apart. So
we would have to throw an error even on deleted tuples. That's what
happens in a serializable transaction too.
I'm leaning towards option 1. But I wonder if we have this behavior
documented somewhere already? If not, this really needs more comments.
- Heikki