[GENERAL] Re: Failed Statements within Transactions

Tim Kientzle

unread,

Dec 29, 2000, 9:54:18 PM12/29/00

to

> ... if a query fails inside a transactions block,
> PostgreSQL "RollBack"s the whole transaction ...

In my experience, it's unusual for SELECT statements
(queries) to cause errors. If they do, the statements
themselves are likely broken. (A SELECT that returns
zero rows is not an error.) Likewise, many
other SQL statements (UPDATE, DELETE) rarely
cause errors; they usually "fail" by modifying
zero rows, which does not affect the transaction
state.

However, I do agree that a failed INSERT should
not automatically rollback the entire transaction.
INSERTs often fail due to integrity constraints.
The common suggestion of doing a test SELECT
first to determine whether to INSERT or UPDATE
is simple nonsense; that just creates a race
condition. If you lose the race, you have
to replay the entire transaction, which
is a waste of CPU cycles. (Worse, you're more
likely to lose the race on a more heavily loaded
system, which is exactly when you most want
to avoid replaying complex transactions.)

The INSERT/UPDATE problem is an inherent part
of the SQL language. There is no particularly
direct way in standard SQL to ensure that a particular
(unique) piece of data is in a table. The most
consistently reliable way to handle this is to have an
integrity constraint, try the INSERT, and then possibly
do an UPDATE if the INSERT fails. I'm told this is the
preferred strategy on other DB systems, and it's the only
one that avoids any kind of race condition. Furthermore,
it is usually much faster than doing a pre-SELECT. (If the data
isn't already there, the INSERT is clearly fastest, if it is,
the INSERT will typically fail and return an error more
quickly than a SELECT would complete. Other strategies
are more efficient if the data is usually already
there.)

PostgreSQL's behavior essentially forces you to
do BOTH of the following:
* SELECT first to see whether to INSERT or UPDATE
* AND be prepared to replay the _entire_ transaction
when you lose the race.
You will sometimes lose the race, so the second
step is not optional. I suspect many complex
PostgreSQL applications have infrequent failures
precisely because they aren't prepared to replay entire
transactions (with the correct INSERTs converted to
UPDATEs). For a complex transaction which must ensure
certain data exists in several tables, this gets
pretty ugly.

Basically, the PostgreSQL developers have decided
that any integrity violation is a serious error;
therefore, PostgreSQL does not really permit
tentative INSERTs within transactions. This violates
SQL conventions that are pretty well-established
in some circles, needlessly complicates
applications that use complex transactions
and introduces a fairly minor performance issue.

- Tim

Mike Mascari

unread,

Dec 29, 2000, 11:17:22 PM12/29/00

to

Tim Kientzle wrote:
> ...

> However, I do agree that a failed INSERT should
> not automatically rollback the entire transaction.
> INSERTs often fail due to integrity constraints.
> The common suggestion of doing a test SELECT
> first to determine whether to INSERT or UPDATE
> is simple nonsense; that just creates a race

> condition...
> ...

> PostgreSQL's behavior essentially forces you to
> do BOTH of the following:
> * SELECT first to see whether to INSERT or UPDATE
> * AND be prepared to replay the _entire_ transaction
> when you lose the race.
> You will sometimes lose the race, so the second
> step is not optional.

> ...

How about:

CREATE TABLE foo (key int4, value text);

Then, in the client application,

INSERT INTO foo SELECT 1, 'Hello' WHERE NOT EXISTS (SELECT
key FROM foo WHERE key = 1);

The insert will then insert either (1) a single row if the
row did not already exist, or (2) no rows if the row already
existed. If the latter, you may then update the row using an
update. An UPDATE of zero rows would suggest that the row
was deleted between the time of the INSERT and the time of
the UPDATE, so one would either have to loop or take the
appropriate action for when another user wishes to delete
the row. And even that possibility is dependent upon the
selected TRANSACTION ISOLATION LEVEL. But it would *not*
cause an error in a transaction block forcing a rollback of
the entire transaction.

Mike Mascari

Tom Lane

unread,

Dec 30, 2000, 1:39:27 AM12/30/00

to

Mike Mascari <mas...@mascari.com> writes:
> How about:

> INSERT INTO foo SELECT 1, 'Hello' WHERE NOT EXISTS (SELECT
> key FROM foo WHERE key = 1);

Cute idea, but it doesn't eliminate the race-condition problem.
If two backends execute this code concurrently, both will find
that the subselect returns no rows, and both will try to do the
INSERT. If you have a unique index on the table, then one will
succeed and the other will get an error (after waiting to see
if the first commits or not).

There's no doubt that savepoints within transactions would be a
good improvement to Postgres, and I believe Vadim has hopes of
making that happen in 7.2 or 7.3. But in the meantime, transactions
without savepoints are a lot better than "transactions" that fail to
notice errors at all. Any MySQL partisans who contend otherwise
just plain Don't Get It.

regards, tom lane