performance of updates

Ed Prochak

unread,

Dec 18, 2008, 9:53:36 AM12/18/08

to

I am in the midst of some performance improvements. I have never
thought about this aspect before but has anyone tested the difference
in performance between

UPDATE mytable SET
status = new_status
WHERE mykey = thiskey;

and

UPDATE mytable SET
status = new_status
WHERE mykey = thiskey
AND status <> new_status;

???

I haven't run my own test yet. The actual UPDATE is in a trigger but I
don't think that matters. The argument a coworker is making is that we
should update only if it makes a difference (ie if the value changed).
Since this is an update to one row, I'm not sure if the overhead of
checking additional columns is worth the minimal savings of avoiding
writing the data.

The difference seems to me like it would be minimal. Anyone have
evidence either way?

Ed

Charles Hooper

unread,

Dec 18, 2008, 10:38:15 AM12/18/08

to

What seems like a small improvement now may become a larger problem
later as the decision not to limit the rows to be updated is reused in
other applications. You might want to review the following:
http://jonathanlewis.wordpress.com/statspack-distractions/
http://jonathanlewis.wordpress.com/2007/01/02/superfluous-updates/

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.

Tim X

unread,

Dec 18, 2008, 11:06:00 PM12/18/08

to

Ed Prochak <edpr...@gmail.com> writes:

There are too many variables to give an answer that would always be
true. A lot depends on the table structure, the size of columns, the
available keys and indexes and possibly the version of Orcle, the platform its
running on and the type of storage.

The only way to really know the right answer is to measure the
difference.
Personally, the first thing I'd be asking is whether having a trigger to
do the insert is actually the right design. Triggers are something which
are often very useful for specific problems, but too often are used when
it would be better to do it more explicitly another way - triggers that
do inserts and updates are possibly the worst in the sense that your
design is now relying on what could be called side effects - often a
source of major problems and errors in later maintenance work where the
people doing the maintenance are no the ones who designed the system and
may not be as familiar with all these side effects.

Tim

--
tcross (at) rapttech dot com dot au

joel garry

unread,

Dec 19, 2008, 2:36:34 PM12/19/08

to

On Dec 18, 8:06 pm, Tim X <t...@nospam.dev.null> wrote:
>
> The only way to really know the right answer is to measure the
> difference.
> Personally, the first thing I'd be asking is whether having a trigger to
> do the insert is actually the right design. Triggers are something which
> are often very useful for specific problems, but too often are used when
> it would be better to do it more explicitly another way - triggers that
> do inserts and updates are possibly the worst in the sense that your
> design is now relying on what could be called side effects - often a
> source of major problems and errors in later maintenance work where the
> people doing the maintenance are no the ones who designed the system and
> may not be as familiar with all these side effects.
>

Sometimes it's even worse than that. I'm passively sitting back and
watching a development effort which will be sucking off my production
system, done by the vendor of the system. Seems other implementations
have issues with this taking 9 hours, so I keep seeing all sorts of
stupid ideas flying about - the latest is someone wants to add a
trigger that will update date columns added to a bunch of tables.
Here's a quote:

"I would keep it very simple, like any activity on the record, even an
inquiry, would trigger an update to the field."

Okaaaaaaaayyyyy....

jg
--
@home.com is bogus.
"To our sincere regret, however, it has now emerged that the text
contains deeper levels of meaning, which are not immediately
accessible to a non-native speaker."
http://www.smh.com.au/news/home/technology/eminent-scientific-journal-gets-hit-for-sex/2008/12/11/1228584998876.html

robert

unread,

Dec 20, 2008, 12:00:53 AM12/20/08

to

Ed Prochak wrote:
> I am in the midst of some performance improvements. I have never
> thought about this aspect before but has anyone tested the difference
> in performance between
>
> UPDATE mytable SET
> status = new_status
> WHERE mykey = thiskey;
>
> and
>
> UPDATE mytable SET
> status = new_status
> WHERE mykey = thiskey
> AND status <> new_status;
>
> ???
>
> I haven't run my own test yet. The actual UPDATE is in a trigger but I
> don't think that matters. The argument a coworker is making is that we
> should update only if it makes a difference (ie if the value changed).

I did take some measurements some years ago when I must have had too
much time on my hands :) I could not demonstrate enough difference to
make it worth the effort. On the other hand if you already have the
trigger on the view anyway, I don't see how it can hurt either
(performance, that is) in any measurable degree, to have the "status"
clause tagging along.

The numbers themselves would probably be less than relevant given
today's technology, even had I kept them around (which I didn't).

What might be still relevant would be perhaps the lessons learned from
the exercise.

Consider these three points:

1) If this check were worth making, and under the assumption that plain
jane vanilla DML is going on, then why wouldn't Oracle be making the
same comparison behind the scenes? Maybe they are not -- I don't know
-- but I DO know that out of all the ways to improve performance, trying
to outsmart the Oracle engine has never been a very productive approach,
at least not in my experience.

2) Supposing the argument is that you are saving a physical write by not
updating the row...now if there is enough activity to make any
difference at all, then we have to consider the odds that some other row
on the same page or buffer got updated _anyway_, forcing the page to be
written out no matter what, and so we didn't really save any physical
writes after all.

3) Possibly the assumptions behind 1) and 2) are false, meaning there is
a lot more going on than just a simple row update. In this case, I
would not be surprised if you had bigger fish to fry.

I can think of two examples:

a) business requirements force you to provide some kind of status date,
that is given say, systime, each time the status is updated.
b) more generally developers sometimes insist on an indexed "update
date" column, that gets updated on every update of the row, presumably
so they can do some date-range extracts.

In both a) and b) updating a global index on every row update is a big
waste in best case, and hideously expensive in worst cases. I could
understand trying to avoid it. But in both cases I would argue that
there are better ways to do things and avoid the need for such indexes
altogether, which is what I mean by bigger fish to fry...

> Since this is an update to one row, I'm not sure if the overhead of
> checking additional columns is worth the minimal savings of avoiding
> writing the data.
>
> The difference seems to me like it would be minimal. Anyone have
> evidence either way?
>

just my 2c

> Ed

-- robert

Michael Austin

unread,

Dec 20, 2008, 12:11:19 PM12/20/08

to

From the OP Given:

1) Some change is made to table1
2) A trigger is evaluated - not sure if it is an insert or delete update
trigger - this can make a difference.

Actually the UPDATE in a trigger CAN matter. If mykey is properly
indexed and the number of rows where mykey=thiskey is significant-
compared to the overall data, then adding a second "status" <>
thisstatus MAY be faster especially if you have a compound mythis,status
index. If mykey is somewhat or completely unique AND is properly
indexed then adding status would not impact performance.

I have had databases (not my design - I was called in to fix
performance) where the trigger on one table would cause IIRC 11 other
table triggers to evaluate/execute. Because they did not do the initial
trigger properly, this caused severe performance when trying to load
thousands of records/sec. By ensuring the proper "trigger" was in place,
(in this case - don't update status unless status actually changed to
'somevalue' as was intended) I was able to get them about a 50% boost in
performance.

The KEY is for the DBA to know and understand the data and the logical
view of the database.

I know that over the past 5 years RAD (rapid application development)
has significantly decreased the ability of the development staff
(programmers, DBA, Sysadmin, SANadmin, and Network admins) to have any
understanding of applications and how they should function. Once
deployed, they then spend most of their waking hours trying to achieve
performance that can only be gained by designing performance into the
app. No amount of physical tuning will ever overcome a poorly designed
(or should I say thrown together) application and database. I was
recently asked to review a database that was developed on a PC under
someones desk with Access and quickly converted to single CPU SQL Server
and deployed to support >10K users/min - and they were simply amazed
that it didn't work.

Ed Prochak

unread,

Dec 22, 2008, 1:28:33 PM12/22/08

to

On Dec 20, 12:11 pm, Michael Austin <maus...@firstdbasource.com>
wrote:

mykey is indeed unique, my issue should not be improved with the
additional condition (but not hurt either).
This looks like what I am seeing so far.

>
> I have had databases (not my design - I was called in to fix
> performance) where the trigger on one table would cause IIRC 11 other
> table triggers to evaluate/execute. Because they did not do the initial
> trigger properly, this caused severe performance when trying to load
> thousands of records/sec. By ensuring the proper "trigger" was in place,
> (in this case - don't update status unless status actually changed to
> 'somevalue' as was intended) I was able to get them about a 50% boost in
> performance.

Impressive.
In my case if there was a trigger chain, I would be at the tail end
of it.

>
> The KEY is for the DBA to know and understand the data and the logical
> view of the database.

Separate story. I won't start into that now.

>
> I know that over the past 5 years RAD (rapid application development)
> has significantly decreased the ability of the development staff
> (programmers, DBA, Sysadmin, SANadmin, and Network admins) to have any
> understanding of applications and how they should function. Once
> deployed, they then spend most of their waking hours trying to achieve
> performance that can only be gained by designing performance into the
> app. No amount of physical tuning will ever overcome a poorly designed
> (or should I say thrown together) application and database. I was
> recently asked to review a database that was developed on a PC under
> someones desk with Access and quickly converted to single CPU SQL Server
> and deployed to support >10K users/min - and they were simply amazed
> that it didn't work.

Well, I'll repeat my performance mantra here: the first three steps to
improving performance in Oracle are:
1. analyze
2. Analyze
3. ANALYZE
The fact that after running an analyze in the test database recently
broke lots of things, may give you a hint at my situation.

Thanks ALL!
Ed