Any way to reduce waits over the link? (SQL*Net message from dblink)

NetComrade

unread,

Sep 10, 2003, 12:41:19 PM9/10/03

to

The queries are fine (not pulling whole tables)
The network is fine (running at about half the capacity)
Db's are not direct-connected, traffic goes over a switch, 100Mbps
ethernet.
When looking at trace files, it looks like oracle does some kind of
checking over the link to make sure the objects are valid, that might
be a contributor to these waits, any way to get rid of this?
Some queries have already been 'tuned' to use snapshots, but some
other queries require instanteneous updates to local tables, so
snapshots won't work.

My last statspack for an hour shows:
Avg
Time TotalWait wait Waits
Event Waits outs Time(cs) (ms) /txn
--------------------------- -------- ---- -------- ---- ------
SQL*Net message from dblink 1,428,648 55 454,719 3 7.1
db file sequential read 588,295 0 440,978 7 2.9
log file sync 94,051 5 67,230 7 0.5
db file parallel write 31,395 0 44,485 14 0.2

454,719cs/100/60=75mins of wait time on a 10cpu machine
the file reads cannot really be tuned, unless i increase the buffer
cache a bit, but it's running at 99% hit rate.
log file sync's can't be tuned either, and they don't bother me much,
redo is running on dedicated disks, striping didn't help.
.......
We use Oracle 8.1.7.4 on Solaris 2.7 boxes
remove NSPAM to email

Brian Peasland

unread,

Sep 10, 2003, 2:40:15 PM9/10/03

to

> The queries are fine (not pulling whole tables)

Are you sure? Have you taken the query that is sent to the remote
database and tuned it on that remote database? What percentage of your
data are you pulling across your database link as opposed to the amount
of data you actually need on the local database? Maybe its a good idea
to do a full table scan instead of an index lookup! One never knows for
sure until they tune the query on the remote database.

> The network is fine (running at about half the capacity)

Just because a network runs at 50% capacity or 10% capacity does not
necessarily mean that data is moving fast across the network. Granted,
if the network is at 100% capacity, data will move slower....

> Avg
> Time TotalWait wait Waits
> Event Waits outs Time(cs) (ms) /txn
> --------------------------- -------- ---- -------- ---- ------
> SQL*Net message from dblink 1,428,648 55 454,719 3 7.1

This one event contributes approximately 50% of your total wait time.
This event will only occur if you are waiting on communication from a
remote database, initiated at your local database. i.e. a database link
is involved here between two databases. Have you tuned your distributed
query?

> db file sequential read 588,295 0 440,978 7 2.9

You might want to see if this event can be tuned too since it accounts
for nearly the other 50% of your total wait time.

> log file sync 94,051 5 67,230 7 0.5
> db file parallel write 31,395 0 44,485 14 0.2
>
> 454,719cs/100/60=75mins of wait time on a 10cpu machine
> the file reads cannot really be tuned, unless i increase the buffer
> cache a bit, but it's running at 99% hit rate.

Why can't the file reads be tuned? You may be requesting more logical
reads than you really need, which *may* translate to more physical reads
than you need. The BCHR at 99% doesn't mean that this wait event is a
waste of time to tune. This one event contributes 44% of your total wait
time. It may be a good idea to investigate this further. Is there I/O
contention on the disks or in the controllers? Are these disks fast
enough? Your BCHR of 99% does not mean that you don't have I/O tuning
(logical and/or physical) to do. That type of reasoning is a myth.

> log file sync's can't be tuned either, and they don't bother me much,
> redo is running on dedicated disks, striping didn't help.

HTH,
Brian

--
===================================================================

Brian Peasland
dba@remove_spam.peasland.com

Remove the "remove_spam." from the email address to email me.

"I can give it to you cheap, quick, and good. Now pick two out of
the three"

Tanel Poder

unread,

Sep 10, 2003, 3:09:56 PM9/10/03

to

Hi!

My first question would be, what's the problem? Is your application slow or
are you just tuning everything what appears in statspack top-5 list? (In
other words - CTD is a disease probably even worse than SARS, thousands of
DBAs have left their lives and spend all their time tuning until they die
(or the sama happens to all of their databases) ;)

Note that when you post a statspack top-5 output taken in some timeframe,
you should also post the information how much CPU was used during this
timeframe, this way we can see the ratio between service & wait times and
recommend whether it'd be appropriate to concentrate on waits at all. There
should be statistic called CPU used ... or something in SP report as well.

But anyway, your stats report shows that during this hour all of your
sessions were spending totally 12,5% on waiting for data from dblink. You
have about 7 waits per transaction and average wait lasts for 3ms. You have
two options here, either reduce number of waits (thus reduce number of
roundtrips to remote server by tuning sql etc or reduce the average duration
of a wait).
I can't help you with sql, but wait time "more data from dblink" generally
breaks to two: 1) network latency(wait time) 2) query execution&fetching
(service time).

1) network latency can be tuned using TCP parameters, using faster protocol
(named pipes in windows maybe), using dedicated network, using faster
network, etc..
2) query execution & fetching - this is normal query response time tuning,
just in remote server.

> The queries are fine (not pulling whole tables)

When latency is a problem, it is wiser to pull more data with less requests.
This is design & SQL issue.

> The network is fine (running at about half the capacity)
> Db's are not direct-connected, traffic goes over a switch, 100Mbps
> ethernet.
> When looking at trace files, it looks like oracle does some kind of
> checking over the link to make sure the objects are valid, that might
> be a contributor to these waits, any way to get rid of this?

If you're joining tables in remote location, then create views with required
join conditions there and select from them instead of drectly from tables.
That way you can save sqlnet roundtrips required for checking validity (or
whatever) of referenced objects. That way you only evaluate your view, not
any underlying objects. Also, note that selecting the first time over
database link in a session does more roundtrips than subsequent operations.
Thus if you got a web application which always spawns a new session, the
overhead could come there..

Tanel.

Tanel Poder

unread,

Sep 10, 2003, 3:42:36 PM9/10/03

to

Here's a little proof to my post (that selecting from a view in remote database which is doing join of some tables, causes fewer sqlnet roundtrips than doing the join directly, without a view):

C:\Work\Oracle>sqlplus "admin/admin"

SQL*Plus: Release 9.2.0.4.0 - Production on K Sep 10 22:15:25 2003

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production

SQL> create table t1 (id number, name varchar(10));

Table created.

SQL> create table t2 (id number, name varchar(10));

Table created.

I create a database link which actually references to my own schema, but it will do the job.

SQL> create database link l connect to admin identified by admin using 'orcl';

Database link created.

SQL> select n.name, s.value from v$sesstat s, v$statname n where n.statistic# = s.statistic#
2 and s.sid = (select sid from v$mystat where rownum < 2)
3 and n.name like 'SQL*Net roundtrips to/from dblink';

NAME VALUE
---------------------------------------------------------------- ----------
SQL*Net roundtrips to/from dblink 0

SQL> select t1.name name1, t2.name name2 from t1@l, t2@l where t1.id = t2.id;

no rows selected

SQL> select n.name, s.value from v$sesstat s, v$statname n where n.statistic# = s.statistic#
2 and s.sid = (select sid from v$mystat where rownum < 2)
3 and n.name like 'SQL*Net roundtrips to/from dblink';

NAME VALUE
---------------------------------------------------------------- ----------
SQL*Net roundtrips to/from dblink 12

First select to these 2 tables causes 12 sqlnet roundtrips

SQL> select t1.name name1, t2.name name2 from t1@l, t2@l where t1.id = t2.id;

no rows selected