Oracle hangs on select statement on certain computers

Dan

unread,

Feb 20, 2006, 5:35:12 PM2/20/06

to

I'm not really sure where to post this as I'm not really sure where the
problem lies, but I THINK it may be an oracle problem:

I have a problem with certain select statements hanging (either
retrieving several hundred rows than just stopping on the same row
every time, or just hanging before retrieving anything.) These exact
statements do work on most of our computers and up until a few weeks
ago ran fine on the few that now hang. And even on the computers that
have the problem, it's only 3 particular databases (2 of which are on
the same server but all of which belong to the same client and are
accessed via VPN.) This problem is consistantly reproducable on the
same computers with the same statements with the same databases. The
statements are being run through an application that we created in
PowerBuilder and I have also tried them in a database utility that we
also created in PowerBuilder, SQL Plus, & TOAD for Oracle. On the
computers that have the problem, the statements freeze in both our
PowerBuilder apps, and in a version of TOAD (I forget what) prior to
8.6.0.38; they work in SQL Plus (which I'm told DOESN'T use the oracle
API) and TOAD 8.6.0.38. We have recently upgraded TOAD on all machines
which is why I'm not sure what the version was that was failing, but in
the previous version the statements DID return fine when run as single
statements (which retrieves a few rows into a table and retrieves the
rest when you scroll down); they just hung when executing as a script
(which retrieves everything at once.)

Not everyone has the exact same versions of everything (particularly
Oracle and Windows-we all use XP Pro but there still are version
differences..) or the exact same programs installed. I have tried two
computers with the same Oracle version though and one worked while one
did not. One strange thing about this problem is that the 3 computers
that have the problem (there are 3 that I know of and they all started
having the problem a few weeks ago) are the 3 that access the
particular databases the most; this may or may not be coincidence. As
I mentioned, the databases are accessed by us over VPN but the client
(who accesses them directly) reported to us the other day a problem
that they had in which it hangs when the app runs certain insert or
delete statements. We are aware that locks on the tables could cause
these problems but we tried and the client tried when everything was
unlocked and they hung, locking the tables, on all their databases
while only two of the three hung for us.

All three of their servers are at oracle 9.2.0.6.0; I just now thought
of asking if this upgrade was made recently; if so, I'll post that info
in here tomorrow.

Oh, and by the way, not only does the structure (ie. columns/tables/etc
used) affect which statements hang, but the data used in them (ie.
"where flag='x'" vs. "where flag=' '"), which doesn't change the
execution plan, is a factor.

So does anyone have any suggestions?

Thanks, Dan

Sybrand Bakker

unread,

Feb 20, 2006, 6:25:00 PM2/20/06

to

On 20 Feb 2006 14:35:12 -0800, "Dan" <everk...@gmail.com> wrote:

> On the
>computers that have the problem, the statements freeze in both our
>PowerBuilder apps, and in a version of TOAD (I forget what) prior to
>8.6.0.38; they work in SQL Plus (which I'm told DOESN'T use the oracle
>API) and TOAD 8.6.0.38.

sql*plus definitely DOES use the Oracle Call Interface, so what you
call 'the oracle API'
My take on this is simple: if it works in sql*plus it is NOT a
database problem, but likely a *transport* problem, either
- not using array fetching, causing a roundtrip between every 2
records being fetched
- sqlnet fragmenting packets because of a mismatch of Session Data
Unit and MTU of the network card
- network card fragmenting packets.
- and then of course the assorted mess of various drivers you seem to
support

- the (non)use of array fetching can be established by tracing an
Oracle session
- sqlnet fragmentation can be observed by tracing sqlnet
- network fragmentation can be observed by a sniffer

I would start by issuing DOS pings and tracert to the database server
from the affected clients and observe the differences.

--
Sybrand Bakker, Senior Oracle DBA

Mark D Powell

unread,

Feb 20, 2006, 7:53:05 PM2/20/06

to

Dan, different systems would have different data. Have you looked at
the statistics for the problem databases and compared the statistics to
the actual data to make sure your application is not running into any
type of skewed data accessing problem?

Is the alert log and user trace file directories for each affected
database clean?

If these applications use an Oracle client does the version of the
client installed match the version of the target database?

HTH -- Mark D Powell --

Dan

unread,

Feb 21, 2006, 11:51:13 AM2/21/06

to

I ran a trace on this through my sqlnet.ora file and it seems to do a
small amount of data processing and then it gets stuck in an infinite
loop of the folowing:

[21-FEB-2006 10:31:02:568] nsdo: entry
[21-FEB-2006 10:31:02:568] nsdo: cid=0, opcode=85, *bl=0, *what=0,
uflgs=0x0, cflgs=0x3
[21-FEB-2006 10:31:02:568] nsdo: rank=64, nsctxrnk=0
[21-FEB-2006 10:31:02:568] nsdo: nsctx: state=8, flg=0x100420d, mvd=0
[21-FEB-2006 10:31:02:568] nsdo: gtn=127, gtc=127, ptn=10, ptc=2047
[21-FEB-2006 10:31:02:568] nsdo: switching to application buffer
[21-FEB-2006 10:31:02:568] nsrdr: entry
[21-FEB-2006 10:31:02:568] nsrdr: recving a packet
[21-FEB-2006 10:31:02:568] nsprecv: entry
[21-FEB-2006 10:31:02:568] nsprecv: reading from transport...
[21-FEB-2006 10:31:02:568] nttmrd: entry
[21-FEB-2006 10:31:02:568] ntt2err: entry
[21-FEB-2006 10:31:02:568] ntt2err: exit
[21-FEB-2006 10:31:02:568] nttmrd: socket 384 had bytes read=0
[21-FEB-2006 10:31:02:568] nttmrd: exit
[21-FEB-2006 10:31:02:568] nsprecv: transport read error
[21-FEB-2006 10:31:02:568] nsprecv: error exit
[21-FEB-2006 10:31:02:568] nserror: entry
[21-FEB-2006 10:31:02:568] nsrdr: error exit
[21-FEB-2006 10:31:02:568] nsdo: *what=0, *bl=0
[21-FEB-2006 10:31:02:568] nsdo: nsctxrnk=0
[21-FEB-2006 10:31:02:568] nsdo: error exit
[21-FEB-2006 10:31:02:568] nioqrc: wanted 1 got 0, type 0
[21-FEB-2006 10:31:02:568] nioqer: entry
[21-FEB-2006 10:31:02:568] nioqer: incoming err = 12151
[21-FEB-2006 10:31:02:568] nioqce: entry
[21-FEB-2006 10:31:02:568] nioqce: exit
[21-FEB-2006 10:31:02:568] nioqer: returning err = 3123
[21-FEB-2006 10:31:02:568] nioqer: exit
[21-FEB-2006 10:31:02:568] nioqrc: exit
[21-FEB-2006 10:31:02:678] nioqrc: entry

Anyone seen this?

ianal Vista

unread,

Feb 21, 2006, 9:52:27 PM2/21/06

to

"Dan" <everk...@gmail.com> wrote in news:1140540673.893089.63120
@o13g2000cwo.googlegroups.com:

> I ran a trace on this through my sqlnet.ora file and it seems to do a
> small amount of data processing and then it gets stuck in an infinite
> loop of the folowing:
>

> [21-FEB-2006 10:31:02:568] nioqer: incoming err = 12151
> [21-FEB-2006 10:31:02:568] nioqce: entry
> [21-FEB-2006 10:31:02:568] nioqce: exit
> [21-FEB-2006 10:31:02:568] nioqer: returning err = 3123
> [21-FEB-2006 10:31:02:568] nioqer: exit
> [21-FEB-2006 10:31:02:568] nioqrc: exit
> [21-FEB-2006 10:31:02:678] nioqrc: entry
>
>
> Anyone seen this?
>
>

12151, 00000, "TNS:received bad packet type from network layer"
// *Cause: Internal error.
// *Action: Not normally visible to the user. For further details, turn
// on tracing and reexecute the operation. If error persists, contact
// Worldwide Customer Support.

03123, 00000, "operation would block"
// *Cause: This is a status code that indicates that the operation
// cannot complete now.
// *Action: None; this is not an error. The operation should be retried
// again for completion.

Colum Mylod

unread,

Feb 22, 2006, 7:23:43 AM2/22/06

to

On Wed, 22 Feb 2006 02:52:27 GMT, ianal Vista
<ianal...@hotmail.com> wrote:

>12151, 00000, "TNS:received bad packet type from network layer"
>// *Cause: Internal error.
>// *Action: Not normally visible to the user. For further details, turn
>// on tracing and reexecute the operation. If error persists, contact
>// Worldwide Customer Support.

I've seen this on a system with an intermittent network cable that
would cause irregular hiccups on large data. In other words, the OP
should probably check the layers below Oracle right down to the metal.
About one of the worst problems to track down!

--
New anti-spam address cmylod at despammed dot com

steve

unread,

Feb 24, 2006, 5:52:48 AM2/24/06

to

On Wed, 22 Feb 2006 00:51:13 +0800, Dan wrote
(in article <1140540673....@o13g2000cwo.googlegroups.com>):

try changing the network cable between the computer nic & your wall socket.
i had the same, when the cable flapped about it caused the oracle to hang.