First off, I'm a network guy. I don't speak cognos or oracle. Speak
servers of any flavour on any switch and I'll understand all but sql
stuff boggles my small brain.
Problem:
For over a year our business analysts have been complaining of returned
TNS errors. They are convinced it is a network issue. I have done
everything from rebuilding the servers to tweaking the routers and
switches. I am convinced that it is an application issue.
This is a typical error we receive on a daily basis.
DS-DBMS-E306: UDA driver error connecting to 'CMIS'.
[PROGRESS - 06:25:26] pid 2164 DimBuild Node 25
'D_INVENTORY_LOCATION'; reported the following:
DS-DBMS-E402: UDA driver reported the following:
DMS-E-GENERAL, A general exception has occurred during operation
'attach database'.
ORA-12535: TNS:operation timed out
here's another:
PowerPlay Transformer(7.1.513.0 ) Wed Jun 07 06:33:53 2006
LogFileDirectory=\\servername\corprpt\Projects\Logs\
ModelSaveDirectory=\\servername\corprpt\Projects\Source\Model\
DataSourceDirectory=C:\Program Files\Cognos\cer3\bin\
CubeSaveDirectory=\\servername\corprpt\Projects\Source\Cube\Stage\
DataWorkDirectory=d:\temp\;d:\temp1\
ModelWorkDirectory=d:\temp\
MaxTransactionNum=500000
ReadCacheSize=32768
WriteCacheSize=65536
Temporary UDA directory=C:\WINNT\Temp
06-06-07 06:34:17 AM 2 00000000 DMS-E-GENERAL, A
general exception has occurred during operation 'attach database'.
06-06-07 06:34:17 AM 2 00000000 ORA-12535:
TNS:operation timed out
06-06-07 06:34:17 AM 2 00000000
PowerPlay Transformer Wed Jun 07 06:34:19 2006
I have edited the actual name of our servers with "servername".
Can anyone out there help us figure this one out??
Thank you,
Scott
ORA-12535 is a network or firewall issue. PERIOD.
There is nothing to investigate at the application side, there is
everything to investigate at the network side like
- is there any firewall in play
- is there any Network Address Translation into play
- have the correct ports been opened in the firewall
- is Oracle using random return ports (by default Oracle always uses 2
tcp/ip ports, the first one fixed, and configured in tnsnames.ora, the
actual communication takes place on a radom port. If that is a problem
at your site you need to fix that)
- You need to run netstat -r to identify which ports are actually used.
- Maybe you should consider using a sniffer to identify further
problems.
My experience is, that apart from stubborn network and/or firewall
administrators involved, the issue can always be resolved. I have
posted the usual methods to resolve this many times, and I am aware of
the fact most current posters here do not make any attempt to research
the archives at Google. Yet I am not going to post them again. They are
easy to find, and they are also documented in the Net administrators
manual. The small number of people responding to queries should stop
rewarding laziness. If this will make this forum go extinct, so be it.
--
Sybrand Bakker
Senior Oracle DBA
> Hi Cognos and Oracle gurus,
>
> First off, I'm a network guy. I don't speak cognos or oracle. Speak
> servers of any flavour on any switch and I'll understand all but sql
> stuff boggles my small brain.
Great! Welcome to this forum. I'll try to help you out. I have no clue
about Cognos, though.
>
> Problem:
>
> For over a year our business analysts have been complaining of returned
> TNS errors. They are convinced it is a network issue. I have done
> everything from rebuilding the servers to tweaking the routers and
> switches. I am convinced that it is an application issue.
I tend to agree with you. It's probably Oracle*Net configuration.
>
> This is a typical error we receive on a daily basis.
>
> DS-DBMS-E306: UDA driver error connecting to 'CMIS'.
> [PROGRESS - 06:25:26] pid 2164 DimBuild Node 25
> 'D_INVENTORY_LOCATION'; reported the following:
> DS-DBMS-E402: UDA driver reported the following:
> DMS-E-GENERAL, A general exception has occurred during operation
> 'attach database'.
> ORA-12535: TNS:operation timed out
Here is the error message for ORA-12535:
$ oerr ora 12535
12535, 00000, "TNS:operation timed out"
// *Cause: The requested operation could not be completed within the time out
// period.
// *Action: Look at the documentation on the secondary errors for possible
// remedy. See SQLNET.LOG to find secondary error if not provided explicitly.
// Turn on tracing to gather more information.
$
The follow-up questions are:
1) Do you have sqlnet.log file in the directory from which the application
was started?
2) Was client trace enabled on the client? Oracle*Net has extensive
tracing facilities which should be enabled in cases like yours.
3) With all due respect to the network guys, where is the DBA? The DBA
would know how to enable trace on the client and server sides and
collect the necessary information.
Yes
> 2) Was client trace enabled on the client? Oracle*Net has extensive
> tracing facilities which should be enabled in cases like yours.
Yes, but not being a DBA, I'm still searching on the servers where the
trace logs are located!!!
> 3) With all due respect to the network guys, where is the DBA? The DBA
> would know how to enable trace on the client and server sides and
> collect the necessary information.
The DBA is relatively new at the game. He is the only one available in
our large enviro. These DBA's are a dying breed it seems. The poor
guy is totally overworked.
>
> --
> http://www.mgogala.com
Thanks Mladen
What does it say? Is there anything sensible in the file? You can put
something like this in your sqlnet.ora:
TRACE_DIRECTORY_CLIENT = d:\tmp
TRACE_LEVEL_CLIENT = ADMIN
That should put all the necessary information in d:\tmp or any
other directory of your choice. Server trace can, if needed, be
turned on from the listener control utility (lsnrctl).
>
>> 2) Was client trace enabled on the client? Oracle*Net has extensive
>> tracing facilities which should be enabled in cases like yours.
>
> Yes, but not being a DBA, I'm still searching on the servers where the
> trace logs are located!!!
If the server is Unix, logs are located on $ORACLE_HOME/network/log,
while trace files are located on $ORACLE_HOME/network/trace
>
>> 3) With all due respect to the network guys, where is the DBA? The DBA
>> would know how to enable trace on the client and server sides and
>> collect the necessary information.
>
> The DBA is relatively new at the game. He is the only one available in
> our large enviro. These DBA's are a dying breed it seems. The poor
> guy is totally overworked.
Yes, not many people appreciate us these days. We should probably work
for peanuts, that should make us more popular.
--
Mladen Gogala
http://www.mgogala.com
In addition to the other excellent advice you've received, it could be
something like the processes are not disconnecting cleanly, hogging
ports, so at random times sqlnet can't get a port and times out. I've
seen certain specific versions of certain unix servers fix this problem
simply by reducing tcp timeouts - the root cause in one case was poor
application design that doesn't properly handle people closing the app
on their pc uncleanly, but try to get the vendor to fix it...
We might be able to help more if you give the platforms and versions of
all tiers (OS, Oracle, etc). Those backslashes look windowish, the pid
looks unixish... no wait I see \\servername... maybe you need to make
all those local...
Also, ask the DBA if he is using MTS. There might be a file called
tnsnames.ora on the clients that can give a clue as to which server is
being used. It is also possible that certain options in the clients
sqlnet.ora affect how the attach is being made, so let us know what is
in that.
This may be version dependent:
$ oerr ora 12535
12535, 00000, "TNS:operation timed out"
// *Cause: The requested connection could not be completed within the
timeout
// period specified by the CONNECT_TIMEOUT parameter in listener.ora.
This
// error arises from the tnslsnr.
// *Action: Either reconfigure CONNECT_TIMEOUT to be 0, which means
// wait indefinitely, or reconfigure CONNECT_TIMEOUT to be some higher
// value. Or, if the timeout is unacceptably long, turn on tracing
// for further information.
You should also tell your dba to give you access to
metalink.oracle.com, and see Note: 119706.1among others:
"+ The TNS-12535 or ORA-12535 error is normally a timeout error
associated
with Firewalls or slow Networks.
+ It can also be an incorrect listener.ora parameter setting for the
CONNECT_TIMEOUT_<listener_name> value specified.
+ In essence, the ORA-12535/TNS-12535 is a timing issue between the
client and
server."
See Note:68652.1 if you have NT servers.
jg
--
@home.com is bogus.
http://www.banthisfilth.com/
<snip>
> This is a typical error we receive on a daily basis.
>
> DS-DBMS-E306: UDA driver error connecting to 'CMIS'.
> [PROGRESS - 06:25:26] pid 2164 DimBuild Node 25
> 'D_INVENTORY_LOCATION'; reported the following:
> DS-DBMS-E402: UDA driver reported the following:
> DMS-E-GENERAL, A general exception has occurred during operation
> 'attach database'.
> ORA-12535: TNS:operation timed out
<snip>
> Can anyone out there help us figure this one out??
>
> Thank you,
>
> Scott
Scott,
Have you resolved this issue? My experience is not with the Cognos
product. When do these TNS errors occur:
* Immediately when the application is started/user logs into the
program
* When displaying reports
* While using the program - enter information into a field, press tab,
and instead of information being retrieved, this error message is
returned.
* At a middle tier - a server sitting between the client and the Oracle
database.
Other questions:
* Has this error message only appeared on only a couple computers, and
never on others? Or, are all computers affected at the same time?
* What hardware and software platform is in use for the Oracle database
and the clients?
* Are the clients connecting through a VPN, or are multiple subnets
involved (you mentioned a router).
* Are there any firewalls between the client and server, or on the
client or server?
* What release of Oracle is in use? What release of the Oracle client
software is installed?
It is hard to say what you need to check based on the limited amount of
information provided. Possible scenarios:
* Windows clients running NT 4, 95, 98, ME can connect to the database
without problem, but the Windows 2000, XP, and 2003 server clients
cannot. This could be an indication that the wrong IP address is
specified in the DNS server for the database. If the NT 4, 95, 98, ME
clients cannot connect, but the Windows 2000 and above computers can
connect, and you have no internal DNS server, this may be a sign that
your Internet connection dropped temporarily and the ISP's DNS server
could not be contacted to report that it does not know of your server's
IP address. In this case, you could try hard coding the IP address for
the server in the TNSNAMES.ORA file on each of the client computers.
* There is a VPN connection or some type of WAN link between the server
and the client. The VPN or WAN link connection go down temporarily,
and the client tries to connect to the database (or display a report).
After a 40 second delay, or whatever the timeout is, the client reports
ORA-12535: TNS:operation timed out.
* Something on the server is preventing the listener from responding.
It could be another database on the server is having a bad day -
listener tries to hand off a connection to the database that is having
a bad day, and never returns from the handoff. If this happens, you
will likely see many entries in the alert logs for the databases, and
possibly several trace files.
Suggestions:
* Get specifics of when the errors occur.
* Locate the listener.log file on the server. It should be in the
Oracle installation directory, buried in directories network\log.
Examine the tail end of the log file for errors.
* Search inside the Oracle installation directory for files ending with
.log or .trc. Examine these files to see if any problems are reported.
* Search the client computers for sqlnet.ora. Examine the files for
clues.
* When this problem happens, run a packet capture utility
(Ethereal/WireShark) on the client and server, and then try to
reproduce the problem. This should tell you if the client and the
server are seeing each other, and are able to exchange multiple
packets.
Charles Hooper
PC Support Specialist
K&M Machine-Fabricating, Inc.
>* There is a VPN connection or some type of WAN link between the server
>and the client. The VPN or WAN link connection go down temporarily,
>and the client tries to connect to the database (or display a report).
>After a 40 second delay, or whatever the timeout is, the client reports
>ORA-12535: TNS:operation timed out.
>
>* Something on the server is preventing the listener from responding.
>It could be another database on the server is having a bad day -
>listener tries to hand off a connection to the database that is having
>a bad day, and never returns from the handoff. If this happens, you
>will likely see many entries in the alert logs for the databases, and
>possibly several trace files.
Sqlnet uses *two* ports, one to get in touch, one for the actual
communications.
The second port is determined *randomly*.
If that port isn't opened in the firewall (and why should it), you get
12535.
Listeners don't have a 'bad day'.
There are three workarounds
- set up connection manager on the database server
- set up mts on the database server to a fixed port
- set use_shared_sockets = true
--
Sybrand Bakker, Senior Oracle DBA
Regarding the "bad day" reference, this was an attempt to over-simplify
for the sake of understanding. I stated that there is a possibility
that another database instance on the server is having a "bad day" that
causes the listener on the server that is handling multiple databases
to stop responding.
One of the unfortunate common traits of people who are experts in their
field, it that they tend to communicate using terminology that only
other experts in their field understand. The person who originally
posted the request for assistance is a network administrator, not an
Oracle DBA. That is a little bit like a user of Microsoft Word asking
a C programmer on Windows how to make a dot on the screen a little more
green. And the C programmer replying to just use getpixel, perform a
binary OR with 00FF00, and then setpixelv to update the screen. The C
programmer may be right, but does it help the user of Microsoft Word?
With a properly configured firewall, the three work arounds that you
posted are unnecessary to allow the connection to pass through a
stateful firewall to a VPN connection.
I am aware of how the connection between the client and server jumps
from one set of IP ports to another. I will, however, diagree with
your assertion that "Listeners don't have a 'bad day'". Perhaps the
best description of what could cause a listener to have a bad day can
be found on page 61 of Tom Kyte's "Expert Oracle Database
Architecture":
"Now that the client software knows where to connect to, it will open a
TCP/IP socket connection to the server with the hostname
localhost.localdomain on port 1521. If the DBA for our server has
installed and configured Oracle Net, and has the listener listening on
port 1521 for connection requests, this connection may be accepted. In
a network environment, we will be running a process called the TNS
listener on our server... When it receives the inbound connection
request... [to a] dedicated server connection, the listener will create
a dedicated server for us. On UNIX, this is achieved via a fork() and
exec() system call... The new dedicated server process inherits the
connection established by the listener, and we are now physically
connected to the database. On Windows, the listener process requests
the database process to create a new thread for a connection. Once the
thread is created, the client is "redirected" to it, and we are
physically connected."
There are various other examples in the "Oracle Database Concepts 10g
Release 2" manual, "Expert Oracle Database 10g Administration", and
several other sources. What happens if a physical or artificial limit
is encountered during the fork/exec call or the request to create a new
thread. Hitting a physical limit on a 32 bit operating system is not
out of the question.