Most busy process

Dheeraj K

unread,

Nov 26, 2010, 8:04:58 AM11/26/10

to

Hi,

I am aware of the viewsys utility to check the % of CPU busy. However
i am not sure if there is any utility to check the process which is
keeping the CPU most busy among all the process.

Is there any way that we can check the process on tandem for a
perticular CPU?

Regards
DK

wbreidbach

unread,

Nov 26, 2010, 9:00:02 AM11/26/10

to

There are several utilities available. I am using "mini-MOMI" from
blackwood-systems.com. It is free, easy to install and provides a lot
of information.
As long as that busy process is an aplication process you will get the
information you need. That is the same with all other of those free
tools. On the yahoo-forum you can find another tool to download.
The problems begin if your most busy process is a discprocess. Then
you have to find out who is getting this process busy. And as far as I
know there is no freeware tool giving that information.
I myself wrote a monitoring toolbox, one part of this box is a program
monitoring CPUs and processes and can give you the active proesses.
If a discprocess is too active, it finds out which process and which
file/table accessed by that process are causing that activity.
We are using that toolbox in connection with the opensource monitoring
tool NAGIOS. A sample NAGIOS screen from that monitoring can be found
on slide 27 in
http://www.innug.org/images/stories/2010_NOW_PPT/03.b-wolfgang.pdf (in
German). All information on that screen is collected by our toolbox.
We intend to do complete monitoring of all NonStop resources with that
toolbox.

We are not a vendor and at the moment there are no plans to sell that
toolbox. A test version is available. If there is interest, I could
provide a short description here.

Dheeraj K

unread,

Nov 26, 2010, 10:09:54 AM11/26/10

to

On Nov 26, 7:00 pm, wbreidbach <wolfgang.breidb...@bv-

Sure. That will be great. I also found that we can do the same using
the measure (meascom utility).

Regards
DK

wbreidbach

unread,

Nov 26, 2010, 10:30:01 AM11/26/10

to

> DK- Zitierten Text ausblenden -
>
> - Zitierten Text anzeigen -

In fact everything besides CPU-busy and process-busy is somehow
depending on measure. We are using internally measure to find the
process stressing the disk.
Products like MOMI or PROGNOSIS are relaying on measure. But if you
want to have something like a graphical interface or automatic
detection of very busy processes you have to buy such a tool.

So as proposed I post the short description. The toolbox is
multilingual, at the moment messages are available in German and
English. I have to maintain 4 systems so I created an installation
procedure that usually takes not more than 30 minutes to complete, Did
I tell you, that I am a lazy guy? All necessary parameters are
collected during the installation process.

Monitoring NonStop

The new monitoring was necessary because we wanted to integrate our
NonStop-systems into our Nagios-monitoring. Nagios is an Open-Source
monitoring tool running on Unix-based systems. Unfortunately there is
no NonStop-Client for Nagios and so we were forced to develop our own
solution.
The basic idea was to create to product that needs very little
configuration and maintenance. This product should be able to deliver
all the relevant events on the NonStop-systems to Nagios or at least
to prepare the data for being fetched by Nagios.

Some sample events are:

High CPU-usage by a process
Long CPU-Queues during a long timeframe
Very high memory-usage
Missing processes
Files or tables running full
Lineproblems
Too many or too few Pathway-Serverprocesses
Problems with TMF

The application has a lot of parameters to adjust its behaviour. All
of those parameters have a default according to our experiences.
Because of those defaults the application usually needs very few
parameters, it can be started „out of the box“.

The whole application is based on a SQL-database including
eventhistory.
For all of the so-called „subsystems“ like PATHWAY, CPU there are
configuration tables and event tables. The configuration tables are
created automatically with a few exceptions. They contain all
information concerning the objects. If you take an X.25 line, the
table contains which port on which SWAN-box is used, which number and
so on. In addition space has been reserved for manual documentation.

If an event happens, a tableentry including the actual timestamp is
created, if the problem is solved the tableentry is marked as deleted
with the new timestamp. So you have a history of all events.

In addition we are collecting some statistical data. So the size of
the monitored files is written to table once a day, the relation ANSI-
name and Guardianname is kept in a table, so the application people do
not need to know the Guardiannames.
Another table contains the dates of last SQL/MP-statistics, last SQL/
MX-statistics and last reload.

For the CPU data we maintain a table containing hourly and daily
averages and maximums.

The event tables are read by the Message-Collector. The program reads
all still active events and all events that have been solved but not
yet acknowledged. Every event is acknowlegled with a timestamp. The
solving of a problem is acknowledged, too.
Every event contains a subsystem and an eventnumber. Based on this
information a textmessage based on a template is created. The keywords
start with # and are replaced by the values:

During the development we defined several additional benefits. If the
Pathway-Monitoring finds our there are too few static servers running
it can submit a start-command.

Another tool ist the automatic reload. The program checks the
monitored files for a necessary reload depending on given parameters.
The program not only checks for free space but if checks the
fragmentation of the file/table, similar to TRA
(TandemReloadAnalyzer). If the program finds out that a reload is
necessary, the reload is started automatically in a waited way. This
program has reduced the manual checking to nearly zero.
Of course the program is able to handle even the 32k-blocking.

Monitoring NonStop package contains several programs. All processes
have to be defined as controlled by the Persistence-Monitor. All
programs accept parameters from the central parameter-table CHECKPAR
and in addition parameters are read from the startupmessage. The
parameters from the startupmessage overwrite the tabledata, but during
a refresh-command the table-entries are activated. In the
startupmessage the parameternames must contain a leading dash, –
INTERVAL, the parametername in the table is INTERVAL.

As a side-effect, the following information is detected automatically
and stored in tables:
CPU-information
OS-information (creating a complete history of versions)
Line-configuration
TCP/IP-configuration
PATHAY-configuration
RDF-configuration
Netbatch-configuration
Spooler-configuration
Device-configuration

Additional functionality will be added in the near future.

Connection to Nagios

There are 2 ways the actual events are transported to Nagios:

1. A flat file containing all the active events and performance data
2. SNMP-traps for critical messages

The flat file is fetched by Nagios via an SSH-connection periodically,
all the events can be seen by Nagios.
The SNMP-traps are generated as soon as a critical event occurs.
Nagios will create an Alarm and send out E-mails and an SMS to the
operator on standby. (we do not have 24*7 operating, during the night
and most of the weekend people are on standby).

MicroTech

unread,

Nov 28, 2010, 10:38:34 PM11/28/10

to

On Nov 26, 9:04 pm, Dheeraj K <dhiraj.ka...@gmail.com> wrote:

> ... (most busy) process on Tandem for a particular CPU?

Hello Dheeraj!

Keeping things simple is always good... Here's a simple TACL macro,
using Measure to produce what you want:

== cut here...
?SECTION procbusy MACRO
#frame
#push vfile
#set vfile [ #createprocessname ]
#set vfile [ #charget vfile 2 for 4 ]
#set vfile meas[ vfile ]
#set #inlineecho 0
#push #inlineprefix
#set #inlineprefix +
meascom / inline /
+ add process *
+ start [ vfile ], interval 10 seconds
#output Measuring process activity (30 sec, interval 10, file
[ vfile ])...
status *,user
delay 30 seconds
+ stop [ vfile ]
+ add [ vfile ]
+ list process *, by cpu-busy (descending), if cpu-num = %1%
#inlineeof == Stop process list...
#inlineeof == Stop Meascom...
#pop #inlineprefix
delay 1 second == Give measfh time to close the datafile...
purge [ vfile ]
#unframe
== Cut here, save as Tandem text file <macro name>...

After the TACL command "LOAD/KEEP 1/<macro name>", at the same TACL
prompt, enter:

> procbusy <cpu number>

And the busiest process (in terms of IPU load) will be listed.

On a related note, it is sometimes desired to find out programatically
which node is the *least* busy (earlier post to this group). For a
freebie TAL function that does just that, feel free to visit my
website at tinyurl.com/TNS-freebies (function LeastBusyNode())!

Cheers,

Henry Norman
MicroTech Consulting
sites.google.com/site/microtechnonstop

wbreidbach

unread,

Dec 2, 2010, 12:10:21 PM12/2/10

to

I am not sure if that is what he would like to have.
There some older tools like cpubusy or viewcpu (part of viewproc) that
would do the trick out of the box without using measure. As mentioned
before, what happens if the most busy process is a discprocess? You
have to set up a measurement for that disc and usually as soon as you
have managed to set up the measurement everything is quiet again.
For those who do not know viewproc or viewcpu: A little tool that runs
through one CPU or though the whole system and picks up the most busy
processes and displays them in fullscreen-mode on a 6530, automatic
update is done at configurable intervals The display is like that:

.pid.. .......process name....... . name. ...
home terminal... pri owner %busy
00,000 ** MONITOR ** $:0:0:8
I#CLCI 201 255,255 00.6
00,320 $SYSTEM.SYS01.TSYSDP2 $DATA02 T2.$ZHOME 220
255,255 00.2

The original program had been written at pre D-Release times. I did a
lot of modifications using the actual procedures and extending the
number of processes, it runs fine on H-series. The hometerminal-
display has been changed to our needs. The program is able to decode
OSS-filenames.
If somebody is interested, feel free to contact me for a free copy.

MicroTech

unread,

Dec 3, 2010, 3:32:52 AM12/3/10

to

On Dec 3, 1:10 am, wbreidbach <wolfgang.breidb...@bv-
zahlungssysteme.de> wrote:

> what happens if the most busy process is a discprocess?

The TACL macro is easily modified to display the top x consumers...

> There are some older tools like CPUBUSY or VIEWCPU

Also, we used to have "Offender" (this little gem still runs fine on
most of my client systems here in Asia), is this no longer distributed
with new systems?

> If somebody is interested, feel free to contact me for a free copy.

I'd very much like to have a copy! (henry.k...@gmail.com)! Vielen
dank!

I downloaded your migration presentation: Again thanks! Great Job!
Well done, Wolfie!

Cheers,

Henry Norman
MicroTech Consulting
https://sites.google.com/site/microtechnonstop/