Connect:Direct (NDM) CPU Usage

Kelman, Tom

unread,

May 2, 2008, 11:53:22 AM5/2/08

to

We run Connect:Direct (used to be called NDM) from Sterling Commerce.
Yesterday afternoon the started task, CDNDM, took from 50-60% of an
engine on our z9BC for almost 2 hours while it transferred a large file.
This caused us to hit our softcap and affected other tasks in the
system. We have the CDNDM task running in our STCLO service class which
is set for Vel=50 and an importance level of 4. It was still running at
a high DP and grabbed the CPU. I can't understand why a task that is
basically transmitting a file over the network should need this much
CPU. My only explanation might be that it is compressing the data
before putting it on the network and the compression algorithm isn't the
most efficient in the world.

Has anyone else had this kind of a problem running Connect:Direct?
What, if anything, did you do to control it?

Tom Kelman

Commerce Bank of Kansas City

(816) 760-7632

*****************************************************************************
If you wish to communicate securely with Commerce Bank and its
affiliates, you must log into your account under Online Services at
http://www.commercebank.com or use the Commerce Bank Secure
Email Message Center at https://securemail.commercebank.com

NOTICE: This electronic mail message and any attached files are
confidential. The information is exclusively for the use of the
individual or entity intended as the recipient. If you are not
the intended recipient, any use, copying, printing, reviewing,
retention, disclosure, distribution or forwarding of the message
or any attached file is not authorized and is strictly prohibited.
If you have received this electronic mail message in error, please
advise the sender by reply electronic mail immediately and
permanently delete the original transmission, any attachments
and any copies of this message from your computer system.
*****************************************************************************

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to list...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Mansell, George R.

unread,

May 2, 2008, 12:00:21 PM5/2/08

to

I have not seen this problem but Ndm has different levels of
compression. Do you know what level of compression was used? I think
level 1 is the default.

George Mansell
UMB Bank
816-860-1149

Dave Thorn

unread,

May 2, 2008, 12:08:05 PM5/2/08

to

Tom, we have had similar issues where it impacts performance too. What
we did was to assign it to a Resource Group to keep it from hurting
others while still giving it enough resources to get its work done.

Dave Thorn * Senior Technology Analyst * SunGard Computer Services * 600
Laurel Oak Road, Voorhees, NJ, 08043
Office 856 566-5412 * Mobile 609 781-0353 * Fax 856 566-3656

CONFIDENTIALITY: This e-mail (including any attachments) may contain
confidential, proprietary and privileged information, and unauthorized
disclosure or use is prohibited. If you received this e-mail in error,
please notify the sender and delete this e-mail from your system.

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-...@BAMA.UA.EDU] On
Behalf Of Kelman, Tom
Sent: Friday, May 02, 2008 11:53 AM
To: IBM-...@BAMA.UA.EDU
Subject: Connect:Direct (NDM) CPU Usage

Mark Zelden

unread,

May 2, 2008, 12:08:55 PM5/2/08

to

On Fri, 2 May 2008 10:53:02 -0500, Kelman, Tom
<Thomas...@COMMERCEBANK.COM> wrote:

>We run Connect:Direct (used to be called NDM) from Sterling Commerce.
>Yesterday afternoon the started task, CDNDM, took from 50-60% of an
>engine on our z9BC for almost 2 hours while it transferred a large file.
>This caused us to hit our softcap and affected other tasks in the
>system. We have the CDNDM task running in our STCLO service class which
>is set for Vel=50 and an importance level of 4. It was still running at
>a high DP and grabbed the CPU. I can't understand why a task that is
>basically transmitting a file over the network should need this much
>CPU. My only explanation might be that it is compressing the data
>before putting it on the network and the compression algorithm isn't the
>most efficient in the world.
>
>

Hello... VOL=50 basically means half an engine! (I have no idea how many
engines you have). Of course this assumes other more important work
is getting done. But my point is VEL=50 is very high for an IMP=4
workload (IMO). Are you sure other higher important work wasn't
meeting goals (what do RMF reports or RMF III tell you). If they weren't
then you probably shouldn't have seen NDM's DP higher than the work
that was missing goals. I say "probably"... not definitely because WLM
won't make the DP higher if it doesn't think it will help (other factors
causing delay).

>
>Has anyone else had this kind of a problem running Connect:Direct?
>What, if anything, did you do to control it?
>

There are lots of things you can do / try. Lower the velocity, Imp=5,
discretionary, resource group with a MAX. Maybe the work that had
problems is also not classified correctly (as opposed to this work).

Mark
--
Mark Zelden
Sr. Software and Systems Architect - z/OS Team Lead
Zurich North America / Farmers Insurance Group - ZFUS G-ITO
mailto:mark....@zurichna.com
z/OS Systems Programming expert at http://expertanswercenter.techtarget.com/
Mark's MVS Utilities: http://home.flash.net/~mzelden/mvsutil.html

Craddock, Chris

unread,

May 2, 2008, 12:17:59 PM5/2/08

to

> Hello... VOL=50 basically means half an engine! (I have no idea how
many
> engines you have).

Half an engine? No. It means basically that if you sample the work over
a period of time, that 50% of the time that the work was eligible to be
dispatched it actually was dispatched.

Arguably this is an extremely crude and ill-conceived way of defining a
performance goal, but it's the one we were given :-(

> But my point is VEL=50 is very high for an IMP=4
> workload (IMO).

Often true, but not necessarily. (playing devil's advocate :-)

CC

Pinnacle

unread,

May 2, 2008, 12:39:03 PM5/2/08

to

----- Original Message -----
From: "Craddock, Chris" <Chris.C...@CA.COM>
Newsgroups: bit.listserv.ibm-main
Sent: Friday, May 02, 2008 12:17 PM
Subject: Re: Connect:Direct (NDM) CPU Usage

>> Hello... VOL=50 basically means half an engine! (I have no idea how
> many
>> engines you have).
>
> Half an engine? No. It means basically that if you sample the work over
> a period of time, that 50% of the time that the work was eligible to be
> dispatched it actually was dispatched.
>
> Arguably this is an extremely crude and ill-conceived way of defining a
> performance goal, but it's the one we were given :-(
>
>> But my point is VEL=50 is very high for an IMP=4
>> workload (IMO).
>
> Often true, but not necessarily. (playing devil's advocate :-)
>
> CC
>

With apologies to Crash, I gotta go with Z. Velocity 50 for an importance 4
workload is way high. I would guess also that nothing else was impacted, so
WLM just gave the CPU to NDM. It's not a problem unless you had other
higher importance workload affected. Make sure you have all the APARs on
for WLM. If higher importance workload was affected, then open a PMR.

Regards,
Tom Conley

Ted MacNEIL

unread,

May 2, 2008, 12:41:34 PM5/2/08

to

> Hello... VOL=50 basically means half an engine! (I have no idea how many
> engines you have).

Not even close if you have I/O included in the definition of VELOCITY.
(A viable option since they removed disconnect time as a component for calculating velocity ~OS/390 2.8, iirc)

-
Too busy driving to stop for gas!

Ulrich Krueger

unread,

May 2, 2008, 12:45:29 PM5/2/08

to

Is it possible that NDM needs to be tuned (region size, internal config
parameters, etc) to be able to better handle large files?
Lots of CPU usage for an extended period of time is to me an indicator of a
problem with the application. Even the process of compressing a file should
not take lots of CPU for an extended period of time, as you indicated.
Which brings up a question: How big was the file? Have you tried compressing
it outside of NDM and then sending the compressed file as-is, without NDM
performing any further compression on it?
You might want to talk to the software vendor ... are there any patches
addressing this issue that should be installed?

Regards,
Ulrich Krueger

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-...@BAMA.UA.EDU] On Behalf
Of Kelman, Tom

Tom Schmidt

unread,

May 2, 2008, 12:47:44 PM5/2/08

to

On Fri, 2 May 2008 12:38:52 -0400, Pinnacle wrote:

>----- Original Message -----
>From: "Craddock, Chris" <Chris.C...@CA.COM>
>Sent: Friday, May 02, 2008 12:17 PM
>
>> Half an engine? No. It means basically that if you sample the work over
>> a period of time, that 50% of the time that the work was eligible to be
>> dispatched it actually was dispatched.
>>
>> Arguably this is an extremely crude and ill-conceived way of defining a
>> performance goal, but it's the one we were given :-(
>>
>>> But my point is VEL=50 is very high for an IMP=4
>>> workload (IMO).
>>
>> Often true, but not necessarily. (playing devil's advocate :-)
>

>With apologies to Crash, I gotta go with Z. Velocity 50 for an importance 4
>workload is way high. I would guess also that nothing else was impacted, so
>WLM just gave the CPU to NDM. It's not a problem unless you had other
>higher importance workload affected.

Well, I disagree with your "it's not a problem" statement because OP said,

"This caused us to hit our softcap and affected other tasks in the system."

The softcap issue could be resolved by Dave Thorn's suggestion of putting it
into a resource group with a max specification.

Other sites may not necessarily have the bandwidth to support NDM eating
half an engine; if the bottleneck is in the pipe then they won't necessarily see
ugly CPU consumption.

--
Tom Schmidt

Scott Ford

unread,

May 2, 2008, 1:39:33 PM5/2/08

to

I disagree with compression not taking a lot of cpu seconds, it does.
I worked for XCOM 6.2 support at Legent, trust me, we told customers if they
are running 'large pipes' i.e' T1s or T3s or Channel extenders do no
compress, its buys you nothing and just eats cpus seconds..We had about
3000+ customers on MVS/VM/VSE....

Scott Ford
Senior Host Developer | Forging Enterprise Identity | IdentityForge.com
(Main) 678.266.3399 x304 | (Cell) 609.346.0399 | (Fax) 678.266.3399
scott...@identityforge.com

This message is for the designated recipient only and may contain
priviledged, proprietary, or otherwise private information. If you have
received it in error, please notify the sender immediately and then delete
the original. Any other use of the email by you is prohibited.

Pinnacle

unread,

May 2, 2008, 1:45:50 PM5/2/08

to

>>WLM just gave the CPU to NDM. It's not a problem unless you had other
>>higher importance workload affected.
>
>
> Well, I disagree with your "it's not a problem" statement because OP said,
> "This caused us to hit our softcap and affected other tasks in the
> system."
>
> The softcap issue could be resolved by Dave Thorn's suggestion of putting
> it
> into a resource group with a max specification.
>

I should have been clearer by saying it's not a problem with WLM. If the
SOFTCAP is the problem, then I agree that you should use a resource group to
limit CPU. That will also cause NDM to run longer. We turned off
compression in our file transfer product (not NDM) because we found that the
transmit time savings were negligible compared to the CPU used to
compress/decompress the data.

Regards,
Tom Conley

Craddock, Chris

unread,

May 2, 2008, 1:53:15 PM5/2/08

to

Tom said

> With apologies to Crash, I gotta go with Z. Velocity 50 for an
importance
> 4 workload is way high. I would guess also that nothing else was
impacted,
> so WLM just gave the CPU to NDM.

In theory an application that has low cpu demands can achieve a
relatively high velocity -unless- the higher importance work is using
all of the processor resource.

Now I agree that in reality it is more often the case that higher
importance work -does- use all of the processor resource and lower
importance work struggles to get anything at all, but it really does
depend on the demands the work is making on the system.

> It's not a problem unless you had other
> higher importance workload affected.

If higher importance work is getting hurt by lower importance work then
there's some sort of APAR-able defect at play.

CC

Patrick Falcone

unread,

May 2, 2008, 2:07:10 PM5/2/08

to

Well, I like the resource group solution but I have to wonder how well STCLOW does normally with a vel = 50 at imp. 4. I might think that this is a never achieving goal service class from all of what might be in STCLOW, but of course I could be way wrong too.

I'd resource cap max it, CD, and also look at STCLOW to see if maybe that needs a tweak.

John S. Giltner, Jr.

unread,

May 2, 2008, 8:09:22 PM5/2/08

to

Kelman, Tom wrote:
> We run Connect:Direct (used to be called NDM) from Sterling Commerce.
> Yesterday afternoon the started task, CDNDM, took from 50-60% of an
> engine on our z9BC for almost 2 hours while it transferred a large file.
> This caused us to hit our softcap and affected other tasks in the
> system. We have the CDNDM task running in our STCLO service class which
> is set for Vel=50 and an importance level of 4. It was still running at
> a high DP and grabbed the CPU. I can't understand why a task that is
> basically transmitting a file over the network should need this much
> CPU. My only explanation might be that it is compressing the data
> before putting it on the network and the compression algorithm isn't the
> most efficient in the world.
>
>
>
> Has anyone else had this kind of a problem running Connect:Direct?
> What, if anything, did you do to control it?
>
>
>
> Tom Kelman
>
> Commerce Bank of Kansas City
>
> (816) 760-7632
>

What is the bandwidth between your site and the other site?
Which model z9BC?

Depending on which model I would make a wild guess that a link that has
enough available bandwidth to keep a CPU on most z9BC's 50-60% busy for
2 hours would be fast enough not to need compression.

Does Connect:Direct have an option to encrypt the data on the fly?
Could that be turned on?

Hal Merritt

unread,

May 6, 2008, 4:30:26 PM5/6/08

to

An early foray into file encryption showed this behavior, IIRC. Some
enciphering algorithms are said to be really heavy CPU hitters.

If you are encrypting then you need to be careful with compression.
Encrypted data is said to be generally not compressible.

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-...@BAMA.UA.EDU] On
Behalf Of Kelman, Tom
Sent: Friday, May 02, 2008 10:53 AM
To: IBM-...@BAMA.UA.EDU
Subject: Connect:Direct (NDM) CPU Usage

Tom Kelman

(816) 760-7632

NOTICE: This electronic mail message and any files transmitted with it are intended
exclusively for the individual or entity to which it is addressed. The message,
together with any attachment, may contain confidential and/or privileged information.
Any unauthorized review, use, printing, saving, copying, disclosure or distribution
is strictly prohibited. If you have received this message in error, please
immediately advise the sender by reply email and delete all copies.

Ted MacNEIL

unread,

May 6, 2008, 4:35:28 PM5/6/08

to

>If you are encrypting then you need to be careful with compression.
>Encrypted data is said to be generally not compressible.

Compress first.
Then encrypt.
If I recall from my basic computer courses in the mid-1970's.

-
Too busy driving to stop for gas!

----------------------------------------------------------------------

Scott Ford

unread,

May 6, 2008, 4:39:29 PM5/6/08

to

Compress across a large bandwidth pipe isn't the best solution....testing
bears out that Very little is gained with compress across say a T1 or
bigger.

Scott Ford
Senior Host Developer | Forging Enterprise Identity | IdentityForge.com
(Main) 678.266.3399 x304 | (Cell) 609.346.0399 | (Fax) 678.266.3399
scott...@identityforge.com

This message is for the designated recipient only and may contain
priviledged, proprietary, or otherwise private information. If you have
received it in error, please notify the sender immediately and then delete
the original. Any other use of the email by you is prohibited.

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-...@BAMA.UA.EDU] On Behalf

Of Ted MacNEIL
Sent: Tuesday, May 06, 2008 4:35 PM
To: IBM-...@BAMA.UA.EDU

Ted MacNEIL

unread,

May 6, 2008, 4:46:21 PM5/6/08

to

>Compress across a large bandwidth pipe isn't the best solution....testing bears out that Very little is gained with compress across say a T1 or bigger.

I agree.
All I meant was if you are going to do both:
Compress
Then encrypt.

I haven't seen value in compression for years.

Hal Merritt

unread,

May 6, 2008, 5:08:31 PM5/6/08

to

My test results differ. I've seen upwards of 60% over some very large
pipes. And I have seen completely different rates over supposedly equal
pipes.

Could be that network people talk only in terms of pipe size and not net
throughput (actual elapsed time for a byte to travel from point a to b).

NOTICE: This electronic mail message and any files transmitted with it are intended

exclusively for the individual or entity to which it is addressed. The message,
together with any attachment, may contain confidential and/or privileged information.
Any unauthorized review, use, printing, saving, copying, disclosure or distribution
is strictly prohibited. If you have received this message in error, please
immediately advise the sender by reply email and delete all copies.

----------------------------------------------------------------------

Thompson, Steve

unread,

May 6, 2008, 6:04:30 PM5/6/08

to

----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-...@BAMA.UA.EDU] On
Behalf Of Hal Merritt
Sent: Tuesday, May 06, 2008 4:03 PM
To: IBM-...@BAMA.UA.EDU
Subject: Re: Connect:Direct (NDM) CPU Usage

My test results differ. I've seen upwards of 60% over some very large
pipes. And I have seen completely different rates over supposedly equal
pipes.

Could be that network people talk only in terms of pipe size and not net
throughput (actual elapsed time for a byte to travel from point a to b).

<SNIP>

The more hops, the more time is spent in transit.

Regards,
Steve Thompson

-- All opinions expressed by me are my own and may not necessarily
reflect those of my employer. --

Hal Merritt

unread,

May 7, 2008, 11:36:11 AM5/7/08

to

I tried to explain that and they laughed me out of the room. Sigh.

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-...@BAMA.UA.EDU] On
Behalf Of Thompson, Steve
Sent: Tuesday, May 06, 2008 4:12 PM
To: IBM-...@BAMA.UA.EDU
Subject: Re: Connect:Direct (NDM) CPU Usage

..snip

The more hops, the more time is spent in transit.

Regards,
Steve Thompson

-- All opinions expressed by me are my own and may not necessarily
reflect those of my employer. --

NOTICE: This electronic mail message and any files transmitted with it are intended

exclusively for the individual or entity to which it is addressed. The message,
together with any attachment, may contain confidential and/or privileged information.
Any unauthorized review, use, printing, saving, copying, disclosure or distribution
is strictly prohibited. If you have received this message in error, please
immediately advise the sender by reply email and delete all copies.

----------------------------------------------------------------------

Dave Barry

unread,

May 21, 2008, 10:14:06 AM5/21/08

to

Not to mention that your NDM has multiple channels (subtasks) sending and receiving at a given time, but the velocity applies to the address space as a whole. That is the fundamental condundrum when using velocity goals.

We had the same problem of being occasionally overwhelmed by NDM on one system. However, our first shot at using a resource group involved setting a maximum so low, our file transfer people couldn't get their work done even when spare capacity was available.

The way we solved it was:

Create a service class "NDM" with importance 5, velocity 1
Associate the NDM service class to a resource group by the same name with a relatively low minimum capacity
Assign the NDM task to the NDM service class.

It works by ensuring an acceptable NDM workflow during peak hours, but allowing it to increase as much as possible during off hours without impacting production work. We haven't heard a peep out of either application or support staff since. The technique worked so well, we went on to use it for another hard-to-manage started task, the NFS z/OS client.

db

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-...@BAMA.UA.EDU] On Behalf Of Craddock, Chris
Sent: Friday, May 02, 2008 12:18 PM
To: IBM-...@BAMA.UA.EDU
Subject: Re: Connect:Direct (NDM) CPU Usage