I'm trying to get greater speeds from our VMS application running on Itanium. I've had moderate success going from using the lock manager (sys$enqw) to using spinning bitlocks, but feel there is more to be gained in other areas too. However, PCA does not have system service analysis implemented on Itanium and when I use PCA to analyse processor time in functions, 90% is spent in "SYSTEM$SPACE", which doesn't tell me a great deal.
So, does anyone have any recommendations/suggestions on application performance analysis tools that work on Itanium, please? I'm using VMS v8.3 (ia64) with PCA v4.9.
pin...@gmail.com wrote: > I'm trying to get greater speeds from our VMS application running on > Itanium. I've had moderate success going from using the lock manager > (sys$enqw) to using spinning bitlocks, but feel there is more to be > gained in other areas too. However, PCA does not have system service > analysis implemented on Itanium and when I use PCA to analyse > processor time in functions, 90% is spent in "SYSTEM$SPACE", which > doesn't tell me a great deal.
> So, does anyone have any recommendations/suggestions on application > performance analysis tools that work on Itanium, please? I'm using VMS > v8.3 (ia64) with PCA v4.9.
Since you didn't mention what you've already done, so I'll mention the top three things to look for on I64 to improvement performance for you and others.
Look at MONITOR ALIGN, if it is not showing 0 (or close to 0), then you can have PCA collect alignment fault data for your image (assuming it is YOUR image that is causing the faults). We might have to look for other faulting processes however.
If MONITOR ALIGN says you are clean, then you can do some PC sampling inside of SDA (start at SDA> PCS for quick help).
-- John Reagan OpenVMS Pascal/Macro-32/COBOL Project Leader Hewlett-Packard Company
On Apr 26, 5:01 am, "pin...@gmail.com" <pin...@gmail.com> wrote:
> I'm trying to get greater speeds from our VMS application running on > Itanium. I've had moderate success going from using the lock manager > (sys$enqw) to using spinning bitlocks, but feel there is more to be > gained in other areas too. However, PCA does not have system service > analysis implemented on Itanium and when I use PCA to analyse > processor time in functions, 90% is spent in "SYSTEM$SPACE", which > doesn't tell me a great deal.
> So, does anyone have any recommendations/suggestions on application > performance analysis tools that work on Itanium, please? I'm using VMS > v8.3 (ia64) with PCA v4.9.
How about using SDA extensions ???
$ MCR SDA * SDA> PRF LOAD
Others may be of interest -- IO, SPL, LCK, etc. depending on what exactly you are looking for
> I'm trying to get greater speeds from our VMS application running on > Itanium. I've had moderate success going from using the lock manager > (sys$enqw) to using spinning bitlocks, but feel there is more to be > gained in other areas too. However, PCA does not have system service > analysis implemented on Itanium and when I use PCA to analyse > processor time in functions, 90% is spent in "SYSTEM$SPACE", which > doesn't tell me a great deal.
> So, does anyone have any recommendations/suggestions on application > performance analysis tools that work on Itanium, please? I'm using VMS > v8.3 (ia64) with PCA v4.9.
On Apr 26, 5:01 am, "pin...@gmail.com" <pin...@gmail.com> wrote:
> I'm trying to get greater speeds from our VMS application running on > Itanium. I've had moderate success going from using the lock manager > (sys$enqw) to using spinning bitlocks, but feel there is more to be > gained in other areas too. However, PCA does not have system service > analysis implemented on Itanium and when I use PCA to analyse > processor time in functions, 90% is spent in "SYSTEM$SPACE", which > doesn't tell me a great deal.
> So, does anyone have any recommendations/suggestions on application > performance analysis tools that work on Itanium, please? I'm using VMS > v8.3 (ia64) with PCA v4.9.
Sorry, may be a duplicate...
How about using SDA extensions?
$ MCR SDA * SDA> PRF LOAD
Other extensions may be of interest -- IO, SPL, LCK, LNM, etc., depending on what you are tracking or what your application is using.
>> I'm trying to get greater speeds from our VMS application running on >> Itanium. I've had moderate success going from using the lock manager >> (sys$enqw) to using spinning bitlocks, but feel there is more to be >> gained in other areas too. However, PCA does not have system service >> analysis implemented on Itanium and when I use PCA to analyse >> processor time in functions, 90% is spent in "SYSTEM$SPACE", which >> doesn't tell me a great deal.
>> So, does anyone have any recommendations/suggestions on application >> performance analysis tools that work on Itanium, please? I'm using VMS >> v8.3 (ia64) with PCA v4.9.
> Since you didn't mention what you've already done, so I'll mention the > top three things to look for on I64 to improvement performance for you > and others.
> Look at MONITOR ALIGN, if it is not showing 0 (or close to 0), then you > can have PCA collect alignment fault data for your image (assuming it is > YOUR image that is causing the faults). We might have to look for other > faulting processes however.
> If MONITOR ALIGN says you are clean, then you can do some PC sampling > inside of SDA (start at SDA> PCS for quick help).
I just ran a user mode (only) application (WASD) in serveral different activities (main image only - no other process(es) involved), exercised from another system using Apache Bench. Kernel mode faults were consistently higher than user mode. Super and exec modes consistently zero (even though RMS calls featured in at least some of the activity).
What can be inferred from such a snapshot?
Why not MONITOR ALIGN on Alpha where such issues have been emphasized from the beginning?
TIA.
-- I wish one and all long and happy lives, no matter what may become of them afterwards. Use sunscreen! Dont smoke cigarettes. Cigars, however, are good for you ... Firearms are also good for you. Gunpowder has zero fat and zero cholesterol. That goes for dumdums, too. [Kurt Vonnegut; God Bless You, Dr Kevorkian]
> I just ran a user mode (only) application (WASD) in serveral different > activities (main image only - no other process(es) involved), exercised > from another system using Apache Bench. Kernel mode faults were > consistently higher than user mode. Super and exec modes consistently > zero (even though RMS calls featured in at least some of the activity).
> What can be inferred from such a snapshot?
> Why not MONITOR ALIGN on Alpha where such issues have been emphasized > from the beginning?
> TIA.
1) Some application is generating alignment faults.
2) Some piece of kernel code is also generating alignment faults. Could be OpenVMS itself (I would call that a bug that we need to fix once we identify the guilty party); could be some piece of your application if you have kernel code.
3) On Alpha, alignment faults are fixed up quickly by the PAL code. Since the PAL code understands the page table entries and has direct access to the machine below OpenVMS, it can quickly fix up the alignment fault and make sure that no other CPU is in the middle of deleting the address space at the same time. On I64 however, all that happens is that the chip interrupts OpenVMS. The OS has to go take some some spinlocks, etc. to prevent other CPUs from playing with the memory in question while the alignment fault is fixed up. On Alpha, an alignment fault might cost 100 instructions give or take. On I64, it could be 10,000 to 15,000 instructions give or take (my SWAG, not really measured with any level of confidence).
So you can do things like:
1) Use the SDA FLT extension to figure out which process/image is faulting along with PC values. Using those, plus .MAPs, and .LISs files , you can go back to the source.
2) PCA will collect fault data and plot it for you.
3) The debugger lets you say SET BREAK/UNALIGNED so it will stop at alignment faults.
4) You can call things like SYS$PERM_REPORT_ALIGN_FAULT which will generate messages to SYS$OUTPUT for alignment faults. It is process-wide and will survive image rundown. You have to call SYS$PERM_DIS_ALIGN_FAULT_REPORT (don't get me started on the confusing naming scheme) to turn them off.
-- John Reagan OpenVMS Pascal/Macro-32/COBOL Project Leader Hewlett-Packard Company
On Apr 26, 4:01 am, "pin...@gmail.com" <pin...@gmail.com> wrote:
> I'm trying to get greater speeds from our VMS application running on > Itanium. I've had moderate success going from using the lock manager > (sys$enqw) to using spinning bitlocks, but feel there is more to be > gained in other areas too. However, PCA does not have system service > analysis implemented on Itanium and when I use PCA to analyse > processor time in functions, 90% is spent in "SYSTEM$SPACE", which > doesn't tell me a great deal.
> So, does anyone have any recommendations/suggestions on application > performance analysis tools that work on Itanium, please? I'm using VMS > v8.3 (ia64) with PCA v4.9.
I concur with John and the others, alignment faults are a BIG penalty on IA64 and should be completely eligible for removal.
>> I just ran a user mode (only) application (WASD) in serveral different >> activities (main image only - no other process(es) involved), >> exercised from another system using Apache Bench. Kernel mode faults >> were consistently higher than user mode. Super and exec modes >> consistently zero (even though RMS calls featured in at least some of >> the activity).
>> What can be inferred from such a snapshot?
>> Why not MONITOR ALIGN on Alpha where such issues have been emphasized >> from the beginning?
>> TIA.
> 1) Some application is generating alignment faults.
> 2) Some piece of kernel code is also generating alignment faults. Could > be OpenVMS itself (I would call that a bug that we need to fix once we > identify the guilty party); could be some piece of your application if > you have kernel code.
No elevated modes, all user.
> 3) On Alpha, alignment faults are fixed up quickly by the PAL code. > Since the PAL code understands the page table entries and has direct > access to the machine below OpenVMS, it can quickly fix up the alignment > fault and make sure that no other CPU is in the middle of deleting the > address space at the same time. On I64 however, all that happens is > that the chip interrupts OpenVMS. The OS has to go take some some > spinlocks, etc. to prevent other CPUs from playing with the memory in > question while the alignment fault is fixed up. On Alpha, an alignment > fault might cost 100 instructions give or take. On I64, it could be > 10,000 to 15,000 instructions give or take (my SWAG, not really measured > with any level of confidence).
Orders of magnitude at any rate.
> So you can do things like:
> 1) Use the SDA FLT extension to figure out which process/image is > faulting along with PC values. Using those, plus .MAPs, and .LISs files > , you can go back to the source.
Application mainly doing network and file I/O, along with some internal processing. Approx two minutes duration.
SDA> LOAD FLT SDA> FLT START TRACE [do some processing] SDA> FLT STOP TRACE SDA> FTL SHOW TRACE /SUMM
Can this be understood in general terms (without needing to be an internals specialist)?
> 2) PCA will collect fault data and plot it for you.
> 3) The debugger lets you say SET BREAK/UNALIGNED so it will stop at > alignment faults.
> 4) You can call things like SYS$PERM_REPORT_ALIGN_FAULT which will > generate messages to SYS$OUTPUT for alignment faults. It is > process-wide and will survive image rundown. You have to call > SYS$PERM_DIS_ALIGN_FAULT_REPORT (don't get me started on the confusing > naming scheme) to turn them off.
Thanks for the useful explanations.
-- So it goes. [Kurt Vonnegut; Slaughterhouse-Five]
> pin...@gmail.com wrote: > > I'm trying to get greater speeds from our VMS application running on > > Itanium. I've had moderate success going from using the lock manager > > (sys$enqw) to using spinning bitlocks, but feel there is more to be > > gained in other areas too. However, PCA does not have system service > > analysis implemented on Itanium and when I use PCA to analyse > > processor time in functions, 90% is spent in "SYSTEM$SPACE", which > > doesn't tell me a great deal.
> > So, does anyone have any recommendations/suggestions on application > > performance analysis tools that work on Itanium, please? I'm using VMS > > v8.3 (ia64) with PCA v4.9.
> Since you didn't mention what you've already done, so I'll mention the > top three things to look for on I64 to improvement performance for you > and others.
And I think #4 is exception handling, no? If you do a lot of setjmp() / longjmp(), it can make a difference to compile C code with /DEFINE=__FAST_SETJMP. However, I don't know how common it is to be limited by that. I built Perl this way, thinking it might make a difference because every Perl opcode is a setjmp() / longjmp() sequence, but it only made a difference of a few seconds in an hour-long test (in other words, in the noise). In a raw (but artificial) test consisting of nothing but setjmp() calls, however, it was about 10,000 times faster to use __FAST_SETJMP. It saved something like 50,000 nanoseconds per call, but 50,000 nanoseconds still isn't very long. This on td183.testdrive.hp.com.
Beware that alignment faults showing up as kernel mode faults may very well be caused by user mode code by passing unaligned parameters to system services.
pin...@gmail.com wrote: > I'm trying to get greater speeds from our VMS application running on > Itanium. I've had moderate success going from using the lock manager > (sys$enqw) to using spinning bitlocks, but feel there is more to be > gained in other areas too. However, PCA does not have system service > analysis implemented on Itanium and when I use PCA to analyse > processor time in functions, 90% is spent in "SYSTEM$SPACE", which > doesn't tell me a great deal.
> So, does anyone have any recommendations/suggestions on application > performance analysis tools that work on Itanium, please? I'm using VMS > v8.3 (ia64) with PCA v4.9.
In article <1331dhi6tv1e...@corp.supernews.com>, Mark Daniel <mark.dan...@vsm.com.au> writes:
> What can be inferred from such a snapshot?
If you have sufficient RAM, then I'd raise the system working set a little to see if the kernel faults go away. On highly restricted RAM systems I used to lower the system working set until those faults just barely started to happen.
There is no such thing as a user-mode only program. You can't load a program from the disk without using the kernel to catch the page faults and do the disk I/O. (There's no separate program loader in VMS, the OS just maps the pages and calls the entry point, the process will probably experience a hard page fault during the call instruction).
> I just ran a user mode (only) application (WASD) in serveral different > activities (main image only - no other process(es) involved), exercised > from another system using Apache Bench.
There's no such thing as a user mode only application. You can't even load an application into RAM without using kernel services. And I would suspect WASD does a good bit of I/O, which requires the kernel at some point.
On Apr 27, 9:00 am, koeh...@eisner.nospam.encompasserve.org (Bob
Koehler) wrote: > In article <1331dhi6tv1e...@corp.supernews.com>, Mark Daniel <mark.dan...@vsm.com.au> writes:
> > What can be inferred from such a snapshot?
> If you have sufficient RAM, then I'd raise the system working set > a little to see if the kernel faults go away.
BZZZZ....
Mark cut the MONI ALIGN screen a little short. The header would have read: "ALIGNMENT FAULT STATISTICS" Those faults are NOT page faults, but alignment fault... and a lot! Increasing memory will have zero effect on this as you now surely realize.
In article <1177685406.887896.207...@t38g2000prd.googlegroups.com>, Hein RMS van den Heuvel <heinvandenheu...@gmail.com> writes:
> On Apr 27, 9:00 am, koeh...@eisner.nospam.encompasserve.org (Bob > Koehler) wrote: >> In article <1331dhi6tv1e...@corp.supernews.com>, Mark Daniel <mark.dan...@vsm.com.au> writes:
>> > What can be inferred from such a snapshot?
>> If you have sufficient RAM, then I'd raise the system working set >> a little to see if the kernel faults go away.
> BZZZZ....
Yeah, I realized that right after I posted and canceled the post, but I suppose it got to some servers anyhow.
The exception PC symbolization is comes out with SDA$SHARE+xxxxxxx since you are in SDA.
Look at the SHOW TRACE without the /SUMMARY. Find one of the lines with 147CB0 and find the process index. Now do a SET PROC/INDEX with that number and repeat the SHOW TRACE/SUMMARY. The symbolization will be more informative.
To figure out the image that the process is running, you can do things like:
SHOW PROC/CHANNEL SHOW PROC/IMAGE
Once you have the image name that the process is running, the symbolization plus the .MAP; plus the .LIS, should eventually point you to the right place.
-- John Reagan OpenVMS Pascal/Macro-32/COBOL Project Leader Hewlett-Packard Company
> Beware that alignment faults showing up as kernel mode faults may very > well be > caused by user mode code by passing unaligned parameters to system > services.
> Jur.
Ah yes. Forgot about that.
-- John Reagan OpenVMS Pascal/Macro-32/COBOL Project Leader Hewlett-Packard Company