SQLRPGLE performance problems

SanderP

unread,

Jul 12, 2007, 6:02:20 AM7/12/07

to

All,

At the company I work for we're facing some performance problems with
our "third party written" RPG programs.
We are running an i520, P10 with 8Gb memory installed on V5R3. We'll
go to V5R4 next Sunday, by the way.

All programs are written in SQLRPGLE over an SQL created database
(DDL)
The programs do not use service programs / stored procedures etc.,
badly enough
All files all journaled.

After some investigation I found out that all programs are compiled
with the CLOSQLCSR(*ENDMOD) parameter.
I have read some documents about this parameter, telling that programs
which are called many times within a job are better compiled with
CLOSQLCSR(*ENDACTGRP)

Let's say we have a SQLRPGLE program which only does an embedded SQL
SELECT statement to retrieve a customer's name.
At the end of all these programs the LR indicator is set on.
Will compiling this program with CLOSQLCSR(*ENDACTGRP) result in
faster performance?

Funny thing is also, that the programmers decided to let the cursors
have beautiful names like C00, C01, C02 etc..... So program A which
calls program B both can have cursors with the same name in it.

Also I found out that all programs are compiled with the COMMIT(*CHG)
parameter.
Even when there are no UPDATE/INSERT 's done in the program.
Is it better for these kinda programs to be compiled with
COMMIT(*NONE) to achieve better performance?

Beside above "problems", I know that there are cursors defined over
big files, for which no correct index is available. So I think some
performance can be made by just creating the correct indexes.

Hopefully some of you can give some answers to above questions.
Thanks in advance,

Sander

walker.l2

unread,

Jul 12, 2007, 9:14:44 AM7/12/07

to

If the third party still supoprts the programs, raise your issues with
them. (Just imagine how much trouble you would be in with your boss if
you had written that code yourself!) If the third party no longer
provide support, go ahead on creating the correct indexes - that is
likely to improve things dramatically and should be fairly straight-
forward (recompiling / rewriting programs is something for the long
run).

Graybeard

unread,

Jul 12, 2007, 10:32:37 AM7/12/07

to

No service pgms, no procedures, huh? Are you running in Activation
Groups, or the "Default Activation Group" ?

If you call a pgm and end with LR, the pgm reloads every call. If you
leave LR open, the next call uses the open pgm (variables are not
initialized, files are already open, etc). This can have a major
impact on performance. The effect will vary depending on the answer
to the activation group question.

As far as using SQL for everything: Keep in mind that SQL is designed
for returning SETS of data. A lookup of a customer name using SQL
will never be as fast as using CHAIN (of course, you need an indexed
file of LF for this).

I agree, though, with walker: Because its 3rd party software, I would
first build the indexes I think it needs. This is non-invasive to the
software and could get dramatic results. V5R4 also should improve
performance.

Also, check for a file QUSRSYS/QAQQINI (not in QSYS) and look for a
value for IGNORE_DERIVED_INDEX set to *YES. This cut some of my
larger SQL requests from 59 minutes to 5 minutes. Whether is helps
depends on whether you have derived indexes (such as LF with SELECT/
OMT criteria).

Jonathan Ball

unread,

Jul 12, 2007, 11:15:01 AM7/12/07

to

This (indexes) will probably provide the biggest
performance improvement, by far.

COMMIT(*CHG) should have no impact on programs that
only do SELECT and FETCH.

You need to be careful about changing the CLOSQLCSR
parameter, because if the programs aren't doing
explicit closes on cursors, you could end up with the
programs attempting to open cursors that are already
open, which could cause problems.

walker.l2

unread,

Jul 12, 2007, 11:31:22 AM7/12/07

to

> A lookup of a customer name using SQL will never be as fast as using CHAIN
>

I've seen that assertion several times before, and always from people
who know a lot more about RPG than I do, so I'm scarcely in a position
to judge whether it is true or not; but conceptually I don't know
understand why SQL should be slower. What is the difference between
open LF then key-read LF, compared with open SQL Index then select on
primary key? Why should one be much faster than the other?

Thad Rizzi

unread,

Jul 12, 2007, 1:40:17 PM7/12/07

to

Correct indexing should help performance greatly. Run the programs in
debug and look at the job log. You should see some suggestions on the
access paths you need to create.

HTH,

Thad Rizzi

Hau...@sss-software.de

unread,

Jul 12, 2007, 2:36:13 PM7/12/07

to

Correct indexes are the most important thing with embedded SQL.

Commit(*CHG) is the default value and does not have any impact, if no
insert, update or deletes are executed.

When using CLOSQLCSR(*ENDMOD) no ODPs can be reused. Instead each time
the module is called a full optimization must be performed, i.e.:
1. An access plan must either be built or at least validated, that
means all access path built over the physical files (or SQL Tables)
must be estimated, the statistics must be asked, the optimizer must
decide which temporary objects must be created and used (for examples
hash tables or relative record lists)
2. The data path must be opened (ODP), that means the temporary
objects must be built and filled with data.

When the module ends the ODP gets hard closed, i.e. all temporary
objects will be deleted. Next time the full optimzation must be
performed.

With CLOSQLCSR(*ENDACTGRP) a full optimization will only be performed
the first and second time the same SQL statement will be executed in
the same activation group. After the second exeuction the ODP stays
open (if reuseable). That means after the second call only the data in
the temporary objects must be actualized (pseudo open), but step one
and step 2 listed above will not be exeuted any more (as long as the
activation group will not be closed).

When using CLOSQLCSR(*ENDACTGRP) and activation group *NEW the ODP
will be hard closed (i.e. deleted) at the end of the program. Which is
not much better than doing a hard close at the end of the module.

Birgitta

DBDriver

unread,

Jul 13, 2007, 1:43:42 AM7/13/07

to

"walker.l2" <walk...@ukonline.co.uk> wrote in message
news:1184254282....@o61g2000hsh.googlegroups.com...

>> A lookup of a customer name using SQL will never be as fast as using
>> CHAIN
>>
> I've seen that assertion several times before, and always from people
> who know a lot more about RPG than I do, so I'm scarcely in a position
> to judge whether it is true or not; but conceptually I don't know
> understand why SQL should be slower.
>

The performance difference probably reflects on record access being a native
process on the Series I, having hooks into the database because of the
system's heritage. But seriously, no other RDBMS suppliers push native
record access approaches anymore. SQL is the language. As a consequence
these vendors have all worked to optimise their products for full time SQL
access. There are not even many major programming languages left that are
not geared towards SQL and recordset processing (I'm thinking Java, .NET,
ruby, PHPs, etc).

Perhaps the lack of focus by programming teams that are predominately RPG
based has let IBM off the hook on this system. Their DB2-UDB products fly
with SQL access but to be honest I've never really looked at single record
access through SQL on a Series I.

Rj.

Dieter Bender

unread,

Jul 12, 2007, 2:22:04 PM7/12/07

to

Hi,

the cursornames have scope of the module and duplicate names within the same
scope don't compile.
CLOSSQLCSR, we are talking about millisecs.
Commit we are talking about millisecs, as long as you don't have deadlocks,
then we are talking about 60 seconds
missing indexes, we are talking about seconds to minutes.

Dieter Bender

walker.l2

unread,

Jul 13, 2007, 4:50:14 AM7/13/07

to

> The performance difference probably reflects on record access being a native
> process on the Series I, having hooks into the database because of the
> system's heritage.

But on the current versions of OS/400, isn't SQL access equally native?

Dieter Bender

unread,

Jul 13, 2007, 7:14:58 AM7/13/07

to

Hi,

thats marketing, don't believe in any benchmark you didn't fake by yourself.
a synchronous read would be faster by RLA compared to SQL; SQL is far better
in caching and prefetching (except physical sequentiell processing in RLA)

Dieter Bender

walker.l2

unread,

Jul 13, 2007, 8:54:28 AM7/13/07

to

> thats marketing, don't believe in any benchmark you didn't fake by yourself.

Good advice. Our dataset probably isn't large enough (by an order of
magnitude or two) to show up any but the most blatant performance
differences, so I certainly wouldn't be prepared to put any money on
the question. :-)
Our choice (RLA or SQL) varies from program to program, and is based
on which method would be the simplest to write and maintain since the
performance of both is "good enough".

> a synchronous read would be faster by RLA compared to SQL

But why should that be the case? Doesn't it all end up with the same
MI in the end? (Note: I don't know whether it does or doesn't - this
level of detail is well beyond my comfort-zone.)

I can imagine sequential processing might be faster via RLA than SQL,
but I don't see why random access should be any different.

Dieter Bender

unread,

Jul 13, 2007, 10:39:55 AM7/13/07

to

>> a synchronous read would be faster by RLA compared to SQL
>
> But why should that be the case? Doesn't it all end up with the same
> MI in the end? (Note: I don't know whether it does or doesn't - this
> level of detail is well beyond my comfort-zone.)
>
> I can imagine sequential processing might be faster via RLA than SQL,
> but I don't see why random access should be any different.

RLA transfer the data by Format, SQL at field level
RLA checks the data by format level id (level check) SQL is strongly typed
RLA is tighter coupled to the data at compile time, SQL does some binding at
runtime

RLA is limited to use only two access methods (sequential or by keylist)
SQL has many features RLA doesn't have (different null Values handling, BLOB
support...) and you don't get anything without paying for it.

To decide who of both is faster, there are other more important aspects,
than the speed of a single s y n c h r o n o u s read operation. I have
designed the load process of a data warehous with Terrabyte of data and
millions of transactions every night and we use sql exclusivly for all
database operations, and speed is a critical issue for this.

Dieter Bender

Jonathan Ball

unread,

Jul 13, 2007, 10:59:30 AM7/13/07

to

walker.l2 wrote:
>> A lookup of a customer name using SQL will never be as fast as using CHAIN
>>
> I've seen that assertion several times before, and always from people
> who know a lot more about RPG than I do, so I'm scarcely in a position
> to judge whether it is true or not; but conceptually I don't know
> understand why SQL should be slower.

If it's even still true, and I don't know that it is,
it has nothing to do with RPG. I seem to recall from
other posts you've made that you're a COBOL guy, and
the same is, or was, true for COBOL.

walker.l2

unread,

Jul 13, 2007, 12:32:37 PM7/13/07

to

> If it's even still true, and I don't know that it is,
> it has nothing to do with RPG. I seem to recall from
> other posts you've made that you're a COBOL guy, and
> the same is, or was, true for COBOL.
>

Yeah, I'm a COBOL and Java guy. You are saying that a COBOL read is no
different to an RPG chain? (I can believe that.) And that the issue is
not RPG versus SQL, but RLA versus SQL then? Dieter seemed to be
making that point too, and it makes sense. (I guess RPGers are just
used to saying RPG rather than RLA because other "legacy" language
code is very much a minority in AS/400-land.)

I do remember now that one other difference between RLA and SQL is
when field validation is carried out - on the write or on the read (I
can't remember offhand which method checks when, just that they do it
at different times) - so depending on whether your application is
heavily write-biased or heavily read-biased, one access method might
outperform the other even with random access.

DBDriver

unread,

Jul 12, 2007, 6:47:44 PM7/12/07

to

"Dieter Bender" <dieter...@t-online.de> wrote in message
news:refkm4-...@eiffel.bender-dv.de...

>
> RLA transfer the data by Format, SQL at field level
> RLA checks the data by format level id (level check) SQL is strongly typed

See here's where it gets interesting though. How many other RDBMS even
support the notion of formats? From experience formats tend to be just a
lazy means of mixing Header and Detail records in a single file for faster
sequential access in traditional processing. Very much not a normalised
approach to table structures.

>
> To decide who of both is faster, there are other more important aspects,
> than the speed of a single s y n c h r o n o u s read operation. I have
> designed the load process of a data warehous with Terrabyte of data and
> millions of transactions every night and we use sql exclusivly for all
> database operations, and speed is a critical issue for this.
>

And this code would tend to be fairly readable to most SQL programmers no
matter what their RDBMS heritage. This opens up all sorts of additional
benefits. For example, ease of training new staff (from a very large
experience pool) and even portability of the code (especially if a Java or
.NET frontend).

I'd say the real speed problem with single record access of the ISeries
normally comes down to the conversion or upgrading of traditional code that
has repetitive reads or updates. I've seen a couple of instances of loops
"upgraded" with SQL Update statements that really take no advantage of SQL's
set approach to data updates.

Rj.

Jonathan Ball

unread,

Jul 14, 2007, 10:56:40 AM7/14/07

to

walker.l2 wrote:
>> If it's even still true, and I don't know that it is,
>> it has nothing to do with RPG. I seem to recall from
>> other posts you've made that you're a COBOL guy, and
>> the same is, or was, true for COBOL.
>>
> Yeah, I'm a COBOL and Java guy. You are saying that a COBOL read is no
> different to an RPG chain?

They both get translated to MI in the process of
encapsulation, and since a COBOL READ KEY IS... and an
RPG CHAIN are doing fundamentally the same thing (and
so do a COBOL START and an RPG SETLL), I doubt IBM
wrote different bits of MI for each. I do both RPG and
COBOL, and over the years I've done little comparisons
between test code in both languages, and I see no
performance difference. The RPG-vs-COBOL battles faded
away into irrelevance a long time ago, but back in the
day, RPGers used to claim that RPG was the "native"
language for the System 38 and AS/400, which of course
was a load of malarkey. I think it was Paul Conte who
pointed out that MI was the true native language for
the platform.

Lou

unread,

Jul 15, 2007, 7:37:37 AM7/15/07

to

I see a lot of theories and discussion. I do not see a description of
the performance problem being measured.

Sander, would you post the details of the performance problem(s) you
have measured?

SanderP

unread,

Jul 16, 2007, 4:54:37 AM7/16/07

to

First of all, thanx for your answers, but I have to agree with Lou,
the discussion got a bit off topic.

Since yesterday we're on V5R4. This, together with the upgrade from
4Gb to 8Gb memory of last week, resulted in quite a big speed-up of
the system.

After some more investigation I found out that the biggest performance
problem must be the lack of correct SQL indexes, which will be quite
easy to fix.

Also some tests are done with the CLOSQLCSR(*ENDACTGRP) parameter.
This resulted in some faster perfomance also. But we have to be
careful with this change, I think.

One more question concerning this CLOSQLCSR parameter; will this only
have effect of cursors defined in programs, or does this also have
effect on programs which only do "normal" SELECT staments over files
just to retrieve data from one record?

Thanx again

Thomas

unread,

Jul 17, 2007, 4:08:21 AM7/17/07

to

Jonathan Ball wrote:

> The RPG-vs-COBOL
> battles faded away into irrelevance a long time ago, but back in the
> day, RPGers used to claim that RPG was the "native" language for the
> System 38 and AS/400, which of course was a load of malarkey. I think
> it was Paul Conte who pointed out that MI was the true native language
> for the platform.

Note that according to IBM, RPG was perhaps the _defining_ language of
the System/38. In the IBM System/38 Technical Developments manual (1978
ISBN 0-933186-00-2) that described the S/38 in a series of technical
papers, the article titled 'Introduction to IBM System/38
architecture', there is a topic 'System function' that begins:

"System/38 consists of a machine and three major IBM licensed
programs: Control Program Facility (CPF), RPG III, and Interactive
Data Base Utilities (IDU)."

Now, that's not at all the same as saying that RPG is the "native"
language; but it might help explain why some could make the leap to
such a statement.

Also note that MI hasn't seemed to be the 'true native language' under
OS/400 for quite a few years, at least since the ILE was brought to us.
AFAIK, the ILE compilers no longer translate to MI -- MI is for OPM.

Purely comments for discussion.

--
Tom Liotta
http://zap.to/tl400

Thomas

unread,

Jul 17, 2007, 4:11:52 AM7/17/07

to

DBDriver wrote:

> "Dieter Bender" wrote in message

> news:refkm4-...@eiffel.bender-dv.de...
>> RLA transfer the data by Format, SQL at field level RLA checks
>> the data by format level id (level check) SQL is strongly typed
>>
> See here's where it gets interesting though. How many other RDBMS
> even support the notion of formats? From experience formats tend
> to be just a lazy means of mixing Header and Detail records in a
> single file for faster sequential access in traditional
> processing. Very much not a normalised approach to table
> structures.

??? I'm completely in the dark about how 'record formats' and
'mixing Header and Detail records in a single file' have any
relationship to each other, except that 'records' in externally
described files have formats.

A 'record format' seems to be a database object under OS/400 (and
i5/OS). They're created at a level apparently inside TIMI since, AFAIK,
there is no object type that's externalized for programmer use.
E.g., you can list objects of type *PGM or *FILE, but there is no
type [*RCDFMT] or whatever they might be called.

A given record format will be unique within each ASP. If a new file
is created that resolves to the same format, the single format is
shared.

I've seen no info on what causes a format object (or compound object
possibly) to be deleted. They might always exist or might be deleted
when no files reference them.

That's about as far as I can discuss format objects because there isn't
enough widely publicized info about them. I'd love to see more detail
from anyone who can supply it.

Dieter Bender

unread,

Jul 17, 2007, 5:44:25 AM7/17/07

to

with my statement:

>>> RLA transfer the data by Format, SQL at field level RLA checks
>>> the data by format level id (level check) SQL is strongly typed

i tried to express, that RLA checks the format level id of file and program
and if it matches, the data is treated as binary and transferred to the
program. sql checks field by field and transfers the data field by field.

in my understanding DBDrivers statement:

>> From experience formats tend
>> to be just a lazy means of mixing Header and Detail records in a
>> single file for faster sequential access in traditional
>> processing. Very much not a normalised approach to table
>> structures.

is talking about tables with multiple formats, I don't know how this is
implemented in os400, in some cases formats seem to be no longer unique
(create table as .... twice and loooking to the format level ids, or was it
create table like ..., or maybe a bug, I don't know?), but this was not in
scope of the discussion (why is a single synchronous read by rla faster as
by sql.

Dieter

DBDriver

unread,

Jul 17, 2007, 5:55:33 AM7/17/07

to

"Dieter Bender" <dieter...@t-online.de> wrote in message

news:pkfum4-...@eiffel.bender-dv.de...
> with my statement:

>
> is talking about tables with multiple formats, I don't know how this is
> implemented in os400, in some cases formats seem to be no longer unique
> (create table as .... twice and loooking to the format level ids, or was
> it
> create table like ..., or maybe a bug, I don't know?), but this was not in
> scope of the discussion (why is a single synchronous read by rla faster as
> by sql.
>

Sorry I was speculating on the additional physical file complexity that the
DB2/400 environment has to cater for which other SQL reliant RDBMs don't
typically need to worry about. The concept of record formats is completely
alien to other vendors as can be seen most obviously by the variance in the
connection strings for, say, ADO, between the DB2/400 and the other
databases.

Rj.

CRPence

unread,

Jul 26, 2007, 7:52:43 PM7/26/07

to

Most of the following details are available directly from DMPOBJ of a
database *FILE object.
The "format" is an object of type *FMT. As an 'internal object
type', the format object is not exposed [externalized to a user] in a
library. As an object-based system, this allows the database to
encapsulate the definition of the columns rather than storing them
externally [with respect to the TABLE or VIEW] in the catalog; the *FILE
being a composite object. Having a format to define the columns enables
just the one sub-object to represent, without redundancy, multiple
composites. The format object is tracked in a database directory object
of type *DBDIR when more than one composite references that format.

Another RDBMS which does not support this self-describing
object-based nature would not need a concept of a 'format', as they can
just extract the list of fields that define the layout [aka format] of
the data. The i5/OS also supports effectively that other way to define
the layout, using the data dictionary object as is done for [linked]
S/36 files even on the S/36 -- the *DTADCT for SQL in pre-v3r1m0 was
effectively redundant [now deprecated], only used prior to the existence
of active tracking of columns to the system-wide dictionary [file
QADBIFLD].
The DB2 for i5/OS SQL only allows a single format [aka layout] for a
TABLE or a VIEW, so the reference to combining header and detail data is
not applicable there.

When a database file is deleted and its format is either not tracked
to a directory or otherwise would be the last file to 'own' that format,
the format is destroyed.

Regards, Chuck
--
All comments provided "as is" with no warranties of any kind
whatsoever and may not represent positions, strategies, nor views of my
employer

Thomas wrote:

Thomas

unread,

Jul 29, 2007, 2:32:59 AM7/29/07

to

CRPence wrote:

> Most of the following details are available directly from DMPOBJ of a
> database *FILE object.

'Ask and ye shall receive.'

Thanks, Chuck. It's good to read significantly authoritative details
once in a while. Maybe if I retire someday, I'll find the time to
decompose *FILE objects in detail. For now, this kind of stuff is plenty.