Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Enscribe Files

327 views
Skip to first unread message

Jay

unread,
Oct 2, 2001, 12:23:12 AM10/2/01
to
Hi All,

Does anyone know how to boost the performance of writes to an enscribe
(Key Sequence) file ? Currently we have partitioned the database,
tried to perform updates by bundling about 500 in one Begin-end
transaction. Someone mentioned bulk i-o or fast i-o to do the writes
and maybe that will speed up performance.

Can anyone shed more light on bulk i-o ? Is there anyother suggestions
people have on trying to increase transaction rates ?

Thanks,
Jay

Steve Boettger

unread,
Oct 2, 2001, 7:44:21 AM10/2/01
to
"Jay" <jay_a...@hotmail.com> wrote in message
news:727da039.01100...@posting.google.com...

Jay -
Have you searched 'TIM', in particular the 'Enscribe Programmers Manual'
chapters 4 & 6. Here is some information from Chapter 4 of that manual -

With the bulk transfer feature disabled, you can transfer from 0 to 4096 bytes
in a single operation.
With the bulk transfer feature enabled, you can transfer up to 30K bytes in a
single operation. The amount of data transferred must be a multiple of 2K
bytes. Note
that with the bulk transfer feature enabled, the only data transfer I/O
operations that
are allowed are READX, READUPDATEX, WRITEX, and WRITEUPDATEX.

Additionally, are you using larger data blocks? This will help to prevent data
block splits thereby improving disk usage, and reducing the number of Index
records and larger data transfers. All helping to improve performance. Of
course you will have to take into consideration of using Audited vs
Non-Audited,
Buffered. Again, 'TIM' has some very good information and examples.

Good luck,
Steve


--
Posted from mail.compuconservices.com [63.72.147.130]
via Mailgate.ORG Server - http://www.Mailgate.ORG

Dave Bossi

unread,
Oct 2, 2001, 9:54:05 AM10/2/01
to
Jay -

Check out the Enscribe Programming Guide for info on bulk-io. Here's
the pertinant stuff:

... the SETMODE 141 call requires that the file be opened for
unstructured access (bit 2 of open^flags = 1) and for exclusive access
(bits 10 and 11 of open^flags = 2). With the bulk transfer feature


disabled, you can transfer from 0 to 4096 bytes in a single operation.
With the bulk transfer feature enabled, you can transfer up to 30K
bytes in a single operation. The amount of data transferred must be a
multiple of 2K bytes. Note that with the bulk transfer feature
enabled, the only data transfer I/O operations that are allowed
are READX, READUPDATEX, WRITEX, and WRITEUPDATEX.

Doesn't sound too helpful because it's unstructured.

FAST-IO is available - see your appropriate language manual like
COBOL85 or whatever to enable this feature. The main issue here is
that nobody is reading the data while it's being written because it is
cached on output and can cause someone to get old data.

Also, make sure you have parrallelwrite configured on your drives.
Any chance of partitioning the application itself to spread the load
and keep the data from traveling too far to get to disk?

Another thought is alternate key updates. Are you writing any (or
maybe too many) and to what drives.

Dave


jay_a...@hotmail.com (Jay) wrote in message news:<727da039.01100...@posting.google.com>...

Joel Shepherd

unread,
Oct 2, 2001, 12:07:02 PM10/2/01
to
Checking to see if parallel writes are disabled was my first thought.

A client did a two week long stint a the Benchmark Center earlier
this year. While researching why the write performance was so poor
they discovered that serial writes is the Gxx default for mirrored
volumes.
After enabling parallel writes, they reported a 30-40% improvement
in write performance. It depends on what else is hammering the
volume at the time, your mileage may vary.


SCF INFO DISK $DATA1, CONFIG

[snip]

*SerialWrites.......................... ENABLED

In article <3dcc870e.01100...@posting.google.com>, dbo...@usa.com
says...

Gilbert Mainz

unread,
Oct 2, 2001, 2:14:13 PM10/2/01
to
If you have input data, allocate it to another disk. The disk should
preferably run on a primary CPU different from the primary CPU of the
target disk. This is referring to all sorts of system related load
balancing.

If your application is doing a lot of internal processing (especially
referencing internal tables or arrays of data) before the write, the
bad performance can also originate from the compiled code. You can
optimise the compiled code with the AXCEL (Accelerator) program.
Easiest usage: AXCEL <Compiled code>, <Compiled code>
N.b.: You can of cource AXCEL to another object code.

> Also, make sure you have parrallelwrite configured on your drives.
> Any chance of partitioning the application itself to spread the load
> and keep the data from traveling too far to get to disk?

Just to add to this: Check the load of the drive. If other
applications are heavily using the drive, you will obviously not get
all available resources to your application.

Another thing to try system wise is the cache settings of the drive. I
highly recommend you to read about disk cache settings before changing
anything there. As far as I remember from the old D.. versions you can
cause CPU crash by allocating more cache than available memory on the
CPU running the disk process.

Gilbert

Bill Honaker

unread,
Oct 2, 2001, 4:22:50 PM10/2/01
to
Jay,

You don't mention much about either your platform (model, # CPUs, #
disks) or the application, so this may be off the wall.

Based loosely on what you have tried so far, it appears that this is a
bulk loading application (as opposed to a high volume OLTP one). You
mentioned partitioning the database, but that doesn't help much if you
only run one copy of the application. In cases where the source data
allows it, we have had good luck in the past running multiple copies
of the loading application, each one targeting a different partition.
That was on SQL/MP instead of Enscribe so your results may not be as
good. (All enscribe inserts must pass throught the disk process of
the primary partition, which is different from SQL inserts).

More input of what you're running on and how successful you have been
to date could help in getting more valuable responses.

Bill Honaker
Chief Technology Officer
XID Software, Inc.

Doug Miller

unread,
Oct 2, 2001, 9:27:51 PM10/2/01
to
In article <3dcc870e.01100...@posting.google.com>, dbo...@usa.com (Dave Bossi) wrote:
>Jay -
>
>Check out the Enscribe Programming Guide for info on bulk-io. Here's
>the pertinant stuff:
>
[snip]

>
>Doesn't sound too helpful because it's unstructured.

Exactly right -- he's trying to write a key-sequenced file. If he's going to
do that using bulk I/O, then he's responsible for building and updating bitmap
and index blocks, and the block header and internal structure of the data
blocks. Not especially practical.

>
>FAST-IO is available - see your appropriate language manual like
>COBOL85 or whatever to enable this feature. The main issue here is
>that nobody is reading the data while it's being written because it is
>cached on output and can cause someone to get old data.
>

Cache is the first thing I'd look at, actually -- to make sure the disk volume
in question has enough cache configured to keep the *index* blocks for this
file in cache all (or most) of the time.

FAST IO is of use only on sequential I/O operations. For random reads or
writes, there is no benefit.

>Also, make sure you have parrallelwrite configured on your drives.
>Any chance of partitioning the application itself to spread the load
>and keep the data from traveling too far to get to disk?
>

Partitioning also can reduce the index levels of the file, providing an
additional performance boost.

>Another thought is alternate key updates. Are you writing any (or
>maybe too many) and to what drives.
>

Good point -- in an application I once worked on, the file with the heaviest
write activity had eleven alternate-key files, and updating those files was
actually the single biggest performance bottleneck in the entire app.

Doug Miller

unread,
Oct 2, 2001, 9:32:18 PM10/2/01
to
In article <mh9krt8lqh2qno02b...@4ax.com>, Bill Honaker <nospam_...@xidsoftware.com> wrote:
>Jay,
>
>You don't mention much about either your platform (model, # CPUs, #
>disks) or the application, so this may be off the wall.
>
>Based loosely on what you have tried so far, it appears that this is a
>bulk loading application (as opposed to a high volume OLTP one).

Which brings up another idea. If in fact this is a bulk loading application,
doing random inserts, HUGE performance gains can be achieved by sorting the
data first (preferably using parallel FASTSORT running in all CPUs). Since the
data will be inserted in sequence, FAST I-O can be enabled -- and block splits
will be eliminated.

Jay

unread,
Oct 4, 2001, 12:29:58 AM10/4/01
to
Thank you all for your input. Let me explain a bit more about my
program.
We currently run on the ServerNet S74000, G06.11 OS with 4G memory.
The application is an OLTP not a BATCH application and we do have
multiple copies of the process running in each CPU. The file is a Key
sequenced file with one alternate file that has two alternate keys.
The record size is 1.5K and we are using the 4K blocks to help in
caching.

The high level overview goes like this:

we have a process A which receives messages from $receive and writes
to the new messagedb(partioned across multiple physical drives) and
forwards the message to process b to do some data manipulation. Once
the manipulation is done the message is sent to process c which
deletes the record from the messsagedb to another db which is
identical in its setup.

To help performance above the cache was increased and we increased the
number of disk processes. From what I read in the manuals it seems
that if I use File_OPEN_ I get bulk i-o, or I could issue a setmode
141. I am not clear if this helps me because I have an alternate key.
I have not tried the parallel writes yet (hopefully this will help),
but is there any other suggestions ?

I wrote a small program to see if converting to queue files instead
would help but it actually looked like key-sequence went faster ??

Edwin Earl Ross

unread,
Oct 4, 2001, 1:23:40 AM10/4/01
to
Updating 500 records in one transaction locks many records, and may delay
I/O done by other processes. In this situation, performance can be data
sensitive. For example, updating 500 consecutive records, in each process
will tend to minimize interference, especially if each process is updating
records in different partitions. On the other hand, updating random records
in each process may result in overall performance degradation due to record
or file locking. If possible, test performance with bundles of 1, 5, 25,
100, and 500 records. Choose an optimal bundle size based on performance
results.

FUP COPY 500 records into your file indexed file. FUP does efficient I/O.
Compare the time it takes FUP to insert 500 records and the time it takes
your application to insert (update) 500 records. If FUP runs much faster,
the bottleneck may not be related to updates to the indexed Enscribe file.

Partition widely to insure that a FUP INFO, STAT shows only two index
levels. The difference in performance between two and three index levels can
be 25%.

Insure that the disks containing the file are not overactive, using Measure
or another performance monitor. If there is too much activity on these
disks, move the file to disks that are less busy.

"Jay" <jay_a...@hotmail.com> wrote in message
news:727da039.01100...@posting.google.com...

> ...we do have multiple copies of the process running in each CPU.
> ... tried to perform updates by bundling about 500 in one Begin-end
transaction.

Ian Hadley

unread,
Oct 4, 2001, 4:11:35 AM10/4/01
to
Jay
Which process A, B, or C is the slow one? Is it the data manipulation or
the messagedb handling?

If the pre and post messagedbs have identical partition definitions, the
same disc process will be used for the lifecycle of each message.
Depending of the key proximity of each message, it might be an idea to
offset the volumes used in the partition definition by one.

At which stage (A, B or C) are you buffering the 500 updates? Are the
messages actually sent (waited?) between the processes or are the data
manipulation requests handled through messagedb. It seems there might be
some locking issues here also.

What percentage increase in throughput are you after?

Large block sizes are not necessarily idea for random I/O. Depending on
the total file size and the number of volumes available for
partitioning, a smaller (and more sheltered) blocksize can be better
suited to what you are trying to acheive.

Ian Hadley
OneStop Tandem Performance
http://www.austral.se/onestop

Roy Nicholas

unread,
Oct 4, 2001, 7:38:18 AM10/4/01
to
From your description, both message database files are structured with
alternate keys. So the bulk I/O routines aren't going to be directly useful
since they would force you to use unstructured access. With a record size
of 1.5 K, a 4K blocksize is the only one that makes sense. Even then, you
can expect a lot of block splits because of all that insertion and deletion
activity. The default file attribute for TMF audited files is BUFFERED, but
double check to make sure it is really set. If not, each logical write and
delete is getting flushed to disk before the transaction can proceed.

One thing I find unclear in your messages is whether process A starts the
TMF transaction and that transaction is passed along to process B and
process C, or if both process A and process C are starting separate TMF
transactions for their work. As Ian mentioned, it sounds like lockwait
contention may be slowing you down. Have you looked at detailed Measure
information? At minimum, I'd look at the CPU, DISK, PROCESS, FILE and
DISKFILE entities during the peak load.

Although the database design you describe is pretty common, I think it tends
to perform poorly. Every record is inserted twice and deleted once, when in
reality it probably needs to be inserted just once. It should be at least
three times faster to receive the message, reformat it in memory and then
write it (once) to the final database. But that's a pretty major
application redesign. At minimum, I'd look to see if the alternate keys on
the new messagedb can be eliminated.


Roy

"Jay" <jay_a...@hotmail.com> wrote in message
news:727da039.01100...@posting.google.com...

Dave Bossi

unread,
Oct 5, 2001, 4:46:16 PM10/5/01
to
Jay -

1) Although I agree with the statement two writes and a delete is a
lot of overhead I will assume you do not want to change the
application so...

2) Split your alt-keys into two files on different disks and configure
for parallelwrites

3) What does your KEY look like? Is it a timestamp or order number
(i.e. is it sequential)? If this is the case, partitioning doesn't
buy you anything since the records all go into the same partition
until it rolls over to the new partition. If this is the case, you
need to hash the key so that it is randomized across the partitions.
I know of an expensive ($$$) but effective way to do this - we just
saw it using Escort-SQL where we can hash the key on the fly and
unhash it on the way back without touching the application but - like
I said - an expensive solution for just this problem.

Dave

Jay

unread,
Oct 10, 2001, 1:27:01 AM10/10/01
to
Hi All,
Just to clear up a few things. Process A's transaction is not carried
though process b and c. That is the Begin-end transaction belong to
process A, process C performs another begin -end transactions. The
keys are randem and also process will never update a record that
process A has not committed. This means that there should not be a
case for process C to hang waiting for process A to complete. Wow that
was lengthy.

We have looked at making Process C just delete the records and not
write to the other dbs just to see whats the increase. Obviously we
get some but not enough, about 50% increase. We are looking for a
minimum of 100%.

As for which process is the bottleneck its really the disk processes,
process B runs on its own CPU set.

One good news is that I did turn on parrelwrites and my performance
did increase by 90% ! Thanks for the Tip. Reading the manuals I am
wondering if I should increase the buffer for the audit disk, i.e.
increase the AuditTrailBuffer ? Currently its set to 0Mb, the manual
says its for RDF performance increase but thats it. Can anyone explain
a little on what this does ?

Fred Stiening

unread,
Oct 10, 2001, 10:41:01 PM10/10/01
to
Another thing that you might want to try. It involves a code change,
but may buy you siginifcant performance improvements. How much you
want to enhance this idea is up to you :)

Instead of a key sequenced file, define the file as relative record,
and take the primary key that you were using and define it as a unique
alternate key. The main code change (which could be one place or a
hundred depending on the coding style)... is to change keyposition
calls to use the two character alternate key id versus 0 (which
indicated to search using the primary key).

Unless you want to do your own round-robining or other load balancing,
this means all your records get written to the same drive - but
buffered relative record writes are extremely fast. The overhead is
in the maintenance of the indexes. Since the primary key is now in a
file that essentially only contains the index values and a relative
record number, there will be fewer index levels to maintain and
perhaps more importantly - the index is being maintained by a
different CPU than the one maintaining the data records. Of course,
the alternate key files can be partitioned as well as the primary,
futther spreading the load of the index maintenance..

Of course, there are other complications depending on the application
- eventually you'll need to recover the deleted record space from the
relative record file - you can use the insert option that inserts new
records in the first empty slot, or if your application permits down
time, you can copy the file and reload the indexes.

it's not the solution to every problem, but it is an option worth
having in your toolbag.

Frans Jongma

unread,
Oct 12, 2001, 9:18:14 AM10/12/01
to
<snip>


> One good news is that I did turn on parrelwrites and my performance
> did increase by 90% ! Thanks for the Tip. Reading the manuals I am
> wondering if I should increase the buffer for the audit disk, i.e.
> increase the AuditTrailBuffer ? Currently its set to 0Mb, the manual
> says its for RDF performance increase but thats it. Can anyone explain
> a little on what this does ?

When TMF writes to the audittrail, it must write to disk. Because a
transaction cannot commit before it's data is safely stored in the
audittrail.
The AuditTrailBuffer is used by the extractor process of RDF. (Reemote
Duplicate database Facility).
When this buffer size is > 0, the extractor can get data from this memory
buffer and does not need to read the data off the disk and will not be in
the way of TMF when it wants to write audit.
Sorry, no help for your application.

frans.


Douglas Miller

unread,
Oct 13, 2001, 8:45:23 AM10/13/01
to
In article <hn0ast8amteq2m95d...@4ax.com>, Fred Stiening <fr...@findanisp.com> wrote:
>Another thing that you might want to try. It involves a code change,
>but may buy you siginifcant performance improvements. How much you
>want to enhance this idea is up to you :)
>
>Instead of a key sequenced file, define the file as relative record,
>and take the primary key that you were using and define it as a unique
>alternate key. The main code change (which could be one place or a
>hundred depending on the coding style)... is to change keyposition
>calls to use the two character alternate key id versus 0 (which
>indicated to search using the primary key).
>
>Unless you want to do your own round-robining or other load balancing,
>this means all your records get written to the same drive - but
>buffered relative record writes are extremely fast. The overhead is
>in the maintenance of the indexes. Since the primary key is now in a
>file that essentially only contains the index values and a relative
>record number, there will be fewer index levels to maintain and
>perhaps more importantly - the index is being maintained by a
>different CPU than the one maintaining the data records. Of course,
>the alternate key files can be partitioned as well as the primary,
>futther spreading the load of the index maintenance..

I think this scheme is likely to *degrade* performance, rather than enhancing
it, because the necessity of updating an additional file (the alternate-key
file) increases the number of writes that must take place whenever the file is
updated -- and every read operation will require twice as much I/O as before.

--
alphageek/at/milmac/dot/com
Stop Partial-birth Abortion NOW!
End religious persecution in China - boycott Chinese goods.
Ted Kennedy's car has killed more people than my gun.

0 new messages