Why do we even have READV?

221 views
Skip to first unread message

Bob Dubery

unread,
Jun 22, 2022, 6:31:58 AM6/22/22
to Pick and MultiValue Databases
I've just seen a piece of code that used, in 3 consecutive lines, 3 READVs using the same file and the same key. 

This is in a loop. So there is significant redundant IO.

OK... this code can be refactored quite quickly, but it prompted a question in my mind.

Why do we even have READV?

It seems to just be short hand for a READ followed by an extract. 

I'm guessing that sometime in the past, probably in the early days of Pick, it was slightly faster way to do a read and a single extract, and that ever since it's been included for backwards compatibility. Otherwise I can't see a reason for it. 

Wols Lists

unread,
Jun 22, 2022, 9:24:23 AM6/22/22
to mvd...@googlegroups.com
On 22/06/2022 08:15, Bob Dubery wrote:
>
> Why do we even have READV?
>
> It seems to just be short hand for a READ followed by an extract.
>
> I'm guessing that sometime in the past, probably in the early days of
> Pick, it was slightly faster way to do a read and a single extract, and
> that ever since it's been included for backwards compatibility.
> Otherwise I can't see a reason for it.

I think you've just answered yourself. It's ONE line of code, instead of
two. Plus, "READV 0" just does a scan of the keys, it's a "does it
exist" function, not a "read the value" one.

And like all things, it can easily be abused. But Pick probably cached
the record pretty much from day 1, and in your case maybe someone TESTED
the code, and found that three readv's were noticeably faster than a
read followed by three extracts.

Oh - and that makes me think. Three readv's means one copy of the record
in cache, and the three fields you're interested in in memory. Depending
on the size of the record, back when *core* was kilodollars per
kilobyte, the financial (as well as speed) savings could have been
substantial.

Don't apply today's criteria to yesterday's decisions ...

Cheers,
Wol

Jim Idle

unread,
Jun 23, 2022, 10:26:03 PM6/23/22
to mvd...@googlegroups.com
It’s not very useful any more except for READV 0 I suppose; if it ever was very useful. 

When frames were 512 bytes, with 500 usable for data, then an item could quite easily bleed over a single frame of course. If that FID was not already in memory, then it caused a frame fault, which results in IO. So a READV could in theory mean you could limit the frame faulting to the first frame of the item. Though that sounds more like bad design to me.I have no idea if that’s why it was implemented - it could have just felt like a good idea to someone at the time. Or perhaps they really did think that scenario through and see it had some practical performance improvement. My bet is on the “good idea” theory to be honest. Someone else may remember if there was any method behind the madness.

Three READVs in a row sounds more like someone was lazy, or didn’t really understand what they were doing - no disgrace really. Personally, I would never have used it I think.

In Reality, the frames were eventually - around 1984 or thereabouts - reverse mapped to the disk sectors. The disk spins only one way of course, so if it has 64 sectors per track, you can write frame 1 on sector 0, or frame 64 on sector 0, etc. This meant that if you asked for frame 1, and the disk head was on frame 5, the system would read frame 5, 4, 3, 2 ,1 into memory. 

This was usually a great optimization because it was quite likely that after using frame 1, a program, would immediately ask for frame 2, which would be in memory.  Think about SELECT READNEXT and English queries (it depends how frames are mapped to files of courses). 

Track reads were also added, so that you got all the frames on a single disk track in one hit. Track reads could actually slow things down for some systems (usually poorly written), so you could turn them off. So READV had no practical value.

In Unix based systems, most implementations allow the underlying file system and kernel to manage cache etc. As algorithms get better and better at predicting disk access, this gets better and better of course. And nobody will be using hard drives these days for main working storage anyway.  jBASE uses memory mapping for instance. These days the only performance problems the average system should have are poor algorithms. They can be fixed.

Jim

--
You received this message because you are subscribed to
the "Pick and MultiValue Databases" group.
To post, email to: mvd...@googlegroups.com
To unsubscribe, email to: mvdbms+un...@googlegroups.com
For more options, visit http://groups.google.com/group/mvdbms
---
You received this message because you are subscribed to the Google Groups "Pick and MultiValue Databases" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mvdbms+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mvdbms/08b83fc3-3865-4113-bd70-84fd171a3691n%40googlegroups.com.

Lisa Levsen

unread,
Jun 23, 2022, 10:33:56 PM6/23/22
to mvd...@googlegroups.com
I find it useful that READV still works as designed for a quick and efficient return of a value like a sequence # etc. Because if it didn't, I'd have about 2000 programs I'd need to rewrite. Regardless of its origin, the READV is a valuable tool under appropriate circumstances..

Lisa Levsen
 


George Gallen

unread,
Jun 24, 2022, 3:29:30 PM6/24/22
to mvd...@googlegroups.com
Not sure if this is a Universe Only feature - TRANS()

If your only reading from the file and dont need to write to it, or have locks - use TRANS()
you can read one attribute or the entire record.

Has anyone run into any circumstances that TRANS seems to not do very well?

Run time....might be slower as it will need to open/read/close not just read

George


From: mvd...@googlegroups.com <mvd...@googlegroups.com> on behalf of Lisa Levsen <lle...@gmail.com>
Sent: Thursday, June 23, 2022 10:33 PM
To: mvd...@googlegroups.com <mvd...@googlegroups.com>
Subject: Re: [mvdbms] Why do we even have READV?
 

Wol

unread,
Jun 24, 2022, 4:05:28 PM6/24/22
to mvd...@googlegroups.com
On 24/06/2022 20:29, George Gallen wrote:
> Not sure if this is a Universe Only feature - TRANS()
>
> If your only reading from the file and dont need to write to it, or have
> locks - use TRANS()
> you can read one attribute or the entire record.
>
> Has anyone run into any circumstances that TRANS seems to not do very well?
>
> Run time....might be slower as it will need to open/read/close not just read
>
Firstly, I think it was a Pr1meism originally.

Secondly, it probably uses READV quite a lot internally :-) (most of
INFORMATION was written in BASIC).

And thirdly, even from the get-go, I believe it cached open file
pointers ...

Cheers,
Wol

Scott Ballinger

unread,
Jun 24, 2022, 7:28:34 PM6/24/22
to Pick and MultiValue Databases
Did not know about readv 0, but on D3/Linux (flashed) readv 1 is the fastest, and readv 0 is slower than read.

:ct bp sb

    sb
001 open "bigfile" to bigfile else stop
002
003 n1 = 10000000
004 n2 = 10100000
005 t = system(12)
006 for n = n1 to n2
007   readv x from bigfile,n,0 else null
008 next n
009 print "readv 0: ":system(12)-t
010
011 t = system(12)
012 for n = n1 to n2
013   readv x from bigfile,n,1 else null
014 next n
015 print "readv 1: ":system(12)-t
016
017 t = system(12)
018 for n = n1 to n2
019   read x from bigfile,n else null
020   y = x<1>
021 next n
022 print "read:    ":system(12)-t

:compile-catalog bp sb
sb
...

[820] Creating FlashBASIC Object ( Level 0 ) ...
[241] Successful compile!   2 frame(s) used.
[244] 'sb' cataloged

:sb
readv 0: 335
readv 1: 139
read:    157
:sb
readv 0: 365
readv 1: 139
read:    158
:sb
readv 0: 348
readv 1: 139
read:    157


/Scott

Parrot Truman

unread,
Jun 25, 2022, 10:22:35 AM6/25/22
to mvd...@googlegroups.com
We have been using TRANS() in universe for a long time, and the results are excellent !!.  You can read an entire record with the -1 option (you need to RAISE() the result) and it's much faster than using OPEN, READ, CLOSE. It seems to be the best option if you need to read a record and don't need to update it.

--
You received this message because you are subscribed to
the "Pick and MultiValue Databases" group.
To post, email to: mvd...@googlegroups.com
To unsubscribe, email to: mvdbms+un...@googlegroups.com
For more options, visit http://groups.google.com/group/mvdbms
---
You received this message because you are subscribed to the Google Groups "Pick and MultiValue Databases" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mvdbms+un...@googlegroups.com.

Steven Davies-Morris

unread,
Jun 25, 2022, 12:44:09 PM6/25/22
to mvd...@googlegroups.com
Boy does this take me back a long way.  Thanks Jim for reminding me of
my misspent youth!

--

SDM a 21st century schizoid man in SoCal
Systems Theory website www.systemstheory.net
Through The Looking Glass radio show at www.deepnuggets.com
Email "The Cleaner System" at cleaner...@yahoo.com

Will Johnson

unread,
Jun 25, 2022, 2:35:08 PM6/25/22
to Pick and MultiValue Databases
You mean that Prime cached open file pointers?

Wols Lists

unread,
Jun 25, 2022, 2:42:34 PM6/25/22
to mvd...@googlegroups.com
On 25/06/2022 19:35, 'Will Johnson' via Pick and MultiValue Databases wrote:
> You mean that Prime cached open file pointers?
>
It was well known for it ... open/close was slow slow slow ...

Cheers,
Wol
> --
> You received this message because you are subscribed to
> the "Pick and MultiValue Databases" group.
> To post, email to: mvd...@googlegroups.com
> To unsubscribe, email to: mvdbms+un...@googlegroups.com
> For more options, visit http://groups.google.com/group/mvdbms
> <http://groups.google.com/group/mvdbms>
> ---
> You received this message because you are subscribed to the Google
> Groups "Pick and MultiValue Databases" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to mvdbms+un...@googlegroups.com
> <mailto:mvdbms+un...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mvdbms/f7966854-1455-4183-bedc-7bb7c122a5d6n%40googlegroups.com
> <https://groups.google.com/d/msgid/mvdbms/f7966854-1455-4183-bedc-7bb7c122a5d6n%40googlegroups.com?utm_medium=email&utm_source=footer>.

Jim Idle

unread,
Jun 30, 2022, 7:11:51 AM6/30/22
to mvd...@googlegroups.com
You would need much bigger loops to get accurate numbers, and you have to measure the execution time of the loop and remove it. Also cache effects etc make it a difficult thing to measure. Even more so with jBASE. 

IMO, if it works, then no need to change it. :)

--
You received this message because you are subscribed to
the "Pick and MultiValue Databases" group.
To post, email to: mvd...@googlegroups.com
To unsubscribe, email to: mvdbms+un...@googlegroups.com
For more options, visit http://groups.google.com/group/mvdbms
---
You received this message because you are subscribed to the Google Groups "Pick and MultiValue Databases" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mvdbms+un...@googlegroups.com.

Jim Idle

unread,
Jun 30, 2022, 7:13:03 AM6/30/22
to mvd...@googlegroups.com
Yes, most systems do it. JBASE uses mmap() though. 

--
You received this message because you are subscribed to
the "Pick and MultiValue Databases" group.
To post, email to: mvd...@googlegroups.com
To unsubscribe, email to: mvdbms+un...@googlegroups.com
For more options, visit http://groups.google.com/group/mvdbms
---
You received this message because you are subscribed to the Google Groups "Pick and MultiValue Databases" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mvdbms+un...@googlegroups.com.

Will Johnson

unread,
Jun 30, 2022, 2:53:56 PM6/30/22
to Pick and MultiValue Databases
Yes.  I was only curious if Prime invented this however.
Caching open files pointers (keeping them in memory always) was not always a feature of every Pick system.
The pointer was kept in the working memory of a running program, but the program itself could still be paged out

Wols Lists

unread,
Jun 30, 2022, 3:34:51 PM6/30/22
to mvd...@googlegroups.com
On 30/06/2022 19:53, 'Will Johnson' via Pick and MultiValue Databases wrote:
> Yes.  I was only curious if Prime invented this however.
> Caching open files pointers (keeping them in memory always) was not
> always a feature of every Pick system.
> The pointer was kept in the working memory of a running program, but the
> program itself could still be paged out
>
Because opening was so slow, it was normal for user programs to open
files into named common and leave them there ...

Cheers,
Wol

Peter McMurray

unread,
Jun 30, 2022, 7:04:04 PM6/30/22
to Pick and MultiValue Databases
I agree Wol. Every Application that we have written since 1977 writes the file opens to a Common Array and still shows dramatic improvement in speed. The array is not only common to all the programs, every program is a subroutine and every read is a mat read. To my mind writing a count control to a large record that a readv would improve is bad design.
Nowadays D3 has such terrific matrix handling that to import a 30,000 item customer csv by far the best way is to Read it, count the items and Dim an array for it. then mat the individual items as you need them.
Importing daily credit card files as we do with Petrol company cards used all over the country this is the simplest and cleanest.

Jim Idle

unread,
Jun 30, 2022, 9:19:01 PM6/30/22
to mvd...@googlegroups.com
All Unix implementations that used individual FD had to do awkward things with them because back in the day, there was a limited number of file descriptors available in the system. 

I am not sure what platforms you refer to when you say "paged out", or why you think being paged out would affect file descriptors? You mean native systems? Unix?

Jim 


Jim Idle

unread,
Jun 30, 2022, 9:24:16 PM6/30/22
to mvd...@googlegroups.com
Reading a CSV file like that seems tortuous. I don't think D3 matrix handling is as good as you think it is. Every system has benefited from the vast improvements in CPU, memory and disk speeds though. I imagine any performance problems people have these days are either algorithmic, or platform limits.

Jim 

--
You received this message because you are subscribed to
the "Pick and MultiValue Databases" group.
To post, email to: mvd...@googlegroups.com
To unsubscribe, email to: mvdbms+un...@googlegroups.com
For more options, visit http://groups.google.com/group/mvdbms
---
You received this message because you are subscribed to the Google Groups "Pick and MultiValue Databases" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mvdbms+un...@googlegroups.com.

geneb

unread,
Jul 1, 2022, 10:11:32 AM7/1/22
to Pick and MultiValue Databases
On Thu, 30 Jun 2022, Peter McMurray wrote:

> Nowadays D3 has such terrific matrix handling that to import a 30,000 item
> customer csv by far the best way is to Read it, count the items and Dim an
> array for it. then mat the individual items as you need them.
> Importing daily credit card files as we do with Petrol company cards used
> all over the country this is the simplest and cleanest.
>

I've found (at least with v9), creating large output files in D3 using
dynamic arrays gets problematic after some point. When I need to create a
large csv file, I use %fwrite() to write each line out as I create it.
It's dramatically faster than collecting all the data into a single
dynamic array and then writing it out once.

I don't do any imports, so I've no experience as to how fast that might
be. (again, v9)

g.


--
Proud owner of F-15C 80-0007
http://www.f15sim.com - The only one of its kind.
http://www.diy-cockpits.org/coll - Go Collimated or Go Home.
Some people collect things for a hobby. Geeks collect hobbies.

ScarletDME - The red hot Data Management Environment
A Multi-Value database for the masses, not the classes.
http://scarlet.deltasoft.com - Get it _today_!

Will Johnson

unread,
Jul 1, 2022, 2:54:47 PM7/1/22
to Pick and MultiValue Databases
Yes this is good practice, but global named common was not introduced until later on in the life of Pick.  Then once people were made aware of it (I recall some articles from the old Spectrum Tech in the mid 80s I think), they started writing login routines that would open the most common fifty files into global common and they would remain as open pointers for the entire login session, regardless of what program you were running.

And by "paged out" I mean that memory has a number of pages, and when it's full and a new operation wants memory, the older memory pages, get "paged out" i.e. written back to a disk location, and not currently in memory that moment...  Then when that user gets the token again, the system would automagically pull those pages back into memory for that user.

Typically with an active program, and a limited number of users, the active program, wouldn't actually ever get paged out unless your system was overwhelmed with activity.  I guess you have to have a good memory (haha) to remember the days when 32K was a lot of memory.  And then of course some programmers have never had to actually learn this stuff because they are always working on their own systems, without the other 125 users battling for service.

We had systems where user programs were constantly getting paged out.  I seem to recall something about core-locking a process so that it never got paged out.  I think the ABS frames used to be core-locked.  I can't recall if that was a setting or just automatic.

Peter McMurray

unread,
Jul 1, 2022, 4:27:01 PM7/1/22
to Pick and MultiValue Databases
Hi We used Common from the beginning in 1977. Since the original was a switch operated program (that is it read parameters for display, input and file control) it effectively core locked itself as no other programs were being called and we mapped the control matrices.
Now I have a master program and every "program" for an application is a subroutine so it still works the same way. The result 100% consistency across an unlimited number of routines. No Day 10,000 or year 2000 issues and no problem when Australia switched from sales tax based on cost to GST based on actual tax free sales price since everything is date based.
I can quote a programmer that I showed the DIM method to in 1999 "it's orders of magnitude faster"
Since then D3 considers a 10,000 matrix very small. However I now simply read the CSV and SELECT it for READNEXT and split each line of the CSV to a suitable ITEM MATRIX.
MATREAD is always a more efficient way to handle an item. By sensible file design - the latest change is always the first value in an attribute - one can keep thousands of elements by date in the one item without degradation.

Will Johnson

unread,
Jul 2, 2022, 2:58:33 PM7/2/22
to Pick and MultiValue Databases
Yes there is Common and then named Common which is a newer animal

Peter McMurray

unread,
Jul 3, 2022, 6:13:03 PM7/3/22
to Pick and MultiValue Databases
Yes Will no criticism intended. We simply made the decision to avoid anything that was not universal. So the only time we have had to change anything was a couple of locates in an early SUN issue - that was the only machine that ever went down from a virus for 10 days (not from our software). Ah! I forgot we had a stupid field engineer went to a Tech college around 1985 then used his disk in our client's machine. Marihuana virus stopped DOS for a day, again no loss on our side. Greatest thing that ever happened for us was Windows NT all Unix and Linux problems vanished and nowadays a windows 10/11 peer to peer server is even simpler for the typical small to medium user 20 to 50 million$ in our case. Plus the cloud just takes away any restriction. D3 sings.
Message has been deleted

Will Johnson

unread,
Jul 4, 2022, 8:10:07 PM7/4/22
to Pick and MultiValue Databases
I was only pointing out that Common has existed for several years before Named Common was introduced
Reply all
Reply to author
Forward
0 new messages