Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Knuth's "Man or Boy"

77 views
Skip to first unread message

Robert AH Prins

unread,
May 10, 2012, 7:35:59 PM5/10/12
to
The following is a PL/I version of Donald Knuth's "Man or Boy" compiler
test program:

morb: proc options (main) reorder;
dcl sysprint file;

dcl i fixed bin (31);

do i = 0 to 30;
put skip edit('i = ', i, a((i), x1, x2, x3, x4, x5))
(a(4), p'z9b', P'---,---,--9');
end;

a: proc(k, y1, y2, y3, y4, y5) returns(fixed bin (31)) recursive;
dcl k fixed bin (31);
dcl (y1, y2, y3, y4, y5) entry returns(fixed bin (31));

b: proc returns(fixed bin(31)) recursive;
k = k - 1;
return(a((k), b, y1, y2, y3, y4));
end b;

if k <= 0 then
return(y4 + y5);
else
return(b);
end a;

x1: proc returns(fixed bin (31)); return(1); end x1;
x2: proc returns(fixed bin (31)); return(-1); end x2;
x3: proc returns(fixed bin (31)); return(-1); end x3;
x4: proc returns(fixed bin (31)); return(1); end x4;
x5: proc returns(fixed bin (31)); return(0); end x5;
end morb;

It works OK, i.e. returned values are the same as those provided by
Wikipedia <http://en.wikipedia.org/wiki/Man_or_boy_test> and Rosetta
Code <http://rosettacode.org/wiki/Man_or_boy_test> up to the value of k=25.

Running the program on z/OS with the old OS (V2R3M0) compiler, the
program ABENDs at k=15 (running with a REGION=0M). Using Enterprise PL/I
V3R9, the limit is upped to k=23. However, using PL/I for Windows V8.0
from RDz on a W7-64 system, the limit seems to be k=26 (A= -21,051,458)
a value that seems not to be documented anywhere. Now despite my recent
goofs, I'm pretty sure that the value is correct, but can someone (try
to) verify this using another compiler and/or language?

Robert
--
Robert AH Prins
robert(a)prino(d)org

robin....@gmail.com

unread,
May 11, 2012, 8:41:31 AM5/11/12
to rob...@prino.org
On Friday, 11 May 2012 09:35:59 UTC+10, Robert AH Prins wrote:

> Running the program on z/OS with the old OS (V2R3M0) compiler, the
> program ABENDs at k=15 (running with a REGION=0M). Using Enterprise PL/I
> V3R9, the limit is upped to k=23. However, using PL/I for Windows V8.0
> from RDz on a W7-64 system, the limit seems to be k=26 (A= -21,051,458)
> a value that seems not to be documented anywhere.

Values up to k=26 are documented in both URLs.

John W Kennedy

unread,
May 11, 2012, 10:22:41 AM5/11/12
to
The program is notorious for its huge stack requirements. Setting of
ISASIZE is critical in the mainframe version.

--
John W Kennedy
Read the remains of Shakespeare's lost play, now annotated!
http://www.SKenSoftware.com/Double%20Falshood

wolfgang kern

unread,
May 11, 2012, 2:41:53 PM5/11/12
to

Robert AH Prins posted:
Ok so far, but I cannot see any relation to ASM here...
So just for the record: what was the question about in the above?

__
wolfgang


Robert AH Prins

unread,
May 11, 2012, 6:50:16 PM5/11/12
to
On 2012-05-11 18:41, wolfgang kern wrote:
> Robert AH Prins posted:
>
>> The following is a PL/I version of Donald Knuth's "Man or Boy" compiler
>> test program:

<snip>

>> It works OK, i.e. returned values are the same as those provided by
>> Wikipedia<http://en.wikipedia.org/wiki/Man_or_boy_test> and Rosetta Code
>> <http://rosettacode.org/wiki/Man_or_boy_test> up to the value of k=25.
>
> Ok so far, but I cannot see any relation to ASM here...
> So just for the record: what was the question about in the above?

Did you miss the

"... but can someone (try to) verify this using another compiler and/or
language?"

I was hoping that someone might have programmed this little piece of
stack-stressing code in x86, and could verify that the (k=26) value is
indeed the one (-21,051,458) returned by PL/I. Seeing how this could be
programmed in assembler would be a bonus, but my main interest is the
correctness of the above.

Shmuel Metz

unread,
May 11, 2012, 11:06:59 AM5/11/12
to
In <a12ql2...@mid.individual.net>, on 05/10/2012
at 11:35 PM, Robert AH Prins <spam...@prino.org> said:

>Running the program on z/OS with the old OS (V2R3M0) compiler,

V2R3M0 of *which* compiler? Presumably not F, which is the old
compiler.

>the program ABENDs at k=15

With what ABEND code and messages?

>(running with a REGION=0M)

Well, if you're using the "optimizing" compiler then it only supports
storage below the line.

>Using Enterprise PL/I V3R9, the limit is upped to k=23.

Using what compiler options and what REGION?

>a value that seems not to be documented anywhere.

That's expected if the problem is running out of space to extend the
stack.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to spam...@library.lspace.org

robin....@gmail.com

unread,
May 11, 2012, 9:42:21 PM5/11/12
to rob...@prino.org
On Saturday, 12 May 2012 08:50:16 UTC+10, Robert AH Prins wrote:

> I was hoping that someone might have programmed this little piece of
> stack-stressing code in x86, and could verify that the (k=26) value is
> indeed the one (-21,051,458) returned by PL/I. Seeing how this could be
> programmed in assembler would be a bonus, but my main interest is the
> correctness of the above.

Both wikipedia and rosetta confirm that value, as I said previously.

robin....@gmail.com

unread,
May 11, 2012, 9:49:09 PM5/11/12
to
On Saturday, 12 May 2012 01:06:59 UTC+10, Seymour J. Shmuel Metz wrote:
> In <a12q........@mid.individual.net>, on 05/10/2012
> at 11:35 PM, Robert AH Prins <spam...@prino.org> said:


> >Using Enterprise PL/I V3R9, the limit is upped to k=23.
>
> Using what compiler options and what REGION?
>
> >a value that seems not to be documented anywhere.
>
> That's expected if the problem is running out of space to extend the
> stack.

It's not so much running out of space; it's that the stack
uses all available RAM, and then begins swapping via disc.
At that point, the program tends never to finish.

John W Kennedy

unread,
May 11, 2012, 10:56:13 PM5/11/12
to
On 2012-05-11 15:06:59 +0000, Shmuel (Seymour J.) Metz said:

> In <a12ql2...@mid.individual.net>, on 05/10/2012
> at 11:35 PM, Robert AH Prins <spam...@prino.org> said:
>
>> Running the program on z/OS with the old OS (V2R3M0) compiler,
>
> V2R3M0 of *which* compiler? Presumably not F, which is the old
> compiler.
>
>> the program ABENDs at k=15
>
> With what ABEND code and messages?
>
>> (running with a REGION=0M)
>
> Well, if you're using the "optimizing" compiler then it only supports
> storage below the line.

For the stack. But the stack would be more likely to fail due to a bad
(for this purpose) ISASIZE than a region failure.

Shmuel Metz

unread,
May 13, 2012, 2:52:28 PM5/13/12
to
In <4fadd14d$0$1899$607e...@cv.net>, on 05/11/2012
at 10:56 PM, John W Kennedy <jwk...@attglobal.net> said:

>For the stack. But the stack would be more likely to fail due to
> a bad (for this purpose) ISASIZE than a region failure.

Regardless of ISASIZE and REGION, you can't get more than 16 MB below
the line. The OP needs to take into account what his compiler
supports, what his compiler options are, what hi run-time options are
and what his installation does with REGION=0M.

Note: REGION=0M requests all available virtual storage in the private
are, except when it doesn't.

John W Kennedy

unread,
May 14, 2012, 5:53:11 PM5/14/12
to
On 2012-05-13 18:52:28 +0000, Shmuel (Seymour J.) Metz said:

> In <4fadd14d$0$1899$607e...@cv.net>, on 05/11/2012
> at 10:56 PM, John W Kennedy <jwk...@attglobal.net> said:
>
>> For the stack. But the stack would be more likely to fail due to
>> a bad (for this purpose) ISASIZE than a region failure.
>
> Regardless of ISASIZE and REGION, you can't get more than 16 MB below
> the line. The OP needs to take into account what his compiler
> supports, what his compiler options are, what hi run-time options are
> and what his installation does with REGION=0M.
>
> Note: REGION=0M requests all available virtual storage in the private
> are, except when it doesn't.

All true, but when ISASIZE hasn't been addressed, it's likely to be the
first point of stack failure, and I'm the only one here who's mentioned
it in this thread, ergo....

Robert AH Prins

unread,
May 15, 2012, 7:37:36 AM5/15/12
to
RV falls in the category of pigheaded people who are never wrong, a
category of people who are so full of themselves that they cannot be
bothered to look one yoctometer past the board in front of their heads.

If he had bothered to look at the history of both Wiki pages, he would
have noticed that the history logs for both pages contain the names of
the users who made changes, and that by clicking the right circles, one
can compare older versions.

And guess who added the A(k=26) values on both pages?

Robert AH Prins

unread,
May 15, 2012, 11:08:54 AM5/15/12
to
On 2012-05-14 21:53, John W Kennedy wrote:
> On 2012-05-13 18:52:28 +0000, Shmuel (Seymour J.) Metz said:
>
>> In <4fadd14d$0$1899$607e...@cv.net>, on 05/11/2012
>> at 10:56 PM, John W Kennedy <jwk...@attglobal.net> said:
>>
>>> For the stack. But the stack would be more likely to fail due to
>>> a bad (for this purpose) ISASIZE than a region failure.
>>
>> Regardless of ISASIZE and REGION, you can't get more than 16 MB below
>> the line. The OP needs to take into account what his compiler
>> supports, what his compiler options are, what hi run-time options are
>> and what his installation does with REGION=0M.
>>
>> Note: REGION=0M requests all available virtual storage in the private
>> are, except when it doesn't.
>
> All true, but when ISASIZE hasn't been addressed, it's likely to be the
> first point of stack failure, and I'm the only one here who's mentioned
> it in this thread, ergo....

PL/I V8.0 for windows:

ISASIZE(1677216k) (aka 16Gb, all my PC has) - ISASIZE doesn't really
matter on doze, without it the programs also stops after k=26.

Link on windows:

ilink /st:0x75fFFFFE,0x7fFFFE morb.obj - yes, it links, but the
executable does not do anything, command prompt returns immediately.

ilink /st:0x74fFFFFE,0x7fFFFE morb.obj - working executable, with the
aforementioned limit of k=26.

Of course Pl/I for windows is 32-bit and that seems to show when
monitoring the program with Process Explorer, just before the "IBM3000☺
Not enough application stack to complete processing." message appears,
the "private bytes" column in PE reaches 1,808,000K, just short of 2Gb,
the limit for signed 32-bit.

As for z/OS, this is some info from Mark Zelden's IPLINFO:

The real storage online at IPL time was 6144M.
The real storage increment size is 1M with 6144 increments installed.
The potential real storage size is 6144M.
The reconfigurable storage size is 0MB.
The private area size <16M is 9216K.
The private area size >16M is 1767M.
The CSA size <16M is 3776K.
The CSA size >16M is 200072K.
The SQA size <16M is 1528K.
The SQA size >16M is 14836K.
The maximum V=R region size is 128K.
The default V=R region size is 64K.
The maximum V=V region size is 9192K.

Compiler: Enterprise PL/I V3.R9.M0 (Built:20100923)
No ISASIZE specified: ABEND at k=23
ISASIZE(4096M): ABEND at k=23
In both cases the program is run with a REGION=0M

In other words, the ISASIZE doesn't seem to matter at all.

John W Kennedy

unread,
May 15, 2012, 9:14:34 AM5/15/12
to
"No ISASIZE specified" does not mean "No ISA".

Robert AH Prins

unread,
May 15, 2012, 11:59:20 AM5/15/12
to
I know, it uses the defaults,

STACK(128K,128K,ANYWHERE,KEEP,512K,128K)

Robert AH Prins

unread,
May 16, 2012, 9:16:15 AM5/16/12
to
On 2012-05-10 23:35, Robert AH Prins wrote:
> The following is a PL/I version of Donald Knuth's "Man or Boy" compiler test
> program:

<snip>

> Running the program on z/OS with the old OS (V2R3M0) compiler, the program
> ABENDs at k=15 (running with a REGION=0M). Using Enterprise PL/I V3R9, the limit
> is upped to k=23. However, using PL/I for Windows V8.0 from RDz on a W7-64
> system, the limit seems to be k=26 (A= -21,051,458) a value that seems not to be
> documented anywhere. Now despite my recent goofs, I'm pretty sure that the value
> is correct, but can someone (try to) verify this using another compiler and/or
> language?

The A(k=26) value has been confirmed by Ruud Koot, using a program written in
Haskell, see <http://rosettacode.org/wiki/User_talk:Ruud_Koot>. His attempt to
go further also failed due to a lack of memory.

Shmuel Metz

unread,
May 15, 2012, 8:54:42 PM5/15/12
to
In <a1f5pj...@mid.individual.net>, on 05/15/2012
at 03:59 PM, Robert AH Prins <spam...@prino.org> said:

>STACK(128K,128K,ANYWHERE,KEEP,512K,128K)

That's tiny; what happens when you increase it?

Shmuel Metz

unread,
May 15, 2012, 8:51:37 PM5/15/12
to
In <a1f2r2...@mid.individual.net>, on 05/15/2012
at 03:08 PM, Robert AH Prins <spam...@prino.org> said:

>The maximum V=V region size is 9192K.

That's the maximum below the line. I suspect that the maximum above
the line is 32MiB, but you need to check with your systems programmers
to be sure.

robin....@gmail.com

unread,
May 16, 2012, 10:16:59 AM5/16/12
to rob...@prino.org
On Tuesday, 15 May 2012 21:37:36 UTC+10, Robert AH Prins wrote:

> RV falls in the category of pigheaded people who are never wrong, a
> category of people who are so full of themselves that they cannot be
> bothered to look one yoctometer past the board in front of their heads.

Prins can't help himself. Prins repeatedly makes derogatory
remarks.

> If he had bothered to look at the history of both Wiki pages, he would
> have noticed that the history logs for both pages contain the names of
> the users who made changes, and that by clicking the right circles, one
> can compare older versions.

If Prins had bothered to look at his listing, he would have found that
he specified OPT(0) instead of full optimisation.

I have better things to do than look at who changed what in wiki.
More often than not, people use pseudonyms or are anonymous.

> And guess who added the A(k=26) values on both pages?

A boofhead?

Prins, you want to take a good look at yourself instead of denigrating others.

Robert AH Prins

unread,
May 16, 2012, 3:04:51 PM5/16/12
to
On 2012-05-16 14:16, robin....@gmail.com wrote:
> On Tuesday, 15 May 2012 21:37:36 UTC+10, Robert AH Prins wrote:
>
>> RV falls in the category of pigheaded people who are never wrong, a
>> category of people who are so full of themselves that they cannot be
>> bothered to look one yoctometer past the board in front of their heads.
>
> Prins can't help himself. Prins repeatedly makes derogatory
> remarks.
>
>> If he had bothered to look at the history of both Wiki pages, he would
>> have noticed that the history logs for both pages contain the names of
>> the users who made changes, and that by clicking the right circles, one
>> can compare older versions.
>
> If Prins had bothered to look at his listing, he would have found that
> he specified OPT(0) instead of full optimisation.

And Robert Prins has no problem publicly apologizing for his mistakes, unlike
some other person, care to reread this
<http://groups.google.com/group/comp.lang.pl1/msg/44ec4d4828de6637?hl=en>, and I
quote the final paragraph:

"Please note, I am certain that if you queried the active and less visible
participants in this forum, your technical knowledge and expertise would be
commonly acknolwedged. HOWEVER, if you also queried the viewers and
participants in the forum, I think that you would find a majority (a VERY
LARGE majority) that would confirm that you have a "online personality" that
seems totally incapable of accepting correction, criticism, or even
suggestions of other views."

> I have better things to do than look at who changed what in wiki.
> More often than not, people use pseudonyms or are anonymous.

Of course, but if you see the name "prino", only a complete Robin Vowels would
fail to make to connection.

>> And guess who added the A(k=26) values on both pages?
>
> A boofhead?
>
> Prins, you want to take a good look at yourself instead of denigrating others.

robin....@gmail.com

unread,
May 17, 2012, 11:23:28 PM5/17/12
to rob...@prino.org
On Thursday, 17 May 2012 05:04:51 UTC+10, Robert AH Prins wrote:
> On 2012-05-16 14:16, robin.vow....@gmail.com wrote:

> >> If he had bothered to look at the history of both Wiki pages, he would
> >> have noticed that the history logs for both pages contain the names of
> >> the users who made changes, and that by clicking the right circles, one
> >> can compare older versions.
> >
> > If Prins had bothered to look at his listing, he would have found that
> > he specified OPT(0) instead of full optimisation.
>
> And Robert Prins has no problem publicly apologizing for his mistakes, unlike
> some other person,

Your response is typical of those who don't like being challenged,
which is, to attack the challenger, and to denigrate him.

Quoting some uninformed source won't get you anywhere.

As for your mistakes, it's not the first time that you have
shot your mouth off in a big way about software,
calling manufacturers incompetent or their works incompetent
(or words to that effect), only to find that the incompetent is
your own self.

> > I have better things to do than look at who changed what in wiki.
> > More often than not, people use pseudonyms or are anonymous.
>
> Of course, but if you see the name "prino", only a complete Robin Vowels would
> fail to make to connection.

As I said, I don't waste my time looking at who wrote/corrected what in wiki.

> >> And guess who added the A(k=26) values on both pages?
> >
> > A boofhead?
> >
> > Prins, you want to take a good look at yourself instead of denigrating others.

You're obviously suffering from delusions of grandeur,
thinking that anyone would be interested in knowing that
you had added some trivia to wiki.

Richard Russell

unread,
May 18, 2012, 5:22:19 AM5/18/12
to
On May 11, 12:35 am, Robert AH Prins <spamt...@prino.org> wrote:
> However, using PL/I for Windows V8.0
> from RDz on a W7-64 system, the limit seems to be k=26 (A= -21,051,458)
> a value that seems not to be documented anywhere.

The Rosetta Code page now lists values for up to k = 30
(-512,016,658).

Richard.
http://www.rtrussell.co.uk/

Robert AH Prins

unread,
May 18, 2012, 7:30:09 AM5/18/12
to
On 2012-05-18 09:22, Richard Russell wrote:
> On May 11, 12:35 am, Robert AH Prins<spamt...@prino.org> wrote:
>> However, using PL/I for Windows V8.0
>> from RDz on a W7-64 system, the limit seems to be k=26 (A= -21,051,458)
>> a value that seems not to be documented anywhere.
>
> The Rosetta Code page now lists values for up to k = 30
> (-512,016,658).

You're not going to tell us that you did this with BBC Basic, aren't you? ;)

Richard Russell

unread,
May 18, 2012, 5:52:27 AM5/18/12
to
On May 18, 12:30 pm, Robert AH Prins <spamt...@prino.org> wrote:
> You're not going to tell us that you did this with BBC Basic, aren't you? ;)

No - I can only manage up to k=20 that way! The contributor of the
high values was somebody called Weinholt who comments that they were
"found with Vicare Scheme". To go higher than k=30 with a
conventional recursive method would need about 150 Gbytes of heap/
stack!

Richard.
http://www.rtrussell.co.uk/

Rod Pemberton

unread,
May 18, 2012, 6:18:25 AM5/18/12
to
"Robert AH Prins" <spam...@prino.org> wrote in message
news:a1mj4k...@mid.individual.net...
> On 2012-05-18 09:22, Richard Russell wrote:
> > On May 11, 12:35 am, Robert AH Prins<spamt...@prino.org> wrote:
> >> However, using PL/I for Windows V8.0
> >> from RDz on a W7-64 system, the limit seems to be k=26 (A= -21,051,458)
> >> a value that seems not to be documented anywhere.
> >
> > The Rosetta Code page now lists values for up to k = 30
> > (-512,016,658).
>
> You're not going to tell us that you did this with BBC Basic, aren't you?
> ;)
>

FYI, I tried this in DOS, specifically MS-DOS v7.10. There were 3 C
versions on Rosetta code website. I tried the C version which was
supposedly ANSI C. I got upto k=24 with OpenWatcom (v1.3) using Japheth's
HDPMI DPMI server. The limitation is the OW stub can only allocate 1GB of
stack. (wcl386/l=dos4g -k1024m) I got upto k=25 with DJGPP (GCC v3.4.1)
using Charles Sandmann's CWSDPMI (v5) DPMI server. The limitation is the
DJGPP stub can only allocate 2GB of stack. (gcc -Wall -pedantic -O2,
w/STUBEDIT 0x7FFFFFFF stack size, without -ansi flag since it errors on
va_list ...) My machine has 4GB of memory and Japheth's HIMEMX EMS client
which will support that in DOS, but about 512KB is shared with an old PCI
video card since my "new" (5 years old) PCIE card died. That leaves just
over 3.5GB. So, it still might not have had enough space for k=26, since
the amount of stack space required seemed to double above k=10 or so. With
a different compiler (No, I don't know which that could possibly be.), video
card, amount of memory (i.e., 6GB or 8GB), and motherboard, you might be
able to get to and above k=26 in DOS.


Rod Pemberton




Richard Russell

unread,
May 18, 2012, 9:05:49 AM5/18/12
to
On May 11, 11:50 pm, Robert AH Prins <spamt...@prino.org> wrote:
> I was hoping that someone might have programmed this little piece of
> stack-stressing code in x86

To bring this properly on-topic I have listed an assembler version
below; it's called with the k-value in eax. You will need an awful
lot of stack for the larger k values, and as there's no overflow
checking it tends to crash spectacularly if you exceed the limit! If
somebody has a really huge amount of RAM (or virtual memory) it might
be interesting to convert it to 64-bit code and see how high you can
go.

Richard.
http://www.rtrussell.co.uk/

start: mov esi,eax ; k
mov ebp,pt1 ; x1 = 1
mov ebx,pt_1 ; x2 = -1
mov edx,pt_1 ; x3 = -1
mov ecx,pt1 ; x4 = 1
mov eax,pt0 ; x5 = 0
fna: cmp esi,0
jg kgt0
push ecx
call [eax]
xchg eax,[esp]
call [eax]
pop ebx
add eax,ebx
ret
;
kgt0: mov edi,fnb
pushad
mov eax,esp
call fnb
add esp,32
ret
;
fnb: dec dword [eax+4]
mov esi,[eax+4]
mov ebp,eax
mov ebx,[eax+8]
mov edx,[eax+16]
mov ecx,[eax+20]
mov eax,[eax+24]
jmp short fna
;
fn0: xor eax,eax
ret
pt0: dd fn0
;
fn1: mov eax,1
ret
pt1: dd fn1
;
fn_1: mov eax,-1
ret
pt_1: dd fn_1

Terje Mathisen

unread,
May 18, 2012, 11:35:39 AM5/18/12
to
Who do you need to push every register?

> mov eax,esp
> call fnb
> add esp,32
> ret
> ;
> fnb: dec dword [eax+4]

You are reordering the register values here?

> mov esi,[eax+4]
> mov ebp,eax
> mov ebx,[eax+8]
> mov edx,[eax+16]
> mov ecx,[eax+20]
> mov eax,[eax+24]
> jmp short fna
> ;
> fn0: xor eax,eax
> ret
> pt0: dd fn0
> ;
> fn1: mov eax,1
> ret
> pt1: dd fn1
> ;
> fn_1: mov eax,-1
> ret
> pt_1: dd fn_1

Since the maximum return values (so far) are 0.5e9, there's still room
for 2 or 3 more doublings before you need 64-bit working registers.

Using 32-bit registers should still be supported in 64-bit mode, but
CALL/RET will use 64-bit values afaik?

The best setup would seem to be a 64-bit stack pointer and every other
register used as a 32-bit value, in which case you'd probably have to
synthesize the stack operations...

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Richard Russell

unread,
May 18, 2012, 1:36:00 PM5/18/12
to
On May 18, 4:35 pm, Terje Mathisen <"terje.mathisen at
tmsw.no"@giganews.com> wrote:
> Who do you need to push every register?

I don't, but it's only esp that doesn't need to be pushed so I'm only
wasting 4 bytes of stack by using 'pushad'. It would be
straightforward to modify the code to push just the seven registers,
in order to squeeze as much out of the stack space as possible.

> You are reordering the register values here?

I'm not too sure what you mean by "reordering". An important aspect
of the Man or Boy algorithm is shifting the parameters along (x1->x2,
x2->x3 etc., discarding x5) which governs which registers I load from
where.

> Since the maximum return values (so far) are 0.5e9, there's still room
> for 2 or 3 more doublings before you need 64-bit working registers.

True, but we know the answers up to k=30 so the only really worthwhile
application for the code would be to find the answer for k=31, which
definitely needs more than a 4 Gbyte stack.

> Using 32-bit registers should still be supported in 64-bit mode, but
> CALL/RET will use 64-bit values afaik?

Yes. I expect the only vital change would be to replace references to
'esp' with 'rsp'.

Richard.
http://www.rtrussell.co.uk/

Terje Mathisen

unread,
May 18, 2012, 4:00:27 PM5/18/12
to
Richard Russell wrote:
> On May 18, 4:35 pm, Terje Mathisen<"terje.mathisen at
>> You are reordering the register values here?
>
> I'm not too sure what you mean by "reordering". An important aspect
> of the Man or Boy algorithm is shifting the parameters along (x1->x2,
> x2->x3 etc., discarding x5) which governs which registers I load from
> where.

Sorry, I haven't studied the original Knuth challenge algorithm, I guess
I'd have to do so in order to argue intelligently about how to implement
it. :-)
>
>> Since the maximum return values (so far) are 0.5e9, there's still room
>> for 2 or 3 more doublings before you need 64-bit working registers.
>
> True, but we know the answers up to k=30 so the only really worthwhile
> application for the code would be to find the answer for k=31, which
> definitely needs more than a 4 Gbyte stack.

The 4G+ stack size is fine, I'm worried about _values_ (in registers)
overflowing 32 bits!
>
>> Using 32-bit registers should still be supported in 64-bit mode, but
>> CALL/RET will use 64-bit values afaik?
>
> Yes. I expect the only vital change would be to replace references to
> 'esp' with 'rsp'.

Along with modifying any register which RSP is copied to/from...

Richard Russell

unread,
May 19, 2012, 4:33:46 AM5/19/12
to
On May 18, 9:00 pm, Terje Mathisen <"terje.mathisen at
tmsw.no"@giganews.com> wrote:
> The 4G+ stack size is fine, I'm worried about _values_ (in registers)
> overflowing 32 bits!

It was *you* who proposed changing only the pointers to 64 bits,
leaving the data registers as 32 bits. I would have thought changing
everything to 64 bits was more straightforward and safer.

Richard.
http://www.rtrussell.co.uk/

Terje Mathisen

unread,
May 19, 2012, 12:53:40 PM5/19/12
to
I wouldn't put it that strongly, I just wondered (out loud) if maybe
that would be possible. :-)

Going to all-64 code would immediately double the required stack size. :-(

(At that point my 8 GB laptop would still be in trouble...)

Rod Pemberton

unread,
May 19, 2012, 7:57:17 PM5/19/12
to
"Terje Mathisen" <"terje.mathisen at tmsw.no"@giganews.com> wrote in message
news:kpvk89-...@ntp6.tmsw.no...
> Richard Russell wrote:
> > On May 18, 9:00 pm, Terje Mathisen<"terje.mathisen at
> > tmsw.no"@giganews.com> wrote:
> >> The 4G+ stack size is fine, I'm worried about _values_ (in registers)
> >> overflowing 32 bits!
> >
> > It was *you* who proposed changing only the pointers to 64 bits,
> > leaving the data registers as 32 bits. I would have thought changing
> > everything to 64 bits was more straightforward and safer.
>
> I wouldn't put it that strongly, I just wondered (out loud) if maybe
> that would be possible. :-)
>
> Going to all-64 code would immediately double the required stack size. :-(
>
> (At that point my 8 GB laptop would still be in trouble...)
>

Why does this algorithm consume SO much stack space?

If one is just looking for the computational results, i.e., k=xx, instead of
attempting to prove that a compiler is a "man" or "boy", then there is no
need to use the original algorithm. A different algorithm with the same
computational results and far less stack consumption could be used.


Rod Pemberton



Göran Weinholt

unread,
May 20, 2012, 11:09:52 AM5/20/12
to
On 20 Maj, 01:57, "Rod Pemberton"
<do_not_h...@nospicedham.notemailntt.cmm> wrote:

> If one is just looking for the computational results, i.e., k=xx, instead of
> attempting to prove that a compiler is a "man" or "boy", then there is no
> need to use the original algorithm.  A different algorithm with the same
> computational results and far less stack consumption could be used.

It turns out that Knuth showed how to compute this function faster way
back in the 1960s. Here's an implementation in Scheme, if anyone's
interested:

http://weinholt.se/hacks/man-or-boy.scm

(This is not the code I used to compute the numbers I added to
the Rosetta Code wiki. For that I used the code from the wiki.)

Rugxulo

unread,
May 22, 2012, 2:32:21 PM5/22/12
to
Hi, Rod,

On May 18, 5:18 am, "Rod Pemberton"
<do_not_h...@nospicedham.notemailntt.cmm> wrote:
>
> FYI, I tried this in DOS, specifically MS-DOS v7.10.  There were 3 C
> versions on Rosetta code website.  I tried the C version which was
> supposedly ANSI C.

http://rosettacode.org/wiki/Man_or_boy_test#C

Which would be most efficient: C99, GCC, C89?

> I got upto k=24 with OpenWatcom (v1.3) using Japheth's
> HDPMI DPMI server.  The limitation is the OW stub can only allocate 1GB of
> stack.  (wcl386/l=dos4g -k1024m)  I got upto k=25 with DJGPP (GCC v3.4.1)
> using Charles Sandmann's CWSDPMI (v5) DPMI server.

Heh, you're still stuck to oldies-but-goodies versions.

> The limitation is the
> DJGPP stub can only allocate 2GB of stack. (gcc -Wall -pedantic -O2,
> w/STUBEDIT 0x7FFFFFFF stack size, without -ansi flag since it errors on
> va_list ...)

I'm surprised it can even do that much.

> My machine has 4GB of memory and Japheth's HIMEMX EMS client

Typo, surely you meant XMSv3 here.

> which will support that in DOS, but about 512KB is shared with an old PCI
> video card since my "new" (5 years old) PCIE card died.  That leaves just
> over 3.5GB.

I only get 2.9 GB ("largest allocatable block") here.

> So, it still might not have had enough space for k=26, since
> the amount of stack space required seemed to double above k=10 or so.  With
> a different compiler (No, I don't know which that could possibly be.), video
> card, amount of memory (i.e., 6GB or 8GB), and motherboard, you might be
> able to get to and above k=26 in DOS.

I hate to admit, but this PC has 6 GB. Not sure how you intend to use
that in DOS. There is the very rough / experimental DOS/32A PAE hack,
but I'm not sure how to (correctly) utilize it. At least DJGPP's libc
(I'm told) is limited to 2 GB, so you'd have to call sbrk() directly
like twice in a row to get 3 GB. (DMC's X32 used to claim to reach 3
GB, but I never tried. Anyways, often you run into weird issues
anywhere near that, or even far lower, due to lack of testing.)

Well, I guess I should reboot and try for myself in DOS, but I'm not
optimistic (to say the least).

Rod Pemberton

unread,
May 22, 2012, 3:55:57 PM5/22/12
to
"Rugxulo" <rug...@nospicedham.gmail.com> wrote in message
news:6e36c884-1c47-42c6...@m24g2000yqh.googlegroups.com...
> On May 18, 5:18 am, "Rod Pemberton"
<do_not_h...@nospicedham.notemailntt.cmm> wrote:
> >

> > FYI, I tried this in DOS, specifically MS-DOS v7.10. There were
> > 3 C versions on Rosetta code website. I tried the C version which
> > was supposedly ANSI C.
>
> Which would be most efficient: C99, GCC, C89?

You mean in terms of stack space? I don't know. I'm not intending on
checking either. The version using GCC extensions seemed to be faster
computationally, but you need to patch in argv, argc, k from the other
examples to change which value it computes.

> > My machine has 4GB of memory and Japheth's HIMEMX EMS client
>
> Typo, surely you meant XMSv3 here.

Sigh ...

Yes, XMS client.

Expanded memory came first back in the 1980's as a form of page swapping.
Extended memory above 1MB came next in x86 PCs. That's why even though both
begin with 'EX', EMS begins with the 'E' and XMS begins with 'X'. I know
this. But, I'm not sure why I still continue to mix them up ... :-)


Rod Pemberton



Rugxulo

unread,
May 22, 2012, 7:05:49 PM5/22/12
to
Hi again,

On May 22, 1:32 pm, Rugxulo <rugx...@nospicedham.gmail.com> wrote:
>
> On May 18, 5:18 am, "Rod Pemberton"
> <do_not_h...@nospicedham.notemailntt.cmm> wrote:
>
> > FYI, I tried this in DOS, specifically MS-DOS v7.10.  There were 3 C
> > versions on Rosetta code website.  I tried the C version which was
> > supposedly ANSI C.
> >
> > So, it still might not have had enough space for k=26, since
> > the amount of stack space required seemed to double above k=10 or so.
> > With a different compiler (No, I don't know which that could possibly
> > be.), video card, amount of memory (i.e., 6GB or 8GB), and motherboard,
> > you might be able to get to and above k=26 in DOS.
>
> this PC has 6 GB. (Often you run into weird issues
> anywhere near that, or even far lower, due to lack of testing.)
>
> Well, I guess I should reboot and try for myself in DOS, but I'm not
> optimistic (to say the least).

No better luck, though I only tested (newer) OW and DJGPP. Best I
could get was k=25, similar to your results.

I expect I could try again, esp. with Win32 output + HX or X32 via
Digital Mars. Dunno, does Win32 PE grow the stack or must it be pre-
allocated? But at this point, I'm even more pessimistic.

P.S. Obviously I think that this is a pointless exercise and
incredibly wasteful of RAM, heh.

P.P.S. (naive question) Which Scheme is Weinholt using, R6RS?

Richard Russell

unread,
May 24, 2012, 6:11:23 PM5/24/12
to
On May 23, 12:05 am, Rugxulo <rugx...@nospicedham.gmail.com> wrote:
> No better luck, though I only tested (newer) OW and DJGPP. Best I
> could get was k=25, similar to your results.

Using the assembler code I listed I can get up to k=24 with 200 Mbytes
of stack space. So I would have expected k=27 to be possible with a 2
Gbytes stack.

Richard.
http://www.rtrussell.co.uk/

Brent C

unread,
May 26, 2012, 1:07:39 AM5/26/12
to
wow.. that one got up to k=39...
too bad I can't understand enough of it to make an assembly language version
of this one


"Göran Weinholt" <wein...@nospicedham.gmail.com> wrote in message
news:5c168b60-1942-4be6...@d6g2000vbe.googlegroups.com...

Thomas Womack

unread,
May 26, 2012, 5:16:20 AM5/26/12
to
In article <5c168b60-1942-4be6...@d6g2000vbe.googlegroups.com>,
=?ISO-8859-1?Q?G=F6ran_Weinholt?= <wein...@nospicedham.gmail.com> wrote:
>On 20 Maj, 01:57, "Rod Pemberton"
><do_not_h...@nospicedham.notemailntt.cmm> wrote:
>
>> If one is just looking for the computational results, i.e., k=3Dxx, inste=
>ad of
>> attempting to prove that a compiler is a "man" or "boy", then there is no
>> need to use the original algorithm. =A0A different algorithm with the sam=
>e
>> computational results and far less stack consumption could be used.
>
>It turns out that Knuth showed how to compute this function faster way
>back in the 1960s. Here's an implementation in Scheme, if anyone's
>interested:
>
>http://weinholt.se/hacks/man-or-boy.scm
>
>(This is not the code I used to compute the numbers I added to
>the Rosetta Code wiki. For that I used the code from the wiki.)

Umm, as far as I and the pari-gp implementation of LLL can tell it
spends a little while warming up and then computes the recursion

s_{n+5} = 5s_{n+4}-10s_{n+3}+11s_{n+2}-6s_{n+1}+s_n

whose values can be found really quite efficiently by matrix
exponentiation.

Tom

Göran Weinholt

unread,
May 27, 2012, 2:39:10 PM5/27/12
to
On 26 Maj, 07:07, "Brent C" <x87...@nospicedham.gmail.com> wrote:
> wow.. that one got up to k=39...
> too bad I can't understand enough of it to make an assembly language version
> of this one

I should have mentioned this explicitly: it goes a lot higher than
k=39: A(k=10000)=-19928164276191803466..[lots of numbers]..96034899
This was computed in milliseconds.

Let me show you the code in a different language (unfortunately
neither assembler nor PL/I). The C code below is like my R6RS Scheme
code, except it does neither memoization nor bignums, so it will be
slower and will give the wrong result at higher k values. Call
A(31,1,-1,-1,1,0) to compute A(k=31). Hopefully this is easier to
read. The algorithm is, as previously noted, due to Knuth.

int c1(k) {
switch (k) {
case 0: return 0;
case 1: return 0;
case 2: return 0;
case 3: return 1;
case 4: return 2;
case 5: return 3;
default: return c1(k - 1) + c2(k - 1);
}
}

int c2(k) {
switch (k) {
case 0: return 0;
case 1: return 0;
case 2: return 1;
case 3: return 1;
case 4: return 1;
case 5: return 2;
default: return c2(k - 1) + c3(k - 1);
}
}

int c3(k) {
switch (k) {
case 0: return 0;
case 1: return 1;
case 2: return 1;
case 3: return 0;
case 4: return 0;
case 5: return 1;
default: return c3(k - 1) + c4(k);
}
}

int c4(k) {
switch (k) {
case 0: return 1;
case 1: return 1;
case 2: return 0;
case 3: return 0;
case 4: return 0;
case 5: return 0;
default: return c1(k - 1) + c4(k - 1) - 1;
}
}

int c5(k) {
switch (k) {
case 0: return 0;
default: return 1;
}
}

int A(k, x1, x2, x3, x4, x5) {
return c1(k) * x1 + c2(k) * x2 + c3(k) * x3 + c4(k) * x4 + c5(k) *
x5;
}

Terje Mathisen

unread,
May 28, 2012, 8:08:40 AM5/28/12
to
Göran Weinholt wrote:
> I should have mentioned this explicitly: it goes a lot higher than
> k=39: A(k=10000)=-19928164276191803466..[lots of numbers]..96034899
> This was computed in milliseconds.

My perl version takes a few seconds, but I got the same results for
k=10K. :-)
>
> Let me show you the code in a different language (unfortunately
> neither assembler nor PL/I). The C code below is like my R6RS Scheme
> code, except it does neither memoization nor bignums, so it will be
> slower and will give the wrong result at higher k values. Call
> A(31,1,-1,-1,1,0) to compute A(k=31). Hopefully this is easier to
> read. The algorithm is, as previously noted, due to Knuth.

Thanks for the code, it was trivial to translate into a memoizing bignum
perl version:

#!perl -w

use strict;
use bignum;

my @c1 = (0,0,0,1,2,3);
sub c1
{
my ($k) = @_;
$c1[$k] = c1($k - 1) + c2($k - 1) unless (defined($c1[$k]));
return $c1[$k];
}

my @c2 = (0,0,1,1,1,2);
sub c2
{
my ($k) = @_;
$c2[$k] = c2($k - 1) + c3($k - 1) unless (defined($c2[$k]));
return $c2[$k];
}

my @c3 = (0,1,1,0,0,1);
sub c3
{
my ($k) = @_;
$c3[$k] = c3($k - 1) + c4($k) unless (defined($c3[$k]));
return $c3[$k];
}

my @c4 = (1,1,0,0,0,0);
sub c4
{
my ($k) = @_;
$c4[$k] = c1($k - 1) + c4($k-1) - 1 unless (defined($c4[$k]));
return $c4[$k];
}

sub c5
{
my ($k) = @_;
return $k ? 1 : 0;
}

sub A
{
my ($k, $x1, $x2, $x3, $x4, $x5) = @_;
return c1($k) * $x1 + c2($k) * $x2 + c3($k) * $x3 +
c4($k) * $x4 + c5($k) * $x5;
}

my $max = shift;
if (defined($max)) {
my $mob = 1;
# Initialize the memoization tables by calls to every 20th value,
# this limits the recursion depth:
for (my $i = 0; $i < $max; $i += 20) {
$mob = A($i,1,-1,-1,1,0);
}
printf("%3d %40s\n", $max, A($max,1,-1,-1,1,0));
}
else { # No input parameter, so print all the values up to 100:
for (my $i = 0; $i <= 100; $i++) {
printf("%3d %40s\n", $i, A($i,1,-1,-1,1,0));
0 new messages