Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Brainfuck in ANS Forth

389 views
Skip to first unread message

Mark Wills

unread,
May 1, 2014, 11:31:38 AM5/1/14
to
Just for fun...

> Increment the pointer.
< Decrement the pointer.
+ Increment the byte at the pointer.
- Decrement the byte at the pointer.
. Output the byte at the pointer.
, Input a byte and store it in the byte at the pointer.
[ Jump forward past the matching ] if the byte at the pointer is zero.
] Jump backward to the matching [ unless the byte at the pointer is zero.



create array
8000 chars allot
variable pointer
array pointer !

: reset ( -- )
\ zero bf memory and reset pointer
array 8000 0 fill
array pointer ! ;

: peek ( -- )
pointer @ c@ ;

: poke ( -- )
pointer @ c! ;

: > ( -- )
\ increment the pointer
pointer @ 1+ pointer ! ;

: < ( -- )
\ decrement the pointer
pointer @ 1- pointer ! ;

: + ( -- )
\ increment the byte at the pointer
peek 1+ poke ;

: - ( -- )
\ decrement the byte at the pointer
peek 1- poke ;

: . ( -- )
\ output the byte at the pointer
peek . ;

: , ( -- )
\ input a byte and store it in the byte at the pointer
pad 3 expect pad 3 number drop poke ;

: [ \ ( -- )
\ jump forward past the matching ] if the byte at the pointer is zero
[compile] begin compile peek [compile] while ; immediate

: ] ( -- )
\ jump backward to the matching [ unless the byte at the pointer is zero
[compile] repeat ; immediate

: hw
+ + + + + + + + [ > + + + + [ > + + > + + + > + + + > + < < < < - ] > +
> + > - > > + [ < ] < - ] > > . > - - - . + + + + + + + . . + + + . > >
. < - . < . + + + . - - - - - - . - - - - - - - - . > > + . > + + . ;


Test:

hw
72 101 108 108 111 32 87 111 114 108 100 33 10

Note: BrainFuck can't emit characters, only numbers. So the Hello World! example above (taken from WikiPedia) outputs the ASCII *values* of Hello World! complete with a carriage return.

http://en.wikipedia.org/wiki/Brainfuck

Alex McDonald

unread,
May 1, 2014, 1:54:42 PM5/1/14
to
on 01/05/2014 16:31:35, Mark Wills wrote:
[snip]

> \ jump backward to the matching [ unless the byte at the pointer is
> zero [compile] repeat ; immediate


Question; why [compile] and not postpone? Which Forth is this?

Rod Pemberton

unread,
May 1, 2014, 4:08:22 PM5/1/14
to
On Thu, 01 May 2014 11:31:38 -0400, Mark Wills <markwi...@gmail.com>
wrote:

> Just for fun...
>
>> Increment the pointer.
> < Decrement the pointer.
> + Increment the byte at the pointer.
> - Decrement the byte at the pointer.
> . Output the byte at the pointer.
> , Input a byte and store it in the byte at the pointer.
> [ Jump forward past the matching ] if the byte at the pointer is zero.
> ] Jump backward to the matching [ unless the byte at the pointer is
> zero.
>
>
>
> create array
> 8000 chars allot

The array is usually 32KB.

> variable pointer
> array pointer !
>
> : reset ( -- )
> \ zero bf memory and reset pointer
> array 8000 0 fill
> array pointer ! ;

0 fill. Good. Officially, BF doesn't need that. But, I found that
some BF programs wont' work without it, and all seem to work with it.

> : peek ( -- )
> pointer @ c@ ;
> : poke ( -- )
> pointer @ c! ;
...

> : > ( -- )
> \ increment the pointer
> pointer @ 1+ pointer ! ;

How common is +! ? It seems I picked it up from fig-Forth and one
critical word needs it.

> : < ( -- )
> \ decrement the pointer
> pointer @ 1- pointer ! ;
>
...

> : + ( -- )
> \ increment the byte at the pointer
> peek 1+ poke ;
> : - ( -- )
> \ decrement the byte at the pointer
> peek 1- poke ;

If you're using peek and poke, why not words for
'pointer @' and 'pointer !' ...

> : . ( -- )
> \ output the byte at the pointer
> peek . ;

. or EMIT ? Shouldn't it be EMIT ? It's a value for a character.

> : , ( -- )
> \ input a byte and store it in the byte at the pointer
> pad 3 expect pad 3 number drop poke ;

Um, did ANS Forth remove KEY ?

> : [ \ ( -- )
> \ jump forward past the matching ] if the byte at the pointer is zero
> [compile] begin compile peek [compile] while ; immediate
> : ] ( -- )
> \ jump backward to the matching [ unless the byte at the pointer is
> zero
> [compile] repeat ; immediate

...

> : hw
> + + + + + + + + [ > + + + + [ > + + > + + + > + + + > + < < < < - ]
> > +
> > + > - > > + [ < ] < - ] > > . > - - - . + + + + + + + . . + + + .
> > >
> . < - . < . + + + . - - - - - - . - - - - - - - - . > > + . > + + . ;
>

Ha! Each one has a space after them for Forth space parsing. And, it's
a colon-def. Awkward.

> Test:
>
> hw
> 72 101 108 108 111 32 87 111 114 108 100 33 10
>
> Note: BrainFuck can't emit characters, only numbers. So the Hello World!
> example above (taken from WikiPedia) outputs the ASCII *values* of Hello
> World! complete with a carriage return.
>
> http://en.wikipedia.org/wiki/Brainfuck

No. You've misunderstood something.

It's supposed to emit ASCII text "Hello World!" with linefeed (ASCII LF
0x0A).
The . in BF emits the ASCII character for the value. If you look at the C
equivalent for . and , on the Wikipedia page, they use getchar() and
putchar()
in C which fetches a char and emits a char, respectively.


Rod Pemberton

Mark Wills

unread,
May 1, 2014, 6:20:20 PM5/1/14
to
Ah. Sorry. It looks like I posted the wrong version. I have a version that uses postpone. Will post it in the morning.

Mark Wills

unread,
May 1, 2014, 6:22:47 PM5/1/14
to
Ah okay. That's my misunderstanding. Thanks for the correction. I'll post a tidier version on the morning. BF is good fun but boy is it a BF!

Elizabeth D. Rather

unread,
May 1, 2014, 6:31:10 PM5/1/14
to
On 5/1/14 10:08 AM, Rod Pemberton wrote:
> On Thu, 01 May 2014 11:31:38 -0400, Mark Wills <markwi...@gmail.com>
> wrote:
>
...
>> : > ( -- )
>> \ increment the pointer
>> pointer @ 1+ pointer ! ;
>
> How common is +! ? It seems I picked it up from fig-Forth and one
> critical word needs it.

It's in the CORE wordset for ANS Forth and Forth 200x.

>> : < ( -- )
>> \ decrement the pointer
>> pointer @ 1- pointer ! ;


or -1 pointer +!


> ...
>
>> : + ( -- )
>> \ increment the byte at the pointer
>> peek 1+ poke ;
>> : - ( -- )
>> \ decrement the byte at the pointer
>> peek 1- poke ;
>
> If you're using peek and poke, why not words for
> 'pointer @' and 'pointer !' ...

That's presumably what peek and poke are.

>> : . ( -- )
>> \ output the byte at the pointer
>> peek . ;
>
> . or EMIT ? Shouldn't it be EMIT ? It's a value for a character.

Unless outputting the string of numbers is the fun part, which OP seems
to imply.

>> : , ( -- )
>> \ input a byte and store it in the byte at the pointer
>> pad 3 expect pad 3 number drop poke ;
>
> Um, did ANS Forth remove KEY ?

Certainly not!

...
>>
>> Note: BrainFuck can't emit characters, only numbers. So the Hello World!
>> example above (taken from WikiPedia) outputs the ASCII *values* of Hello
>> World! complete with a carriage return.
>>
>> http://en.wikipedia.org/wiki/Brainfuck
>
> No. You've misunderstood something.
>
> It's supposed to emit ASCII text "Hello World!" with linefeed (ASCII LF
> 0x0A).
> The . in BF emits the ASCII character for the value. If you look at the C
> equivalent for . and , on the Wikipedia page, they use getchar() and
> putchar()
> in C which fetches a char and emits a char, respectively.
>

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

Mark Wills

unread,
May 2, 2014, 4:09:04 AM5/2/14
to
Here's a tidied up version.

32000 constant mem_size
create array
mem_size chars allot
variable pointer
array pointer !

: reset ( -- )
\ zero bf memory and reset pointer
array mem_size 0 fill
array pointer ! ;

: peek ( -- n )
pointer @ c@ ;

: poke ( n -- )
pointer @ c! ;

: > ( -- )
\ increment the pointer
1 pointer +! ;

: < ( -- )
\ decrement the pointer
-1 pointer +! ;

: + ( -- )
\ increment the byte at the pointer
peek 1+ poke ;

: - ( -- )
\ decrement the byte at the pointer
peek 1- poke ;

: . ( -- )
\ output the byte at the pointer
peek emit ;

: , ( -- )
\ input a byte and store it in the byte at the pointer
pad 3 accept 0. pad 3 >number 2drop drop poke drop ;

: [ \ ( -- )
\ jump forward past the matching ] if the byte at the pointer is zero
postpone begin postpone peek postpone while ; immediate

: ] ( -- )
\ jump backward to the matching [ unless the byte at the pointer is zero
postpone repeat ; immediate

: hw \ Hello World! in BrainFuck
reset
+ + + + + + + + [ > + + + + [ > + + > + + + > + + + > + < < < < - ] > +
> + > - > > + [ < ] < - ] > > . > - - - . + + + + + + + . . + + + . > >
. < - . < . + + + . - - - - - - . - - - - - - - - . > > + . > + + . ;

Regarding Rod's comment of the BF code requiring spaces - yes, you're right. I'm doing it "the Forth way"... I.e. the above code is not a BF interpreter (as a lot of implementations seem to be), it's a BF compiler. Of course, in Forth, it's stupidly easy to do - each "command" is an entry in the Forth dictionary. As far as Forth is concerned, BF is now Forth! Of course, the "down-side" is that you need spaces between "words" to satisfy the forth compiler. The up-side is that you get compiled BF code as an output :-)

Och... It's just a toy, but it was a fun 20-minute deviation from the daily grind ;-)

Hans Bezemer

unread,
May 2, 2014, 8:18:53 AM5/2/14
to
Mark Wills wrote:
> Regarding Rod's comment of the BF code requiring spaces - yes, you're
> right. I'm doing it "the Forth way"... I.e. the above code is not a BF
> interpreter (as a lot of implementations seem to be), it's a BF compiler.
> Of course, in Forth, it's stupidly easy to do - each "command" is an entry
> in the Forth dictionary. As far as Forth is concerned, BF is now Forth! Of
> course, the "down-side" is that you need spaces between "words" to satisfy
> the forth compiler. The up-side is that you get compiled BF code as an
> output :-)
Having done about five BF thingies, I can tell you from experience you can
do pretty neat things:
(1) Straight interpreter
(2) Compiler (to a kind of bytecode)
(3) Optimizing compiler (again, bytecode)
(4) Forth generator (generates 4tH code)
(5) Optimizing Forth generator

The point is that you can balance out < and > calls and + and - calls.
Instead of calling "+" 5 times (+++++) you can do +(5). And "+-" cancel
each other out.

The straight Forth generator creates something that is very similar to your
solution.

Hans Bezemer

Mark Wills

unread,
May 2, 2014, 12:06:52 PM5/2/14
to
Darn it Hans. I thought I was done with it. Now you've gone and started something ;-)

Mark Wills

unread,
May 2, 2014, 6:04:31 PM5/2/14
to
On Friday, 2 May 2014 17:06:52 UTC+1, Mark Wills wrote:
> Darn it Hans. I thought I was done with it. Now you've gone and started something ;-)

This is apparently a complete BF interpreter coded in C.

char m[9999],*n[99],*r=m,*p=m+5000,**s=n,d,c;main(){for(read(0,r,4000);c=*r;
r++)c-']'||(d>1||(r=*p?*s:(--s,r)),!d||d--),c-'['||d++||(*++s=r),d||(*p+=c==
'+',*p-=c=='-',p+=c=='>',p-=c=='<',c-'.'||write(2,p,1),c-','||read(2,p,1));}

Wow!

Rod Pemberton

unread,
May 2, 2014, 10:37:01 PM5/2/14
to
On Fri, 02 May 2014 18:04:31 -0400, Mark Wills <markwi...@gmail.com>
wrote:
Ugh...

I tested every BF interpreter and compiler I could find a few years
back. Usually, the short BF interpreters have something which doesn't
work perfectly, i.e., boundary conditions, undefined behavior change, etc.
I found five that are really good.

What I did was write a BF to C converter in C which optimizes additions,
subtracts, pointer shifts. Everyone does this. But, I also eliminate
deadcode and optimize clears and setting of cells. It also manipulates
a pointer and value separately. It does this because some BF programs
can do many adjustments to a single cell, i.e., without moving the pointer.
This reduces pointer dereferences in C and allows the value to be placed
in a register. It does all that in a single pass, emitting C directly,
ANSI C, without allocating any memory. It just uses a handful of
variables.

I've been interested in three other optimizations to BF. One of them
basically requires a powerful expression match and rewriting capabilities,
e.g., program directed text editor. These had to do with cell compression
or compaction. I.e., if only the first and third cells are used, the code
would be rewritten to use the first and second.

The third was conversion to statically named cells in C, i.e., treatment
of each cell as a named variable in C. This only works if pointer movement
within loops is balanced. Unbalanced movement of the pointer within a loop
- ending on a different cell - means the next time through the loop, a
different set of cells is accessed. That, of course, means that if each
cell is named, a different set of names would be needed for each time
through
the loop. That's a problem in C (without using pointers). Unbalanced
loops
happen for any of the "search" sequences for BF.

I also did two trivial analyses of common BF sequences, to determine what
the main operations are. Because operators in BF can be repeated, I used
some compression to reduce to a basic form. I'd have to check my results
for the exact number. I'll just say there was about a half-dozen to a
dozen
primary sequences, i.e., not too many. So, 14 to 20 operations when you
take into account the original 8.

BF needs a few things like setting values and chars directly, strings,
larger
integers to be more useful. It would've been nice if BF had one more
operation,
to clear a cell. That would eliminate trivial loops allowing optimizers
to work on loops that do something. So, I use '~' tilde to do a cell clear
instead of '[-]' or '[+]'. I'd recommend not using the '[+]' to clear
since
you don't know how integer wrapping is handled for a specific interpreter.

Anyway, this is what my C program outputs for the "Hello World!" program.
The specific one you posted. There are many... Notice the initial setting
of value 'v' to 8 and the save value, adjust pointer, reload value
sequences,
and other optimizations.

Obviously, there are no optimizations to emit a complete or partial
strings ...
Obviously, the assembly for it is still longer and more complicated than it
should be, but that's mostly due to the language simplicity.

Compile BF "Hello World!" to C conversion using 'gcc -Wall -ansi -pedantic
-O2'.
It should compile clean and work with GCC (Linux, DJGPP, etc).


#include <stdio.h>
#define SZ 32768

int main(void)
{
unsigned char a[SZ],*p,v;

for(p=a;p<&a[SZ];*p=0,p++);
p=a;

v=8;
while(1)
{
if(!v)break;
*p=v;p++;v=*p;
v+=4;
while(1)
{
if(!v)break;
*p=v;p++;v=*p;
v+=2;
*p=v;p++;v=*p;
v+=3;
*p=v;p++;v=*p;
v+=3;
*p=v;p++;v=*p;
v++;
*p=v;p-=4;v=*p;
v--;
}
*p=v;p++;v=*p;
v++;
*p=v;p++;v=*p;
v++;
*p=v;p++;v=*p;
v--;
*p=v;p+=2;v=*p;
v++;
while(1)
{
if(!v)break;
*p=v;p--;v=*p;
}
*p=v;p--;v=*p;
v--;
}
*p=v;p+=2;v=*p;
putchar(v);
*p=v;p++;v=*p;
v-=3;
putchar(v);
v+=7;
putchar(v);
putchar(v);
v+=3;
putchar(v);
*p=v;p+=2;v=*p;
putchar(v);
*p=v;p--;v=*p;
v--;
putchar(v);
*p=v;p--;v=*p;
putchar(v);
v+=3;
putchar(v);
v-=6;
putchar(v);
v-=8;
putchar(v);
*p=v;p+=2;v=*p;
v++;
putchar(v);
*p=v;p++;v=*p;
v+=2;
putchar(v);

return(0);
}


Rod Pemberton

Hans Bezemer

unread,
May 3, 2014, 6:02:47 AM5/3/14
to
Rod Pemberton wrote:

This is a "bf2c" in BF. It does its job quite nicely:

---8<---
+++[>+++++<-]>>+<[>>++++>++>+++++>+++++>+>>+<++[++<]>---]

>++++.>>>.+++++.>------.<--.+++++++++.>+.+.<<<<---.[>]<<.<<<.-------.>++++.
<+++++.+.>-----.>+.<++++.>>++.>-----.

<<<-----.+++++.-------.<--.<<<.>>>.<<+.>------.-..--.+++.-----<++.<--[>+<-]
>>>>>--.--.<++++.>>-.<<<.>>>--.>.

<<<<-----.>----.++++++++.----<+.+++++++++>>--.+.++<<<<.[>]<.>>

,[>>+++[<+++++++>-]<[<[-[-<]]>>[>]<-]<[<+++++>-[<+++>-[<-->-[<+++>-
[<++++[>[->>]<[>>]<<-]>[<+++>-[<--->-[<++++>-[<+++[>[-[-[-[->>]]]]<[>>]<<-]
>[<+>-[<->-[<++>-[<[-]>-]]]]]]]]]]]]]

<[
-[-[>+<-]>]
<[<<<<.>+++.+.+++.-------.>---.++.<.>-.++<<<<.[>]>>>>>>>>>]
<[[<]>++.--[>]>>>>>>>>]
<[<<++..-->>>>>>]
<[<<..>>>>>]
<[<<..-.+>>>>]
<[<<++..---.+>>>]
<[<<<.>>.>>>>>]
<[<<<<-----.+++++>.----.+++.+>---.<<<-.[>]>]
<[<<<<.-----.>++++.<++.+++>----.>---.<<<.-[>]]
<[<<<<<----.>>.<<.+++++.>>>+.++>.>>]
<.>
]>
,]

<<<<<.<+.>++++.<----.>>---.<<<-.>>>+.>.>.[<]>++.[>]<.
>[Translates brainfuck to C. Assumes EOF->0 or no-change-on-EOF.
Generated C does no-change-on-EOF, and uses unistd.h read and write calls.
Daniel B. Cristofani (cris...@hevanet.com).]
---8<---

Those interested in BF should also take a look at AWIB:
https://code.google.com/p/awib/

Hans Bezemer

Rod Pemberton

unread,
May 3, 2014, 9:21:32 AM5/3/14
to
On Sat, 03 May 2014 06:02:47 -0400, Hans Bezemer
<the.bee...@gmail.com> wrote:
> Rod Pemberton wrote:

> This is a "bf2c" in BF. It does its job quite nicely:
>
> [...]
> Daniel B. Cristofani (cris...@hevanet.com).]
> ---8<---
>

Yes, he's good.

The four of five programs I use for testing BF are by him:

bcci.c by Daniel B. Cristofani
pbrain.c by Daniel B. Cristofani
qdb.c by Daniel B. Cristofani
sbi.c by Daniel B. Cristofani
bff4.c by Oleg Mazonka

> Those interested in BF should also take a look at AWIB:
> https://code.google.com/p/awib/
>

One of the most powerful BF compilers:
(in Python, emits BF code as C)

Esotope-bfc
http://code.google.com/p/esotope-bfc/

Be sure to click the "Optimization" and
"Comparison" links too.


Rod Pemberton

Assad Ebrahim

unread,
May 3, 2014, 5:59:40 PM5/3/14
to
On Fri, 2 May 2014 01:09:04 -0700 (PDT), Mark Wills
<markwi...@gmail.com> wrote:

>Here's a tidied up version.
>

Nice!

>
>: hw \ Hello World! in BrainFuck
> reset
> + + + + + + + + [ > + + + + [ > + + > + + + > + + + > + < < < < - ] > +
> > + > - > > + [ < ] < - ] > > . > - - - . + + + + + + + . . + + + . > >
> . < - . < . + + + . - - - - - - . - - - - - - - - . > > + . > + + . ;
>
>Regarding Rod's comment of the BF code requiring spaces - yes, you're right. I'm doing it "the Forth way"... I.e. the above code is not a BF interpreter (as a lot of implementations seem to be), it's a BF compiler. Of course, in Forth, it's stupidly easy to do - each "command" is an entry in the Forth dictionary. As far as Forth is concerned, BF is now Forth! Of course, the "down-side" is that you need spaces between "words" to satisfy the forth compiler. The up-side is that you get compiled BF code as an output :-)

Presumably one could get by this by essentially loading non-spaced BF
code into memory, then reading a byte (char) at a time, and finding
and executing the command?

- Assad

Wolfgang Allinger

unread,
May 3, 2014, 7:39:00 PM5/3/14
to

On 03 May 14 at group /comp/lang/forth in article dhpam9pia9dd9oodi...@4ax.com
<assad....@alum.swarthmore.edu> (Assad Ebrahim) wrote:

>Presumably one could get by this by essentially loading non-spaced BF
>code into memory, then reading a byte (char) at a time, and finding
>and executing the command?

Why not converting non-spaced-BF to FORTH style spaced-BF?

Prepare a 64kB buffer (BF is maximal 32kB or?)
read a byte from non-spaced-BF to the buffer and put a space to the next
byte until the non-spaced-BF is read.

Then you can go ahead with the existing FORTH implementation.



Saludos (an alle Vern�nftigen, Rest sh. sig)
Wolfgang

--
Wolfgang Allinger, anerkannter Trollallergiker :) reply Adresse gesetzt!
Ich diskutiere zuk�nftig weniger mit Idioten, denn sie ziehen mich auf
ihr Niveau herunter und schlagen mich dort mit ihrer Erfahrung! :p
(lt. alter usenet Weisheit) iPod, iPhone, iPad, iTunes, iRak, iDiot

Mark Wills

unread,
May 6, 2014, 11:14:17 AM5/6/14
to
Well, I bit the bullet and hacked together an optimising Brainfuck compiler. It reads text files of BF code, and compiles a Forth representation (optimising as it goes) directly to memory.

It's written for my Forth system (Forth 83). I'm not sure if it's possible to take the approach I have taken in an ANS system, especially with systems that generate machine code.

It reads a line of text from the file, and then starts processing. When instructions repeat (e.g. ++++) it counts the number of repetitions and only emits code for it (LIT 4 BF+) when a new instruction arrives - so a change in instruction causes the previous instruction to be emitted/flushed.

There's some 'look-back' type of optimisation going on with reference to a single instruction sequence: [-] this sequence decrements the current location to 0 and is obviously ripe for optimising. When it sees ] it looks back at the (compiled) code to see if we've just compiled [-] and if so it backs up HERE and overwrites the code with a call to a ~ instruction, which zeros the current location. I don't know if this technique is doable on a native compiling system? It's easy on an ITC, you can read back the code you just compiled and compare them to CFAs and identify a code-sequence. Not so easy when you compile Forth, but read back machine code, which is what happens on a native system. Some other system would have to be employed.

I think for a native code generating Forth system, the simplest approach for this type of thing is to write an interpreter, rather than a compiler. It then becomes very similar to an ITC system; you're compiling tokens to memory space, which will be read and looked up with a CASE...ENDCASE or similar.

Ah well, I think that ends my temporary interest in BF!

forget bf>
1 constant bf> \ Increment the pointer.
2 constant bf< \ Decrement the pointer.
3 constant bf+ \ Increment the byte at the pointer.
4 constant bf- \ Decrement the byte at the pointer.
5 constant bf. \ Output the byte at the pointer.
6 constant bf, \ Input a byte and store it in the byte at the pointer.
7 constant bf[ \ Jump forward past the matching ] if the byte at pointer=0
8 constant bf] \ Jump backward to the matching [

8000 constant bfArraySize

forget file-in
fbuf: file-in

variable bfLoops
variable bfPointer
variable bfOpCount
variable bfState
variable pad_addr

: pointer+ ( n -- )
\ optimised implementation of + and -.
\ Optimises multiple +'s into a single instruction that takes the amount
\ from the stack at runtime.
bfPointer +! ;

: pointer++ ( -- )
\ runtime implementation of +. add 1 to the bf array pointer
1 bfPointer +! ;

: pointer-- ( -- )
\ runtime implementation of -. subtract 1 from the bf array pointer
-1 bfPointer +! ;


: array+ ( n -- )
\ optimised implementation of < and >. Optimises multiple +'s into a single
\ instruction that takes the number amount from the stack at runtime.
\ add n to the array value pointed at by the bf array pointer
bfPointer @ dup c@ rot + swap c! ;

: array++ ( -- )
\ runtime implementation of >
\ add 1 to the array value pointed at by the bf array pointer
bfPointer @ dup c@ 1+ swap c! ;

: array-- ( -- )
\ runtime implementation of <
\ subtract 1 from the array value pointed at by the bf array pointer
bfPointer @ dup c@ 1- swap c! ;

: bf. ( -- )
\ take a byte from the array and display as ascii character
bfPointer @ c@ dup 10 = if cr drop else emit then ;

: bfIf ( -- flag )
\ return the byte at the array. used for [ (looping) implementation
bfPointer @ c@ ;

: bf~ ( -- )
\ zero the byte at pointed to by the bf array pointer
0 bfPointer @ c! ;


: flushCode ( -- )
bfState @ case
bf> of
bfOpCount @ dup 1 = if
compile pointer++ drop
else
compile lit , compile pointer+
then
endof

bf< of
bfOpCount @ dup -1 = if
compile pointer-- drop
else
compile lit , compile pointer+
then
endof

bf+ of
bfOpCount @ dup 1 = if
compile array++ drop
else
compile lit , compile array+
then
endof

bf- of
bfOpCount @ dup -1 = if
compile array-- drop
else
compile lit , compile array+
then
endof
endcase

bfOpCount 0!
bfState 0! ;

: bfOpCount++ ( -- )
1 bfOpCount +! ;

: bfOpCount-- ( -- )
-1 bfOpCount +! ;

: flushCode? ( opcode -- )
bfState @ = not if flushCode then ;


: do> ( -- )
bf> flushCode? bf> bfState ! bfOpCount++ ;

: do< ( -- )
bf< flushCode? bf< bfState ! bfOpCount-- ;

: do+ ( -- )
bf+ flushCode? bf+ bfState ! bfOpCount++ ;

: do- ( -- )
bf- flushCode? bf- bfState ! bfOpCount-- ;

: do. ( -- )
flushCode compile bf. ;

: do, ( -- )
." , " ;

: do[ ( -- )
flushCode 1 bfLoops +!
[compile] begin compile bfIf [compile] while ;


: do] ( -- )
flushCode -1 bfLoops +!
[compile] repeat
\ check if optimisation is possible:
\ look for [-] (which sets an array element to 0 by repeatedly decrementing
\ to zero. If detected, replace the sequence with a ~ command (which just
\ sets the element pointed at by the pointer to 0).
\ First, see if previous instruction compiled was an array-- instruction:
here 3 cells - @ ['] array-- = if
\ yes, it was. now look back to see if there's a bfIf instruction
\ prior to it. The code in memory would look like this:
\ bfIf 0BRANCH xxxx array-- BRANCH xxxx <here>
here 6 cells - @ ['] bfIf = if
\ yes, bfIf detected. We can optimise this sequence away into
\ a single ~ instruction
here 6 cells - h ! \ set Forth compiler compilation address
compile bf~ \ compile a ~ instruction
." optimised [-] to ~ "
then
then ;


: compile-bf ( addr -- )
\ compiles a line of bf code
count dup if ( not a blank line)
swap pad_addr !
0 do
pad_addr @ i + c@ case
ascii > of do> endof
ascii < of do< endof
ascii + of do+ endof
ascii - of do- endof
ascii . of do. endof
ascii , of do, endof
ascii [ of do[ endof
ascii ] of do] endof
endcase
loop
else ( blank line) 2drop then ;


: bf ( "filespec" -- )
\ open, interpret and compile bf program
s" dsk1.1234567890.txt DV80SI"
>r >r r@ 19 32 fill
r@ swap cmove
r> r> file-in file
cr file-in #open abort" Can't open file"

bfOpCount 0! bfLoops 0! page
$8320 , \ compile DOCOL
begin
file-in #eof? not while
pad file-in #get if
." Can't read from file" file-in #close
r> h ! abort
then
pad compile-bf
cr ." DEPTH=" depth . ." HERE=" HERE $.
repeat flushCode \ write any pending opcodes
$832C , \ compile EXIT
file-in #close
cr bfLoops @ if
." Un-closed loops detected:" bfLoops @ .
else
." Compiled successfully."
then cr ;


: run ( -- )
\ executes the bf program
page \ clear screen
ffailm @ bfPointer ! \ set address of bf array
bfPointer @ bfArraySize 0 fill \ initialise the bf array
$2000 execute \ run the compiled bf code
;

marker erase-bf

s" DSK1.bf-test2.txt" bf

Rod Pemberton

unread,
May 6, 2014, 4:53:36 PM5/6/14
to
On Tue, 06 May 2014 11:14:17 -0400, Mark Wills <markwi...@gmail.com>
wrote:

> Ah well, I think that ends my temporary interest in BF!
>

Yeah, that's what I thought five years ago. Then, I kept noticing
the simplicity, the potential, the similarity to Forth, etc.
So, it gets put aside for a while, but then I return to it ...


RP

Gerry Jackson

unread,
May 22, 2014, 5:13:54 PM5/22/14
to
On 06/05/2014 16:14, Mark Wills wrote:
> Well, I bit the bullet and hacked together an optimising Brainfuck compiler.. It reads text files of BF code, and compiles a Forth representation (optimising as it goes) directly to memory.
>
> It's written for my Forth system (Forth 83). I'm not sure if it's possible to take the approach I have taken in an ANS system, especially with systems that generate machine code.
>
> It reads a line of text from the file, and then starts processing. When instructions repeat (e.g. ++++) it counts the number of repetitions and only emits code for it (LIT 4 BF+) when a new instruction arrives - so a change in instruction causes the previous instruction to be emitted/flushed.
>
> There's some 'look-back' type of optimisation going on with reference to a single instruction sequence: [-] this sequence decrements the current location to 0 and is obviously ripe for optimising. When it sees ] it looks back at the (compiled) code to see if we've just compiled [-] and if so it backs up HERE and overwrites the code with a call to a ~ instruction, which zeros the current location. I don't know if this technique is doable on a native compiling system? It's easy on an ITC, you can read back the code you just compiled and compare them to CFAs and identify a code-sequence. Not so easy when you compile Forth, but read back machine code, which is what happens on a native system. Some other system would have to be employed.
>
> I think for a native code generating Forth system, the simplest approach for this type of thing is to write an interpreter, rather than a compiler. It then becomes very similar to an ITC system; you're compiling tokens to memory space, which will be read and looked up with a CASE...ENDCASE or similar.
>
> Ah well, I think that ends my temporary interest in BF!
>
[...]

Perhaps so but here's a non-optimising BF compiler in ANS Forth which is
a bit more compact. I'll show how to implement an optimiser after the
code. Incidentally this follows the desciption in the Wikipedia article
and ignores any character in the BF program that isn't one of the 8
commands. Hence the 2 versions of Hello World below, both of whch work.

A BF program is contained within bf[ and ]
The final ] is unmatched and therefore illegal in BF and is used to
terminate compilation.

32000 constant bf-mem_size
create bf-mem bf-mem_size chars allot
variable pointer
variable #nested \ Count the number of nested [ ] pairs
variable bf-end \ Flag to indicate end of BF program

: reset ( -- ) \ zero bf memory and reset pointer
bf-mem bf-mem_size 0 fill
bf-mem pointer ! 0 #nested ! 0 bf-end !
;

: refill? ( caddr u -- caddr u | caddr' u' )
begin
dup 0=
while
2drop refill 0= abort" Early end of file"
source
repeat
;

wordlist constant commands \ For definition of the BF commands

: next-cmd ( caddr u -- caddr' u' xt )
begin
refill? 1 /string
over 1 chars - 1 commands search-wordlist
until
;

: peek ( -- n ) pointer @ c@ ;
: poke ( n -- ) pointer @ c! ;

: (>) ( -- ) 1 pointer +! ; \ increment the pointer
: (<) ( -- ) -1 pointer +! ; \ decrement the pointer
: (+) ( -- ) peek 1+ poke ; \ increment the byte at the pointer
: (-) ( -- ) peek 1- poke ; \ decrement the byte at the pointer
: (.) ( -- ) peek emit ; \ output the byte at the pointer
: (,) ( -- ) key poke ; \ input a byte, store at the pointer

\ The BF commands in a separate wordlist
get-current commands set-current

: > ( -- ) postpone (>) ;
: < ( -- ) postpone (<) ;
: + ( -- ) postpone (+) ;
: - ( -- ) postpone (-) ;
: . ( -- ) postpone (.) ;
: , ( -- ) postpone (,) ;

: [ ( caddr u ) \ jump past the matching ] if the byte at the pointer = 0
1 #nested +!
2>r postpone begin postpone peek postpone while 2r>
;

\ An unmatched ] indicates the end of the BF program
: ] ( caddr u ) \ jump backward to the matching [ or end of program
#nested @
if -1 #nested +! 2>r postpone repeat 2r> else -1 bf-end ! then
;

set-current

: bf[ ( -- ) \ Start of BF program
reset :noname ( -- xt )
source >in @ /string ( -- xt caddr u )
begin
next-cmd execute bf-end @
until
source rot - >in ! 2drop \ Update >in for the Forth interpreter
postpone ; cr execute
;

bf[ \ Hello world compressed
++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
]

bf[ \ Hello World with comments
+++++ +++ Set Cell #0 to 8
[
>++++ Add 4 to Cell #1; this will always set Cell
#1 to 4
[ as the cell will be cleared by the loop
>++ Add 4*2 to Cell #2
>+++ Add 4*3 to Cell #3
>+++ Add 4*3 to Cell #4
>+ Add 4 to Cell #5
<<<<- Decrement the loop counter in Cell #1
] Loop till Cell #1 is zero
>+ Add 1 to Cell #2
>+ Add 1 to Cell #3
>- Subtract 1 from Cell #4
>>+ Add 1 to Cell #6
[<] Move back to the first zero cell you find;
this will
be Cell #1 which was cleared by the
previous loop
<- Decrement the loop Counter in Cell #0
] Loop till Cell #0 is zero

The result of this is:
Cell No : 0 1 2 3 4 5 6
Contents: 0 0 72 104 88 32 8
Pointer : ^

>>. Cell #2 has value 72 which is 'H'
>---. Subtract 3 from Cell #3 to get 101 which is 'e'
+++++ ++..+++. Likewise for 'llo' from Cell #3
>>. Cell #5 is 32 for the space
<-. Subtract 1 from Cell #4 for 87 to give a 'W'
<. Cell #3 was set to 'o' from the end of 'Hello'
+++.----- -.----- ---. Cell #3 for 'rl' and 'd'
>>+. Add 1 to Cell #5 gives us an exclamation point
>++. And finally a newline from Cell #6
]

\ --- End of Code -----

You've said elsewhere you can't see the point of quotations, well the
definition of is:

: + ( -- ) postpone (+) ;
where (+) is defined earlier. If we use a quotation we can write

: + ( -- ) [: peek 1+ poke ;] compile, ;

which looks a lot clearer to me.

Optimisation can be added in this way e.g. to optimise a sequence of
+'s, add this code in the appropriate places

: n+ ( n -- ) peek + poke ; \ Optimised sequence of +'s

0 value +xt
\ Replace the original definition of + with
: + ( caddr u -- caddr' u' ) \ Optimises sequence of +'s
0 >r
begin
next-cmd dup +xt =
while
drop r> 1+ >r
repeat
r> ?dup if 1+ postpone literal postpone n+ else postpone (+) then
execute
;

get-order commands swap 1+ set-order
' + to +xt
previous

--
Gerry

Coos Haak

unread,
May 23, 2014, 9:27:09 AM5/23/14
to
Op Thu, 22 May 2014 22:13:54 +0100 schreef Gerry Jackson:

<his brainfuck compiler>

My version
---
32000 constant bf-mem_size
create bf-mem bf-mem_size allot
variable pointer
variable #nested \ Count the number of nested [ ] pairs
variable bf-end \ Flag to indicate end of BF program

: reset ( -- ) \ zero bf memory and reset pointer
bf-mem bf-mem_size erase
bf-mem pointer ! 0 #nested ! 0 bf-end !
;

\ No need for a string on the stack to keep the state.
: ?refresh ( -- )
begin source nip >in @ = \ Line exhausted?
while refill 0= abort" Early end of file"
repeat
;

: next-token ( -- c-addr u ) \ String of one character
?refresh
source drop >in @ + \ point to character
1 \ length of string
1 >in +! \ bup pointer
;

wordlist constant commands \ For definition of the BF commands

: next-cmd ( -- xt flag )
begin next-token commands search-wordlist ?dup
until
;

: peek ( -- n ) pointer @ c@ ;
: poke ( n -- ) pointer @ c! ;

\ The BF commands in a separate wordlist
get-current commands set-current

\ You may freely use names here that are used in Forth,
\ Its only effect is that the compiler may warn for duplicate definitions.

\ 'normal' bf words are not immediate
: > ( -- ) 1 pointer +! ; \ increment the pointer
: < ( -- ) -1 pointer +! ; \ decrement the pointer
: + ( -- ) peek 1+ poke ; \ increment the byte at the pointer
: - ( -- ) peek 1- poke ; \ decrement the byte at the pointer
: . ( -- ) peek emit ; \ output the byte at the pointer
: , ( -- ) key poke ; \ input a byte, store at the pointer
\ Speed up compilation ;-)
: \ ( -- ) postpone \ ; immediate

\ 'special' bf words are immediate
: [ ( -- ) \ jump past the matching ] if the byte at the pointer = 0
1 #nested +!
postpone begin postpone peek postpone while
; immediate

\ An unmatched ] indicates the end of the BF program
: ] ( -- ) \ jump backward to the matching [ or end of program
#nested @
if -1 #nested +! postpone repeat else -1 bf-end ! then
; immediate

set-current

\ A nice feature of search-wordlist that is next to the xt gives a number,
\ 1 for immediate and -1 for normal words, you can use this here:
: bf[ ( -- ) \ Start of BF program
reset :noname ( -- xt )
begin
next-cmd 0<
if compile,
else execute
then
bf-end @
until
postpone ; cr execute
;

--
Coos

CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html

Gerry Jackson

unread,
May 26, 2014, 4:38:18 PM5/26/14
to
But \ is not a comment in a BF program, so it doesn't hide any BF
commands on the same line, it should be ignored

>
> \ 'special' bf words are immediate
> : [ ( -- ) \ jump past the matching ] if the byte at the pointer = 0
> 1 #nested +!
> postpone begin postpone peek postpone while
> ; immediate
>
> \ An unmatched ] indicates the end of the BF program
> : ] ( -- ) \ jump backward to the matching [ or end of program
> #nested @
> if -1 #nested +! postpone repeat else -1 bf-end ! then
> ; immediate
>
> set-current
>
> \ A nice feature of search-wordlist that is next to the xt gives a number,
> \ 1 for immediate and -1 for normal words, you can use this here:
> : bf[ ( -- ) \ Start of BF program
> reset :noname ( -- xt )
> begin
> next-cmd 0<
> if compile,
> else execute
> then
> bf-end @
> until
> postpone ; cr execute
> ;
>

Yes quite a bit neater, thanks. If optimisation is wanted then more of
the BF commands become immediate of course.

--
Gerry

Ian Osgood

unread,
Jun 16, 2014, 6:35:02 PM6/16/14
to
The Rosetta Code wiki also has a Brainfuck task, for which I contributed a simple bf-to-forth compiler.

http://rosettacode.org/wiki/Execute_Brain****/Forth

Ian
0 new messages