Wimps don't read this message (long)

87 views
Skip to first unread message

Louis Howell

unread,
Feb 15, 1991, 11:18:19 PM2/15/91
to

Well, people, I've come to a conclusion that many of you may find
difficult to accept. Nevertheless, I think it's true. INTERCAL is
really a pretty sissy language. It tries hard to be different, but
when you get right down to its roots, what do you find? You find bits,
that's what. Plain old ones and zeroes, in groups of 16 and 32, just
like every other language you've ever heard of. And what operations
can you perform on these bits? The INTERCAL operators may arrange and
permute them in weird and wonderful ways, but at the bit level the
operators are the same AND, OR and XOR you've seen countless times
before.

Once the prospective INTERCAL programmer masters the unusual syntax,
she finds herself working with the familiar Boolean operators on
perfectly ordinary unsigned integer words. Even the constants she uses
are familiar. After all, who would not immediately recognize #65535
and #32768? It may take a just a moment more to figure out #65280,
and #21845 and #43690 could be puzzles until she notices that they
sum to #65535, but basically she's still on her home turf. The 16-bit
limit on constants actually works in the programmer's favor by insuring
that very long anonymous constants can not appear in INTERCAL programs.
And this is in a language that is supposed to be different from any
other!

It is in the interest of overcoming these shortcomings that I now
introduce, not a change to INTERCAL itself, but rather a new dialect
sharing most of the features of the old language. Presenting...


The TriINTERCAL Supplemental Programmer's Manual

1. ABANDON ALL HOPE...

Standard INTERCAL is based on variables consisting of ordinary bits
and familiar Boolean operations on those bits. In pursuit of uniqueness,
it seems appropriate to provide a new dialect, otherwise identical to
INTERCAL, which instead uses variables consisting of trits, i.e. ternary
digits, and operators based on tritwise logical operations. This is
intended to be a separate dialect, rather than an extension to INTERCAL
itself, for a number of reasons. Doing it this way avoids word-length
conflicts, does not spoil the elegance of the Spartan INTERCAL operator
set, and dodges the objections of those who might feel it too great an
alteration to the original language. Primarily, though, giving INTERCAL
programmers the ability to switch numeric base at will amounts to
excessive functionality. So much better that a programmer choose a base
at the outset and then be forced to stick with it for the remainder of
the program.

2. COMPILER OPERATION

The same compiler, ick, supports both INTERCAL and TriINTERCAL.
This has the advantage that future bug fixes and additions to the
language not related to arithmetic immediately apply to both versions.
The compiler recognizes INTERCAL source files by the extension '.i',
and TriINTERCAL source files by the extension '.3i'. It's as simple
as that. There is no way to mix INTERCAL and TriINTERCAL source in
the same program, and it is not always possible to determine which
dialect a program is written in just by looking at the source code.

3. DATA TYPES

The two TriINTERCAL data types are 10-trit unsigned integers and
20-trit unsigned integers. All INTERCAL syntax for distinguishing
data types is ported to these new types in the obvious way. Small
words may contain numbers from #0 to #59048, large words may contain
numbers from #0$#0 to #59048$#59048. Errors are signaled for constants
greater than #59048 and for attempts to WRITE IN numbers too large
for a given variable or array element to hold.
Note that though TriINTERCAL considers all numbers to be unsigned,
nothing prevents the programmer from implementing arithmetic operations
that treat their operands as signed. Three's complement is one obvious
choice, but balanced ternary notation is also a possibility. This
latter is a very pretty and symmetrical system in which all 2 trits
are treated as if they had the value -1.

4. OPERATORS

The binary operators carry over from the original language with only
minor changes. The MINGLE operator ($) creates a 20-trit word by
alternating trits from its two 10-trit operands. The SELECT operator (~)
is a little more complicated, since the ternary tritmask may contain 0, 1,
and 2 trits. If we observe that the SELECT operation on binary operands
amounts to a bitwise AND and some rearrangement of bits, it seems
appropriate to base the SELECT for ternary operands on a tritwise AND in
the analogous fashion. We therefore postpone the definition of SELECT
until we know what a tritwise AND looks like.

The unary operators in INTERCAL are all derived from the familiar
Boolean operations on single bits. To extend these operations to trits,
we first ask ourselves what the important properties of these operations
are that we wish to be preserved, then design the tritwise operators so
that they behave in a similar fashion.

Let's start with AND and OR. To begin with, these can be considered
"choice" or "preference" operators, as they always return one of their
operands. AND can be described as wanting to return 0, but returning 1
if it is given no other choice, i.e., if both operands are 1. Similarly,
OR wants to return 1 but returns 0 if that is its only choice. From
this it is immediately apparent that each operator has an identity
element that "always loses", and a dominator element that "always wins".
AND and OR are commutative and associative, and each distributes
over the other. They are also symmetric with each other, in the sense
that AND looks like OR and OR looks like AND when the roles of 0 and 1
are interchanged (De Morgan's Laws). This symmetry property seems to be
a key element to the idea that these are logical, rather than arithmetic,
operators. In a three-valued logic we would similarly expect a three-
way symmetry among the three values 0, 1 and 2 and the three operators
AND, OR and (of course) BUT.
The following tritwise operations have all the desired properties:
OR returns the greatest of its two operands. That is, it returns 2 if
it can get it, else it tries to return 1, and it returns 0 only if both
operands are 0. AND wants to return 0, will return 2 if it can't get
0, and returns 1 only if forced. BUT wants 1, will take 0, and tries
to avoid 2. The equivalents to De Morgan's Laws apply to rotations
of the three elements, e.g., 0 -> 1, 1 -> 2, 2 -> 0. Each operator
distributes over exactly one other operator, so the property
"X distributes over Y" is not transitive. The question of which way
this distributivity ring goes around is left as an exercise for the
student.
In TriINTERCAL programs the '@' (whirlpool) symbol denotes the unary
tritwise BUT operation. You can think of the whirlpool as drawing
values preferentially towards the central value 1. Alternatively,
you can think of it as drawing your soul and your sanity inexorably
down...

On the other hand, maybe it's best you NOT think of it that way.

A few comments about how these operators can be used. OR acts like
a tritwise maximum operation. AND can be used with tritmasks. 0's
in a mask wipe out the corresponding elements in the other operand,
while 1's let the corresponding elements pass through unchanged. 2's
in a mask consolidate the values of nonzero elements, as both 1's and
2's in the other operand yield 2's in the output. BUT can be used to
create "partial tritmasks". 0's in a mask let BUT eliminate 2's from
the other operand while leaving other values unchanged. Of course,
the symmetry property guarantees that the operators don't really
behave differently from each other in any fundamental way; the apparent
differences come from the intuitive view that a 0 trit is "not set"
while a 1 or 2 trit is "set".

At this point we can define SELECT, since we now know what the
tritwise AND looks like. SELECT takes the binary tritwise AND of
its two operands. It shifts all the trits of the result corresponding
to 2's in the right operand over to the right (low) end of the result,
then follows them with all the output trits corresponding to 1's in
the right operand. Trits corresponding to 0's in the right operand,
which are all 0 anyway, occupy the remaining space at the left end of
the output word. Both 10-trit and 20-trit operands are accepted,
and are padded with zeroes on the left if necessary. The output
type is determined the same way as in standard INTERCAL.

Now that we've got all that settled, what about XOR? This is
easily the most-useful of the three unary INTERCAL operators,
because it combines in one package the operations ADD WITHOUT CARRY,
SUBTRACT WITHOUT BORROW, BITWISE NOT-EQUAL, and BITWISE NOT. In
TriINTERCAL we can't have all of these in the same operator, since
addition and subtraction are no longer the same thing. The solution
is to split the XOR concept into two operators. The ADD WITHOUT CARRY
operation is represented by the new '^' (sharkfin) symbol, while the
old '?' symbol represents SUBTRACT WITHOUT BORROW. The reason for
this choice is so that '?' will also represent the TRITWISE NOT-EQUAL
operation.
Note that '?', unlike the other four unary operators, is not
symmetrical. It should be thought of as rotating its operand one trit
to the right (with wraparound) and then subtracting off the trits of
the original number. These subtractions are done without borrowing,
i.e., trit-by-trit modulo 3.

5. EXAMPLES

That discussion of the TriINTERCAL operators has probably left you
all gasping for breath, but they really aren't all that bad once you
get used to them. Let's look at a few examples to show how they can
be used in practice. In all of these examples the input value is
contained in the 10-trit variable .3.
In INTERCAL, single-bit values often have to be converted from
{0,1} to {1,2} for use in RESUME statements. Examples of how to do
this appear in the original manual. In TriINTERCAL the expression
"^.3$#1"~#1 sends 0 -> 1 and 1 -> 2. If the 1-trit input value can
take on any of its three possible states, however, we will also have
to deal with the 2 case. The expression "V.3$#1"~#1 sends {0,1} -> 1
and 2 -> 2. To test if a bit is set, we can use "V'"&.3$#2"~#1'$#1"~#1,
sending 0 -> 1 and {1,2} -> 2. To reverse the test we use
"?'"&.3$#2"~#1'$#1"~#1, sending 0 -> 2 and {1,2} -> 1. Note that we
have not been taking full advantage of the new SELECT operator. These
last two expressions can be simplified into "V!3~#2'$#1"~#1 and
"?!3~#2'$#1"~#1, which perform exactly the same mappings. Finally, if
we need a 3-way test, we can use "@'"^.3$#7"~#4'$#2"~#10, which
obviously sends 0 -> 1, 1 -> 2, and 2 -> 3.
For an unrelated example, the expression "^.3$.3"~"#0$#29524"
converts all of the 1-trits of .3 into 2's and all of the 2-trits
into 1's. In balanced ternary, where 2-trits represent -1 values,
this is the negation operation.

6. BEYOND TERNARY...

While we're at it, we might as well extend this multiple bases
business a little farther. The ick compiler actually recognizes
filename suffixes of the form '.Ni', where N is any number from 2
to 7. 2 of course gives standard INTERCAL, while 3 gives TriINTERCAL.
We cut off before 8 because octal notation is the smallest base used
to facilitate human-to-machine communication, and this seems quite
contrary to the basic principles behind INTERCAL. The small data
types hold 16 bits, 10 trits, 8 quarts, 6 quints, 6 sexts, or 5 septs,
and the large types are always twice this size.
As for operators, '?' is always SUBTRACT WITHOUT BORROW, and '^'
is always ADD WITHOUT CARRY. 'V' is the OR operation and always
returns the max of its inputs. '&' is the AND operation, which chooses
0 if possible but otherwise returns the max of the inputs. '@' is BUT,
which prefers 1, then 0, then the max of the remaining possibilities.
Rather than add more special symbols forever, a numeric modifier may
be placed directly before the '@' symbol to indicate the operation
that prefers one of the digits not already represented. Thus in files
ending in '.5i', the permitted unary operators are '?', '^', '&', '@',
'2@', '3@', and 'V'. Use of such barbarisms as '0@' to represent '&'
are not permitted, nor is the use of '@' or '^' in files with either
of the extensions '.i' or '.2i'. Why not? You just can't, that's why.
Don't ask so many questions.

As a closing example, we note that in balanced quinary notation,
where 3 means -2 and 4 means -1, the negation operation can be written
as either

DO .1 <- "^'"^.3$.3"~"#0$#3906"'$'"^.3$.3"~"#0$#3906"'"~"#0$#3906"

or as

DO .1 <- "^.3$.3"~"#0$#3906"
DO .1 <- "^.1$.1"~"#0$#3906"

These work because multiplication by -1 is the same as multiplication
by 4, modulo 5.

Now go beat you head up against the wall for a while.

-----END OF TriINTERCAL SUPPLEMENTAL MANUAL-----

By the way, I have already implemented all of the above operations in
my version of the compiler except the bounds checking on constants and
WRITE IN's. All expressions mentioned above compile and run
successfully, and I have tested my arbitrary-base versions of the
operations on some standard INTERCAL programs as well by forcing
use of the general versions even in base 2. In normal operation the
old base 2 versions are used for standard INTERCAL since they are *much*
faster, like by a factor of 30. I see no reason why this stuff should
not be available in the next release, unless Steve violently objects
to the changes.


--
Louis Howell (naz...@llnl.gov)

"The word is EPTIFY.
Don't look in the dictionary.
It's too new for the dictionary.
But you'd better learn what it implies.
EPTIFY.
We do it to you."

Reply all
Reply to author
Forward
0 new messages