How does ghostscript make ints and reals equivalent as dictionary keys??!!!

luser- -droog

unread,

Aug 23, 2013, 1:38:14 AM8/23/13

to

I've marveled at this behavior before. But I just can't imagine how
they do that!

I suppose in many languages (PERL, Lua, JavaScript), the distinction
between ints and reals is very blurry all the way down the pipeline.
But postscript makes them distinct types for many operations
(array and string indexing, stack indexing, various other 'count' or
ordinal data), but blurs them during math. And C makes the distinction
very explicit, but permits 'usual implicit conversions' to provide
automatic promotions where appropriate.

All of this is fine, and where an operator needs one or the other
it can check and signal an error; or when an operator can accept
either, it can check and promote if necessary before merging
the int and real cases. But...

But how do you make them hash equally so you can look them up in a dict?

My only guess is that ghostscript must merge them somehow into a single
canonical form: some kind of extended floating-integer hybrid super-duper hackjob type.

But then, how do you do the 'super-duper' part?

luser- -droog

unread,

Aug 23, 2013, 3:06:19 AM8/23/13

to

I think this can work. I'm gonna use IEEE 754 as a guide,
but as much as possible, I'd like to avoid assuming any
particular representation of reals and perform all transformations
arithmetically.

IEEE 754 single precision divides 32 bits into
. 1 bit sign
. 8 bit exponent
. 23 bit significand

And I've got 48 bytes of payload available in an object
(64 - 16 bit tag). So I think I can extend the significand to
31 bits, and make a 32-bit sign/mag integer with 8 bit exponent.
This form seems easy to convert an int into as well.

Helge Blischke

unread,

Aug 23, 2013, 10:37:37 AM8/23/13

to

Integers and reals don#t seem to be equivalent as dictionary keys.
For instance, if you define (in Ghostscript)

/1 (Heureka) def

currentdict/1 .knownget

puts
(Heureka) true
onto the operand stack, but

currentdict/1.0 .knownget

yields only

false

on the operand stack.

Helge

luser- -droog

unread,

Aug 23, 2013, 12:15:53 PM8/23/13

to

The slash creates a nametype object.
Without the slashes, your example works.

$ gsnd
GPL Ghostscript 9.06 (2012-08-08)
Copyright (C) 2012 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
GS>1 (Heureka) def
GS>currentdict 1 .knownget
GS<2>pstack
true
(Heureka)
GS<2>clear
GS>currentdict 1.0 .knownget
GS<2>pstack clear
true
(Heureka)
GS>

Ross Presser

unread,

Aug 23, 2013, 1:40:20 PM8/23/13

to

On Friday, August 23, 2013 1:38:14 AM UTC-4, luser- -droog wrote:
> My only guess is that ghostscript must merge them somehow into a single
> canonical form: some kind of extended floating-integer hybrid
> super-duper hackjob type.

I would think that promoting integers to reals, before hashing as a key,
would be sufficient?

luser- -droog

unread,

Aug 23, 2013, 1:54:38 PM8/23/13

to

That would get the basic features readily, I agree. But I think
you'd lose some of the precision on integers as they approach the
top of the range. Which wouldn't let you do this:

GS>16#7fffffff (value) def
GS>16#7fffffff load =
value
GS>

luser- -droog

unread,

Aug 23, 2013, 1:57:57 PM8/23/13

to

Wait, I suppose that would still work.
But then other close integers would match the same key.

16#7ffffffe load =
Error: /undefined in --load--
Operand stack:
--nostringval-- 2147483646
Execution stack:
%interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- --nostringval-- %loop_continue --nostringval-- --nostringval-- false 1 %stopped_push .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval--
Dictionary stack:
--dict:1168/1684(ro)(G)-- --dict:0/20(G)-- --dict:79/200(L)--
Current allocation mode is local
Current file position is 17

Ross Presser

unread,

Aug 23, 2013, 2:20:46 PM8/23/13

to

On Friday, August 23, 2013 1:57:57 PM UTC-4, luser- -droog wrote:
> Wait, I suppose that would still work.
> But then other close integers would match the same key.

Not seeing why other close integers would be considered equal.
Reals take more bytes than integers, don't they?

luser- -droog

unread,

Aug 23, 2013, 2:49:46 PM8/23/13

to

I'm using 32bit ints and 32bit floats. But yeah, I think
a longer float type would hold both without loss.

What I have now is:

typedef uint16_t word;
typedef uint32_t dword;
typedef int32_t integer;
typedef float real;

typedef struct {
word tag;
word pad;
integer val;
} int_;

typedef struct {
word tag;
word pad;
real val;
} real_;

So my idea is to add a third type to be used as dictionary keys:

typedef struct {
word tag;
word sign_exponent;
dword fraction;
} extended_;

So a float converts into this type by

* extract sign and set sign bit in sign_exponent
* extract 8bit exponent, widen, and "or" into sign_exponent
* extract 23bit fraction, widen, and store as 32bit fraction

And an integer converts into this type by

* extract sign and set sign bit
* set exponent to bit-length of the integer
* shift int left until top bit falls off, store in fraction

I'm sure there are lots of edge cases I haven't considered yet.
I don't quite know whether I need to bias the exponent like IEEE
does, or if it would be better not to widen and just use 9 bits
of sign_exponent. But the rough idea seems sound.

luser- -droog

unread,

Aug 23, 2013, 3:08:09 PM8/23/13

to

Here's an illustration.

GS>16#7fffffff cvr =
2.14748e+09
GS>16#7ffffffe cvr =
2.14748e+09
GS>16#7fffffff cvr 16#7fffffff cvr eq =
true
GS>16#7fffffff cvr 16#7ffffffe cvr eq =
true
GS>

The floating type has less precision than the integer type.
IEEE 754 single-width float type has: 1-bit sign, 8-bit exponent, 23-bit fraction. With an additional implied bit, there's 24-bits
of integral precision in a float.

luser- -droog

unread,

Aug 24, 2013, 1:57:52 AM8/24/13

to

Fleshed-out some. Untested. Decided not to extend the exponent.
For float, just copy the sign+exponent, and shift up the
mantissa. For integers, the shortcut I found is to convert
to double and then extract the pieces (adjusting the width
of the exponent).

object extend_real (object R) { //assume IEEE 754 float
unsigned long r = *(unsigned long *)&R.real_.val;
extended_ e;
e.tag = extended;
e.sign_exp = (r>>23)&0x1FF;
e.fraction = r<<9;
return (object) e;
}

object extend_int (object I) { //assume IEEE 754 double
double d = I.int_.val;
unsigned long long r = *(unsigned long long *)&d;
extended_ e;
e.tag = extended;
e.sign_exp = r&0x8000000000000000?
0x100 : 0;
e.sign_exp |= (((r>>52)&0x7FF) - 1023) + 127;
e.fraction = r>>20;
return (object) e;
}

I'll need to set a flag to indicate the original type
so dictforall can return int or real as appropriate.
The flags in the tag are suppressed before hashing.

luser- -droog

unread,

Aug 24, 2013, 3:24:07 PM8/24/13

to

On Saturday, August 24, 2013 12:57:52 AM UTC-5, luser- -droog wrote:
> On Friday, August 23, 2013 1:49:46 PM UTC-5, luser- -droog wrote:
>
> > On Friday, August 23, 2013 1:20:46 PM UTC-5, Ross Presser wrote:
>
> >
>
> > > On Friday, August 23, 2013 1:57:57 PM UTC-4, luser- -droog wrote:
>
> >
>
> > >
>
> > > > Wait, I suppose that would still work.
>
> > >
>
> > > > But then other close integers would match the same key.
>
> > >
>
> > >
>
> > >
>
> > > Not seeing why other close integers would be considered equal.
>
> > >
>
> > > Reals take more bytes than integers, don't they?
>
> >
>
> > I'm using 32bit ints and 32bit floats. But yeah, I think
>
> > a longer float type would hold both without loss.
>

Hurray!

^test itp.c
loading init.ps...
loading err.ps...
$error
Xpost3 version 0.0
PS>1 /this def
PS>1.0 load
PS<1>=
this
PS><operator 1 exch 0x407b85>bye!

luser- -droog

unread,

Aug 26, 2013, 1:38:13 AM8/26/13

to

On Saturday, August 24, 2013 2:24:07 PM UTC-5, luser- -droog wrote:
>
> Hurray!
>
>
> ^test itp.c
> loading init.ps...
> loading err.ps...
> $error
> Xpost3 version 0.0
> PS>1 /this def
> PS>1.0 load
> PS<1>=
> this
> PS><operator 1 exch 0x407b85>bye!

The way I finally implemented it was a little different than the
way I described while planning it.

The extended number type is essentially an IEEE 754 double,
with the bottom 20 bits discarded. The 11-bit sign and exponent
portion is stored in the spare 16-bit word in the object, and
the 53 bit fraction truncated and stored in the 32-bit payload word.
Integers and reals are converted to this type when performing
dictionary lookup (at the same point where strings are converted
to names), and then converted back to the original number type
when enumerated by `forall`.

The conversion back to the original type is simply the reverse of
the above. 11-bite from the spare word shifted to the top of
a 64-bit value, 32-bits from the payload word ORed-in, and the
bottom 20-bits left zero. Then the 64-bit value is treated as
an IEEE 754 double and coerced to int or float according to
the flag set when the extended object was created.

There are still 5 unused bits in the spare word. It might be
simpler (with increased accuracy) to ignore the internal structure
of the double and just take the top 48 bits....