On 10/23/2013 11:25 PM, Ivan Godard wrote:
> On 10/23/2013 10:30 PM, Andy (Super) Glew wrote:
>> On 10/21/2013 3:05 PM, Ivan Godard wrote:
>
>
> <snip>
>
>>> There is no requirement that the checking be used at all, and it won't
>>> be unless code is added to the allocator to set a granule.
>>
>> Blah, blah, blah.
Sorry, I was grumpy last night. Annoyed at you explaining obvious things
to me, and not answering my question. Although perhaps you did:
I was thinking that you were more willing than I am to say "If you want
to get good checking, round your data structures up to power-of-2 sizes".
I think that instead you are saying "power-of-2 checking only: if you
don't round your data structures up, (1) some errors may not be detected
at all, and (2) some errors may be detected very late".
Although I suspect that you are going to tell me once again that I have
totally misunderstood you, and that I am not allowed to comment on your
proposal.
>>
>> Do you think that overflowing a non-power-of-2 object within a
>> power-of-2 granule, overflowing into another object within the same
>> granule, without detecting an overflow is acceptable?
>
> Acceptable? When the alternative is no checking at all?
I suppose the example of guard pages shows that people are willing to
live with checking that catches errors potentially a very long time
after the actual overflow.
> The proposal is unfortunately minimal, I agree. But something is better
> than nothing IMO, and I personally welcome all the checking I can get,
> complete or not. However, I am not the target market. Fr the market, I'm
> much less worried about what we don't catch than I am about user
> reactions when we *do* catch something in their "working" programs.
>
> The checking alternative that I know of I consider unacceptable to the
> market:
You forgot the MPX scheme. Which is pretty well proven not to break
programs or OSes.
Although I agree with you that Intel's chosen implementation requires a
bit too much compiler involvement for my taste.
Actually, my feeling bad about grumpy "blah, blah, blah" leads me to
suspect that there may be a possibility of tighter checking.
One of the design principles behind my work that led to MPX... (I'm
getting tired of that circumlocution - I called it "Cp/PL". I think that
there is no harm in letting that be known. "Cp" stood for "checked
pointers"; "PL" was short for a longer codename in the Intel geographic
pattern. I won't say that.)
One of the design principles behind my work that led to MPX or Cp/PL was
"no struct descriptors".
Capability ISAs are not so bad, whether using power-of-2 bounded
pointers or wide pointers with [lwb,upb) (especially if the wide bits
are stored non-adjacent).
---+ Struct and array descriptors considered bad
But many capability ISAs eventually have array descriptors - not simple
[lwb,upb) descriptors, but
[base_address,elem_type,ndim,dim1,dim2...].
Worse still, many capability ISAs have struct descriptors:
[struct_type,nfields,field1_type, field2_type,...]
Variable length.
Not something I want to manage in hardware.
---+ Living without direct struct descriptors
Part of the thought process that led to Cp-PL / MPX involved
+ I don't want struct descriptors
+ okay, how do I protect structs - how do I prevent overflow of an
element with a struct
Reluctantly applying the adage "any problem in computer science can be
solved by a level of indirection":
Instead of struct descriptors, use pointers to struct elements.
Possibly in a separate struct or array:
struct Target { int a; char b; double c; ... };
struct Pointers_to_Target_Fields(Target& t) {
int* a_ptr(&t.a);
char* b_ptr(&t.b);
double* c_ptr(&t.c);
...
};
Basically, this amounts to saying that a compiler can create its own
struct_descriptors, if it really wants to, albeit using this level of
indirection.
Now, considering where Cp-PL ended up, this is largely irrelevant. But
nevertheless, it was important to my thought process: I would not have
bothered to continue thinking about Cp-PL if I did not see how it could
ultimately be extended to security just as good as any other capability
system.
---+ What does this have to do with power-of-2 checked pointers?
Most primitive objects understood by the hardware have power-of-2 sizes:
8, 16, 32, 64, 128 bits.
You could probably get away with power-of-2 sized pointers in the thing
that looks like a struct descriptor:
struct Pointers_to_Target_Fields(Target& t) {
Power_of_2_Checked_Pointer<int*> a_ptr(&t.a);
Power_of_2_Checked_Pointer<char*> b_ptr(&t.b);
Power_of_2_Checked_Pointer<double*> c_ptr(&t.c);
...
};
noting that sizeof<Power_of_2_Checked_Pointer<T*>> = sizeof<T*>.
Tight checking.
Unfortunately, this doesn't handle pointing to non-power-of-2 arrays as
elements in structs. Nor does it handle the fact that the
Pointers_to_Target_Fields thing that looks like a struct descriptor is
non-power-of-2.
--
Oh, by the way this reminds me: one of the biggest reasons not to use
1-extra-bit power-of-two pointers was that such pointers are only
[ptr-base-address,size), i.e. the pointer is always to the bottom of the
memory region. Instead of ptr,[lwb,ubp) - where the pointer is separate
from the bounds. C allows the pointers to be outside the bounds, in a
very limited way, but never to be dereferenced. Many C programs use
negative offsets. Disallowing negative offsets breaks far too many
programs.
If I am correct in understanding, Ivan's proposal is not my
1-extra-bit trick. Instead it provides an N-bit field (at one time Ivan
described it as floating point, at another time as a 6-bit field).
If my understanding is correct, Ivan's proposal indicates how many
low bits to ignore in the 2^N bounds comparison, but allows a pointer
into the middle of the region. So it does not have the problem that my
1-extra-bit power-of-two pointers had. Although it costs 6 extra address
bits.
So, see: the 1-extra-bit approach is prior art for power-of-2-aligned
checking. (If it is publicly visible anywhere.) But Ivan's N-bit field
approach is different.
(BTW^2: when Cp/PL - MPX started, it was still considered essential to
make it available on a 32-bit machine, so consuming 5 address bits would
not have been acceptable. Even considering 1 was borderline. On
64-bits, taking away 6 address bits may be acceptable. But then it
interferes with the various other software systems that do things like
placing tags in the upper or lower bits of the address. Like ARMv8
allowing the upper 8 bits of address to be ignored by the hardware
virtual address translation facility, reserved for tagged pointers.
Therefore for MPX/Cp-PL we just left the user's pointer bits alone, and
placed our metadata somewhere else.)