Unsafe structs

121 views
Skip to first unread message

Dominik Pantůček

unread,
Dec 20, 2020, 3:34:50 PM12/20/20
to Racket Users
Hello Racketeers,

there were some discussions about structs' introspection on the IRC
lately and one of the questions that arose was how to get field index
for arbitrary struct's arbitrary field.

Not that it is possible... But the general discussion made me think
about why there are no unsafe variants for structs' accessors and mutators.

And given my work on some performance-demanding projects I started
testing unsafe-struct*-ref and unsafe-struct*-set! in one of my
simulations (yes, that raycasting/raytracing project). It bumped the
speed from roughly 110-120 fps to about 140-150 fps on my laptop. So
definitely worth the try.

But unsafe-struct*-ref/set! degrade structs basically to strange
vectors. And also - in my code I use struct hierarchy to decide what to
do with different kinds of data.

It didn't take long and my unsafe-struct[1] proof-of-concept was created.

The code is still a bit messy. However it keeps all the semantics of
structs and enables usage of unsafe accessors and mutators without any
hassle. Basically the unsafe-struct and unsafe-struct-out are drop-in
replacements or struct and struct-out.

A typical usage would be:

(unsafe-struct my-struct (a b (c #:mutable)) #:transparent)
(define my-val (my-struct 1 2 3))
(unsafe-set-my-struct-a! my-val 4)
(displayln (unsafe-my-struct-b my-val))

Of course, the procedures are inherently unsafe ...

In a typical scenario I am using these immediately after checking the
type of given value using the struct predicate (like my-struct? in this
case) and getting the performance boost without any real un-safety.

And now some questions (I bet you knew I will have some):

1) Is there anyone else reading this that could actually leverage these
unsafe-structs for something useful?

2) If yes, what do you miss there before I create a package from the
code? (Yes, scribblings, yes, slightly cleaning the code up - but what
else?)

3) In the syntax pattern I was unable to create a sub-pattern to get the
field-mutable for all fields (including those without any field options)
- that's the reason for most of the long let* later on. Is there a
better way? (I will think about this one, but more eyes ...)


Cheers,
Dominik

P.S.: I didn't abandon the flalgebra module, I am actually working on
another approach that should move the resulting "DSL" closer to the
flonums unboxing and register allocation code. More on that later.


[1] https://gitlab.com/racketeer/unsafe-struct

David Storrs

unread,
Dec 21, 2020, 12:07:40 PM12/21/20
to Racket Users
<self-plug>
The struct-plus-plus module also provides reflection, so you might take a look to see if there are any ideas in there that would be useful for your own module.  Accessors are included, as are constructors, rules, wrappers, default values, and predicates.  spp has two primary limitations:  You cannot use a base type and you cannot mark individual fields mutable, only the entire struct.

</self-plug>

--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/90b3e86a-db1d-c658-111b-ae4010c5bea2%40trustica.cz.

George Neuner

unread,
Dec 21, 2020, 12:14:34 PM12/21/20
to racket...@googlegroups.com

On 12/20/2020 3:34 PM, Dominik Pantůček wrote:
> Hello Racketeers,
>
> there were some discussions about structs' introspection on the IRC
> lately and one of the questions that arose was how to get field index
> for arbitrary struct's arbitrary field.
>
> Not that it is possible... But the general discussion made me think
> about why there are no unsafe variants for structs' accessors and mutators.

Have you looked at 'struct-field-info-list' and 'struct-field-index'?

'struct-field-index' works if you know the field name(s), and
'struct-field-info-list' returns fields defined by the struct type - so
if your structs types are flat, you can make do with these.

However, 'struct-field-info-list' returns only fields defined by the
actual type and does not include fields that were inherited.  What I
don't see is any simple way to get at the struct's inheritance hierarchy
- it seems that you have to iterate 'struct-type-info' to enumerate the
supertypes.

So it is "possible" (for some definition).
George

Dominik Pantůček

unread,
Dec 21, 2020, 4:33:21 PM12/21/20
to racket...@googlegroups.com
Yes, this approach would be useful to create the unsafe
accessors/mutators after the structs have been defined. Trouble is, one
would have to traverse the structs hierarchy and filter-out those
accessors/mutators, that are already defined by parent struct.

What I wanted to achieve was providing the unsafe variants
"transparently". Honestly, given enough time, the best solution for me
would be adding #:unsafe keyword to the struct form that would define
the unsafe variants.

My usage requires only struct-wide unsafe accessors. I implemented the
rest as an exercise to see what can be done (and to play with more
complex syntax-rules too).

It may be the case that using extract-struct-info and traversing the
structs' hierarchy during expansion phase might be more elegant solution
for implementing the whole unsafe-struct syntax. I will definitely look
into it more in the (relatively near) future.

Thank you,
Dominik

Dominik Pantůček

unread,
Dec 21, 2020, 4:36:01 PM12/21/20
to racket...@googlegroups.com

On 21. 12. 20 18:07, David Storrs wrote:
> <self-plug>
> The struct-plus-plus module also provides reflection, so you might take
> a look to see if there are any ideas in there that would be useful for
> your own module. Accessors are included, as are constructors, rules,
> wrappers, default values, and predicates. spp has two primary
> limitations: You cannot use a base type and you cannot mark individual
> fields mutable, only the entire struct.
Nice one! The per-field #:mutable keyword was one of the things that
made me look into it more :)

I will look into the sources and some of the ideas there will help me
implement it in a cleaner way. That said, I am mostly interested in
providing the unsafe accessors/mutators transparently.


Cheers,
Dominik

Robby Findler

unread,
Dec 21, 2020, 7:30:35 PM12/21/20
to Dominik Pantůček, racket...@googlegroups.com
Is Typed Racket able to prove that your use of unsafe accessors is actually safe? 

Robby

--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.

Philip McGrath

unread,
Dec 21, 2020, 10:40:37 PM12/21/20
to Robby Findler, Dominik Pantůček, Racket Users
On Mon, Dec 21, 2020 at 7:30 PM Robby Findler <ro...@cs.northwestern.edu> wrote:
Is Typed Racket able to prove that your use of unsafe accessors is actually safe?

On a similar note, my understanding is that, even without types, in a program like this:
#lang racket
(struct cell (val))
(λ (x)
  (if (cell? x)
      (cell-val x)
      ...))
the compiler can optimize the use of `cell-val` guarded by `cell?` to omit the redundant check. I think the optimized version should be as good as `unsafe-struct-ref`.

If that's the case, the only extra invariant for `unsafe-struct*-ref` is that it does not work on impersonators (including chaperones, IIUC). If you need the extra performance, I think you can get it safely by declaring your structs as `#:authentic` (or equivalently adding `prop:authentic`), which will cause the struct types involved not to support impersonators: `impersonate-struct` and similar functions will raise exceptions. On the other hand, if you need to support impersonators, then it really is unsafe to use `unsafe-struct*-ref`.

-Philip

Dominik Pantůček

unread,
Dec 22, 2020, 2:36:03 AM12/22/20
to phi...@philipmcgrath.com, Robby Findler, Racket Users


On 22. 12. 20 4:40, Philip McGrath wrote:
> On Mon, Dec 21, 2020 at 7:30 PM Robby Findler <ro...@cs.northwestern.edu
> <mailto:ro...@cs.northwestern.edu>> wrote:
>
> Is Typed Racket able to prove that your use of unsafe accessors is
> actually safe?
>

I can try adding type annotations everywhere, but as I started with
contracts, the results should be the same. It might be interesting to
see how to prove it. There are some cross-procedure type assumptions in
my case co it means fully annotating a few modules. Will look into it.

>
> On a similar note, my understanding is that, even without types, in a
> program like this:
> #lang racket
> (struct cell (val))
> (λ (x)
>   (if (cell? x)
>       (cell-val x)
>       ...))
> the compiler can optimize the use of `cell-val` guarded by `cell?` to
> omit the redundant check. I think the optimized version should be as
> good as `unsafe-struct-ref`.

I hope that in the simple example you provided it is as good as the
unsafe variant. Trouble is that the guards are usually not in the same
scope as the accessors. The major performance improvement I see is when
passing the results of geometry checks from one module to surface
properties handler in another. So definitely the compiler does not see
the guards.

I think Robby's question aims in the right direction. Not only it should
be possible to prove that I _can_ use unsafe accessors/mutators in given
scope, but also the compiler should be able to recognize it and do not
need them in the first place. I'll look into it more.

Although - after my experience with math/matrix - I expect the Typed
Racket (+ #:authentic, yes) to yield better results, but not as good as
the unsafe accessors.

>
> If that's the case, the only extra invariant for `unsafe-struct*-ref` is
> that it does not work on impersonators (including chaperones, IIUC). If
> you need the extra performance, I think you can get it safely by
> declaring your structs as `#:authentic` (or equivalently adding
> `prop:authentic`), which will cause the struct types involved not to
> support impersonators: `impersonate-struct` and similar functions will
> raise exceptions. On the other hand, if you need to support
> impersonators, then it really is unsafe to use `unsafe-struct*-ref`.

#:authentic improves it only VERY slightly (didn't look at the generated
code, just did a few empirical measurements).

Dominik

Dominik Pantůček

unread,
Jan 2, 2021, 5:34:47 PM1/2/21
to racket...@googlegroups.com
Hello Racketeers (and Robby especially)!

On 22. 12. 20 1:30, Robby Findler wrote:
> Is Typed Racket able to prove that your use of unsafe accessors is
> actually safe?

Short answer: YES.

One question for a start: And what now?

Disclaimer: The following text is by no means intended as critique but
rather it should be viewed as a summary of my experience with typing
middle-sized project in two weeks.

Long rant follows.

When Ben sent the Typed Racket Survey[1] here, my experience with TR was
only minimal. I worked with the math/matrix and basically focused on
performance gains it might yield. So I honestly filled-in whatever I
knew. Now I know much more. Hope it will help.

Let's first sum up where TR helped enormously and actually surprised me
VERY pleasantly:

* Finding cases of (when ...) and similar #<void> returning constructs
where values are to be returned and probably error should be raised
instead of returning nothing. Although the actual erroneous return never
happened in the code in question, those constructs were meant to ensure
correct value is returned... Probably some leftovers from the early days
of this project. TR spotted them immediately.

* Finding improper usage of #f for missing information. When the program
expects some value, #f is not the right choice of fallback value. It is
the right one if you want to use it as a condition. Again, these were
leftovers from various experiments - mostly when loading data. TR to the
rescue again.

* Unintended use of default procedure arguments - nice and unexpected!

And now for the worse part. TR rough edges:

* Higher-order procedures and polymorphic functions in all imaginable
combinations. That was a total disaster. Yes, the documentation clearly
states that - but typing any code using these is like trying to break a
badly designed cipher. Irregularities here and there. Sometimes simple
`inst' was enough. Sometimes casting both the function and the arguments
was necessary. The biggest trouble is that the error messages are very
cryptic and sometimes even do not point to the offending construct but
only the module file. I will provide MWE to show this if someone wants
to look into it.

* Struct: Missing struct generics - when porting code that relies on
them, there's not much that can be done.

* Math: there really is just a natural logarithm and no logarithm with
arbitrary base? Yes, one-line to implement, but why?

* Math/Fixnums/Flonums: All fx+/-/*/... accept two arguments only. No
unary fl-, no variadic-argument fl+ or fxior (this one hurt the most).

* unsafe/ops: unsafe-vector-* resisted a lot - until I just gave up and
require/typed it for my particular types. Maybe casting would help, but
the error messages were different to the classical problems of the
polymorphic functions in TR.

* Classes: My notes say "AAAAAAAAAAAAAAAAAAAA". Which roughly says it
all. Although I managed to back down to only using bitmap% class,
properly typing all procedures using it was a nightmare. By properly I
mean "it compiles and runs".

* with-input-from-file does not accept Path or String, only Path-String
and the conversion rules are either missing or strange at best.
Basically I ended up with just converting to String and casting to
Path-String to make everything work.

* with-input-from-file also revealed that procedure signatures as types
can be very tricky - just passing `read' was not possible, because it
accepts some optional arguments. Wrapping it in thunk helped though.

* order of definitions matters. Not that this is unexpected, it is just
strange when working with larger code-base where it didn't matter.
Actually the error messages were helpful here.

* Type annotations of procedures with variadic arguments. The only place
where I had to put annotations outside of the procedure definition. It
is nothing super-problematic, but it feels inconsistent with the rest.

* More modules need to be required everywhere. If module A provides a
procedure that accepts a type from module B, all modules using that
procedure must also require the module B to know the type. In normal
Racket it does not matter as long as you just pass the opaque data along.

* Syntax macros are extra hard. As I use some syntax trickery to convert
semi-regular code to "futurized" one, I basically gave up and just ran
everything single-threaded. The main issue is passing type information
from the module that uses the macro to the macro and matching it on both
sides. Also the macro must split the type annotation from argument names
in the syntax pattern when it defines new procedures - I did some ugly
hacks to get through it but in the end I just refrained from using them
when possible (except for the unsafe-struct macro, of course).

If anyone actively working on TR reads this, I'd be more than happy to
discuss my experience and share the code (although it is really ugly as
I really only intended to prove the unsafe structs are used safely).
Next Saturday would be a good time to discuss it at the Racket Users
Video Meetup[2]!


Hmmm... some clever closing words...

Robby, thank you! And I mean it both sarcastically (two weeks, 52
commits, 2292-line diff!) and genuinely (it really helped).

Doctor Tobin-Hochstadt: Tear down this wall! (Thanks Jay ;)


Cheers,
Dominik


[1] https://groups.google.com/g/racket-users/c/qfepyERIsYA/m/Tnz3rjcHAwAJ
[2] https://groups.google.com/g/racket-users/c/-_I17qdo3SY/m/QcBV-Aa0CwAJ

Robby Findler

unread,
Jan 2, 2021, 6:19:39 PM1/2/21
to Dominik Pantůček, Racket Users
On Sat, Jan 2, 2021 at 4:34 PM Dominik Pantůček <dominik....@trustica.cz> wrote:
Hello Racketeers (and Robby especially)!

On 22. 12. 20 1:30, Robby Findler wrote:
> Is Typed Racket able to prove that your use of unsafe accessors is
> actually safe?

Short answer: YES.

One question for a start: And what now?


That's great! (And I can sympathize about the sarcastic thank you-- getting a machine to verify a proof of something you know to be true is really a lot of work!)

As for what now, my hope was that if TR could provide that your uses were good then you could rely on TR to follow up with the actual optimizations. Did you find that to be the case? I see you had to give up on the parallelism for other reasons so I guess there is still a ways to go. I'm not going to recommend that you follow this path further, but I guess that if you did there would be grateful people, perhaps starting with Sam.

Robby

Sam Tobin-Hochstadt

unread,
Jan 8, 2021, 3:39:12 PM1/8/21
to Dominik Pantůček, Racket Users
Thanks for this detailed account (and for trying it out). I have some
questions inline:


On Sat, Jan 2, 2021 at 5:34 PM Dominik Pantůček
<dominik....@trustica.cz> wrote:
> And now for the worse part. TR rough edges:
>
> * Higher-order procedures and polymorphic functions in all imaginable
> combinations. That was a total disaster. Yes, the documentation clearly
> states that - but typing any code using these is like trying to break a
> badly designed cipher. Irregularities here and there. Sometimes simple
> `inst' was enough. Sometimes casting both the function and the arguments
> was necessary. The biggest trouble is that the error messages are very
> cryptic and sometimes even do not point to the offending construct but
> only the module file. I will provide MWE to show this if someone wants
> to look into it.

An example would be great here. In general you should only need one
use of `inst` in any of these cases, and you shouldn't ever need a
`cast`.

> * Struct: Missing struct generics - when porting code that relies on
> them, there's not much that can be done.

This is indeed a limitation -- struct generics are complex and we
(mostly Fred Fu) are thinking about them, but it's not there yet.

> * Math: there really is just a natural logarithm and no logarithm with
> arbitrary base? Yes, one-line to implement, but why?

I'm not sure why this didn't get added originally, but it's easy to fix.

> * Math/Fixnums/Flonums: All fx+/-/*/... accept two arguments only. No
> unary fl-, no variadic-argument fl+ or fxior (this one hurt the most).

These definitely became variadic after the type definitions were
written, but that's of course not an excuse for not updating them.

> * unsafe/ops: unsafe-vector-* resisted a lot - until I just gave up and
> require/typed it for my particular types. Maybe casting would help, but
> the error messages were different to the classical problems of the
> polymorphic functions in TR.

Can you say more about what happened here?

> * Classes: My notes say "AAAAAAAAAAAAAAAAAAAA". Which roughly says it
> all. Although I managed to back down to only using bitmap% class,
> properly typing all procedures using it was a nightmare. By properly I
> mean "it compiles and runs".

More detail here would be helpful as well.

> * with-input-from-file does not accept Path or String, only Path-String
> and the conversion rules are either missing or strange at best.
> Basically I ended up with just converting to String and casting to
> Path-String to make everything work.
>
> * with-input-from-file also revealed that procedure signatures as types
> can be very tricky - just passing `read' was not possible, because it
> accepts some optional arguments. Wrapping it in thunk helped though.

I don't understand what the problem was here; for example this works for me:

#lang typed/racket
(with-input-from-file "/tmp/x.rkt" read)

and `Path-String` is just the union of Path and String.

> * order of definitions matters. Not that this is unexpected, it is just
> strange when working with larger code-base where it didn't matter.
> Actually the error messages were helpful here.

What do you mean by "order of definitions matters" here?

> * Type annotations of procedures with variadic arguments. The only place
> where I had to put annotations outside of the procedure definition. It
> is nothing super-problematic, but it feels inconsistent with the rest.

I would encourage type annotations before the procedure definition in
all cases -- the `define` form doesn't really have the right places to
put everything that can go in a function type.

> * More modules need to be required everywhere. If module A provides a
> procedure that accepts a type from module B, all modules using that
> procedure must also require the module B to know the type. In normal
> Racket it does not matter as long as you just pass the opaque data along.

Can you give an example?

> * Syntax macros are extra hard. As I use some syntax trickery to convert
> semi-regular code to "futurized" one, I basically gave up and just ran
> everything single-threaded. The main issue is passing type information
> from the module that uses the macro to the macro and matching it on both
> sides. Also the macro must split the type annotation from argument names
> in the syntax pattern when it defines new procedures - I did some ugly
> hacks to get through it but in the end I just refrained from using them
> when possible (except for the unsafe-struct macro, of course).
> commits, 2292-line diff!) and genuinely (it really helped).

An example here would be great too.

Thanks again,
Sam

Sorawee Porncharoenwase

unread,
Jan 20, 2021, 1:53:34 AM1/20/21
to Sam Tobin-Hochstadt, Dominik Pantůček, Racket Users

However, ‘struct-field-info-list’ returns only fields defined by the
actual type and does not include fields that were inherited. What I
don’t see is any simple way to get at the struct’s inheritance hierarchy
—- it seems that you have to iterate ‘struct-type-info’ to enumerate the
supertypes.

Yes, that’s a (the?) way to extract field names of supertypes. The reason I designed it in that way is that field name is a concept for each level in the hierarchy, not across levels. You can’t have two identical field names in a level, but you can have identical field names across levels. So lumping field names across levels together doesn’t look like a good idea to me.


--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.

Dominik Pantůček

unread,
Jan 20, 2021, 5:16:09 PM1/20/21
to racket...@googlegroups.com
Hi Sam,

I went through all my notes and prepared minimal (sometimes) working
examples for most of the issues I mentioned. Let's go through it one by
one. I assume that some of the complications I encountered were because
my lack of experience with Typed Racket. I hope some of these examples
will be useful anyway.

All the examples are in a git repository on Gitlab [1].

>> * Higher-order procedures and polymorphic functions in all imaginable
>> combinations. That was a total disaster. Yes, the documentation clearly
>> states that - but typing any code using these is like trying to break a
>> badly designed cipher. Irregularities here and there. Sometimes simple
>> `inst' was enough. Sometimes casting both the function and the arguments
>> was necessary. The biggest trouble is that the error messages are very
>> cryptic and sometimes even do not point to the offending construct but
>> only the module file. I will provide MWE to show this if someone wants
>> to look into it.
>
> An example would be great here. In general you should only need one
> use of `inst` in any of these cases, and you shouldn't ever need a
> `cast`.

Directory "polymorph". In my use-case, there is a lot of structured
S-expression data in the format:

(define tagged-data-example
'(data-2D
((B B B B)
(B B B))))

9 steps of typing the procedure to calculate the maximum length of the
2D matrix are in this directory. All the errors are documented there.
Yes, many of the intermediate constructs are wrong. The error in
polymorph4.rkt shows the biggest problem I've seen so far. As a separate
example, it is pretty easy to fix - but finding the source of the
problem in a bigger module was not easy at all. Basically I ended up
bi-partitioning the whole source code with procedure granularity and
once I found the offending procedure a linear trial-and-error approach
was the only option that allowed me to find the offending expression.

Yes, the final polymorph9.rkt contains the approach that I actually used
for resolving the issue.

>
>> * unsafe/ops: unsafe-vector-* resisted a lot - until I just gave up and
>> require/typed it for my particular types. Maybe casting would help, but
>> the error messages were different to the classical problems of the
>> polymorphic functions in TR.
>
> Can you say more about what happened here?

I completely mixed things up. It was not unsafe/ops, actually there was
just a minor misunderstanding on my side and after I realized what's
wrong, it worked like a charm.

The actual problem I was referring to here is nicely shown in the
"union" directory and it is about typing hash-union procedure.

I do not know what solution should be considered correct if there were
multiple differently-typed hash-tables in the same module. The explicit
type information in require/typed is definitely not a good idea - but I
found no better way.

>
>> * Classes: My notes say "AAAAAAAAAAAAAAAAAAAA". Which roughly says it
>> all. Although I managed to back down to only using bitmap% class,
>> properly typing all procedures using it was a nightmare. By properly I
>> mean "it compiles and runs".
>
> More detail here would be helpful as well.

The task was so simple and the solution so ugly. See the "classes"
directory with 4 steps of typing a simple program to load a bitmap.

Yes, normally I use (make-object bitmap% filename 'unknown/alpha) and it
just works (racket/draw hides all the problems from the programmer). But
I failed to require/typed it correctly. With explicit specification of
object interface that is used, it worked. The bitmap4.rkt contains the
code that I actually used.

>
>> * with-input-from-file does not accept Path or String, only Path-String
>> and the conversion rules are either missing or strange at best.
>> Basically I ended up with just converting to String and casting to
>> Path-String to make everything work.
>>
>> * with-input-from-file also revealed that procedure signatures as types
>> can be very tricky - just passing `read' was not possible, because it
>> accepts some optional arguments. Wrapping it in thunk helped though.
>
> I don't understand what the problem was here; for example this works for me:
>
> #lang typed/racket
> (with-input-from-file "/tmp/x.rkt" read)

Yes, this works. The example in "input" directory shows the problem I
had. Again, the trouble was I was working with a list where the first
element is different from the rest of elements and I wanted to ensure
type consistency for the rest of the code.

>
> and `Path-String` is just the union of Path and String.

Turns out I must have some other error there as well. I cannot reproduce
it on its own and I wasn't tagging the whole tree when I encountered new
problems. Next time I am tagging the repository so that I can get back
to the exact bigger example of the problem and eventually see what
caused it.

>
>> * order of definitions matters. Not that this is unexpected, it is just
>> strange when working with larger code-base where it didn't matter.
>> Actually the error messages were helpful here.
>
> What do you mean by "order of definitions matters" here?

This one is completely lost. Again - it could have been something
different. The problem was that when I reordered the defines for some
functions and types, suddenly it started working.

>
>> * Type annotations of procedures with variadic arguments. The only place
>> where I had to put annotations outside of the procedure definition. It
>> is nothing super-problematic, but it feels inconsistent with the rest.
>
> I would encourage type annotations before the procedure definition in
> all cases -- the `define` form doesn't really have the right places to
> put everything that can go in a function type.

And what about procedures created with lambda? I thought the (define
(proc args ...) body ...) is just a syntactic sugar around (define proc
(lambda (args ...) body ...)) - so I thought the type annotations are
the same as well. Or do I miss something here?

>
>> * More modules need to be required everywhere. If module A provides a
>> procedure that accepts a type from module B, all modules using that
>> procedure must also require the module B to know the type. In normal
>> Racket it does not matter as long as you just pass the opaque data along.
>
> Can you give an example?

I tried in the "modules" directory - but frankly, it works here as I
needed it to work. My problem was that when module B was not typed at
all, and modules A and C were, the type checks failed and I had to type
the whole module B. Of course the actual configuration contained way
more definitions and I didn't investigate what triggered the problem. I
assumed what I have written - that it is not possible. Yet my example
shows that it actually works. Let's see what I find when I encounter it
in the future.

>
>> * Syntax macros are extra hard. As I use some syntax trickery to convert
>> semi-regular code to "futurized" one, I basically gave up and just ran
>> everything single-threaded. The main issue is passing type information
>> from the module that uses the macro to the macro and matching it on both
>> sides. Also the macro must split the type annotation from argument names
>> in the syntax pattern when it defines new procedures - I did some ugly
>> hacks to get through it but in the end I just refrained from using them
>> when possible (except for the unsafe-struct macro, of course).
>> commits, 2292-line diff!) and genuinely (it really helped).
>
> An example here would be great too.

Two examples are in the "syntax" directory. First one with the three
stages of typing a struct. The need for explicitly extracting the field
names differently than without TR is not very convenient. When I tried
to make a "universal" macro that would handle both typed and untyped
variants, I failed (but I bailed out quickly as that was not my goal).

The second example is my common define-futurized syntax macro. The TS
version shows the :1, :2 and ::: as non-validating syntax template
placeholders (just something that matches the expression but does not do
any actual syntax checking). Is there a better way? (I would think so).


I will definitely play with Typed Racket more now. But ultimately I want
to see the performance benefits from this venture. My usual approach is
to structure my modules in a way that there is always uncontracted
implementation and a wrapper module with contracts. Basically I use the
"design by contract" approach (not strictly, but most of the time).
Statically verified code should be in theory faster in the end - so I am
curious if (or more likely "when") it pays off performance-wise.


Cheers,
Dominik


[1] https://gitlab.com/racketeer/tr-mwe

George Neuner

unread,
Jan 21, 2021, 3:59:01 AM1/21/21
to Sorawee Porncharoenwase, racket users

On 1/20/2021 1:53 AM, Sorawee Porncharoenwase wrote:

However, ‘struct-field-info-list’ returns only fields defined by the
actual type and does not include fields that were inherited. What I
don’t see is any simple way to get at the struct’s inheritance hierarchy
—- it seems that you have to iterate ‘struct-type-info’ to enumerate the
supertypes.

Yes, that’s a (the?) way to extract field names of supertypes. The reason I designed it in that way is that field name is a concept for each level in the hierarchy, not across levels. You can’t have two identical field names in a level, but you can have identical field names across levels. So lumping field names across levels together doesn’t look like a good idea to me.


Yes, but fields defined in an ancestor are visible in its descendants.  I understand the need to distinguish the individual types, but  'struct-type-info'  does that already.  IMO there should be an easy way to get information on *all* the fields at once - potentially including which supertype(s) defined them.

YMMV,
George

Sam Tobin-Hochstadt

unread,
Jan 21, 2021, 1:43:24 PM1/21/21
to Dominik Pantůček, Racket Users
On Wed, Jan 20, 2021 at 5:16 PM Dominik Pantůček
<dominik....@trustica.cz> wrote:
>
> Hi Sam,
>
> I went through all my notes and prepared minimal (sometimes) working
> examples for most of the issues I mentioned. Let's go through it one by
> one. I assume that some of the complications I encountered were because
> my lack of experience with Typed Racket. I hope some of these examples
> will be useful anyway.
>
> All the examples are in a git repository on Gitlab [1].

This is great, thanks! I've opened several issues for fixing some of
these problems.
Yikes, this is pretty terrible. Here's a nice simple version that works, though:

(define (typed-extract-data (tagged-data : (Pairof Symbol (Listof
(Listof (Listof Symbol)))))) : Number
(define data (car (cdr tagged-data)))
(define lengths (map (inst length Symbol) data))
1)

What's going on here is:
1. The type for `cadr` is not quite smart enough -- if it knows that
you have an at-least two element list, then it will produce the second
element type. Otherwise, if you have a (Listof T), it produces T. But
if you have a pair of a T and (Listof S), that ends up with (Union S
T).
2. I think there's a missing `syntax/loc` in the implementation of
`cast`, which resulted in the bad error message you saw.

I also think you're jumping too quickly to use `cast` -- before you do
that, you should try `ann`, for example, with the types of `length`
that you tried, which would have accomplished the same thing and
doesn't hide type errors or impose runtime costs.

> The actual problem I was referring to here is nicely shown in the
> "union" directory and it is about typing hash-union procedure.
>
> I do not know what solution should be considered correct if there were
> multiple differently-typed hash-tables in the same module. The explicit
> type information in require/typed is definitely not a good idea - but I
> found no better way.

Well, first we should make `racket/hash`'s exports work automatically.
But absent that, there are a few things going on here:

First, using `Any` is forgetting information, which is why the version
with `Any` everywhere produced an unsatisfactory result. In general,
one misconception I see regularly with Typed Racket users is using a
general type (such as `Any`) and then expecting Typed Racket to
specialize it when it's used. If you want that, you'd need to write a
polymorphic type, like this:

(hash-union (All (a b)
(-> (Immutable-HashTable a b)
(Immutable-HashTable a b)
(Immutable-HashTable a b)))))

Unfortunately, while this is the type you want, it can't work here
because contracts for hashes are limited in various ways. There are a
couple options to work around that.
1. Use `unsafe-require/typed` from `typed/racket/unsafe`. This works
with the type as written, but obviously if you make a mistake things
can go poorly.
2. Use a more restrictive type, such as `Any` or `Symbol` for the keys.

> >> * Classes: My notes say "AAAAAAAAAAAAAAAAAAAA". Which roughly says it
> >> all. Although I managed to back down to only using bitmap% class,
> >> properly typing all procedures using it was a nightmare. By properly I
> >> mean "it compiles and runs".
> >
> > More detail here would be helpful as well.
>
> The task was so simple and the solution so ugly. See the "classes"
> directory with 4 steps of typing a simple program to load a bitmap.
>
> Yes, normally I use (make-object bitmap% filename 'unknown/alpha) and it
> just works (racket/draw hides all the problems from the programmer). But
> I failed to require/typed it correctly. With explicit specification of
> object interface that is used, it worked. The bitmap4.rkt contains the
> code that I actually used.

Fortunately, this one has an easy solution: if you require
`typed/racket/draw`, then everything just works. Libraries that are
available like this are described here:
https://docs.racket-lang.org/ts-reference/Libraries_Provided_With_Typed_Racket.html

> >> * with-input-from-file does not accept Path or String, only Path-String
> >> and the conversion rules are either missing or strange at best.
> >> Basically I ended up with just converting to String and casting to
> >> Path-String to make everything work.
> >>
> >> * with-input-from-file also revealed that procedure signatures as types
> >> can be very tricky - just passing `read' was not possible, because it
> >> accepts some optional arguments. Wrapping it in thunk helped though.
> >
> > I don't understand what the problem was here; for example this works for me:
> >
> > #lang typed/racket
> > (with-input-from-file "/tmp/x.rkt" read)
>
> Yes, this works. The example in "input" directory shows the problem I
> had. Again, the trouble was I was working with a list where the first
> element is different from the rest of elements and I wanted to ensure
> type consistency for the rest of the code.

This, unfortunately, is just a case where Typed Racket is making you
face some potential problems that you might want to ignore. `read` can
produce anything, and you have to deal with that in Typed Racket.

That said, I might do things a bit differently than you ended up doing.

For example, you wrote `(with-input-from-file ... (cast read (->
DataDef)))` but I would probably do `(cast (with-input-from-file ...
read) DataDef)` because first-order checks are faster than
higher-order ones, and I also mostly try to use `assert` together with
predicates if you have them, because plain functions are usually
faster than contract system-generated ones.


> >> * Type annotations of procedures with variadic arguments. The only place
> >> where I had to put annotations outside of the procedure definition. It
> >> is nothing super-problematic, but it feels inconsistent with the rest.
> >
> > I would encourage type annotations before the procedure definition in
> > all cases -- the `define` form doesn't really have the right places to
> > put everything that can go in a function type.
>
> And what about procedures created with lambda? I thought the (define
> (proc args ...) body ...) is just a syntactic sugar around (define proc
> (lambda (args ...) body ...)) - so I thought the type annotations are
> the same as well. Or do I miss something here?

Indeed, using the (proc args ...) style is most a simple shortcut (but
see exceptions wrt keyword/optional arguments). This is true for Typed
Racket too. For annotating `lambda`s directly, usually the trickier
cases are less significant. If you run into them, then using `ann`
will produce the same results as the top-level annotation.

> >> * Syntax macros are extra hard. As I use some syntax trickery to convert
> >> semi-regular code to "futurized" one, I basically gave up and just ran
> >> everything single-threaded. The main issue is passing type information
> >> from the module that uses the macro to the macro and matching it on both
> >> sides. Also the macro must split the type annotation from argument names
> >> in the syntax pattern when it defines new procedures - I did some ugly
> >> hacks to get through it but in the end I just refrained from using them
> >> when possible (except for the unsafe-struct macro, of course).
> >> commits, 2292-line diff!) and genuinely (it really helped).
> >
> > An example here would be great too.
>
> Two examples are in the "syntax" directory. First one with the three
> stages of typing a struct. The need for explicitly extracting the field
> names differently than without TR is not very convenient. When I tried
> to make a "universal" macro that would handle both typed and untyped
> variants, I failed (but I bailed out quickly as that was not my goal).

Note that just handling all cases of `struct`, even in untyped Racket,
will require the same thing.

> The second example is my common define-futurized syntax macro. The TS
> version shows the :1, :2 and ::: as non-validating syntax template
> placeholders (just something that matches the expression but does not do
> any actual syntax checking). Is there a better way? (I would think so).

Yes, you want to use the "literals" feature in `syntax-case`. In other
words, you'd write:

(define-syntax (define-futurized stx)
(syntax-case stx (:)
((_ (proc (start : stype) (end : etype)) : RetType body ...)
.....)))

Dominik Pantůček

unread,
Apr 15, 2021, 4:21:25 PM4/15/21
to racket...@googlegroups.com
Hello Racketeers,

>> * Math/Fixnums/Flonums: All fx+/-/*/... accept two arguments only. No
>> unary fl-, no variadic-argument fl+ or fxior (this one hurt the most).
>
> These definitely became variadic after the type definitions were
> written, but that's of course not an excuse for not updating them.
>

are these planned for update anytime soon? Tested with latest git and
the functions are still two-argument only.

If there are no plans, I can probably have a look into that and send PR
if I am successful.


Cheers,
Dominik

Sam Tobin-Hochstadt

unread,
Apr 15, 2021, 4:26:27 PM4/15/21
to Dominik Pantůček, Racket Users
A PR extending the types for those functions would be very welcome.
Let me know if you need any help.

Sam
Reply all
Reply to author
Forward
0 new messages