Typed Racket has lowered my performance

272 views
Skip to first unread message

HiPhish

unread,
Dec 2, 2017, 5:20:24 AM12/2/17
to Racket Users
Hello everyone,

A while ago I had written a MessagePack library Racket:
https://pkgs.racket-lang.org/package/msgpack

Yesterday I made a change to narrow down some of the types from `Any` (which is
wrong) to `Packable`. This has tanked my performance to the point where the
tests for non-scalar types (vectors and hashes) time out on the package server,
and thus fail.
https://gitlab.com/HiPhish/MsgPack.rkt/commit/0b6cdc7115389db97a8de2a5175c1feb4c939f8f

Please let me provide some context first: MessagePack is a serialisation format
similar to JSON, except binary, so it should be smaller and faster. Only
certain types can be packed (serialised) and unpacked (deserialised). Here is
the specification:
https://github.com/msgpack/msgpack/blob/master/spec.md

TL;DR: numbers, strings, byte string, booleans, some null element (I use void),
and arrays and dictionaries of any of these. I grouped the union of these types
into the compound type `Packable`:
https://gitlab.com/HiPhish/MsgPack.rkt/blob/master/packable.rkt#L24

I have also defined a predicate for this new type:
https://gitlab.com/HiPhish/MsgPack.rkt/blob/master/packable.rkt#L36

As you can see the `Packable` type is recursive; if an object is a vector or a
hash table then all its elements have to be packable as well. Could this be the
reason for the performance loss? Another thing I noticed is that the predicate
does not provide any proposition, if it succeeds the type checker is not
informed that the object is of type `Packable`. Another is that Typed Racket
seems unable to narrow down the type of the result. Take a look at this
example:

    > (require msgpack)
    > (define bs (call-with-output-bytes (λ (out) (pack 3 out))))
    > bs
    - : Bytes
    #"\3"
    > (define v (call-with-input-bytes bs (λ (in) (unpack in))))
    > v
    - : Packable
    3
    > (+ v 3)
    ; readline-input:8:3: Type Checker: type mismatch
    ;   expected: Number
    ;   given: Packable
    ;   in: v
    ; [,bt for context]
    > (exact-integer? v)
    - : Boolean
    #t

So the type checker can figure out that v is an integer, but it cannot treat it
as one for addition.

Ben Greenman

unread,
Dec 2, 2017, 12:54:09 PM12/2/17
to HiPhish, Racket Users
> This has tanked my performance to the point where the tests for
> non-scalar types (vectors and hashes) time out on the package
> server, and thus fail.
> https://gitlab.com/HiPhish/MsgPack.rkt/commit/0b6cdc7115389db97a8de2a5175c1feb4c939f8f

The performance is probably because HashTable and Vector values can be
mutable, so Typed Racket needs to do extra work to protect them.

To test this, I cloned msgpack, removed `Vector` from the `Packable`
type, and changed `HashTable` to `Immutable-Hashtable`. Change here:
https://gitlab.com/bennn1/MsgPack.rkt/commit/2a711200ad0efa1b974a3e5454f69d7004a74996

When I run the tests in test/pack/map.rkt, I see:
- 40 seconds with the types on master
- 13 seconds with immutable types
- 9 seconds on commit ac2b005 (before changing the types?)

I'm working on a pull request to add Immutable-Vector to Typed Racket.
That should be ready for the 6.12 release, and then msgpack can make
the `Packable` type immutable.
https://github.com/racket/typed-racket/pull/575

If that PR doesn't solve the performance problem, then I'd be happy to
keep looking for something that does.


> Another thing I noticed is that the predicate
> does not provide any proposition, if it succeeds the type checker is not
> informed that the object is of type `Packable`.

It *should* be possible to make the predicate a proposition by
changing the type of `packable?` to `(-> Any Boolean : Packable)`. But
I couldn't get this to type check.
Here's a simpler case that did work: http://pasterack.org/pastes/92542

But it's probably easier to just use `(define-predicate packable? Packable)`


> Another is that Typed Racket
> seems unable to narrow down the type of the result. Take a look at this
> example:
>
> > (require msgpack)
> > (define bs (call-with-output-bytes (λ (out) (pack 3 out))))
> > bs
> - : Bytes
> #"\3"
> > (define v (call-with-input-bytes bs (λ (in) (unpack in))))
> > v
> - : Packable
> 3
> > (+ v 3)
> ; readline-input:8:3: Type Checker: type mismatch
> ; expected: Number
> ; given: Packable
> ; in: v
> ; [,bt for context]
> > (exact-integer? v)
> - : Boolean
> #t
>
> So the type checker can figure out that v is an integer, but it cannot treat
> it as one for addition.

The type checker doesn't know that `v` is an integer. It just knows
`v` is Packable.

HiPhish

unread,
Dec 2, 2017, 1:35:02 PM12/2/17
to Racket Users
> The performance is probably because HashTable and Vector values can be
> mutable, so Typed Racket needs to do extra work to protect them.
>
> To test this, I cloned msgpack, removed `Vector` from the `Packable`
> type, and changed `HashTable` to `Immutable-Hashtable`. Change here:
> https://gitlab.com/bennn1/MsgPack.rkt/commit/2a711200ad0efa1b974a3e5454f69d7004a74996
Thanks, I will change the hash tables to immutable, but I am going to keep the
vector as it is until immutable vectors land in Typed Racket. A slow
implementation is better than an incomplete implementation.


> When I run the tests in test/pack/map.rkt, I see:
> - 40 seconds with the types on master
> - 13 seconds with immutable types
> - 9 seconds on commit ac2b005 (before changing the types?)

Why is the version with immutable types still slower? Is it because it has to
recursively check the contents of the hash instead of just being satisfied with
anything? See, this is the thing that confuses me about Typed Racket, I thought
that making types more specific should improve performance or at least leave it
as it is because Racket does not have to check types at runtime unless
explicitly requested. Is it because the tests are written in untyped Racket and
specifying the Packable type generates a more expensive contract than the Any
type would?


> I'm working on a pull request to add Immutable-Vector to Typed Racket.
> That should be ready for the 6.12 release, and then msgpack can make
> the `Packable` type immutable.
> https://github.com/racket/typed-racket/pull/575
>
> If that PR doesn't solve the performance problem, then I'd be happy to
> keep looking for something that does.
Racket 6.12 should come out around the end of January, right?



> It *should* be possible to make the predicate a proposition by
> changing the type of `packable?` to `(-> Any Boolean : Packable)`. But
> I couldn't get this to type check.
> Here's a simpler case that did work: http://pasterack.org/pastes/92542
>
> But it's probably easier to just use `(define-predicate packable? Packable)`
I think the problem is that Packable contains mutable types, which according to
the reference manual is not allowed:
https://docs.racket-lang.org/ts-reference/special-forms.html?q=make-predicate#%28form._%28%28lib._typed-racket%2Fbase-env%2Fprims..rkt%29._make-predicate%29%29



> The type checker doesn't know that `v` is an integer. It just knows
> `v` is Packable.
Right, that makes sense. Number and Packable intersect, but Packable is not a
subtype of Number.

HiPhish

unread,
Dec 2, 2017, 6:24:49 PM12/2/17
to Racket Users
Now that I think about it, changing the types to be immutable is not really
correct either. There is no reason users should not be able to serialise a
mutable list, vector or hash table, just as they can serialise any mutable
scalar as well.

The result of unpacking bytes could be immutable, but would that make any
difference?

Sam Tobin-Hochstadt

unread,
Dec 2, 2017, 6:36:16 PM12/2/17
to HiPhish, racket users list
I don't think the mutable/immutable issue should be as significant as it seems here. Using the Any type shouldn't perform better the way you describe, so we need to look more at what the actual contracts are doing.

Sam

--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

HiPhish

unread,
Dec 3, 2017, 9:29:26 AM12/3/17
to Racket Users
Is there anything I can do to help investigate the issue? I have reverted my commit for the time being, and it's a difference like day and night.

Sam Tobin-Hochstadt

unread,
Dec 3, 2017, 9:33:10 AM12/3/17
to HiPhish, Racket Users
Running the contract profiler [1] on your code would be quite helpful.

[1] https://docs.racket-lang.org/contract-profile/index.html

Sam
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-users...@googlegroups.com.

HiPhish

unread,
Dec 3, 2017, 12:07:09 PM12/3/17
to Racket Users
Here is what happens when I run one of the array tests with the more
restrictive type specifications:

    OK, passed 100 tests.
    Running time is 70.75% contracts
    75/106 ms
   
    (-> (recursive-contract (or/c (and/c hash? (and/c hash-equal ... 75 ms
    (lib msgpack/pack.rkt):24:9
        pack                                                         75 ms

After reverting the commit I get zero overhead:

    OK, passed 100 tests.
    Running time is 0% contracts
    0/7 ms

The contract makes up 70% of the total runtime. I also looks like there is no
contract generated after reverting. Should I run the profiler on some other
tests as well?

Sam Tobin-Hochstadt

unread,
Dec 3, 2017, 12:11:42 PM12/3/17
to HiPhish, racket users list
Thanks, that's very helpful. It's clear that the contract optimization is working in the old code but not the new code, and we need to fix that.

Sam

--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscribe@googlegroups.com.

HiPhish

unread,
Dec 3, 2017, 6:53:12 PM12/3/17
to Racket Users
Anything more I can do?

Sam Tobin-Hochstadt

unread,
Dec 3, 2017, 8:03:33 PM12/3/17
to HiPhish, Racket Users
Ok, looking more at the commit, I think this is not actually a bug
anywhere, but really an unfortunate combination of things that I don't
have an idea for improving at the moment. Sorry for not realizing that
earlier.

First, I think you should just go back to using `Any` in the type for
`pack` and for `pack-hash` and `pack-sequence`. I don't think that
will cause any problems for you or users of your code, and will make
the performance problem go away. For functions where `Packable`
appears in the _result_, such as `unpack`, you should keep using
`Packable` -- that will be both faster and will avoid contract errors
that your users might otherwise encounter.

More generally, what happened here was that you made the types you
export _more restrictive_ by changing `Any` to `Packable` in the
argument types. That is, in the old code, anyone could pass any value
to `pack` and either get the result or a dynamic error. That means
that Typed Racket could generate a very cheap contract for `pack`. By
changing it to a more restrictive type, you get to assume in the body
of `pack` that the input is `Packable`, but Typed Racket then
generates a very expensive contract to check that. From your code, it
doesn't look like you're making use of that additional assumption, so
it's just costing you a lot of performance.

As to why that contract is so expensive, the short answer is that
contracts for mutable data like procedures and hash tables have to
construct wrapper objects, which involves a lot of extra allocation
and indirection, on top of the usual expense of contract checking.
That's why things got faster with Ben's modifications.

Finally, why use `Packable` in the result type of `unpack`? Here, the
contract for `Any` isn't simple and inexpensive, since you're sharing
potentially arbitrary values with untyped code, so Typed Racket
constructs a complicated contract (called `any-wrap/c`) in order to
protect it. That contract will also error in cases where it doesn't
know what to do, which the contract for `Packable` won't.

Sam
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-users...@googlegroups.com.

HiPhish

unread,
Dec 4, 2017, 6:56:32 AM12/4/17
to Racket Users
When I change the return type of `unpack` to `Packable` instead of an explicit union of types the map packing test (`test/pack/map.rkt`) hangs.
https://gitlab.com/HiPhish/MsgPack.rkt/blob/master/unpack.rkt#L83
https://gitlab.com/HiPhish/MsgPack.rkt/blob/master/test/pack/map.rkt

Sam Tobin-Hochstadt

unread,
Dec 4, 2017, 5:59:10 PM12/4/17
to HiPhish, Racket Users
I'm not able to replicate that (see transcript below). Is there
something else I should be doing?

[samth@huor:~/tmp/MsgPack.rkt (master) plt] git diff
diff --git a/unpack.rkt b/unpack.rkt
index 30c87a9..3806548 100644
--- a/unpack.rkt
+++ b/unpack.rkt
@@ -81,7 +81,7 @@
[else (error "Unknown tag " tag-var)])]))

(: unpack (-> Input-Port
- (U Void Boolean Integer Real String Bytes (Vectorof
Packable) (Listof Packable) (HashTable Packable Packable) Ext)))
+ Packable))
(define (unpack in)
(define tag (read-byte in))
(cond
[samth@huor:~/tmp/MsgPack.rkt (master) plt] racket -l msgpack/test/pack/map
[samth@huor:~/tmp/MsgPack.rkt (master) plt] raco test -l msgpack/test/pack/map
raco test: (submod "/home/samth/tmp/MsgPack.rkt/test/pack/map.rkt" test)
OK, passed 100 tests.
OK, passed 100 tests.
2 tests passed

HiPhish

unread,
Dec 8, 2017, 6:06:16 PM12/8/17
to Racket Users
No, I did the same thing, and only the first hundred tests work normally, the other 100 hand for several minutes. Maybe my computer is too weak, it's an early 2009 iMac with a 2.66GHz Core2Duo and 8GB of RAM. I also ran `raco setup msgpack` after making the change to the source file to make sure everything gets compiled properly.
Reply all
Reply to author
Forward
0 new messages