Ah, I should have replied to this email before, not the other one. Well,
I'm catching up.
I think the next step is to wait for me to be free to work on it.
IIRC, there were two issues with TR-ing plot:
1. Typing functions that take optional arguments *and* keyword
arguments.
2. Mutable data types in input.
#1 is fixed *enough* now to put an actual TR face on plot without making
users write a bunch of `#f' in every function application. The issue
would be writing all the huge `case->' types. Not fun. TR needs a nicer
way to specify optional arguments---maybe something corresponding to a
`->*' contract.
Example of #2 is `points', which takes a (Listof (Vectorof Real)) IIRC.
The problem is that you can't send a (Listof (Vectorof Integer)) to a
function that accepts just that type, because vectors are mutable and
could be used as a communication channel. (Plot won't mutate them into
reals, but TR can't know that.) Fixing this requires either waiting for
possibly research-level changes to TR, or changing plot's API.
I wouldn't mind changing the API; it would be nice to have things like
`points' accept sequences anyway. I haven't put much thought into what
would be in the sequences, though.
So I suppose that would be a nice next step. What's an appropriate type
for pairs/triplets/tuples of numbers? Should we let it accept
(Sequenceof (U (Listof Real) (Vectorof Real)))? (Would that subtype
nicely?) Sequences of sequences?
Neil ⊥
____________________
Racket Users list:
http://lists.racket-lang.org/users
More easily than the other way around. Also, that would work around the
covariance problem with vectors. I could change the contract to accept a
sequence of vectors or lists, and use the type (Sequenceof (List Real
Real)).
I still need a `->*' type constructor, because most plot functions have
2-5 optional arguments and 10-20 arguments altogether. For the largest
functions, a `case->' type would be huge and easy to get wrong. Also,
I'm morally opposed to types that don't fit on my screen.
I'll get around to submitting a formal request, after I get this Poisson
distribution quantile function working...
____________________
Racket Users list:
http://lists.racket-lang.org/users
On 11/03/2012 09:20 AM, Matthias Felleisen wrote:
> Last night Sam, Tony and I had a discussion on TR/R boundaries
> for his "racket on a router" project. Tony ported his software
> from Racket to Typed Racket and stopped halfway in between. The
> 'framework' (aka kernel) is now in TR and the 'user program' aka
> 'client' lives in R. I had predicted that the boundary between the
> two would cause a severe performance and Tony has now confirmed
> this conjecture. (We are talking factors not small percentages.)
>
> As you get racket/math ready for production, I think you too should
> measure the performance hit from going across the boundary.
I've already done that somewhat. First-order functions are okay; all the
new flonum functions, special functions, bigfloats, number theory, etc.,
run at a decent clip on the untyped side of the contract boundary. For
example, I get 4 million `gamma' applications per second in TR, and 1.2
million per second untyped.
(I think the difference for `gamma' is more about how well Vincent and
Matthew have done with TR's optimizer and the JIT. Thanks to their work,
computing gamma comes down to about 100 flops running right on the CPU.)
Higher-order functions, though, are dog slow. In particular, all the
array functions are higher-order, because an array is just a function
with a rectangular domain; e.g. `array-map' is composition. Here's a
program that times computing the elements of an array:
#lang racket
(require math/array)
(define arr
(build-array #(3 3) (λ (js)
(match-define (vector j0 j1) js)
(+ j0 j1))))
arr
(for ([_ (in-range 5)])
(time (for ([_ (in-range 50000)])
(array-strict arr))))
This is the output I get:
(array [[0 1 2] [1 2 3] [2 3 4]])
cpu time: 2680 real time: 2684 gc time: 170
cpu time: 2650 real time: 2659 gc time: 140
cpu time: 2660 real time: 2662 gc time: 170
cpu time: 2650 real time: 2653 gc time: 170
cpu time: 2660 real time: 2660 gc time: 160
Changing the language to "typed/racket", I get this:
(array [[0 1 2] [1 2 3] [2 3 4]])
cpu time: 90 real time: 90 gc time: 20
cpu time: 70 real time: 77 gc time: 0
cpu time: 80 real time: 75 gc time: 10
cpu time: 80 real time: 77 gc time: 10
cpu time: 80 real time: 82 gc time: 20
So here, the contract boundary slows things down 33x.
Huh. I just tried `make-array', which creates the array's function in TR
code, and I get 53x. I didn't expect that. I also get 20x for using
distribution objects, which are immutable structs that contain functions
that are only created in TR code.
> If it
> is bad, we should consider including both a typed and an untyped
> variant where the latter is generated from the former (I believe
> you are working in TR so that's why I wrote the last sentence).
> That is, when the library is installed the Untyped one should be
> generated by disabling types and type checking.
We should consider it, then, unless there's a way to significantly speed
up a type's generated, higher-order contracts.
I'm a bit confused about how this would help, though. The interface
between the library and the user will still have to be contracted, so
where does the performance gain come from?