Enhanced Primitive Support

Showing 1-234 of 234 messages
Enhanced Primitive Support Rich Hickey 6/17/10 1:13 PM
I've been doing some work to enhance the performance, and unify the
semantics, of primitives, in three branches. I've started to document
this work here:

https://www.assembla.com/wiki/show/clojure/Enhanced_Primitive_Support

Feedback welcome,

Rich
Re: Enhanced Primitive Support Rob Lachlan 6/17/10 2:10 PM
I think the enhanced support for primitives is fantastic.  I'm looking
forward to doing more numerical work in clojure.

Quibble:  Using a multiple-recursive algorithm for calculating
fibonnaci values.

user> (defn fib-2 [n] (if (>= n 1)
                  (loop [i 0 f0 0 f1 1]
                    (if (== i n) f0 (recur (inc i)
                                           f1
                                           (+ f0 f1))))))
#'user/fib-2
user> (time (fib-2 38))
"Elapsed time: 0.12 msecs"
39088169

I understand that it's just for demonstration purposes, but
impressionable people (like me!) might not know better.
(Of course, we can calculate fibonacci values using the closed form
formula, but that's neither here nor there.)

Other than that, looks great.

Rob
Re: Enhanced Primitive Support Rich Hickey 6/17/10 2:15 PM


On Jun 17, 5:10 pm, Rob Lachlan <robertlach...@gmail.com> wrote:
> I think the enhanced support for primitives is fantastic.  I'm looking
> forward to doing more numerical work in clojure.
>
> Quibble:  Using a multiple-recursive algorithm for calculating
> fibonnaci values.
>
> user> (defn fib-2 [n] (if (>= n 1)
>                   (loop [i 0 f0 0 f1 1]
>                     (if (== i n) f0 (recur (inc i)
>                                            f1
>                                            (+ f0 f1))))))

Naive fib is often used as a benchmark, the point being not to do a
fast fib, but comparative, 'what is the perf on naive fib'? Not being
tail-recursive, it tests function calling overhead, and thus here
demonstrates what we gain by not having to box on args and returns.

Rich


Re: Enhanced Primitive Support Rob Lachlan 6/17/10 2:20 PM
Ah -- well that makes sense then.  Bravo!

Rob
Re: Enhanced Primitive Support puzzler 6/17/10 6:55 PM
It's great to be discussing the possibility of enhanced primitive support.

I like most of the proposals.  But this part troubles me:

# For bigints, specify bigints
    * new literal bigint format 42N
# bigints contagious, no more auto-reductions
    * (\+ 0N 21 21) => 42N

One of the big reasons I prefer Clojure over languages like Scala/F#
is that in Clojure, numbers "just work".  I don't have to worry about
overflow errors, and the use of numbers in sets/hash tables is
generally pretty intuitive (it gets a little wonky if you're using
numbers produced by Java methods, but at least you're good if you stay
within Clojure).

As a case in point, consider writing a naive recursive factorial function.

(defn fact [n]
  (if (zero? n) 1 (* n (fact (dec n)))))

In Clojure, (fact 3) produces the Integer 6.  (fact 40) produces the
BigInteger 815915283247897734345611269596115894272000000000.  It just
works.  The answers produced by fact can be used in collections,
compared for equality with other numbers.

In F#, Scala, if you want factorial to not break on something like an
input of 40, you have to essentially use the bigint literal syntax
like what you're proposing:
(defn fact [n]
  (if (zero? n) 1N (* n (fact (dec n)))))

Now, the results of this function are "contaminating bigints" which
can't be used as effectively with other Clojure data structures.  As
you pointed out, it causes problems when the same number can have two
different type representations depending on the sequence of
computations used to get to that number.  This will always be somewhat
of a problem due to Java interop, but it's important to me that all
numbers arrived at through Clojure computations just work with respect
to equality and use in sets/hash tables.  The lack of auto-reduction,
in conjunction with the proposal to make equality take numeric type
into account, will really cripple the utility of bigints.

Even more frustrating, in my experience with languages like F# and
Scala, functions written like this are way, way slower.  The reason is
that Java's bigint arithmetic is sloooooow.  To protect yourself from
overflow, you've forced the function to use bigint arithmetic on
1N*2N*3N*..., well before you even reach a bigint.  For this
particular function, it reaches bigint status quite quickly, so the
performance degradation is fairly minimal.  But I have found that in
general, requiring programmers to explicitly use bigint arithmetic for
small numbers in order to protect from overflow is an absolute
performance killer.  Clojure's technique of auto-promotion to bigint
upon overflow, and then auto-reduction when appropriate, is way, way
faster than using bigints all the time.

In summary:
No auto-reduction would make bigints second-class citizens and make
them difficult to use in other functions and collections.
No auto-promotion to bigints means that programmers who want to
protect from overflow have to use bigints everywhere, which destroys
performance.

So what would be the downside of implementing most of these proposals
(e.g., 1 is a literal for long 1), but retain auto-promotion and
auto-reduction rather than throwing an error for primitive long math?
Seems to me like this would be the best of both worlds.  What am I
missing?

Re: Enhanced Primitive Support Rich Hickey 6/17/10 8:00 PM

You raise several points, I'll take them in turn. First, things are  
not as dire as you state. Certainly one could write a version of fact  
that hardwired bigints as you showed. But you need not - here's the  
original naive definition:

(defn fact [n]
   (if (zero? n) 1 (* n (fact (dec n)))))

Such logic that doesn't specify one of the components (in this case,  
n) is still polymorphic, as are operations with mixed components. So  
the power is in the hands of the supplier of n:

user=> (fact 10)
3628800

user=> (fact 40)
java.lang.ArithmeticException: integer overflow

user=> (fact 40N)
815915283247897734345611269596115894272000000000N

I think you can always write code in this way, such that it will work  
with any of the numeric types, without dictating.

----
It is true that Java's BigIntegers are slow, especially for numbers  
small enough to fit in Long or Integer, so using them for smaller  
numbers has overhead. But there are other options, not yet pursued  
here. For instance, Kawa has a higher performing biginteger that uses  
an internal int component until it overflows. I'm sure no one has a  
particular attachment to BigInteger should we find a better  
alternative. It would be easy to create a big integer class based on  
the Kawa concept, perhaps with a long or int basis overflowing to  
BigInteger itself.

http://groups.google.com/group/jvm-languages/msg/0413ed05d1371907

----------
As far as performance killer, that depends on your perspective. It is  
my belief that overwhelming majority of code *never* uses numbers that  
exceed long, And those people are suffering a 'performance killer' on  
the order of 10-20x by having numbers be boxed all the time.

------
The interaction between equality and the BigInteger schism is  
something I am still thinking about. That is why the branches are  
staged. Only the 'equal' branch has the type-strict =

--------
The thing that you are missing is that there is no single  
representation on the JVM* that can hold either a primitive or a  
reference to an Object like a BigInteger. That is where the current  
semantic difference creeps in - a primitive op sourced by a primitive  
must have as its target a primitive - should it produce an object  
there would be nowhere to put it. And any interface points (like the  
math operations themselves) cannot dynamically take either a primitive  
or an object. So the auto-promotion has an unavoidable cost of the use  
of objects to hold all values, and the allocation thereof. Opting out  
of the objects means a change in semantics. Auto-promotion cannot be  
made a uniform semantic for both primitives and objects.

---------
Another alternative is to provide a second set of operations with auto-
promoting, object based semantics. These would never operate on  
primitives. Since a key argument for auto-promotion is the poor  
behavior of small bigints, making a better bigint seems preferable.

--------
In the end, this does put the interests of two sides of the community  
at odds. Only one set of interests can be the default. Part of the  
point of this discussion is to measure the sides (so speak up people!).

I think the current default doesn't serve the majority well, and has a  
semantic rift. Making ordinary programs that use longs and doubles run  
fast requires extensive and tedious annotation that is easy to get  
wrong. This work proves it could be completely otherwise.

Rich

* There are proposals:

http://blogs.sun.com/jrose/entry/fixnums_in_the_vm

Note that even on mature platforms like CL with very good tagging and  
fixnum systems, actually getting at full-width 32- and 64-bit hardware  
operations requires extensive, tedious and error-prone annotation.  
I.e. a tagged number is not a hardware number.

Re: Enhanced Primitive Support David Nolen 6/17/10 8:47 PM
On Thu, Jun 17, 2010 at 11:00 PM, Rich Hickey <richh...@gmail.com> wrote:
In the end, this does put the interests of two sides of the community at odds. Only one set of interests can be the default. Part of the point of this discussion is to measure the sides (so speak up people!).

I am squarely on the side of prim/num/equals. The number of questions on the ML, #clojure, StackOverflow, etc. for the past two years about proper type-hinting shows how much incidental complexity the current system brings to the table.

Look no further than: 
vs. 

The difference is startling.

In order to write decently fast numeric code in Clojure up until now you had to be Christophe Grand ;) or have a fairly good understanding of macros that walk code to tag primitive types. And even so, the inability to return primitives made many nested recursive algorithms perform poorly. The only way to work around this was to convert your functions to macros and inline everything resulting of course in a considerable loss of modularity. Dirty hacks.

For those that need BigInts (the majority of whom are solving Project Euler problems, god bless them), telling them to add a single N is so much simpler than explaining the current complexities to newcomers who don't understand why floating point arithmetic is so darn slow.

More importantly, I think that deftype/defrecord/protocols/static/prim/num/equals represent an astounding work of *unification*. You no longer need to resort to Java - all the tools for building fast code is right there.

My gut feeling is that many Clojure devs won't think to much about this work - they've got their hands full with the rest of the language or are working in domains where they are under the impression that this unification work does not affect them. But I think this kind of work is paving the way for the community to give more of a helping hand to converting core data structures from Java to Clojure, extending them, patching them, and contributing new ones - all in a language that (unlike Java) naturally guides you to "do the right thing".

So though I don't know how much of a stir this thread will create (because I think it's more of a large under the surface wave), I have no doubt it will pay off in a big, big way.
Re: Enhanced Primitive Support puzzler 6/17/10 8:57 PM
Thanks for the responses.

Going back to the naive factorial function:


(defn fact [n]
  (if (zero? n) 1 (* n (fact (dec n)))))

Right now,
user=> (fact 40)
815915283247897734345611269596115894272000000000

Under the proposed changes,


user=> (fact 40)
java.lang.ArithmeticException: integer overflow

So the same code now risks an arithmetic exception.  What have I, the
programmer, gained from this new level of risk?  The answer, if I'm
the kind of programmer who has no interest in putting in type
annotations into the header of the function, is nothing.  There is no
speed improvement without the additional type annotations.  So all
I've gained is the /potential/ for a speed improvement.

I assume that most Clojure users really like its dynamic nature.  If
this is true, then for most of us, the common case is to NOT annotate
our code with types.  Certainly I like the idea of making it as easy
as possible to write fast code in Clojure.  I want my dynamically
typed code to be as fast as it can be, and it's nice to know that
there is a way to write statically typed code that is even faster.
But I really don't want to incur a penalty (i.e., possibility of
Arithmetic Exception) when I'm not using type annotations.

I agree that the burden on those who want to write type annotated code
is too high.  I agree that this should be made easier.  But
ultimately, I still want the burden to be on them (or me, the 1% time
I need it), not those of us who prefer unannotated code.  I that some
of the changes you have proposed for moving the annotations into the
header, along with literal notation for long (i.e., 0L rather than
(long 0)) would be preferable to the bigint dichotomy.

I agree that an improved bigint type would address the performance
aspect of my argument.  On the other hand, it wouldn't really address
the problem of using these things in sets and hash tables.  I think
it's very important that Clojure have one unique canonical
representation for numbers arrived at through Clojure-based
computations.  It is very difficult to use numbers sanely if that's
not the case.

I see your point that there are a couple of sides to this argument.  I
hope I can persuade some more people to speak up for my side :)

Re: Enhanced Primitive Support David Nolen 6/17/10 9:20 PM
On Thu, Jun 17, 2010 at 11:57 PM, Mark Engelberg <mark.en...@gmail.com> wrote:
I assume that most Clojure users really like its dynamic nature.  If
this is true, then for most of us, the common case is to NOT annotate
our code with types.  Certainly I like the idea of making it as easy
as possible to write fast code in Clojure.

The problem is that it distinctly *not* easy to write fast numeric code in Clojure. It requires expert Clojure knowledge. On the other hand, the mental burden for someone who wants BigInts in the new system is very low - you'll get a trivial to track exception.

And the new system doesn't remove any of the lovely dynamic stuff. There's absolutely no compromise there. In fact in the new system you can write your code initially sans ^:static and fn type hints entirely, you work to get your algorithm right, play around at the REPL as usual, then you add ^:static, type hint only the arglist and return value ...

BOOM, you get raw JVM speed.

But of course I'm biased as you are :) In two years I've never once been interested in BigInts. I have however have mangled and munged my code since 2008 trying to get basic numeric algorithms to perform even remotely in the vicinity to their Java counterparts to no avail.

David
Re: Enhanced Primitive Support Dan 6/17/10 9:22 PM
The problem is that it distinctly *not* easy to write fast numeric code in Clojure. It requires expert Clojure knowledge. On the other hand, the mental burden for someone who wants BigInts in the new system is very low - you'll get a trivial to track exception.



Also, when done, it doesn't look like clojure anymore.

+1 on the new behaviour.
Re: Enhanced Primitive Support Antony Blakey 6/17/10 9:24 PM

On 18/06/2010, at 12:30 PM, Rich Hickey wrote:

> You raise several points, I'll take them in turn. First, things are not as dire as you state. Certainly one could write a version of fact that hardwired bigints as you showed. But you need not - here's the original naive definition:
>
> (defn fact [n]
>  (if (zero? n) 1 (* n (fact (dec n)))))
>
> Such logic that doesn't specify one of the components (in this case, n) is still polymorphic, as are operations with mixed components. So the power is in the hands of the supplier of n:
>
> user=> (fact 10)
> 3628800
>
> user=> (fact 40)
> java.lang.ArithmeticException: integer overflow
>
> user=> (fact 40N)
> 815915283247897734345611269596115894272000000000N

That's fine for fact, but as a consumer of library functions, how do I know when I should pass in bigints? How do I know when an intermediate value, for my particular combination of parameter values, is going to overflow?

Should I just use 'N' on every number to every function? This proposal destroys the black-box nature of functions, and seems to me to be the easy solution rather than the best solution, which, I acknowledge, might be very hard work, possibly involving every such function having 2 polymorphic variants, one generic boxed form and one unboxed form, automatically selected and call-site-cached like OO dispatch.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

If you pick up a starving dog and make him prosperous, he will not bite you. This is the principal difference between a man and a dog.
  -- Mark Twain


Re: Enhanced Primitive Support David Nolen 6/17/10 9:56 PM
On Fri, Jun 18, 2010 at 12:24 AM, Antony Blakey <antony...@gmail.com> wrote:

That's fine for fact, but as a consumer of library functions, how do I know when I should pass in bigints? How do I know when an intermediate value, for my particular combination of parameter values, is going to overflow?

If the library isn't communicating that to you isn't that just a broken API that you don't want to participate in? Tag information on arglists is available:

(defn foo [^BigInteger a b ^BigInteger c]
  (+ a b))
(print (map (fn [xs] (map meta xs)) (:arglists (meta #'foo))))

;(({:tag BigInteger} nil {:tag BigInteger})) 

IDEs can easily (and should) expose this.

David
This message has been hidden because it was flagged for abuse.
Re: Enhanced Primitive Support Richard Newman 6/17/10 10:22 PM
I don't want to commit to a full-fledged response on this issue (yet), but:

On the other hand, the mental burden for someone who wants BigInts in the new system is very low - you'll get a trivial to track exception.

Assuming that I understand your implication (that ArithmeticExceptions are trivial to track): that might be true in Java, with checked exceptions, but it wouldn't be true in Clojure. If some operations overflow some of the time, you need thorough tests… and for all of the libraries you use to have thorough tests. It's not a small mental burden to constantly fear that any library call at all could suddenly overflow at any time. What are we, C programmers?

Automatic type promotion is a good thing, if not without its quirks (map behavior being one of them). One of the fantastic things about Common Lisp is its rich numeric tower. 

That said, if we could have speed and safety…
Re: Enhanced Primitive Support puzzler 6/17/10 10:23 PM
On Thu, Jun 17, 2010 at 9:20 PM, David Nolen <dnolen...@gmail.com> wrote:
> The problem is that it distinctly *not* easy to write fast numeric code in
> Clojure. It requires expert Clojure knowledge.

Right, but with just the prim branch and a shorthand for long literals, you get:
(defn ^:static fib ^long [^long n]
  (if (<= n 1L)
    1L
    (+ (fib (dec n)) (fib (- n 2L)))))

which really isn't that bad.  So my claim is that the prim branch gets
you far enough along the road to easy fast numeric code, without
screwing with the numeric tower.

Re: Enhanced Primitive Support puzzler 6/17/10 10:33 PM
Hmmm, here's an idea:

How about inside of a static function with type annotations, all
literals are automatically converted into long and doubles, otherwise
behavior stays the same.

So:


(defn ^:static fib ^long [^long n]
 (if (<= n 1)
   1
   (+ (fib (dec n)) (fib (- n 2)))))

is automatically converted to


(defn ^:static fib ^long [^long n]
 (if (<= n (long 1))
   (long 1)
   (+ (fib (dec n)) (fib (- n (long 2))))))

whereas
(defn fib [n]
 (if (<= n 1)
   1
   (+ (fib (dec n)) (fib (- n 2)))))
works the way it always did.

This behavior makes a lot of sense because (I think) the only time you
really gain from literals being primitives is when the variables
around it are also hinted.  Otherwise, the polymorphic nature of the
variable will cause the literal to get boxed anyway.

Re: Enhanced Primitive Support Richard Newman 6/17/10 10:43 PM
> ... which, I acknowledge, might be very hard work, possibly  

> involving every such function having 2 polymorphic variants, one  
> generic boxed form and one unboxed form, automatically selected and  
> call-site-cached like OO dispatch.

That's pretty much what Common Lisp compilers do. The compiler (or  
your own compiler macro, if you so choose) can examine the input forms  
and the compiler environment to decide what code to produce.

The following is from Allegro Common Lisp, but all of the input is  
ANSI CL -- part of the spec.

Observe how the compiler switches between different implementations of  
trivial addition as it gets a progressively looser or tighter bound on  
x. (I've elided some of the irrelevant preamble.)

What would be nice is if Clojure and/or the JVM could produce safe  
code/bytecode that wouldn't overflow, but could also be JIT-optimized  
into unboxed primitive arithmetic for known input ranges.

   ;; Range known, optimizing for speed: assembly add instruction.
   cl-user(7): (disassemble
     (compile nil
       (lambda (x)
         (declare (type (integer 0 255) x)
                  (optimize (speed 3)))
         (+ x 2))))
   ...
     15: 48 83 c7 10  add        rdi,$16
     19: f8             clc
     20: 4c 8b 74 24 10 movq        r14,[rsp+16]
     25: c3             ret


   ;; We know that x is a fixnum: the arithmetic doesn't need a  
function call, but
   ;; the result will be a boxed bignum, because fixnum + 2 won't fit  
in a fixnum.
   ;; If this function were inlined elsewhere, the compiler might  
avoid that decision.
   cl-user(8): (disassemble
     (compile nil
       (lambda (x)
         (declare (type fixnum x)
                  (optimize (speed 3)))
         (+ x 2))))
   ...
     21: 74 01          jz        24
     23: 17             (pop ss)        ; sys::trap-signal-hit
     24: 48 c1 ff 03  sar        rdi,$3
     28: 49 c7 c5 02 00 movq        r13,$2
         00 00
     35: 49 03 fd       addq        rdi,r13
     38: 4c 8b ef       movq        r13,rdi
     41: 49 d1 e5       sall        r13,$1
     44: 70 19          jo        71
     46: 49 d1 e5       sall        r13,$1
     49: 70 14          jo        71
     51: 49 d1 e5       sall        r13,$1
     54: 70 0f          jo        71
     56: 49 8b fd       movq        rdi,r13
     59: f8             clc
     60: 48 8d 64 24 78 leaq        rsp,[rsp+120]
     65: 4c 8b 74 24 10 movq        r14,[rsp+16]
     70: c3             ret
     71: 41 ff 57 47  call        *[r15+71]    ; sys::box-to-new-bignum
     75: eb ef          jmp        60
     77: 90             nop


   ;; Nothing known: call to excl::+_2op, the two-arg addition function.
   cl-user(9): (disassemble
      (compile nil
        (lambda (x)
          (declare (optimize (speed 3)))
          (+ x 2))))
   ...
     21: 74 01          jz        24
     23: 17             (pop ss)        ; sys::trap-signal-hit
     24: 40 f6 c7 07  testb        dil,$7
     28: 75 12          jnz        48
     30: 48 83 c7 10  add        rdi,$16
     34: f8             clc
     35: 70 1e          jo        67
     37: 48 8d 64 24 78 leaq        rsp,[rsp+120]
     42: 4c 8b 74 24 10 movq        r14,[rsp+16]
     47: c3             ret
     48: 49 8b af 2f ff movq        rbp,[r15-209]    ; excl::+_2op
         ff ff
     55: 48 c7 c6 10 00 movq        rsi,$16     ; 2
         00 00
     62: ff 53 d0       call        *[rbx-48]  ; sys::tramp-two
     65: eb e2          jmp        37
     67: 49 c7 c5 10 00 movq        r13,$16     ; 2
         00 00
     74: 49 2b fd       subq        rdi,r13
     77: eb e1          jmp        48
     79: 90             nop

Re: Enhanced Primitive Support Antony Blakey 6/17/10 10:44 PM

On 18/06/2010, at 2:26 PM, David Nolen wrote:

> On Fri, Jun 18, 2010 at 12:24 AM, Antony Blakey <antony...@gmail.com> wrote:
>
> That's fine for fact, but as a consumer of library functions, how do I know when I should pass in bigints? How do I know when an intermediate value, for my particular combination of parameter values, is going to overflow?
>
> If the library isn't communicating that to you isn't that just a broken API that you don't want to participate in? Tag information on arglists is available:

No. How do I know whether I need to pass a bigint literal or not? For what possible input values will overflow be an issue? It's easy to see with fact, but that's because the function implementation is clearly visible, and has well known growth properties of a single argument. And it's not just literals - when functions are composed from other functions that have this same problem and return primitive types - hence using the static versions of functions - the question in general becomes impossible to answer, not only because there might be a number of parameters with interdependencies, but because the structure of intermediate results may not be visible to the consumer. And what if you don't have the source to the function? This proposal not only puts the burden of promotion and representation choice onto the user of a function, but it requires that promotion be done before the function is executed, in a precautionary way, rather than iff needed.

This proposal is IMO a very bad idea.

Antony Blakey
--------------------------


CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Lack of will power has caused more failure than lack of intelligence or ability.
 -- Flower A. Newhouse

Re: Enhanced Primitive Support David Nolen 6/17/10 11:01 PM
On Fri, Jun 18, 2010 at 1:44 AM, Antony Blakey <antony...@gmail.com> wrote:
> On Fri, Jun 18, 2010 at 12:24 AM, Antony Blakey <antony...@gmail.com> wrote:
This proposal is IMO a very bad idea.

Why do you need know? You're assumption is built on someone writing a writing a bad library (one that doesn't handle long & BigInt that should). The tools are there for the library to behave correctly regardless of the input.

(defn fact [n]
  (if (zero? n) 1N (* n (fact (dec n)))))

(fact 40)
(fact 40N)

both work. The burden is not on the consumer but the designer of the library. What's the problem?

David
This message has been hidden because it was flagged for abuse.
Re: Enhanced Primitive Support David Nolen 6/17/10 11:15 PM
On Fri, Jun 18, 2010 at 2:10 AM, Mark Engelberg <mark.en...@gmail.com> wrote:
This imposes too high a burden on any programmer who cares about safety.

Rich Hickey's branch work boil down to:

1) double/long crowd get to stop eating bark
2) BigInt crowd loses free lunch

I'd say this is a bona fide community compromise and Rich Hickey saw it coming :)

David
Re: Enhanced Primitive Support David Nolen 6/17/10 11:35 PM
On Fri, Jun 18, 2010 at 2:10 AM, Mark Engelberg <mark.en...@gmail.com> wrote:
Elaborating on Anthony's explanation, let's say you call (fact (foo n)).
 
 
This imposes too high a burden on any programmer who cares about safety.

Don't buy it. That's the whole point of BigInt contagion. If fact and foo are correctly written this will work.

(defn fact [n]
  (if (zero? n) 1N (* n (fact (dec n)))))

(defn foo [n]
  (inc n))

(fact (foo 40))
(fact (foo 40N))

Both work.
Re: Enhanced Primitive Support Richard Newman 6/17/10 11:58 PM
This imposes too high a burden on any programmer who cares about safety.

Don't buy it. That's the whole point of BigInt contagion. If fact and foo are correctly written this will work.

It only takes one library to screw up, and the whole stack falls down.

Screwing up can occur because of omission (fact being written with a "1", not "1N", and not being tested with large inputs), or by design.

This whole debate arose because people care about the speed of their numeric functions! The safety bignums (which make things slower) will be omitted for performance reasons, or by accident, and people won't obey the contract of the function… or won't be able to because of the complex interactions between libraries. I can't control where every value in my program goes.

In Clojure as it stands now, those edge cases would see a small performance penalty as bignums occurred (and then it would go away if they got demoted!). After this change, a one-in-a-million collision of numbers and performance-oriented programming would throw an exception with no way to recover from higher up the stack.

A programmer who really cares about safety would thus indeed have to shoulder a massive burden: either review and get complete test coverage over every library in their dependency stack, or ensure that no non-bignum can ever enter a library function. I guess that also means no strings, no file sizes, no URLs, no values which might get recombined… because any of those values could produce a non-bignum and enter an unsafe Clojure library function, causing an overflow with no possibility of continuing. (This ain't Common Lisp.)

Having digested Rich's notes, pretty much the only thing that I disagree with is the lack of automatic bignum promotion/demotion. It seems like too much of a bad tradeoff, sacrificing one of Clojure's selling points for a little numeric performance. Thus far, Clojure has done pretty well in lexically scoping its "please make this fast and unsafe" bits — primitive calls, loop, transients. This would break that pattern.

I wonder: is there's a middle ground, where primitives are automatic almost everywhere, but bignum promotion on overflow is still possible?
Re: Enhanced Primitive Support Antony Blakey 6/18/10 12:12 AM

+1, what I've been trying to say.

> I wonder: is there's a middle ground, where primitives are automatic almost everywhere, but bignum promotion on overflow is still possible?

That's what I was suggesting by having two versions of functions, one with primitives and one with boxed values. I did some investigation on this for a Smalltalk JIT using LLVM, in the form of partial specialisation e.g. dynamically generating type-specialised versions of functions (methods in ST) and using the types available at the call site to dispatch - which is more obvious in Clojure because we already have multi-method dispatch. For Clojure you wouldn't test all types, it could just be a binary choice of all primitives where expected or not  Of course where a type-specialised function calls another type-specialised function you can avoid the dispatch entirely because you have the type information at the call site. This way, you can pay for the type-driven dispatch once at the start of a computation and end up doing minimal testing (basically dynamically checking that type invariants are maintained e.g. args haven't been promoted in the case under discussion). IMO this is speed + safety.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Reflecting on W.H. Auden's contemplation of 'necessary murders' in the Spanish Civil War, George Orwell wrote that such amorality was only really possible, 'if you are the kind of person who is always somewhere else when the trigger is pulled'.
  -- John Birmingham, "Appeasing Jakarta"


Re: Enhanced Primitive Support Chris Dean 6/18/10 12:36 AM
Richard Newman <holy...@gmail.com> writes:
> Having digested Rich's notes, pretty much the only thing that I
> disagree with is the lack of automatic bignum promotion/demotion. It
> seems like too much of a bad tradeoff,

This nicely summarizes my thoughts.  The other parts of Rich's notes
seem very elegant, but I can see lack of promotion leading to problem in
my own code.  Some external input causing an intermediate result to
overflow would certainly happen.  Probably at the least convenient time
possible.

(Have we discarded the concept of a set of non-promoting math functions
that are only used during optimization? prim-+ prim-/ etc.)

Cheers,
Chris Dean

Re: Enhanced Primitive Support Nicolas Oury 6/18/10 1:44 AM
Dear all,

just finished to read the proposals.

- prim is great and would allow to keep performance in more idiomatic code
- I am pro-num.
  It allows to get read of plenty of annotations or java methods.
 Annotations are difficult for beginners, makes code hard to read, maintain and modify, even for experts.
Moreover, in those days of 64-bits integers, the need for bigint is quite rare, except for Fibonacci or factorial functions ;-b.  
That's a bit provocative, but I think most code that needs bigints are "number-centric" code.
I mean code whose primary interest is the treatment of numbers.

Whereas every single program suffers from annotation or boxing.

Currently, all programs have to pay (annotations, speed, and/or java code).

I feel it is fairer that a few programs, that have high expectation on numbers, should pay.
I think, however, it is important we ensure they have a way to pay. And that it is not too expensive.

Maybe, it could be interesting to keep the old behaviour in safe-+, safe-* and the like?

Or, as Rich said, to make a good big numbers implementation that scales well to small integers.

Maybe, there should be a compiler option for default arithmetic :
* choose from long (the default default), int (for people wanting to run clojure on a mobile phone), FastAndCleverBigNums (for people caring about safety and having unusually big numbers)
* the same could be made for default decimal arithmetic.

One of the great benefit of num, is that it allows:

- equal, which is a big step forward.

Altogether, I like very much the fact that these proposals, together with protocols and datatypes, gives a way to obtain high performance Clojure code in a Java-agnostic way.
It seems a good thing for clojure in clojure and porting to other platforms.

Best regards,

Nicolas. 

Re: Enhanced Primitive Support Nicolas Oury 6/18/10 1:48 AM
On a totally different point:

is there a way of having he equivalent of #^final/#^static for defs?
Sometimes, you know a def won't be dynamically bound/redefined.
Would there be a benefit for the JITed code (less memory barriers or more asm reordering) to give a way to pass this information
to the JVM?

Best,

Nicolas.
Re: Enhanced Primitive Support Christophe Grand 6/18/10 2:44 AM
On Fri, Jun 18, 2010 at 8:58 AM, Richard Newman <holy...@gmail.com> wrote:
> It only takes one library to screw up, and the whole stack falls down.
> Screwing up can occur because of omission (fact being written with a "1",
> not "1N", and not being tested with large inputs), or by design.

It can already happen: a library can screw you by design.

(defn fact [n] (loop [r (int 1) n (int n)] (recur (* r n) (dec n))))
user=> (fact 40)
java.lang.ArithmeticException: integer overflow (NO_SOURCE_FILE:0)

Actually people who want numerical perfs have to do very invasive type
hinting and other interfaces workaround because primitives are...
primitives (not objects).

With contagious bigints (let's nick name them "safeints" and assume
they are not BigInteger but something à la kawa) a single N on
literals or having all your inputs going through a safeint conversion
would trigger safe computations (unless you try to call a static
primitive-hinted fn but the compiler would then bark).

Of course a rogue fn can still internally convert to primitives and
overflow but you can't do anything to protect you from that.

As for mixing several numebr representations in a single collection, I
think it's a problem of normalizing data but most of the time your are
either in safeland or in fastland so your representations will be
homogeneous (safehints or (boxed) primitives).

A middleground may be to keep the semantics proposed by Rich but:
* integer literals default to safeint (unless suffixed by L for long)
* scientific notation and decimal literals default to double (unless
suffixed by M or N)
* implement a better bigint.

Christophe

--
European Clojure Training Session: Brussels, 23-25/6 http://conj-labs.eu/
Professional: http://cgrand.net/ (fr)
On Clojure: http://clj-me.cgrand.net/ (en)

Re: Enhanced Primitive Support Michał Marczyk 6/18/10 3:35 AM
On 18 June 2010 05:57, Mark Engelberg <mark.en...@gmail.com> wrote:
> I assume that most Clojure users really like its dynamic nature.  If
> this is true, then for most of us, the common case is to NOT annotate
> our code with types.  Certainly I like the idea of making it as easy
> as possible to write fast code in Clojure.  I want my dynamically
> typed code to be as fast as it can be, and it's nice to know that
> there is a way to write statically typed code that is even faster.
> But I really don't want to incur a penalty (i.e., possibility of
> Arithmetic Exception) when I'm not using type annotations.
>
> I agree that the burden on those who want to write type annotated code
> is too high.  I agree that this should be made easier.  But
> ultimately, I still want the burden to be on them (or me, the 1% time
> I need it), not those of us who prefer unannotated code.  I that some
> of the changes you have proposed for moving the annotations into the
> header, along with literal notation for long (i.e., 0L rather than
> (long 0)) would be preferable to the bigint dichotomy.

QFT -- I'm in absolute agreement.

It seems to me that in its current shape, the new design -- actually
the "no automatic promotion/demotion" part -- would amount to
sacrificing correctness for performance *by default*; that doesn't
strike me as a good way to go. Not a single language I like using does
that... this of course includes my favourite language, Clojure, as it
currently stands on the master branch. :-)

Part of the reason why I think this is a bad idea is that while it is
all very well to say that people should know what they're doing,
libraries should be well-designed and have excellent test coverage
etc., this is not what happens in practice. (It's also perfectly
possible to write correct concurrent programmes with locks...). And
not necessarily through lack of dilligence; I know I have learned many
a gotcha by getting bitten by it (and rather expect to accumulate
plenty more "bites" in the years to come).

I suppose that's fine as long as one is learning something useful in
the process, but I should think that top numeric performance is
something required only by a certain part of the community -- a
sizeable one, no doubt, but hardly a majority share -- in fact
comprising the people most likely to know what they're doing when
dealing with numbers (why would someone *not* doing involved numeric
computations know what s/he's doing with numbers on a comparable level
to someone who is?). Making things significantly easier for them would
be welcome, of course, but perhaps not at the cost of making
*everyone* deal with entirely new kinds of complexity to support that.

I wouldn't know if the "good" solution Antony Blakey mentions is
feasible currently, but if it isn't, I'd prefer the language to err on
the side of caution, while perhaps offering a better system for
hand-annotating things for top performance (some people have been
using pretty ingenious macros to ease things a little -- perhaps these
could be improved upon to produce an acceptable close-to-universal
solution)?

Right, so these are my initial impressions. I'll continue thinking
about this, the new developments do seem attractive in many ways
(^:static is very cool indeed!) and I'm curious to see if I might be
able to bring myself to see more positive sides to the long/bigint
split (as well as come to an opinion on the equals issue).

All the best,
Michał

Re: Enhanced Primitive Support Carson 6/18/10 3:47 AM
Just tried out num branch and I really like how easy it is to be
fast!  However...

Consider:

(defn fact [n] (if (zero? n) 1 (* n (fact (dec n)))))
(defn twice-fact [n] (fact (fact n)))
(defn bad-twice-fact [n] (fact (-> n fact range last inc)))

user=> (fact (fact 5))
java.lang.ArithmeticException: integer overflow (NO_SOURCE_FILE:3)
user=> (fact (fact 5N))
6689502913449127057588118054090372586752746333138029810295671352301633557244962989366874165271984981308157637893214090552534408589408121859898481114389650005964960521256960000000000000000000000000000N
user=> (twice-fact 5)
java.lang.ArithmeticException: integer overflow (NO_SOURCE_FILE:5)
user=> (twice-fact 5N)
6689502913449127057588118054090372586752746333138029810295671352301633557244962989366874165271984981308157637893214090552534408589408121859898481114389650005964960521256960000000000000000000000000000N
user=> (bad-twice-fact 5)
java.lang.ArithmeticException: integer overflow (NO_SOURCE_FILE:7)
user=> (bad-twice-fact 5N)
java.lang.ArithmeticException: integer overflow (NO_SOURCE_FILE:8)

> Don't buy it. That's the whole point of BigInt contagion. If fact and foo
> are correctly written this will work.

BigInt contagion doesn't help if in some convoluted manner a BigInt's
value is used to construct a primitive sufficiently large that later
causes an overflow.

It'd sure be nice if I could just write ((safe bad-twice-fact) 5) and
get back a BigInt.  Maybe ((safe foo) x) can ensure all arguments and
return values of function foo are boxed, and recursively, ensures
boxing of arguments and return values for all functions called by foo,
etc.  Is that possible? Would that guarantee safety?

Carson

On Jun 17, 11:35 pm, David Nolen <dnolen.li...@gmail.com> wrote:
> On Fri, Jun 18, 2010 at 2:10 AM, Mark Engelberg <mark.engelb...@gmail.com>wrote:
Re: Enhanced Primitive Support Nicolas Oury 6/18/10 4:22 AM
On Fri, Jun 18, 2010 at 11:47 AM, Carson <c.sci...@gmail.com> wrote:

> Don't buy it. That's the whole point of BigInt contagion. If fact and foo
> are correctly written this will work.

BigInt contagion doesn't help if in some convoluted manner a BigInt's
value is used to construct a primitive sufficiently large that later
causes an overflow.


I would like to see that in real examples. And not in something that tend to be mainly a program that crunch numbers.
(I think a program "about numbers" could take a bit of complexity to handle numbers.
 If you write a program to that generates web page, you agree to have an html library and not have "every structure in clojure is html".)

On the other hand, programs that have few number computations - mainly to handle a few stats and indices - but that are spread in a lot of code 
(I think most program are like that) should benefit from good performance without annotating all the code.
Re: Enhanced Primitive Support Carson 6/18/10 4:54 AM
On Jun 18, 2:44 am, Christophe Grand <christo...@cgrand.net> wrote:
> With contagious bigints (let's nick name them "safeints" and assume
> they are not BigInteger but something à la kawa) a single N on
> literals or having all your inputs going through a safeint conversion
> would trigger safe computations (unless you try to call a static
> primitive-hinted fn but the compiler would then bark).

I think the example
(defn bad-twice-fact [n] (fact (-> n fact range last inc)))
shows it takes more than safeint conversion of arguments to the
"surface" function to trigger safe computations.  All functions called
by the "surface" function needs to have their arguments converted to
bigints too, and their return values probably, just in case.  Although
I'm not sure if even that's enough without auto-promotion (if I
understand auto-promotion correctly...).  But note the example doesn't
even use any type hints or explicit conversion to primitives (the
point is the conversion may occur as a legitimate side effect of doing
something else).

*Still*, I'm in favour of fast by default, with an *easy* way to make
things safe as an option.
Re: Enhanced Primitive Support Carson 6/18/10 5:10 AM
On Jun 18, 4:22 am, Nicolas Oury <nicolas.o...@gmail.com> wrote:
> On Fri, Jun 18, 2010 at 11:47 AM, Carson <c.sci.b...@gmail.com> wrote:
> > Don't buy it. That's the whole point of BigInt contagion. If fact and foo
> > > are correctly written this will work.
>
> > BigInt contagion doesn't help if in some convoluted manner a BigInt's
> > value is used to construct a primitive sufficiently large that later
> > causes an overflow.
>
> > I would like to see that in real examples. And not in something that tend
>
> to be mainly a program that crunch numbers.
> (I think a program "about numbers" could take a bit of complexity to handle
> numbers.
>  If you write a program to that generates web page, you agree to have an
> html library and not have "every structure in clojure is html".)

I agree!  Sorry I have no real examples, just my contrived example.
But note the example I used isn't *that* contrived, in the sense that
it doesn't use any type hints or explicit conversion to primitives to
cause an overflow on purpose. The point is that the overflow may occur
as a legitimate side effect of doing something else, whereas it
currently will not.

And *still*, I'm in favour of fast by default, with an *easy* way to
make things safe as an option. The key is to make the safe option
*easy*.
Re: Enhanced Primitive Support Laurent PETIT 6/18/10 5:10 AM
I'm no expert in this field.

What I would like to avoid, is having to become expert in the field to
expect the prevention of more :

  * nightmare debug sessions because of my own code or third party
libraries code
  * crash in production code

which might derive from the choices.

I tend to think that it seems wrong to require more knowledge from
people wanting to write *more* correct code than from people wanting
to write faster/performant code. This is were I would place the
cursor.

2010/6/18 Carson <c.sci...@gmail.com>:> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support Laurent PETIT 6/18/10 5:12 AM
2010/6/18 Carson <c.sci...@gmail.com>:

> And *still*, I'm in favour of fast by default, with an *easy* way to
> make things safe as an option. The key is to make the safe option
> *easy*.

The solution is easy: write java code by default, and clojure code
when you want to be safe :-p

Re: Enhanced Primitive Support Lee 6/18/10 5:39 AM

I too think that "correct by default" is a much better idea than "fast by default."

One way I think about this is that the meaning of +, for example, is the sum of its numeric arguments. The performance characteristics are not part of this core meaning, although of course everyone will be happier the better the performance is. Nonetheless, I think that the meaning should be correctly maintained as a first priority, and that means giving the right answer regardless of how many bits are needed.

I frequently produce BigIntegers, even in code that's not particularly math-heavy. It has always galled me that in most programming languages any such code breaks because of low-level implementation issues that I don't want to have to think about most of the time. I usually just want the right answer, and I certainly don't want to have to think about the binary representations every time I write code that handles numbers. If I find that my code is too slow then I'll have to think about it, and that's when it makes sense to have to deal with declarations or type hints or special faster unsafe arguments or whatever. Ideally this will take as little work as possible, but I think it'd be a bad approach to make fast free (default) at the cost of saying "but the meaning of + may not actually be numeric sum, and this depends on the number of bits required to represent your numbers, so you better always be thinking about low level representation stuff whenever you use numbers."

 -Lee

--
Lee Spector, Professor of Computer Science
School of Cognitive Science, Hampshire College
893 West Street, Amherst, MA 01002-3359
lspe...@hampshire.edu, http://hampshire.edu/lspector/
Phone: 413-559-5352, Fax: 413-559-5438

Check out Genetic Programming and Evolvable Machines:
http://www.springer.com/10710 - http://gpemjournal.blogspot.com/

Re: Enhanced Primitive Support David Nolen 6/18/10 5:45 AM
On Fri, Jun 18, 2010 at 6:47 AM, Carson <c.sci...@gmail.com> wrote:
(defn fact [n] (if (zero? n) 1 (* n (fact (dec n)))))
(defn twice-fact [n] (fact (fact n)))
(defn bad-twice-fact [n] (fact (-> n fact range last inc))) 

Not only is it contrived, under the proposal, this implementation of fact is broken. It needs to be fixed to:

(defn fact [n] (if (zero? n) 1N (* n (fact (dec n)))))

With that change bad-twice-fact is not actually bad.

As Christophe Grand has mentioned one of the main things that is be complained about from the BigInt supporters *is already a problem* in Clojure. Nothing new to see here.

Finally having actually tried out the new branches, this is a good change for me. I'd like to hear more from the BigInt crowd about whether getting their code to work with the new branches is actually causing problems.

So far it just seems like talk.
Re: Enhanced Primitive Support ajuc 6/18/10 5:48 AM
+1 to safe-by-default.

I use clojure for games, but if it will be possible to easily make
computations fast, safe-by-default is more important.
Re: Enhanced Primitive Support Rich Hickey 6/18/10 5:49 AM

This L suffix idea is a non-starter. Here's why:

We're in a reader based language. The reader reads objects. When the  
reader reads 42, it must return an object. It cannot return an int or  
a long primitive. Right now, it returns an Integer. I've also  
conducted experiments where it returned a Long. Certainly 42000000000  
will be a Long. What different thing would the reader return given  
42L, or 42000000000L? Given an int 42 what should the printer print?  
And long 42, Long 42? Would (= 42L (read-string (pr-str 42L)))?

------
The second thing I don't see being acknowledged is that the real  
problem here is the semantic difference. It's not a matter of getting  
at a primitive operation when handed a boxed number - that already  
happens, and pretty efficiently. It is the fact that the results have  
to go somewhere, and the difference in 'where' creates a difference in  
semantics. In master right now:

(+ x y)

and

(+ (identity x) y)

can produce different results. We already have bug reports around  
this, and user confusion. Fixing it requires adopting a single  
semantic for +. Choosing auto-promotion precludes the use of + on  
primitives, a huge loss IMO. Choosing throw-on-overflow precludes auto-
promotion, but preserves polymorphism otherwise, and gives the  
compiler free reign to optimize since it will never be altering the  
semantics by doing so.

I don't see a way around this fundamental dichotomy. The semantics for  
+ should be unified, and the operator that makes the other choice  
needs to be called something else, or placed somewhere else.

Rich

Re: Enhanced Primitive Support Konrad Hinsen 6/18/10 5:56 AM
On 18.06.2010, at 14:49, Rich Hickey wrote:

> I don't see a way around this fundamental dichotomy. The semantics for + should be unified, and the operator that makes the other choice needs to be called something else, or placed somewhere else.

My preference would be "placed somewhere else". We have namespaces, so why not profit from them?

My vote would be for:

1) clojure.core/+ doing bigint promotion and reduction under all circumstances (no optimizations that would break the semantics)

2) clojure.fastmath/+ doing type-specific and optimized addition.

This would retain the "safe by default" approach that I think fits Clojure best, and yet make it easy to use the "fast" version everywhere with nothing to change but the ns form at the top of a module.

Konrad.


Re: Enhanced Primitive Support Rich Hickey 6/18/10 6:11 AM

------
While range could be written differently, this is tantamount to saying  
- if I use a bigint to lookup another number in some mapping, it might  
not be a bigint. Yes, ok, but not a real issue.

-------------
Note - I don't mean to single you out, as others are using the same  
language, but I find this whole characterization of throw-on-overflow  
being somehow 'unsafe' to be a bit specious.

I want it to be perfectly clear to everyone, this is not a choice  
between good math and bad math, i.e. C/Java style truncating/wrapping  
of results. That will never be the default. Such silent truncation/
wrap is truly unsafe, and often the proposed alternative. It was a  
surprising result that optimizations of hot spot, combined with  
pipelining and speculative execution on modern processors, makes the  
overflow checking a very minor overhead above the raw operations, and,  
in my mind, a third, new, viable and safe high-performance option.

Is a program that might throw an exception unsafe? If so, no Java  
program is safe.

Is a vector that throws an exception when asked to store something at  
an index out of range, rather than grow to accommodate it, unsafe?  
This is in many ways similar - you are creating a result that doesn't  
fit, and need to be using a bigger container.

So, let's go easy on the hyperbole. The behavior might not be what you  
desire, but it is not unsafe.

Rich

Re: Enhanced Primitive Support Carson 6/18/10 6:19 AM
On Jun 18, 5:45 am, David Nolen <dnolen.li...@gmail.com> wrote:
> On Fri, Jun 18, 2010 at 6:47 AM, Carson <c.sci.b...@gmail.com> wrote:
> > (defn fact [n] (if (zero? n) 1 (* n (fact (dec n)))))
> > (defn twice-fact [n] (fact (fact n)))
> > (defn bad-twice-fact [n] (fact (-> n fact range last inc)))
>
> Not only is it contrived, under the proposal, this implementation of fact is
> broken. It needs to be fixed to:
>
> (defn fact [n] (if (zero? n) 1N (* n (fact (dec n)))))
>
> With that change bad-twice-fact is not actually bad.

I appreciate what you're saying, and the example above is not broken.
Recall from here:

On Jun 17, 8:00 pm, Rich Hickey <richhic...@gmail.com> wrote:
...
> Certainly one could write a version of fact
> that hardwired bigints as you showed. But you need not - here's the
> original naive definition:
>
> (defn fact [n]
>    (if (zero? n) 1 (* n (fact (dec n)))))
>
> Such logic that doesn't specify one of the components (in this case,
> n) is still polymorphic, as are operations with mixed components. So
> the power is in the hands of the supplier of n:
>
> user=> (fact 10)
> 3628800
>
> user=> (fact 40)
> java.lang.ArithmeticException: integer overflow
>
> user=> (fact 40N)
> 815915283247897734345611269596115894272000000000N
>
> I think you can always write code in this way, such that it will work
> with any of the numeric types, without dictating.

I believe the example (which I have tried on the num branch) shows
providing a BigInt argument does not always make things safe.  That is
all.

Carson
Re: Enhanced Primitive Support Rich Hickey 6/18/10 6:20 AM

Which then begs the questions:

- how will someone 'protect' themselves from libraries written using  
fastmath?

- similar looking code will change semantics when the ns switch is  
made - seems dangerous as it might violate the presumptions of the  
original authors.

Rich


Re: Enhanced Primitive Support Nicolas Oury 6/18/10 6:31 AM
In this situation, I would rather have some kind of toggle *use-bigints-by-default* (defaulted to false) used by the compiler.
Then, if someone needs big integers, he can switch the toggle and recompile everything.
You wouldn't have to change the libraries ns. 

As for being predictable, if you have numbers that can be bigger than 10^19, you would probably know it in advance.
(indices, counters and the like don't grow that big)
Re: Enhanced Primitive Support Laurent PETIT 6/18/10 6:32 AM
2010/6/18 Rich Hickey <richh...@gmail.com>:

> I want it to be perfectly clear to everyone, this is not a choice between
> good math and bad math, i.e. C/Java style truncating/wrapping of results.
> That will never be the default. Such silent truncation/wrap is truly unsafe,

> and often the proposed alternative. It was a surprising result that
> optimizations of hot spot, combined with pipelining and speculative
> execution on modern processors, makes the overflow checking a very minor
> overhead above the raw operations, and, in my mind, a third, new, viable and
> safe high-performance option.
>
> Is a program that might throw an exception unsafe? If so, no Java program is
> safe.
>
> Is a vector that throws an exception when asked to store something at an
> index out of range, rather than grow to accommodate it, unsafe? This is in
> many ways similar - you are creating a result that doesn't fit, and need to
> be using a bigger container.
>
> So, let's go easy on the hyperbole. The behavior might not be what you
> desire, but it is not unsafe.
>

Fair enough.

Re: Enhanced Primitive Support Laurent PETIT 6/18/10 6:35 AM
2010/6/18 Nicolas Oury <nicola...@gmail.com>:

No please, not even in my worst nightmares !
Semantic changing compilation flags ? So I use lib A which needs lib B
compiled with flags C and D, but I also use lib E which needs lib B
compiled with other values for the flags ... I think this just is a
recipe for disaster.

> You wouldn't have to change the libraries ns.
> As for being predictable, if you have numbers that can be bigger than 10^19,
> you would probably know it in advance.
> (indices, counters and the like don't grow that big)

Re: Enhanced Primitive Support Carson 6/18/10 6:39 AM
hmm...didn't see this earlier.
You're right, it's not unsafe, it's undesired behaviour.  My mistake
describing it that way.
Re: Enhanced Primitive Support Nicolas Oury 6/18/10 6:41 AM


On Fri, Jun 18, 2010 at 2:35 PM, Laurent PETIT <lauren...@gmail.com> wrote:

No please, not even in my worst nightmares !
Semantic changing compilation flags ? So I use lib A which needs lib B
compiled with flags C and D, but I also use lib E which needs lib B
compiled with other values for the flags ... I think this just is a
recipe for disaster.

Seen like this, this does not look so great.

However, I would like to note that, in this specific situation, there is a flag that subsumes the other.
I can't see someone relying on overflow (as opposed to promotion) for its behavior...
It would be a bit sick.

Re: Enhanced Primitive Support Rich Hickey 6/18/10 7:04 AM

On Jun 17, 2010, at 11:57 PM, Mark Engelberg wrote:

> Thanks for the responses.
>
> Going back to the naive factorial function:


> (defn fact [n]
>  (if (zero? n) 1 (* n (fact (dec n)))))
>
> Right now,
> user=> (fact 40)
> 815915283247897734345611269596115894272000000000
>
> Under the proposed changes,

> user=> (fact 40)
> java.lang.ArithmeticException: integer overflow
>
> So the same code now risks an arithmetic exception.  What have I, the
> programmer, gained from this new level of risk?  The answer, if I'm
> the kind of programmer who has no interest in putting in type
> annotations into the header of the function, is nothing.  There is no
> speed improvement without the additional type annotations.  So all
> I've gained is the /potential/ for a speed improvement.

I fully acknowledge this is a breaking change that doesn't suit your  
preferences. But again, you are over- and misstating things.

The benefit does not only accrue to those annotating their arguments.

Because many functions and methods return primitives (and more will,  
now that fns can return primitives), any operations on their results  
and/or literals can be optimized. E.g. things like (if (< (dec (count  
x)) 100) ...) can now be optimized into primitives.

>
> I assume that most Clojure users really like its dynamic nature.  If
> this is true, then for most of us, the common case is to NOT annotate
> our code with types.  Certainly I like the idea of making it as easy
> as possible to write fast code in Clojure.  I want my dynamically
> typed code to be as fast as it can be, and it's nice to know that
> there is a way to write statically typed code that is even faster.
> But I really don't want to incur a penalty (i.e., possibility of
> Arithmetic Exception) when I'm not using type annotations.
>
> I agree that the burden on those who want to write type annotated code
> is too high.  I agree that this should be made easier.  But
> ultimately, I still want the burden to be on them (or me, the 1% time
> I need it), not those of us who prefer unannotated code.  I that some
> of the changes you have proposed for moving the annotations into the
> header, along with literal notation for long (i.e., 0L rather than
> (long 0)) would be preferable to the bigint dichotomy.
>
> I agree that an improved bigint type would address the performance
> aspect of my argument.  On the other hand, it wouldn't really address
> the problem of using these things in sets and hash tables.  I think
> it's very important that Clojure have one unique canonical
> representation for numbers arrived at through Clojure-based
> computations.  It is very difficult to use numbers sanely if that's
> not the case.
>

Agreed that this equality bit is a separate issue.


Rich

Re: Enhanced Primitive Support stuart....@gmail.com 6/18/10 7:08 AM
While I enjoy a theoretical debate as much as the next person, it  
would be nice to see

(1) examples of real code that fail on reasonable inputs, if this  
change were made. And how difficult it would be to fix them.

(2) examples of real code that get way faster, and the ease of making  
those changes (if any).

WRT #1, we have tested a few dozen libs, and found only a single  
trivial breakage in their test suites. But I would say tha is still  
just plural anecdotes. More data welcome!

Re: Enhanced Primitive Support Christophe Grand 6/18/10 7:09 AM
On Fri, Jun 18, 2010 at 2:49 PM, Rich Hickey <richh...@gmail.com> wrote:
> This L suffix idea is a non-starter. Here's why:
>
> We're in a reader based language. The reader reads objects. When the reader
> reads 42, it must return an object. It cannot return an int or a long
> primitive. Right now, it returns an Integer. I've also conducted experiments
> where it returned a Long. Certainly 42000000000 will be a Long. What
> different thing would the reader return given 42L, or 42000000000L? Given an
> int 42 what should the printer print? And long 42, Long 42? Would (= 42L
> (read-string (pr-str 42L)))?

But what if the reader reads 42 as a BigInteger (or an alternative
implementation) and 42L as a Long?
then you should be able to tell them apart.

I agree with the goals of the num branch, I'm just unconvinced by the
reader defaulting to Long for integer literals.

Christophe

Re: Enhanced Primitive Support Heinz N. Gies 6/18/10 7:11 AM

On Jun 18, 2010, at 15:11 , Rich Hickey wrote:
> So, let's go easy on the hyperbole. The behavior might not be what you desire, but it is not unsafe.
I agree with both actually it isn't unsafe but in my eyes undesired, met me explain.

In generally it seems we have two options:

1) We make fast default and loose auto promotion in the number stack.

2) We make it more difficult to be fast (as in extra work) but keep promotion.

Lets hope that I got this right otherwise the following argumentation can be skipped since it's based on this understandings. (of cause there might be a way to accomplish both which would - without question the best way to go ;)

Now let see the auto promotion has one very nice effect, it just works. That simple and this is in my eyes hugely important, because it gives people a happy and fluffy feeling to be able to write normal code without special considerations and deep knowledge of types and hinting. Lets face it for 90% of people it does not matter that it is a long or bigint or even a double they expect math to just work, this is one of the reasons why people enjoy languages like Ruby or buy Apple hardware - because it just works (yes we don't play on the same ground but it was a good principle of it). If we go to the point where people need to know that they have to pass a 42N instead of 42 or get an exception it is just plain ugly, it is javaish and that things are not javaish is one thing poeple love about clojure, and just to not be tossed in the bucket, it is not insecure just horrible inconvenient and annoying.

But lets look at the other side of the coin, performance. We all want it - of cause otherwise we'd put a lot of (Thread/sleep 1000) in our code but, again lets face the reality, most of us don't need it. Most of the time we don't need to do fac or fib or very fast math, in 90% of the cases what clojure offers is just fast enough since the bottle necks isn't arithmetics. For the rest 10% of the cases, well if you are in that case you likely have a complex algorithm. This again means that you belong to the people who really know what they are doing, and know what your algorithm works, given that someone in this position should know where to put performance  optimization so the fact that it needs to be explict isn't that horrible in my eyes usually the code that has to be made more complex is alway some kind of focus code where you do way more then just toss a few statics on it and hope that it is fast but try every trick to get it fast - so you'll have the work anyway.

Now what is my conclusion: lets keep the 'it just works' and sacrifice some out of the box performance gains which 90% of the users won't notice anyway and half of the people notice it will only do so because they benchmark for exactly this. Especially since number arithmetics are rarely the bottlenecks, database lookups, string manipulations, IO, network and object allocation those are things that eat much performacne in the most cases.

Just my two (european!) cent :)

That said I find the possible performance gain extraordinary and brilliant, and I'm all for it just not for the sacrifice of 'it just works'. I rather write (long 1) in 2 functions where it is really needed then making sure I write 42N or ^Bigint in every function where I'm not 100% certain (which will likely be all but the two functions where I'd write logn).


Regards,
Heinz

Re: Enhanced Primitive Support stuart....@gmail.com 6/17/10 9:06 PM
> So the same code now risks an arithmetic exception.  What have I, the
> programmer, gained from this new level of risk?  The answer, if I'm
> the kind of programmer who has no interest in putting in type
> annotations into the header of the function, is nothing.  There is no
> speed improvement without the additional type annotations.  So all
> I've gained is the /potential/ for a speed improvement.

It will be much easier for people to write fast library code (including in Clojure and contrib). So if you ever use Clojure libraries, you will get a speed boost.


This message has been hidden because it was flagged for abuse.
Re: Enhanced Primitive Support Nicolas Oury 6/18/10 7:24 AM
I meant:

 (= (* 20 (+ not-a-bottleneck-1 ...  not-a-bottleneck-250)) not-a-bottleneck)
Re: Enhanced Primitive Support Rich Hickey 6/18/10 7:26 AM

On Jun 18, 2010, at 10:09 AM, Christophe Grand wrote:

> On Fri, Jun 18, 2010 at 2:49 PM, Rich Hickey <richh...@gmail.com>  
> wrote:
>> This L suffix idea is a non-starter. Here's why:
>>
>> We're in a reader based language. The reader reads objects. When  
>> the reader
>> reads 42, it must return an object. It cannot return an int or a long
>> primitive. Right now, it returns an Integer. I've also conducted  
>> experiments
>> where it returned a Long. Certainly 42000000000 will be a Long. What
>> different thing would the reader return given 42L, or 42000000000L?  
>> Given an
>> int 42 what should the printer print? And long 42, Long 42? Would  
>> (= 42L
>> (read-string (pr-str 42L)))?
>
> But what if the reader reads 42 as a BigInteger (or an alternative
> implementation) and 42L as a Long?
> then you should be able to tell them apart.
>

And keeps them that way? Hrm. Then most programs, that currently have  
no bigints, will now be full of them. And the boxed math will be  
slower still, as it will be the (possibly improved, but still more  
complex) logic of bigints. And the interop will be trickier. The  
current boxing matches the boxing you will get from things from the  
outside world. This wouldn't. And if this is just a 'way to  
communicate with the compiler' thing, then read/print round-tripping  
will break.

> I agree with the goals of the num branch, I'm just unconvinced by the
> reader defaulting to Long for integer literals.
>

It's important not to get focused on the literals. There are  
primitives coming from other places as well. The semantic dichotomy is  
the driver. The choice of the semantics of + can't be based upon the  
boxed-ness of the argument, as it is now.

Rich

Re: Enhanced Primitive Support Heinz N. Gies 6/18/10 7:47 AM

On Jun 18, 2010, at 16:24 , Nicolas Oury wrote:
> I am not sure (= (* 20 not-a-bottleneck) not-a-bottleneck)
>
> or more precisely:
>  (= (* 20 not-a-bottleneck-1 ...  not-a-bottleneck-250) not-a-bottleneck)

Point is, no one will write 250 variables even less so values by hand, it will be a map, reduce or whatever and then it's pretty easy to type hint it if the performance is an issue at this point and if you write this code you likely will know that this is a bottleneck. (I ignore the * there since you corrected it, otherwise it'd be a nice Palce to point out that it is likely becoming a overflow and an exception w/o automatic promotion :P)

Regards,
Heinz

Re: Enhanced Primitive Support Nicolas Oury 6/18/10 7:50 AM

Point is, no one will write 250 variables even less so values by hand, it will be a map, reduce or whatever and then it's pretty easy to type hint it if the performance is an issue at this point and if you write this code you likely will know that this is a bottleneck. (I ignore the * there since you corrected it, otherwise it'd be a nice Palce to point out that it is likely becoming a overflow and an exception w/o automatic promotion :P)


That would have been really better to have an exception as it was a bug :-b. 
Re: Enhanced Primitive Support Nicolas Oury 6/18/10 8:23 AM
Experiment results :

I switched a program to num.
It was quite straightforward. 
First I had to replace a few unchecked-add by +.
Has unchecked-add changed name? (I use it for indices when itering on a java array, if there is an overflow, you will know it beforehand)

The only funny event is that it was much much much slower at first.
Mainly doing reflection stuff.
I used *warn-on-reflection* and found:


( < val 0) where val is a double. Ugly, bad and wrong. I am not proud. 
(* some-int some-double), that I changed to (* (double some-int) some-double)

Then, I had some results on the benchmark I use to test my change. (I remove 14 seconds to each, because it is the compilation time. As this program is made to run for a long time and I want to test the stationary speed,I put a very low threshold for the jit. I don't know if it is a good idea or not.)
Initial program :                           92 seconds - 14 = 78s
Corrected program with master :  86 seconds - 14 = 72s
Corrected program with num :     80 seconds - 14 = 66s

Note that it is quite heavy on the GC, so there is 5s gc in that, which increase still more the difference.
Altogether, it is around 10% faster, without touching any annotation and static function.
Moreover, it might be difficult to believe if you read the beginning of my mail, but I spend a lot of time adding annotations where I thought it was useful. 
That's difficult, and still there is a nice speedup.
Most of all, I could start added annotations to function. The problem is I use a lot of protocols.
Are there plans for support of prim in protocols?

Last thing, there is no single bottleneck that I could rewrite easily. This 10% is spread over the program.
Result of profiling: 

  Interpreted + native   Method                        
  0.2%     2  +    10    clojure.core__init.load
  0.1%     8  +     0    distribution.FastProductWithinComp.non_local_draw
  0.1%     7  +     0    clojure.core$vector.invoke
  0.1%     7  +     0    java.util.WeakHashMap.get
...
2.8%   177  +    32    Total interpreted (including elided)

     Compiled + native   Method                        
  2.4%   183  +     0    clojure.lang.RT.seqFrom
  2.2%   164  +     1    distribution.FastProductWithinComp.non_local_draw
  1.9%    61  +    81    java.util.WeakHashMap.getTable
  1.4%   103  +     0    clojure.lang.RestFn.invoke
  0.8%    62  +     1    clojure.core.protocols$fn__5284.invoke
  0.7%    51  +     0    clojure.lang.RT.next
  0.4%    31  +     1    clojure.lang.Var.pushThreadBindings
....
14.8%  1019  +   100    Total compiled (including elided)

         Stub + native   Method                        
 81.3%     0  +  6154    java.lang.Object.hashCode

And yes, this 10% are obtained in a code that mainly use HashMaps to retrieve already computed values (81.3% seems spent there).
So I would expect far better speed-up in a more typical program.

(Game : guess what the program does from its profile...)




Re: Enhanced Primitive Support Hugo Duncan 6/18/10 9:48 AM
On Fri, 18 Jun 2010 09:20:35 -0400, Rich Hickey <richh...@gmail.com>  
wrote:

>
> On Jun 18, 2010, at 8:56 AM, Konrad Hinsen wrote:
>
>> On 18.06.2010, at 14:49, Rich Hickey wrote:
>>
>>> I don't see a way around this fundamental dichotomy. The semantics for  
>>> + should be unified, and the operator that makes the other choice  
>>> needs to be called something else, or placed somewhere else.

If you are doing a lot of work with whichever semantic requires decorated  
use, then it is going to seem awkward.  A namespace based solution would  
allow "natural" use of whichever semantic was appropriate for the domain  
you are coding.  It would still require an explicit choice - pulling in a  
different namespace.

> - how will someone 'protect' themselves from libraries written using  
> fastmath?

If I understand correctly, that is still an issue, whatever syntax is  
chosen.

> - similar looking code will change semantics when the ns switch is made

The switch is explicit.

> - seems dangerous as it might violate the presumptions of the original  
> authors.

Not sure I understand this. Presumably if you change your selected  
semantics within a namespace, you had better be sure of how that affects  
all the functions in that namespace.


--
Hugo Duncan

Re: Enhanced Primitive Support puzzler 6/18/10 10:51 AM
On Thu, Jun 17, 2010 at 9:06 PM, Stuart Halloway
<stuart....@gmail.com> wrote:
> It will be much easier for people to write fast library code (including in Clojure and contrib). So if you ever use Clojure libraries, you will get a speed boost.

I am skeptical of this claim.  Since static functions lose some of
their dynamism, I don't think library writers are going to rush to
convert all their code over to the "fast" format.  I think library
writers will care the most about ensuring their functions play well
with Clojure's dynamic ecosystem, and so, as before, type-optimized
code will be written only sparingly.  (I'll admit though, I don't
fully understand yet which dynamic features are lost by static
functions, and how much of an impact it will have, so I could be wrong
about this.  Just wanted to throw this out as something to consider.)

Re: Enhanced Primitive Support Nicolas Oury 6/18/10 11:03 AM
- There is a speedup without static or annotation.
- Static keep the dynamic semantic, if I understood well the proposal (only direct first-order call are static)


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support miner 6/18/10 11:24 AM
On Jun 18, 8:49 am, Rich Hickey <richhic...@gmail.com> wrote:
> Fixing it requires adopting a single  
> semantic for +. Choosing auto-promotion precludes the use of + on  
> primitives, a huge loss IMO. Choosing throw-on-overflow precludes auto-
> promotion, but preserves polymorphism otherwise, and gives the  
> compiler free reign to optimize since it will never be altering the  
> semantics by doing so.
>
> I don't see a way around this fundamental dichotomy. The semantics for  
> + should be unified, and the operator that makes the other choice  
> needs to be called something else, or placed somewhere else.

I'm just a newbie here, but I would prefer the simpler and faster +
(with throw-on-overflow).  I don't want to annotate my arguments
(carefully!) to get primitive performance.  I want the best
performance in the simple cases.  The primitive domain is big enough
for most programs.

I would be willing to use separate function names ("auto-promo-+",
etc.) to get the "safe" auto-promotion semantics.  Perhaps they could
be in a separate namespace so that you can make your + symbol refer to
the auto-promoting one if you want to pay the cost.  By the way, I
also want to know if a library is using auto-promotion so that I can
avoid it unless I really need it.  And, yes, I think I'll know enough
about my domain to make that decision.

Steve Miner
Re: Enhanced Primitive Support Rich Hickey 6/18/10 1:52 PM
I've revised and enhanced the strategy, based upon the feedback here.  
I think it is a nice compromise.

Docs (see update section at the top)

https://www.assembla.com/wiki/show/b4-TTcvBSr3RAZeJe5aVNr/Enhanced_Primitive_Support

Code:

http://github.com/richhickey/clojure/commit/c79d28775e06b196ae1426f6c1446d00b621d2e1

Thanks to all for the feedback, keep it coming!

Rich


On Jun 17, 2010, at 4:13 PM, Rich Hickey wrote:

> I've been doing some work to enhance the performance, and unify the
> semantics, of primitives, in three branches. I've started to document
> this work here:
>
> https://www.assembla.com/wiki/show/clojure/Enhanced_Primitive_Support
>
> Feedback welcome,
>
> Rich
>
>

Re: Enhanced Primitive Support Heinz N. Gies 6/18/10 1:57 PM

On Jun 18, 2010, at 22:52 , Rich Hickey wrote:

> I've revised and enhanced the strategy, based upon the feedback here. I think it is a nice compromise.

I agree this is a very nice compromise :) great work again and it is really cool to see how a community effort shapes clojure itself :D

Regards,
Heinz

Re: Enhanced Primitive Support David Nolen 6/18/10 2:01 PM
On Fri, Jun 18, 2010 at 4:52 PM, Rich Hickey <richh...@gmail.com> wrote:
I've revised and enhanced the strategy, based upon the feedback here. I think it is a nice compromise.

Docs (see update section at the top)

https://www.assembla.com/wiki/show/b4-TTcvBSr3RAZeJe5aVNr/Enhanced_Primitive_Support

Code:

http://github.com/richhickey/clojure/commit/c79d28775e06b196ae1426f6c1446d00b621d2e1

Thanks to all for the feedback, keep it coming!

Rich

Tried it out. Even faster than before. Love it.

David 

Re: Enhanced Primitive Support Paul Moore 6/18/10 9:28 AM

I'm new to clojure, and while I have an interest in fast number
crunching, and in easily using big numbers, I don't have a valid
opinion on the whole matter.

However, I know that some time back, Python changed in precisely the
opposite direction - from separate small and big numbers to a unified
numeric type. Their situation is completely different (not least
because all Python objects are effectively boxed, so the speed benefit
of primitive arithmetic isn't relevant) but there may be real-world
examples of code that demonstrates (1) in the Python discussions. If
the information might be useful, I'd be happy to do some searching of
the archives.

Paul.

Re: Enhanced Primitive Support Daniel 6/18/10 12:47 PM
Save for the new equality semantics (bravo!), most of this can be
viewed as a loss for newcomers to the language, especially those using
project euler to learn the language.  Shame you've been unable to
think of some way to garuntee top performance while keeping overflow
prevention the norm.
Re: Enhanced Primitive Support Daniel 6/18/10 1:02 PM
This also seems to break the principle of make it work, make it right,
make it fast.  Math in Clojure isn't to the point that it works well
yet.
Re: Enhanced Primitive Support Michał Marczyk 6/18/10 2:41 PM
On 18 June 2010 21:47, Daniel <double...@gmail.com> wrote:
> Save for the new equality semantics (bravo!), most of this can be
> viewed as a loss for newcomers to the language, especially those using
> project euler to learn the language.  Shame you've been unable to
> think of some way to garuntee top performance while keeping overflow
> prevention the norm.

Actually the new idea with the + / +' split addresses this, so the
Eulerians (?) among us may be happy again. I take it you've posted
without reading the bottom of the thread or your message was stuck in
the moderation queue...

Incidentally, at first glance, I love the new tip of equal. Really
exciting stuff! :-)

Sincerely,
Michał

Re: Enhanced Primitive Support Heinz N. Gies 6/18/10 2:53 PM

Daniel the decision was to keep things working and add speed as an option by using +' (and friends) specifically. So actually what happened is exactly what you wished for :).


Just tested with the newest code:

user=> (defn fact [n] (if (zero? n) 1 (*' n (fact (dec' n)))))
#'user/fact
user=> (fact 42)


java.lang.ArithmeticException: integer overflow (NO_SOURCE_FILE:3)
user=> (defn fact [n] (if (zero? n) 1 (* n (fact (dec n)))))  
#'user/fact
user=> (fact 42)
1405006117752879898543142606244511569936384000000000N


Regards,
Heinz

Re: Enhanced Primitive Support puzzler 6/18/10 3:27 PM
On Fri, Jun 18, 2010 at 1:52 PM, Rich Hickey <richh...@gmail.com> wrote:
> I've revised and enhanced the strategy, based upon the feedback here. I
> think it is a nice compromise.

Looks good to me.

Re: Enhanced Primitive Support Michał Marczyk 6/18/10 3:31 PM
In connection to a conversation on #clojure (in progress as I write)...

Apparently loop/recur locals are now primitive by default:

(defn fact [n]
  (loop [n n r 1]
    (if (zero? n)
      r
      ;; note the regular * on the next line
      (recur (dec n) (* r n)))))

will throw ArithmeticException because of integer overflow, because r
is a long. The solution provided by Rich is to "hint" r to be a boxed
number: (num r) => gives old behaviour. I'm vaguely in favour of
switching around to boxed-by-default locals, but that's just a very
early initial impression... Just wanted to point this out for the
consideration of others here.

Sincerely,
Michał

Re: Enhanced Primitive Support Heinz N. Gies 6/18/10 3:32 PM

On Jun 18, 2010, at 22:52 , Rich Hickey wrote:

Thanks to all for the feedback, keep it coming!

Okay you asked for it :P

to quote from the IRC:

mmarczyk:
also, (defn fact [n] (loop [n n r 1] (if (zero? n) 1 (recur (dec n) (* r n))))) throws on (fact 40) -- that's with *, not *'

and it trhows: java.lang.IllegalArgumentException: Value out of range for long: 158905670470170624000


and as you explained it is because 1 is a primitive instead of a Number. This is quite frankly not the best behavior in my eyes. Not only is the error message overly cryptic, and if you hadn't explained us what went wrong we likely taken hours to figure it out, but also the behavior is very very strange. This again is not 'it just works' but 'it works most of the time and if not has horrible painful and unpredictable consequences'.

I start this again, I know, but imagine someone starting to code in clojure and reaching this point, they will be super frustrated, toss the language in the next corner and start crying for the next full moon. Don't get me wrong it is not a reason to drop the language, god (or whatever you believe in) forbid, but it is very well a reason not to learn the language. I say it frankly if that would have been one of my first experiences, before I learned what joy and beauty clojure can be, I'd had turned without a second thought - and taking this is a very simple example many newcomers might actually write, this is a not unlikely scenario.

Regards,
Heinz
This message has been hidden because it was flagged for abuse.
Re: Enhanced Primitive Support Daniel 6/18/10 3:23 PM
Sorry guys - missed that latest post.  This new approach is something
I can definitely get behind.  :)
Re: Enhanced Primitive Support Rich Hickey 6/18/10 6:33 PM
Turnabout is fair play, so I've produced a version that swaps the
defaults, still in the 'equal' branch:

Docs:

https://www.assembla.com/wiki/show/b4-TTcvBSr3RAZeJe5aVNr/Enhanced_Primitive_Support

Code:

http://github.com/richhickey/clojure/commit/310534b8e7e7f28c75bb122b4bf1bee320cdae67

You can get the older arbitrary-precision default with this commit:

http://github.com/richhickey/clojure/commit/7652f7e935684d3c7851fbcad8ddce97e510a5a6

I've also temporarily enabled a diagnostic (in both) that tells you
when you have a mismatch between a loop initializer and its recur
form. It goes off over a hundred times in Clojure itself, when using
the arbitrary precision default. In each case, the recur value is
needlessly being boxed, every iteration of a loop, for loops that will
never be bigints; indexes, counters etc. I expect this will be very
typical of most code. But removing that useless overhead would be a
lot of tedious work.

With the defaults swapped, only 2 warnings.

Pay for what you use ...

Rich


On Jun 18, 4:52 pm, Rich Hickey <richhic...@gmail.com> wrote:
> I've revised and enhanced the strategy, based upon the feedback here.  
> I think it is a nice compromise.
>
> Docs (see update section at the top)
>
> https://www.assembla.com/wiki/show/b4-TTcvBSr3RAZeJe5aVNr/Enhanced_Pr...
>
> Code:
>
> http://github.com/richhickey/clojure/commit/c79d28775e06b196ae1426f6c...
Re: Enhanced Primitive Support Aaron Cohen 6/18/10 6:52 PM
> I've also temporarily enabled a diagnostic (in both) that tells you
> when you have a mismatch between a loop initializer and its recur
> form. It goes off over a hundred times in Clojure itself, when using
> the arbitrary precision default. In each case, the recur value is
> needlessly being boxed, every iteration of a loop, for loops that will
> never be bigints; indexes, counters etc. I expect this will be very
> typical of most code. But removing that useless overhead would be a
> lot of tedious work.
>
> With the defaults swapped, only 2 warnings.
>

How often does the warning go off if you go back to
arbitrary-precision by default, but make literals return boxed numbers
again? I suspect your comparison is unfair because lots of the loop
initializers are probably literals.

---

A concern I have is that interop forms that return primitives can
cause a loop var to have a primitive type.

(loop [procs (.availableProcessors Runtime/getRuntime)] ; Current
clojure makes procs a primitive, do people really expect that?

Would it be possible for loop to use boxed numbers always by default,
and only use primitives if some specific metadata is on the
initializer?

So in that world:

(loop [i 5] ; Boxing is always done by the loop statement, even if
initialized with a primitive

(loop [procs (.availableProcessors Runtime/getRuntime)] ; Boxing is
done by the loop statement

(loop [^:primitive procs (.availableProcessors Runtime/getRuntime)] ;
Primitive is used, good because we are expecting it


I'm not sure which order the metadata should go in such a scheme.

(loop [^:primitive i 5]
(loop [i ^:primitive 5]
(loop [i (prim 5)]

--Aaron

Re: Enhanced Primitive Support puzzler 6/18/10 7:09 PM
An idea to consider:

How about keeping the arbitrary-precision default, but add a loop'
construct to the family of +',-',*',inc', and dec' for primitive
optimization?  The loop' construct would bind primitive literals,
whereas the loop construct would keep the literals boxed, so people
who don't want to analyze types rigorously in their loop/recur
constructs won't get mismatch errors.

I see that using primitive literals by default in loop/recur offers
speed benefits for the common case (recur type matches the
loop-declared type), but I worry about how hard it might be to analyze
and make sure that I'm always recurring with a type that exactly
matches what is in the loop.

Any other ideas on how to solve the problem of mismatch errors for
"casual coders"?

Re: Enhanced Primitive Support Rich Hickey 6/18/10 7:12 PM

On Jun 18, 2010, at 9:52 PM, Aaron Cohen wrote:

>> I've also temporarily enabled a diagnostic (in both) that tells you
>> when you have a mismatch between a loop initializer and its recur
>> form. It goes off over a hundred times in Clojure itself, when using
>> the arbitrary precision default. In each case, the recur value is
>> needlessly being boxed, every iteration of a loop, for loops that  
>> will
>> never be bigints; indexes, counters etc. I expect this will be very
>> typical of most code. But removing that useless overhead would be a
>> lot of tedious work.
>>
>> With the defaults swapped, only 2 warnings.
>>
>
> How often does the warning go off if you go back to
> arbitrary-precision by default, but make literals return boxed numbers
> again? I suspect your comparison is unfair because lots of the loop
> initializers are probably literals.
>

It would be easy to make it go away. The point is not the mismatches  
themselves, but they serve as a counter - how many loops had primitive  
init and needed only primitive arithmetic? Each warning is a case  
where it could be automatically fast but isn't. Making the warning go  
away will leave the inefficiency but make it silent. It is just a  
measurable loss to counter all the theoretical arguments being made.

> ---
>
> A concern I have is that interop forms that return primitives can
> cause a loop var to have a primitive type.
>
> (loop [procs (.availableProcessors Runtime/getRuntime)] ; Current
> clojure makes procs a primitive, do people really expect that?
>
> Would it be possible for loop to use boxed numbers always by default,
> and only use primitives if some specific metadata is on the
> initializer?
>
> So in that world:
>
> (loop [i 5] ; Boxing is always done by the loop statement, even if
> initialized with a primitive
>
> (loop [procs (.availableProcessors Runtime/getRuntime)] ; Boxing is
> done by the loop statement
>
> (loop [^:primitive procs (.availableProcessors Runtime/getRuntime)] ;
> Primitive is used, good because we are expecting it
>
>
> I'm not sure which order the metadata should go in such a scheme.
>
> (loop [^:primitive i 5]
> (loop [i ^:primitive 5]
> (loop [i (prim 5)]

Yes, it's easy to imagine a world where people who want efficient code  
have to jump through hoops to get it. OTOH, you can just say (num some-
expr) to force it to be boxed, if you want assurance of an Object  
initializer. Which will be the more common need?

I have to say I'm in the 'pay for what you use' camp - you need a box,  
you ask for one. If I don't (and neither do any of those loops), why  
should I have to do extra work to avoid it?

Rich

Re: Enhanced Primitive Support Mark Fredrickson 6/18/10 7:18 PM
So far most of the action has concerned arithmetic ops (+, -, *, /).
Will these new semantics include the bit-shift operators? I vote yes.
My use cases for bit ops would benefit from primitive ops.

On a related note, my use cases call for silent overflow of bit shifts
(pseudo random number generators). Will there be a way to disable
overflow exceptions (either via a binding or through unchecked-*
functions)?

Thanks,
-Mark


On Jun 18, 8:33 pm, Rich Hickey <richhic...@gmail.com> wrote:
> Turnabout is fair play, so I've produced a version that swaps the
> defaults, still in the 'equal' branch:
>
> Docs:
>
> https://www.assembla.com/wiki/show/b4-TTcvBSr3RAZeJe5aVNr/Enhanced_Pr...
>
> Code:
>
> http://github.com/richhickey/clojure/commit/310534b8e7e7f28c75bb122b4...
>
> You can get the older arbitrary-precision default with this commit:
>
> http://github.com/richhickey/clojure/commit/7652f7e935684d3c7851fbcad...
Re: Enhanced Primitive Support Rich Hickey 6/18/10 7:20 PM

On Jun 18, 2010, at 10:09 PM, Mark Engelberg wrote:

> An idea to consider:
>
> How about keeping the arbitrary-precision default, but add a loop'
> construct to the family of +',-',*',inc', and dec' for primitive
> optimization?  The loop' construct would bind primitive literals,
> whereas the loop construct would keep the literals boxed, so people
> who don't want to analyze types rigorously in their loop/recur
> constructs won't get mismatch errors.
>
> I see that using primitive literals by default in loop/recur offers
> speed benefits for the common case (recur type matches the
> loop-declared type), but I worry about how hard it might be to analyze
> and make sure that I'm always recurring with a type that exactly
> matches what is in the loop.
>

Did you try it? The compiler does that analysis and reports any  
mismatch:

user=> (defn foo [] (loop [x 42] (recur 4.2)))
NO_SOURCE_FILE:2 recur arg for primitive local: x is not matching  
primitive, had: double, needed: long

user=> (defn foo [] (loop [x 42] (recur false)))
NO_SOURCE_FILE:5 recur arg for primitive local: x is not matching  
primitive, had: java.lang.Boolean, needed: long

user=> (defn foo [] (loop [x 4.2] (recur (meta #'first))))
NO_SOURCE_FILE:15 recur arg for primitive local: x is not matching  
primitive, had: java.lang.Object, needed: double

> Any other ideas on how to solve the problem of mismatch errors for
> "casual coders"?
>

Those could become hard errors again. But there are legitimate correct  
cases that would then need annotation:

(defn foo [] (loop [x 42] (recur (:y z))))

Rich

Re: Enhanced Primitive Support Rich Hickey 6/18/10 7:22 PM

On Jun 18, 2010, at 10:18 PM, Mark Fredrickson wrote:

> So far most of the action has concerned arithmetic ops (+, -, *, /).
> Will these new semantics include the bit-shift operators? I vote yes.
> My use cases for bit ops would benefit from primitive ops.
>
> On a related note, my use cases call for silent overflow of bit shifts
> (pseudo random number generators). Will there be a way to disable
> overflow exceptions (either via a binding or through unchecked-*
> functions)?
>

Yes, bit-ops will be made to align with whatever is done here.

Rich

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient  
> with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support dmiller 6/18/10 8:47 PM
> Yes, it's easy to imagine a world where people who want efficient code
> have to jump through hoops to get it. OTOH, you can just say (num some-
> expr) to force it to be boxed, if you want assurance of an Object
> initializer. Which will be the more common need?
>

From the wiki page "Enhanced Primitive Support":  * Note: this means
that locals initialized with literals will have primitive type, e.g.
(let [x 42] …), and especially: (loop [x 42] …). If you intend to
recur with a non-primitive, init like this, (loop [x (num 42)] …)


I'd like to ask for some consideration on any use of (num x).   On the
CLR side, it is unimplementable as documented, there being no
equivalent in the CLR to Number.  If a proposed use of num can be
satisfied by this definition of num:

(defn num {tag :object} [x] x)

I can manage. I imagine this to be the case, but haven't had time to
read all the new code.

If you really mean to use a type that is a base type for all the
primitive numeric types -- not going to happen on the CLR.

-David
Re: Enhanced Primitive Support MarkSwanson 6/18/10 9:27 PM

> I have to say I'm in the 'pay for what you use' camp - you need a box,  
> you ask for one. If I don't (and neither do any of those loops), why  
> should I have to do extra work to avoid it?

+1

Barely worth mentioning:
:s /non-prime/non-primitive/g

Re: Enhanced Primitive Support MarkSwanson 6/18/10 9:32 PM
> :s /non-prime/non-primitive/g

Oh, nvm. You were referring to '.

Cheers.
Re: Enhanced Primitive Support puzzler 6/18/10 11:03 PM
I'm confused.  In the latest scheme, what is the eventual plan for
handling a recur to a loop element that was initialized with a
primitive?  Are boxed numbers automatically coerced to the primitive?
Would a long be coerced to a double, or double to long?  Under what
scenarios will you get a warning, and when will you get an error?

I think what troubles me the most about this loop/recur issue is that
the semantics of recur depend completely on whether a variable is
assigned a primitive or boxed numeric type.  To know what the recur
will do, you must know the type of the assigned value.  This has
always been the case, but in the current version of Clojure, provided
you aren't using Java interop, it's relatively straightforward to
know.  Unless the variable or literal is explicitly cast to a
primitive, it's not a primitive.  But now, with the advent of static
functions which can return primitives, it becomes more likely to not
easily be able to determine this fact.  If I say (loop [x (foo 2)]...)
 I have no way of knowing what's going to happen when I recur.  Yes, I
realize that you can run the function and use warnings/errors to find
out.  But I find it unsettling to have a commonly used code form with
semantics I can't predict just by looking at it.

It's attractive that the current proposal gives speed benefits to
common cases with no additional annotation.  But I personally place a
higher value on being able to understand the semantics of my code when
reading through it.  I would prefer a version where loop requires some
sort of explicit annotation on a variable if you want it to require a
primitive upon recur.  (Ideally, I'd probably want the annotation on
the variable using a syntax that mirrors static functions, rather than
the old technique of typecasting the initializer, since the whole
typecasting system makes less sense now that literals and certain
return values are already primitives.  Unifying the syntax and
semantics of using primitives in loops with the syntax and semantics
of using primitives as inputs to static functions makes a lot of sense
to me.)

Re: Enhanced Primitive Support Tim Daly 6/18/10 11:50 PM
Is it possible to follow the Common Lisp convention
of proclaim/declaim/declare in order to specify types?

Mark Engelberg wrote:
> Thanks for the responses.
>
> Going back to the naive factorial function:


> (defn fact [n]
>   (if (zero? n) 1 (* n (fact (dec n)))))
>
> Right now,
> user=> (fact 40)
> 815915283247897734345611269596115894272000000000
>
> Under the proposed changes,
> user=> (fact 40)
> java.lang.ArithmeticException: integer overflow
>
> So the same code now risks an arithmetic exception.  What have I, the
> programmer, gained from this new level of risk?  The answer, if I'm
> the kind of programmer who has no interest in putting in type
> annotations into the header of the function, is nothing.  There is no
> speed improvement without the additional type annotations.  So all
> I've gained is the /potential/ for a speed improvement.
>
> I assume that most Clojure users really like its dynamic nature.  If
> this is true, then for most of us, the common case is to NOT annotate
> our code with types.  Certainly I like the idea of making it as easy
> as possible to write fast code in Clojure.  I want my dynamically
> typed code to be as fast as it can be, and it's nice to know that
> there is a way to write statically typed code that is even faster.
> But I really don't want to incur a penalty (i.e., possibility of
> Arithmetic Exception) when I'm not using type annotations.
>
> I agree that the burden on those who want to write type annotated code
> is too high.  I agree that this should be made easier.  But
> ultimately, I still want the burden to be on them (or me, the 1% time
> I need it), not those of us who prefer unannotated code.  I that some
> of the changes you have proposed for moving the annotations into the
> header, along with literal notation for long (i.e., 0L rather than
> (long 0)) would be preferable to the bigint dichotomy.
>
> I agree that an improved bigint type would address the performance
> aspect of my argument.  On the other hand, it wouldn't really address
> the problem of using these things in sets and hash tables.  I think
> it's very important that Clojure have one unique canonical
> representation for numbers arrived at through Clojure-based
> computations.  It is very difficult to use numbers sanely if that's
> not the case.
>
> I see your point that there are a couple of sides to this argument.  I
> hope I can persuade some more people to speak up for my side :)
>
>  

Re: Enhanced Primitive Support Konrad Hinsen 6/19/10 12:24 AM
On 18.06.2010, at 15:20, Rich Hickey wrote:

> Which then begs the questions:
>
> - how will someone 'protect' themselves from libraries written using fastmath?

Using fastmath or not would be part of the library specification. Or, in general, of the documentation for each individual function. It may not matter at all for the user of some functions, but be very important for others.

> - similar looking code will change semantics when the ns switch is made - seems dangerous as it might violate the presumptions of the original authors.

That is true, but it is true for any strategy to introduce new maths semantics into Clojure. As soon as there are two, there is a risk for confusion. And even if there is only one but different from the one in the past, there is a risk for confusion as well.

Konrad.

Re: Enhanced Primitive Support Konrad Hinsen 6/19/10 12:26 AM
On 18.06.2010, at 15:31, Nicolas Oury wrote:

> In this situation, I would rather have some kind of toggle *use-bigints-by-default* (defaulted to false) used by the compiler.
> Then, if someone needs big integers, he can switch the toggle and recompile everything.
> You wouldn't have to change the libraries ns.

I don't think it's a good idea to have a compiler switch change the semantics of number handling.

> As for being predictable, if you have numbers that can be bigger than 10^19, you would probably know it in advance.
> (indices, counters and the like don't grow that big)

True, but that's on a variable-by-variable basis.

Konrad.

Re: Enhanced Primitive Support Nicolas Oury 6/19/10 1:01 AM


On Sat, Jun 19, 2010 at 3:12 AM, Rich Hickey <richh...@gmail.com> wrote:

I have to say I'm in the 'pay for what you use' camp - you need a box, you ask for one. If I don't (and neither do any of those loops), why should I have to do extra work to avoid it?

Rich


+1. 

It is often very easy to spot integers that can grow above 10^18. Again, > 95% of number variables that someone will write in his life never will attain that because they are somehow referring to a ressource, either in the computer (counter, indices) or in the real world.

So, except for scientific programming, fact, fib and ackerman, that's not a problem. (For scientific programming, I argue people should know what to do with numbers). I still would like to have a real world example were it is hard to predict and to solve.

On the other hand, having boxed by default is a very significant slowdown (10% on the strange program I tried, that had already a lot of annotations, probably 20% or more on most programs), that can never be addressed : you can't write annotations on every single line of your program.

So, we take a 10/30% speed hit, that won't be solved, mainly because we want people writing their first Fib or fact function not to have an exception thrown. I think people can handle this level of complexity.


If boxed is the default, I advocate at least a flag allowing to change that (and possibly allowing int as a default for embedded stuff).
Most problem with semantic changing flag disappear if the flags have an order.
It is close to choosing the size of your heap on the command -line, which is a semantic-changing run-time flag...



Re: Enhanced Primitive Support Mike Meyer 6/19/10 2:13 AM

"Nicolas Oury" <nicola...@gmail.com> wrote:

>On the other hand, having boxed by default is a very significant slowdown
>(10% on the strange program I tried, that had already a lot of annotations,
>probably 20% or more on most programs), that can never be addressed : you
>can't write annotations on every single line of your program.

Were those real world programs, or arithmetic benchmarks? Most real world programs I see spend so little of their time doing arithmetic that making the math an order of magnitude slower wouldn't make a noticeable difference in overall runtime.

Which puts making numbers primitives by default squarely in the realm of premature optimization.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: Enhanced Primitive Support Nicolas Oury 6/19/10 2:25 AM
I tried it on my program. Very few arithmetic, most of it with annotation. 10%.
On an arithmetic benchmark, it would be in te *10/*20 range.
This 10% is orthogonal to what profiling shows. It just makes a lot of little improvements spread everywhere.
That's why it couldn't be solved with annotation.

 

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support Mike Meyer 6/19/10 2:28 AM

"Nicolas Oury" <nicola...@gmail.com> wrote:

>I tried it on my program. Very few arithmetic, most of it with annotation.
>10%

Can we see the source?


--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: Enhanced Primitive Support Nicolas Oury 6/19/10 3:00 AM
yes, when it will be released :-b.
I can describe it though:
it applied a stochastically a set of rewriting rules on trees (represented by records of records).
To speed up,  the activity of the rules in each subtree is cached in a Java weak hash table from one step to another.
 There is a bit of arithmetic involved everywhere, and around 2-3 double operations per function in the part choosing the instance of rule to apply.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support Jules 6/19/10 3:39 AM
I know nothing about the JVM, but I do know that on x86 you can handle
fixnum -> bignum promotion fairly cheaply. You compile two versions of
the code: one for bignums and one for fixnums. After an arithmetic
instruction in the fixnum code you check if it overflowed, and if it
did you switch to the bignum code.

while(n < 10) {
  n = n + 1;
}

=>

while(n #< 10) {
   a = n #+ 1;
   if(overflow){ n = ToBignum(n); goto foo; }
   n = a;
}

while(n.lessthan(10)) {
foo:
   n = n.add(1);
}

where #< and #+ are the fixnum versions of < and +, and lessthan and
add are the bignum versions. Yeah it's slower than ignoring the
overflow, but faster than tagged 31-bit fixnums and a whole lot faster
than bignums. Can you convince the JVM to produce similar code?

Jules
This message has been hidden because it was flagged for abuse.
Re: Enhanced Primitive Support Michał Marczyk 6/19/10 4:53 AM
On 19 June 2010 03:33, Rich Hickey <richh...@gmail.com> wrote:
> Turnabout is fair play, so I've produced a version that swaps the
> defaults, still in the 'equal' branch:

The issue of which behaviour should be the default aside, I think that
I'd expect the *, + etc. ops to go with unannotated literals and the
primed variants to go with explicit hinting, whereas now this seems
not to be the case:

(defn fact [n]
  (loop [n (num n) r (num 1)]
  (if (zero? n)
  r
  (recur (dec n) (* r n)))))

This works for (fact 40), whereas the version with dec' and *' (and
the (num ...) still in place) doesn't; I'd expect this to be reversed.

In general, I'd expect the primed ops to be there for those who know
what they're doing; same goes for explicit hinting.

On 19 June 2010 04:12, Rich Hickey <richh...@gmail.com> wrote:
> I have to say I'm in the 'pay for what you use' camp - you need a box, you
> ask for one. If I don't (and neither do any of those loops), why should I
> have to do extra work to avoid it?

The question is which desirable commodity Clojure programmers can
expect to get for free and which is to be obtainable through voluntary
banter: the sturdy boxes or the Potion of Speed +5. I'm used to paying
for the PoS, but a sudden rise in the price of boxes seems unwarranted
to me, especially if it only serves to lower the price of the PoS from
a (prim ...) hint where the local is introduced to zero. I mean, this
is no longer about "the free lunch", it's more about "the free
dessert" which the power number crunchers surely do not need.

I have a hunch that if (num ...) stays in and primitive is the
default, people (I mean just about *everybody*) will only box things
in the face of the first crash or *maybe* after developing a great
intuitive feel for when that might be required. Of course there's
going to be a period of frantic learning about numeric representations
on the JVM etc. in between the crash and the boxing. And, um, whatever
for...?

Interestingly, the introductory materials on Clojure I've read so far
tout the "don't worry about it" quality of Clojure's numerics as a
benefit of the language. (The "don't worry about it" phrasing comes
from "Programming Clojure".) I just mention this because I tend to
think it reasonable to present that as a benefit. The idea of making a
U-turn on this is surprising to me... Which is partly to say that I'm
still evaluating it, though so far my intuition stays pretty firmly on
the "don't worry about it"-is-good side.

TLDR summary: I'm very excited about the latest equal; I'd definitely
want the primed arithmetic ops to go with the hinted locals, whereas
non-primed (basic) variants would do the right thing for the simplest,
unhinted code (so which ops should get the primes in my eyes depends
on whether boxing of locals is the default); my heart is still on the
side of free sturdy boxes while I'm expecting to pay for the Potion of
Speed.

Sincerely,
Michał

Re: Enhanced Primitive Support David Powell 6/19/10 5:18 AM

Personally, I have no real interest in bigints, but I'm glad that they
are there, and that arithmetic code supports them polymorphically.

I'm not sure what I think regarding non-promoting numeric operators.
They are ok, but they seem to add complexity to the language, and they
don't look very newbie friendly. I think if we had them, promoting ops
should be the primed ones, and they would mostly be used in library
functions where performance is important.

>  but
> (loop [n 1 r 1](if (zero? n) r (recur (dec n) (* r 0.2))))
> does not [work]

This seems to be particularly nasty. Clojure is statically typing r on
the assumption that just because the init value of r fits a primitive
int, that it can never be anything else.  Promoting operators do no
good, because the variable slot for the loop var has already been
assumed to be a primitive, so it isn't possible to promote it.

Is there any possibility that loop variables could be assumed to be
boxed numerics, unless type hinted to something else?

So:

  (loop [^int n 1 r 1](if (zero? n) r (recur (dec n) (* r 0.2))))


Numeric literals would still produce primitives where possible, but
would get auto-boxed in loop initialisers.

I think this option wouldn't have any nasty surprises.


--
Dave

Re: Enhanced Primitive Support Heinz N. Gies 6/19/10 5:23 AM

On Jun 19, 2010, at 14:18 , David Powell wrote:

 (loop [^int n 1 r 1](if (zero? n) r (recur (dec n) (* r 0.2))))


Numeric literals would still produce primitives where possible, but
would get auto-boxed in loop initialisers.

I think this option wouldn't have any nasty surprises.

+41 (would be 42 if it were ^long since it is the clojure type I think)

This gives the option of make it fast if you know-what-you-are-doing(TM) which is great, and I certainly would use that myself :P but it still keeps the things simple and working for the default case.

Regards,
Heinz
Re: Enhanced Primitive Support David Powell 6/19/10 5:51 AM

> I think if we had them, promoting ops
> should be the primed ones, and they would mostly be used in library

Oops, I had meant the non-primed ops should support promotion. But
tbh, I have mixed feelings about promotion.

I haven't required bitint promotion myself, but having statically
typed numeric vars, where the type of the vars is chosen by Clojure
and it isn't obvious what that type is, seems potentially confusing.

It is the issue of loop initialisers fixing the types of the loop
variable slot that I'm more bothered about though.

--
Dave

Re: Enhanced Primitive Support Rich Hickey 6/19/10 5:55 AM

Yes, that's a bit Java-ish. All we care about here is returning the  
canonic boxed representation of the number.

Rich

Re: Enhanced Primitive Support Heinz N. Gies 6/19/10 6:06 AM

On Jun 19, 2010, at 14:55 , Rich Hickey wrote:


Yes, that's a bit Java-ish. All we care about here is returning the canonic boxed representation of the number.

It was suggested before that it is simple to allow automatic promotion, by compiling multiple versions. how about doing this? One version with primitives and one version with everything object boxed. This would (of cause mean) even if only one argument has to get boxed all others will to but it would give use speed and correctness.

Would that be possible?
Re: Enhanced Primitive Support Rich Hickey 6/19/10 6:32 AM

On Jun 19, 2010, at 2:03 AM, Mark Engelberg wrote:

> I'm confused.  In the latest scheme, what is the eventual plan for
> handling a recur to a loop element that was initialized with a
> primitive?  Are boxed numbers automatically coerced to the primitive?
> Would a long be coerced to a double, or double to long?  Under what
> scenarios will you get a warning, and when will you get an error?
>
> I think what troubles me the most about this loop/recur issue is that
> the semantics of recur depend completely on whether a variable is
> assigned a primitive or boxed numeric type.  To know what the recur
> will do, you must know the type of the assigned value.  This has
> always been the case, but in the current version of Clojure, provided
> you aren't using Java interop, it's relatively straightforward to
> know.  Unless the variable or literal is explicitly cast to a
> primitive, it's not a primitive.  But now, with the advent of static
> functions which can return primitives, it becomes more likely to not
> easily be able to determine this fact.  If I say (loop [x (foo 2)]...)
> I have no way of knowing what's going to happen when I recur.  Yes, I
> realize that you can run the function and use warnings/errors to find
> out.  But I find it unsettling to have a commonly used code form with
> semantics I can't predict just by looking at it.
>

Sorry for the confusion. This is a part of the design subject to  
change. I had temporarily eased the error here in an effort to see the  
usage patterns. The current warning tells you what is going on, since  
the compiler always knows when it is using a primitive local.  
Restoring the error on recur mismatch will cover all of these cases,  
without requiring any sophisticated analysis. If you get it wrong, it  
won't compile, and will tell you quite explicitly why.

The default (auto-primitives or always Objects) will follow the  
default for +, - etc.

> It's attractive that the current proposal gives speed benefits to
> common cases with no additional annotation.  But I personally place a
> higher value on being able to understand the semantics of my code when
> reading through it.  I would prefer a version where loop requires some
> sort of explicit annotation on a variable if you want it to require a
> primitive upon recur.  (Ideally, I'd probably want the annotation on
> the variable using a syntax that mirrors static functions, rather than
> the old technique of typecasting the initializer, since the whole
> typecasting system makes less sense now that literals and certain
> return values are already primitives.  Unifying the syntax and
> semantics of using primitives in loops with the syntax and semantics
> of using primitives as inputs to static functions makes a lot of sense
> to me.)
>

Really? It seems tedious to me.  Don't you get tired of saying int i =  
42 in Java, when you know full well the compiler knows 42 is an int? I  
do, and Clojure is competing with languages that infer the right types  
without the redundancy.

Loops are unlike static function arguments, in many ways. There is no  
context when defining the signature of a static function, and you are  
explicitly creating an interface for callers. By contrast, a loop is  
not an interface for anyone, and all possible context is present.

Again, it is a matter of defaults, the non-default is going to have to  
be explicit, and you can just as easily be explicit that you want the  
loop argument to be an Object, either with (loop [^Object i 42] ...),  
(loop [i (num 42)] ...) or your (loop' [i 42] ...) idea.

Everyone who is advocating for what the other side ought to do should  
be aware that if the default doesn't go their way, the other side will  
be them.

Rich

Re: Enhanced Primitive Support Rich Hickey 6/19/10 6:34 AM

On Jun 19, 2010, at 2:50 AM, Tim Daly wrote:

> Is it possible to follow the Common Lisp convention
> of proclaim/declaim/declare in order to specify types?
>

It would be possible of course, but undesirable. We're trying to avoid  
that kind of verbosity.

Rich

Re: Enhanced Primitive Support Rich Hickey 6/19/10 6:53 AM

On Jun 19, 2010, at 6:39 AM, Jules wrote:

> I know nothing about the JVM, but I do know that on x86 you can handle
> fixnum -> bignum promotion fairly cheaply. You compile two versions of
> the code: one for bignums and one for fixnums. After an arithmetic
> instruction in the fixnum code you check if it overflowed, and if it
> did you switch to the bignum code.
>
> while(n < 10) {
>  n = n + 1;
> }
>
> =>
>
> while(n #< 10) {
>   a = n #+ 1;
>   if(overflow){ n = ToBignum(n); goto foo; }
>   n = a;
> }
>
> while(n.lessthan(10)) {
> foo:
>   n = n.add(1);
> }
>
> where #< and #+ are the fixnum versions of < and +, and lessthan and
> add are the bignum versions. Yeah it's slower than ignoring the
> overflow, but faster than tagged 31-bit fixnums and a whole lot faster
> than bignums. Can you convince the JVM to produce similar code?
>

You can do things like that in a closed world of a few operators and  
types (although it gets unwieldy as the number of variables goes up).  
And of course, inside the math ops Clojure already does things like  
that.

But that is not what we are dealing with here. This is an open system,  
where representational issues of functions, arguments and returns come  
into play. The 'operators' +, - etc are actually function calls - they  
are not special to the compiler. And your loop may also contain other  
function calls:

while(n < 10) {
  foo(n)


  n = n + 1;
}

how do I know foo can handle the bigint once I've upgraded?

How about this?

while(n < 10) {
  n = foo(n)
}

Unless you have a tagged architecture you can't have boxed/unboxed arg/
return uniformity (and even then you lose some bits of range).

Rich

Re: Enhanced Primitive Support Rich Hickey 6/19/10 6:58 AM

As I told Mark, this was, and will again be, a hard error.

I am telling you though, if you continue with these "show stopper",  
"will keep people from using Clojure" sky-is-falling histrionics, I  
will stop reading your messages.

Rich

Re: Enhanced Primitive Support cageface 6/19/10 7:26 AM
Maybe it's only because I'm coming from Ruby, in which number
promotion is automatic and everything is slow, but if I have to choose
between correctness and performance as a *default*, I'll choose
correctness every time. I think there's a good reason that GCC, for
instance, makes you push the compiler harder with compiler flags if
you want to squeeze extra performance out of a program and accept the
corresponding brittleness that it often brings. I also always thought
that the transparent promotion of arithmetic was one of the strongest
selling points of Common Lisp.

My impression has always been that performance of numerics is rarely
the bottleneck in typical code (web stuff, text processing, network
code etc), but that unexpected exceptions in such code are the source
of a lot of programmer heartache. On the other hand, I think 99% of
the cases in which I've had a number exceed a 64 bit value were also
examples of errors that might as well have been exceptions because
they indicated a flaw in the code.
Re: Enhanced Primitive Support David Nolen 6/19/10 7:43 AM
On Sat, Jun 19, 2010 at 5:13 AM, Mike Meyer <mwm-keyword-googlegroups.6209c5@mired.org> wrote:

Were those real world programs, or arithmetic benchmarks? Most real world programs I see spend so little of their time doing arithmetic that making the math an order of magnitude slower wouldn't make a noticeable difference in overall runtime.

"Most real world programs". You mean like Clojure itself?

Please look over the implementation of gvec before making statements like this:

  clojure.lang.IPersistentCollection
  (cons [this val]
     (if (< (- cnt (.tailoff this)) (int 32))
      (let [new-tail (.array am (inc (.alength am tail)))]
        (System/arraycopy tail 0 new-tail 0 (.alength am tail))
        (.aset am new-tail (.alength am tail) val)
        (new Vec am (inc cnt) shift root new-tail (meta this)))
      (let [tail-node (VecNode. (.edit root) tail)] 
        (if (> (bit-shift-right cnt (int 5)) (bit-shift-left (int 1) shift)) ;overflow root?
          (let [new-root (VecNode. (.edit root) (object-array 32))]
            (doto ^objects (.arr new-root)
              (aset 0 root)
              (aset 1 (.newPath this (.edit root) shift tail-node)))
            (new Vec am (inc cnt) (+ shift (int 5)) new-root (let [tl (.array am 1)] (.aset am  tl 0 val) tl) (meta this)))
          (new Vec am (inc cnt) shift (.pushTail this shift root tail-node) 
                 (let [tl (.array am 1)] (.aset am  tl 0 val) tl) (meta this))))))


David
Re: Enhanced Primitive Support Nicolas Oury 6/19/10 8:10 AM
I definitely would like to see more people trying their code with primitive by default and talk about unexpected bugs and performance improvements.
Re: Enhanced Primitive Support Rich Hickey 6/19/10 8:47 AM
The hard error has been restored:

http://github.com/richhickey/clojure/commit/25165a9ccd1001fa7c4725a8219c4108803ae834

Rich

On Jun 19, 2010, at 2:03 AM, Mark Engelberg wrote:

> --  
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient  
> with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support Daniel 6/19/10 1:25 AM
Checked out the new prim branch to repeat some tests.


user=> (defn ^:static fib ^long [^long n]
  (if (<= n 1)
    1
    (+ (fib (dec n)) (fib (- n 2)))))
#'user/fib
user=> (time (fib 38))
"Elapsed time: 4383.127343 msecs"
63245986

That's certainly not what I expected....


user=> (defn fact [n] (loop [n n r 1] (if (zero? n) 1 (recur (dec' n)
(*' r n)))))
#'user/fact
user=> (fact 40)
java.lang.ClassCastException: clojure.lang.Symbol cannot be cast to
java.lang.Number (NO_SOURCE_FILE:7)


Neither is this.  According to the docs "long-and-smaller integers
promote in all cases, even when given integer primitives" with the
prime ops.

Something I did wrong?
Re: Enhanced Primitive Support Rich Hickey 6/19/10 9:45 AM

Yes, you are in the wrong branch. Get the 'equal' branch.

Rich

Re: Enhanced Primitive Support Tim Daly 6/19/10 10:05 AM

(proclaim (speed 3) (safety 0))

is verbose? Telling the compiler in one place that you care
about boxing speed, such as (declare (x unboxed)) seems to
me to be a very direct way to tell the compiler to choose
+' vs + everywhere that 'x' occurs.

This means that "it just works" code does not need additional
unboxing overhead. It also means that "make it fast" code is
marked in a single place.

And the form can be dynamically computed so you can adapt to
the platform.

It has the additional factor that common lispers will "get it".

I understand that it does not fit with your current model of
using inline typing. For a pervasive change like fast numerics
the inline typing or library-based typing is, to my mind, a bit
more cumbersome than 'proclaim' and 'declare' but that's probably
because of my common lisp background. Type hinting feels to me
like using symbol-property machinery in common lisp.

In any case, keep up the good work. I'm impressed with Clojure.

Tim Daly

Re: Enhanced Primitive Support Rich Hickey 6/19/10 10:41 AM

On Jun 19, 2010, at 1:05 PM, Tim Daly wrote:

>
> (proclaim (speed 3) (safety 0))
>
> is verbose? Telling the compiler in one place that you care
> about boxing speed, such as (declare (x unboxed)) seems to
> me to be a very direct way to tell the compiler to choose
> +' vs + everywhere that 'x' occurs.
>
> This means that "it just works" code does not need additional
> unboxing overhead. It also means that "make it fast" code is
> marked in a single place.
>
> And the form can be dynamically computed so you can adapt to
> the platform.
>
> It has the additional factor that common lispers will "get it".
>
> I understand that it does not fit with your current model of
> using inline typing. For a pervasive change like fast numerics
> the inline typing or library-based typing is, to my mind, a bit
> more cumbersome than 'proclaim' and 'declare' but that's probably
> because of my common lisp background. Type hinting feels to me
> like using symbol-property machinery in common lisp.


I don't want to get into it here, but the CL model in this area is  
quite cumbersome and opaque, IMO.

What precisely will (proclaim (speed 3) (safety 0)) do?

And of course, that is never enough, this from the CL spec:

(defun often-used-subroutine (x y)
   (declare (optimize (safety 2)))
   (error-check x y)
   (hairy-setup x)
   (do ((i 0 (+ i 1))
        (z x (cdr z)))
       ((null z))
       ;; This inner loop really needs to burn.
       (declare (optimize speed))
       (declare (fixnum i))
       ))

The (declare (fixnum i)) above being pertinent to this discussion vis-
a-vis verbosity.

Not to mention the wonderful 'the' (this also from the CL spec):

(let ((i 100)) (declare (fixnum i)) (the fixnum (1+ i))) => 101

N.B. that the declaration of the type of i does not cover the result  
of 1+, which needs its own declaration.

Never mind that trading runtime safety for speed is completely not on  
the table for Clojure. It is one of the great lies of CL that it is  
both safe and very fast. It is safe *or* very fast. Clojure code will  
remain verifiable and safe.

The CL model is not something I want to emulate in Clojure.

Point taken about the type hinting. OTOH, metadata type hints flow  
through macros, whereas the decoupled i and its declaration of the CL  
above are extremely difficult to transform together. Metadata flowing  
is really an enormous win, IMO.

Rich

Re: Enhanced Primitive Support Daniel 6/19/10 10:35 AM
Cool.  This works correctly now.  Super nice, too :)

user=> (defn ^:static fib ^long [^long n]
  (if (<= n 1)
    1
    (+ (fib (dec n)) (fib (- n 2)))))
#'user/fib
user=> (time (fib 38))
"Elapsed time: 601.563589 msecs"
63245986


And this returns the error you specified:

user=> (defn fac [n] (loop [n n r 1] (if (zero? n) 1 (recur (dec' n)
(*' r n)))))
java.lang.RuntimeException: java.lang.IllegalArgumentException:  recur
arg for primitive local: r is not matching primitive, had:
java.lang.Number, needed: long (NO_SOURCE_FILE:13)


Which is really poor behavior.  Having numeric literals default to
primitive types is great but this behavior does not match the
documentation.  To quote: "long-and-smaller integers promote in all
cases, even when given integer primitives" with the prime ops.  Any
behavior other than that is confusing and essentially means the
programmer cannot have dependable bindings (to be fair, boxing at the
start of the loop fixes it but whatever).  I fully acknowledge that
the reason I don't understand 'primitive-by-default locals' may be
because I'm so new at Clojure (and functional programming), so take
this suggestion with a grain of salt: locals should default to the
type specified in code (which means the prime ops should cause the
rebind to auto-box/upgrade like the documentation dictates) or have
sensible defaults (like literals default to primitives (which is
awesome!)).  At least that makes sense to me in a dynamic context.


> Really? It seems tedious to me.  Don't you get tired of saying int i =
> 42 in Java, when you know full well the compiler knows 42 is an int? I
> do, and Clojure is competing with languages that infer the right types
> without the redundancy.

Agree 100%, which is partly why this silly loop/recur behavior needs
to be fixed.
Re: Enhanced Primitive Support puzzler 6/19/10 12:08 PM
On Sat, Jun 19, 2010 at 6:32 AM, Rich Hickey <richh...@gmail.com> wrote:
> Really? It seems tedious to me.  Don't you get tired of saying int i = 42 in
> Java, when you know full well the compiler knows 42 is an int? I do, and
> Clojure is competing with languages that infer the right types without the
> redundancy.

In this case, the question isn't so much whether Clojure can figure
out that 42 is a primitive or boxed integer.  The question is whether
the programmer wants to insist that, upon recurrence, the
primitiveness must match the initial binding.  Right now, Clojure
makes the determination about what the recurrence needs to be based on
the initial binding, so it becomes paramount for the programmer to
understand the type of that initial binding.  But conceivably there
are other ways that Clojure could make a determination about what kind
of loop/recur behavior you want other than the type of the initial
binding.  So an alternative way to think about this is to relax the
restriction that a recur must match the primitiveness of the loop
binding, but allow the programmer to somehow control whether they want
this primitive-matching behavior or not.  Ideally, the behavior of
recur should be readily apparent from inspection of the code.
Unfortunately, the type of the initial binding is not always clear,
which suggests to me that another scheme would be preferable.

Sadly, I don't have any idea right now on how to cleanly specify the
desired loop/recur behavior other than something that looks a lot like
type annotations.  But thinking of it in this way opens up the
possibility of other solutions (such as a loop' construct), so maybe
brainstorming along these lines will be useful.  I agree with another
poster that the holy grail would be if loop/recur automagically gave
performance benefits when recurring with a primitive that matched the
initial binding, but if you recur with something else, it just
gracefully rolls over to a boxed version of the whole loop.  Given
that Java has no way of expressing that something can be a primitive
or an Object, this seems unlikely to be possible, but perhaps with
that ideal in mind, some approximation can be found.

> Again, it is a matter of defaults, the non-default is going to have to be
> explicit, and you can just as easily be explicit that you want the loop
> argument to be an Object, either with (loop [^Object i 42] ...), (loop [i
> (num 42)] ...) or your (loop' [i 42] ...) idea.

Agreed, but with a default of boxing, it is possible to write all code
without ever using an annotation -- it will just be a bit slower than
it could be.  With a default of primitives, there is ordinary code
that won't work properly unless you dig deeper into Clojure and learn
all about explicit boxing and figure out when and where to use it.  I
prefer the default where I can ignore tedious type annotations and my
code "just works".

Re: Enhanced Primitive Support Nicolas Oury 6/19/10 12:20 PM
Not "ordinary" code. 10^19 is big.
Re: Enhanced Primitive Support Jules 6/19/10 12:23 PM
There are two possibilities: either foo gets inlined and the same
optimization is applied to the inlined code. Or foo is not inlined and
then you pass the general boxed representation to foo (but inside foo
the same optimization will be applied).

Suppose that we have a boxed representation like this:

class Num { ... }
class Small {
  int val;
  Num add(int n){ // argument specialized to int for simplicity of
this example
    int sum = val + n;
    if(overflow) return val.toBig().add(n);
    else return new Small(sum);
  }
  ...
}
class Big {
  BigNum val;
  Num add(int n){ ... }
}

Now the while loop. I use functional notation because the control flow
is hairy and with while loops this leads to a lot of reorganization.
With functional notation it's easier to see that every step is simple
and mechanical.

Here's our while loop and the library function to add two numbers:

loop(n) = if p(n) then n else loop(add(n,1))
add(Small a, int b) = let sum = a.val+b in if overflow then
add(Big(a.val),b) else Small(sum)
add(Big a, int b) = bigAdd(a,b)

Inline add into the loop:

loop(Small n) = if p(n) then n else let sum = n.val+1 in if overflow
then loop(add(Big(n.val),1)) else loop(Small(sum))
loop(Big n) = if p(n) then n else loop(bigAdd(n,1))

Introduce a new name:

loop_small(n) = if p(n) then n else let sum = n.val+1 in if overflow
then loop(add(Big(n.val),1)) else loop(Small(sum))
loop(Small n) = loop_small(n)
loop(Big n) = if p(n) then n else loop(bigAdd(n,1))

Inline loop into loop_small and eliminate dead code, turning
loop(Small(sum)) into loop_small(Small(sum)):

loop_small(n) = if p(n) then n else let sum = n.val+1 in if overflow
then loop(add(Big(n.val),1)) else loop_small(Small(sum))
loop(Small n) = loop_small(n)
loop(Big n) = if p(n) then n else loop(bigAdd(n,1))

Note that we have eliminated the checks (in this case pattern
matching) from the loop_small loop! Inlining + dead code elimination
did the trick. Sure, we are still using a boxed representation in
loop_small, but it's not polymorphic anymore and standard algorithms
can turn the heap allocated Small instances into stack allocated local
int variables (essentially "inlining" the Small class' instance
variables into the local variables, in this case only val).

It's not trivial to build a compiler that does this, but it's not
impossible either. Perhaps the JVM already does some of this. You can
get best of both worlds: speed *and* correctness. The situation is not
"pick one: speed, correctness", it is "pick two: speed, correctness,
simple compiler".

Jules
Re: Enhanced Primitive Support puzzler 6/19/10 12:24 PM
On Sat, Jun 19, 2010 at 12:20 PM, Nicolas Oury <nicola...@gmail.com> wrote:
> Not "ordinary" code. 10^19 is big.

Nicolas, I think you misunderstand my point.  I'm talking about
loop/recur behavior.  If you initialize a loop with the literal 1, and
then recur with a number 2 that's pulled from a collection (and thus
boxed), it will break.  That's ordinary code.

Re: Enhanced Primitive Support Garth Sheldon-Coulson 6/19/10 12:35 PM
Do any of the proposals on the table affect floating point math?

Currently, it seems mildly inconsistent to me that Clojure's numeric operations are auto-promoting and arbitrary-precision for integers but not for floats unless you specifically request it with the M suffix. This probably matters only to people doing scientific computing, but it's something to think about.

To illustrate with a question: If we were to make "fast" the default and "correct" something you opt into with the primed operators and (num <some number>), then does the question arise whether BigDecimals (as opposed to doubles) are returned from (num <some float>) and (*' <float> <float>)?


Re: Enhanced Primitive Support Brian Goslinga 6/19/10 12:46 PM
On Jun 19, 2:24 pm, Mark Engelberg <mark.engelb...@gmail.com> wrote:
> On Sat, Jun 19, 2010 at 12:20 PM, Nicolas Oury <nicolas.o...@gmail.com> wrote:
> > Not "ordinary" code. 10^19 is big.
>
> Nicolas, I think you misunderstand my point.  I'm talking about
> loop/recur behavior.  If you initialize a loop with the literal 1, and
> then recur with a number 2 that's pulled from a collection (and thus
> boxed), it will break.  That's ordinary code.
What if loop had its current functionality, but if one tries to recur
with a different type, the affected loop variable becomes
automatically boxed (and if *warn-on-reflection* or a similar variable
is set, the compiler emits an note)?  What would be the downside of
this approach?  All of the code that currently works correctly with
the current state of the equal branch remains fast, and all of these
examples of bad behaviour should just work, and the programmer should
be able to ask to know about them.
Re: Enhanced Primitive Support Mike Meyer 6/19/10 12:53 PM
On Sat, 19 Jun 2010 11:00:39 +0100
Nicolas Oury <nicola...@gmail.com> wrote:
>  There is a bit of arithmetic involved everywhere, and around 2-3 double
> operations per function in the part choosing the instance of rule to apply.

That still sounds arithmetic-heavy to me. In looking through my code,
I find three different classes of algorithm:

1) Things that are hard-core number-crunching, which for me is usually
   either combinatorics or crypto, where bigint is either a necessity
   or a major convenience.

2) Searches of various kinds, where most of the code manipulates
   trees/graphs to be searched with no (explicit) arithmetic, and one
   function evaluates a node with maybe a half-dozen arithmetic ops,
   meaning an average of <1 arithmetic op per function.

3) Everything else (web hacks, database stuff, string processing,
   etc.) where there's basically *no* arithmetic.

I guess the real question I should be asking is:

Where can I get jar files of the various branches so I can test for
myself without having to deploy whatever infrastructure is needed to
build clojure these days? Or am I likely to be able to build them with
tools found in the BSD ports tree?

      <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Re: Enhanced Primitive Support Mike Meyer 6/19/10 1:05 PM
On Sat, 19 Jun 2010 10:43:36 -0400
David Nolen <dnolen...@gmail.com> wrote:

> On Sat, Jun 19, 2010 at 5:13 AM, Mike Meyer <
> mwm-keyword-googlegroups.6209c5@mired.org> wrote:
> >
> >
> > Were those real world programs, or arithmetic benchmarks? Most real world
> > programs I see spend so little of their time doing arithmetic that making
> > the math an order of magnitude slower wouldn't make a noticeable difference
> > in overall runtime.
> >
>
> "Most real world programs". You mean like Clojure itself?
>
> Please look over the implementation of gvec before making statements like
> this:

This is actually the what I'm trying to figure out. While *my* code
may have no explicit arithmetic in it, it's not clear without diving
into the guts of clojure how much arithmetic gets done in it's name by
things like gvec.

       <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Re: Enhanced Primitive Support Mike Meyer 6/19/10 1:24 PM
While I generally favor correct over fast (worrying about fast before
you're correct is a good sign that you're engaged in premature
optimization), I'm still trying to figure out the tradeoffs
here. Especially since most LISPs & other dynamic languages don't seem
to run into this issue - or at least the debate that we're seeing
here.

IIUC, the problem is that we've got three "kinds" (not types; which
normally refers to bit lengths) of integer:

primitive, which is what the JVM does it's math with.
boxed, which is the primitive wrapped in a Java object, and what Java
       normally uses.
bigint, which is a clojure-specific kind of integer.

Other numeric types don't have issues because they don't have all
three kinds (is that right? Or could this entire thread be rewritten
in terms of primitive vs. boxed floats?).

And the bottom line is that for some reason we can't get from
primitives (which are required for speed) to bigints (which are
required to maximize the domain for which we get correct answers)
automatically.

Assuming that that's right, the first question that occurs to me is:
Does clojure really need java's boxed types? Other than for java
interop, of course.

Again, other languages seem to get by with the platform primitives
(module tweaks to get type information) going to bigints, and to get
reasonable performance by throwing away the dynamic nature of the
language for the hot spots. Could clojure do something similar, or is
this something technical issue with the JVM, similar to the reason we
don't get TCE?

      <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Re: Enhanced Primitive Support Mike Meyer 6/19/10 2:15 PM
On Sat, 19 Jun 2010 20:20:48 +0100
Nicolas Oury <nicola...@gmail.com> wrote:

> Not "ordinary" code. 10^19 is big.

No, Aleph-2 is big. Any non-infinite number you can name in your
lifetime is small ;-).

Pretty much any time I really need integer speed, I also deal with
numbers that can get much larger than 10^19th, because I tend to be
doing combinatorics or cryptography.

True, this represents a small fraction of all programs. The question
I'd really like answered is how much difference does this make for
non-numeric clojure code, where only a few percent of the calls to
core functions are calls to integer ops.

     <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Re: Enhanced Primitive Support Nicolas Oury 6/19/10 2:27 PM


On Sat, Jun 19, 2010 at 10:15 PM, Mike Meyer <m...@mired.org> wrote:


Pretty much any time I really need integer speed, I also deal with
numbers that can get much larger than 10^19th, because I tend to be
doing combinatorics or cryptography.

But you would agree that for this kind of domain, it wouldn't be too bad to be explicit about
wanting bigint. If you write a program that work with XML, you don't ask clojure data to be 
encoded in XML.

 
True, this represents a small fraction of all programs. The question
I'd really like answered is how much difference does this make for
non-numeric clojure code, where only a few percent of the calls to
core functions are calls to integer ops.


I tried on some code with 1/2 ops per functions and already quite a dew annotations. 10% speed-up in a program that spend 80% in Object.hashCode (which does not get accelerated at all).
I think that's due to the 1/2 ops + the ops hidden in clojure library code.
I'd like to see more results like that and bugs in real programs.
It was a matter of minutes to try.
- pull the branch with git
- compile with ant
- replace your clojure.jar
- recompile.




Re: Enhanced Primitive Support Mike Meyer 6/19/10 2:51 PM
On Sat, 19 Jun 2010 22:27:05 +0100
Nicolas Oury <nicola...@gmail.com> wrote:

> On Sat, Jun 19, 2010 at 10:15 PM, Mike Meyer <m...@mired.org> wrote:
> > Pretty much any time I really need integer speed, I also deal with
> > numbers that can get much larger than 10^19th, because I tend to be
> > doing combinatorics or cryptography.
> But you would agree that for this kind of domain, it wouldn't be too bad to
> be explicit about
> wanting bigint. If you write a program that work with XML, you don't ask
> clojure data to be
> encoded in XML.

Yes, I would agree. The real issue is that when I say "gimme bigints",
it happens pretty much everywhere, so I don't have to worry about
overlooking some quantity that might overflow. I suppose this cuts
both ways - when you want primitives, you want them everywhere, so you
don't have to worry about overlooking some heavily used quantity
somewhere.

The difference is, I'm worried about getting the correct answer, but
you're worried about performance. You can profile your code to
identify where the time is going (in fact, you should have already
done that before adding the first annotation) to make sure you don't
overlook anything. The only automated tool I've got for finding
overflows is - well, to run into them. Having to fix them and repeat
the run pretty thoroughly negates any win I might have gotten out of
using primitive arithmetic where I could.

> > True, this represents a small fraction of all programs. The question
> > I'd really like answered is how much difference does this make for
> > non-numeric clojure code, where only a few percent of the calls to
> > core functions are calls to integer ops.
> I tried on some code with 1/2 ops per functions and already quite a dew
> annotations. 10% speed-up in a program that spend 80% in Object.hashCode
> (which does not get accelerated at all).

Not quite what I wanted, for at least one reason (preexisting
annotations) and possibly because you should be comparing the two
variants of the appropriate branch, instead of comparing that branch
with either 1.1 or the current master.

In any case, I don't think 10% is enough gain to justify narrowing the
range for which naive programs are correct. An order of magnitude
would be. A factor of two is a maybe.

> I think that's due to the 1/2 ops + the ops hidden in clojure library code.
> I'd like to see more results like that and bugs in real programs.
> It was a matter of minutes to try.
> - pull the branch with git
> - compile with ant
> - replace your clojure.jar
> - recompile.

Except you have to do it twice, and get correct timings. It would
really help if someone could provide precompiled jars with an
appropriate README.

       <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Re: Enhanced Primitive Support Michał Marczyk 6/19/10 3:48 PM
On 19 June 2010 16:43, David Nolen <dnolen...@gmail.com> wrote:
> "Most real world programs". You mean like Clojure itself?
> Please look over the implementation of gvec before making statements like
> this:

Surely Clojure's internals are a *very* special case though... I
wouldn't find it particularly worrying if they turned out to require
some hinting.

I suppose even Ruby 1.8 must have its fair share of full-speed C math
under the hood (disclaimer: I'm not a Rubyist though, so I never cared
to check), which has nothing to do with the performance of math in
user programmes, which is what I interpreted Mike's statement to be
referring to.

Sincerely,
Michał

Re: Enhanced Primitive Support Michał Marczyk 6/19/10 4:10 PM
On 19 June 2010 23:51, Mike Meyer

<mwm-keyword-googlegroups.6209c5@mired.org> wrote:
> In any case, I don't think 10% is enough gain to justify narrowing the
> range for which naive programs are correct. An order of magnitude
> would be. A factor of two is a maybe.

QFT!

Have a look at this:

(defn fact [n]
  (loop [n n r 1]
    (if (zero? n)
      r

      (recur (dec n) (* r n)))))

This doesn't compile. To make it compile, the initial value of r has
to be given as (num 1) *or* the initial binding of the "inner" n has
to be given as (long n). Changing the ops to *' and dec' rules out the
latter option, but the (num ...) hint on r is still required. Not
surprisingly having r start off with (Long. 1) solves makes this
compile too.

The summary is that I'm finding it impossible to write a standard
tail-recursive factorial function without type hints on my literals. I
suppose I can guess what the reason is by now, but I find it
unreasonably confusing. Not sure to which degree this can be ironed
out if boxing were to remain an opt-in rather than -out...?

Oh, just in case votes on this are of interest, I like the idea of
loop' for fast-by-default looping.

Sincerely,
Michał

Re: Enhanced Primitive Support puzzler 6/19/10 4:33 PM
On Sat, Jun 19, 2010 at 4:10 PM, Michał Marczyk
<michal....@gmail.com> wrote:
> (defn fact [n]
>  (loop [n n r 1]
>    (if (zero? n)
>      r
>      (recur (dec n) (* r n)))))

Thanks for posting this surprisingly simple example of something that
looks like it should work, but wouldn't under the current proposal.
Really demonstrates how even for straightforward programs you would
have to do a certain amount of "type flow" analysis to verify that it
will compile.

Re: Enhanced Primitive Support David Nolen 6/19/10 5:13 PM
On Sat, Jun 19, 2010 at 4:10 PM, Michał Marczyk
<michal....@gmail.com> wrote:
> (defn fact [n]
>  (loop [n n r 1]
>    (if (zero? n)
>      r
>      (recur (dec n) (* r n)))))

Huh? That doesn't look like it's going to work at all.

1) 1 is primitive, we know that, accept it
2) we don't know the type of n, what will (* r n) be?
3) BOOM!

My suggestion is to stop it with the contrived examples. Start showing some real code, real problems in your real programs. Using loop/recur is already the beginning of code smell for anything that is not performance sensitive.

(defn fact
  ([n] (fact n 1))
  ([n r] (if (zero? n)
          r
          (recur (dec n) (* r n)))))

Sleep soundly.
Re: Enhanced Primitive Support Peter Schuller 6/19/10 5:22 PM
> Where can I get jar files of the various branches so I can test for
> myself without having to deploy whatever infrastructure is needed to
> build clojure these days? Or am I likely to be able to build them with
> tools found in the BSD ports tree?

It still seems to build with 'ant' like it always has for me? Install
devel/apache-ant and simply run 'ant' in the clojure checkout.

If you want to get the maven artifact installed in your local
repository for use with leiningen/maven projects, you first want to
install devel/maven2. The complication is that clojure is not built by
maven (normally you would just 'mvn install'), but instead is built
with Ant with it's maven tasks. So, you need to drop in:

   http://www.apache.org/dyn/closer.cgi/maven/binaries/maven-ant-tasks-2.1.0.jar

Into ~/.ant/lib so that ant can pick that up. Afterwards, you should
be able to run:

  ant ci-build

And that should install (although it does so very silently) clojure
into your (user) local maven repository:

  ls -ltr ~/.m2/repository/org/clojure/clojure/1.2.0-master-SNAPSHOT

--
/ Peter Schuller

Re: Enhanced Primitive Support puzzler 6/19/10 5:24 PM
On Sat, Jun 19, 2010 at 5:13 PM, David Nolen <dnolen...@gmail.com> wrote:
> Huh? That doesn't look like it's going to work at all.
> 1) 1 is primitive, we know that, accept it
> 2) we don't know the type of n, what will (* r n) be?
> 3) BOOM!

Yes David, if you have a deep understanding of how loop/recur
interacts with primitives, and you understand that n, as an input is
boxed and that literals are primitives, it is of course possible to do
the analysis and see why it doesn't work.

But it does look like it *should* work -- in current Clojure it works,
and the same kind of code in just about any dynamically typed language
I can think of would work.  It might not work in a statically typed
language, but the types would be right there in your face, so it would
be totally obvious what was going on.  The problem is that the
behavior is dependent on something that is invisible from the code,
and requires considerable reasoning even for this simple example.

> My suggestion is to stop it with the contrived examples. Start showing some
> real code, real problems in your real programs. Using loop/recur is already
> the beginning of code smell for anything that is not performance sensitive.

I think the notion that loop/recur is a code smell for anything that
isn't performance sensitive is absurd.  It's basically the only
looping mechanism that Clojure offers - it's in all types of code.

> (defn fact
>   ([n] (fact n 1))
>   ([n r] (if (zero? n)
>           r
>           (recur (dec n) (* r n)))))
> Sleep soundly.

The fact that this refactoring of the fact function fixes the problem
further bolsters our argument that the newly proposed semantics are
significantly more inscrutable than they should be.  Without a fair
amount of thought, it's completely unobvious why this refactoring
should fix the problem.  (Yes, I know it's because it boxes the 1 when
it passes it to the two-arg version, but stuff like this really
shouldn't make such a significant difference in behavior).

Re: Enhanced Primitive Support Mike Meyer 6/19/10 5:24 PM
On Sat, 19 Jun 2010 20:13:07 -0400
David Nolen <dnolen...@gmail.com> wrote:

> On Sat, Jun 19, 2010 at 4:10 PM, Michał Marczyk
> <michal....@gmail.com> wrote:
> > (defn fact [n]
> >  (loop [n n r 1]
> >    (if (zero? n)
> >      r
> >      (recur (dec n) (* r n)))))
>
> Huh? That doesn't look like it's going to work at all.
>
> 1) 1 is primitive, we know that, accept it
> 2) we don't know the type of n, what will (* r n) be?
> 3) BOOM!
>
> My suggestion is to stop it with the contrived examples. Start showing some
> real code, real problems in your real programs. Using loop/recur is already
> the beginning of code smell for anything that is not performance sensitive.

Oh, come on. I didn't write that example, but it's a perfect
reasonable implementation of the factorial loop algorithm.

I'd say that you're calling "using loop/recur" a code smell is itself
a smell, indicating that loop/recur is broken.

   <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Re: Enhanced Primitive Support Rob Lachlan 6/19/10 5:39 PM
The main example for recur on the special forms page (http://
clojure.org/special_forms#Special%20Forms--(recur%20exprs*)) is:

(def factorial
  (fn [n]
    (loop [cnt n acc 1]
       (if (zero? cnt)
            acc
          (recur (dec cnt) (* acc cnt))))))

I may not be be clojure jedi, but I've been learning the language for
a while.  I've never come across the notion that this is a code
smell.  I thought that the loop recur form was perfectly orthodox.

Also, the fact that the form above doesn't compile in the equal branch
does make me a little uneasy.

Rob

On Jun 19, 5:13 pm, David Nolen <dnolen.li...@gmail.com> wrote:
> On Sat, Jun 19, 2010 at 4:10 PM, Michał Marczyk
>
Re: Enhanced Primitive Support David Nolen 6/19/10 5:40 PM
Mark and Mike you fail to address my deeper question. I've been using
many different Clojure libraries all day long with the latest equals
branch. Guess
what?

No loop/recur bugs.

When you guys start postIng your broken code that has this problem
then people can start believing it is an issue for working code.

Until then this is just theoretical rhetoric.

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support Mike Meyer 6/19/10 6:59 PM
On Sat, 19 Jun 2010 20:40:13 -0400
David Nolen <dnolen...@gmail.com> wrote:

> Mark and Mike you fail to address my deeper question.

Maybe because you've failed to ask in your hurry to claim code is
non-idiomatic or calling our example rhetoric.

> I've been using
> many different Clojure libraries all day long with the latest equals
> branch. Guess
> what?
>
> No loop/recur bugs.

I wouldn't expect anything else, for two reasons:

1) Most clojure code doesn't deal with integers - it deals with
   symbols, strings, and sequences. Hence it won't run into this problem.
2) The libraries are generally written by experienced users who will
   have tweaked them for performance - meaning they've typed hinted
   them, and thus avoided this problem.

> When you guys start postIng your broken code that has this problem
> then people can start believing it is an issue for working code.

I guess a perfectly natural expression of a reasonable algorithm that
works under 1.1 and doesn't on the equal branch isn't "broken" because
it's the same function we've been using all along. So here's some new
examples, pulled in sequence from some of my combinatorics code:

(defn count-in [value col]
   (loop [value value col col res 0]
      (if (empty? col)
          res
          (recur value (rest col) (if (= (first col) value) (inc res) res)))))

(defn ones-n-zeros [vectors]
  (loop [vectors vectors m-zeros 0 m-ones 0]
     (if (empty? vectors)
         [m-zeros m-ones]
         (let [data (first vectors)
               zeros (count-in 0 data)
               ones (count-in 1 data)]
            (recur (rest vectors) (if (> zeros ones) (inc m-zeros) m-zeros)
                                  (if (> ones zeros) (inc m-ones) m-ones))))))

No, I haven't verified it fails - I've given up trying to check out
the branch, as hg convert failed, and git (after reinstalling over a
broken version) keeps complaining about "warning: remote HEAD refers
to nonexistent ref, unable to checkout." However, my reading of the
proposal is that the inc's here will cause the same problems as the
*'s in the perfectly reasonable factorial example.

Now go ahead and claim that that's not very good code, because an
expert would have type hinted it or written it with map & apply or
some such, or that there's a contrib library somewhere with a faster
implementation of count-in in it. That, after all, is your rhetoric.

But that's my point. I want the *obvious* code to work. I'll worry
about making it fast after I've made it right. If you want a read-only
language that requires an expert to get working code in a reasonable
amount of time, you can always write in Perl or C.

       <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Re: Enhanced Primitive Support Michał Marczyk 6/19/10 7:15 PM
On 20 June 2010 02:13, David Nolen <dnolen...@gmail.com> wrote:
> On Sat, Jun 19, 2010 at 4:10 PM, Michał Marczyk
> <michal....@gmail.com> wrote:
>> (defn fact [n]
>>  (loop [n n r 1]
>>    (if (zero? n)
>>      r
>>      (recur (dec n) (* r n)))))
> Huh? That doesn't look like it's going to work at all.
> 1) 1 is primitive, we know that, accept it
> 2) we don't know the type of n, what will (* r n) be?
> 3) BOOM!

Yes, I know what's going on, after reading through this rather lengthy
thread and participating in / listening in on a number of exchanges on
#clojure. I still find the notion that this code "doesn't look like
it's going to work at all" to be fairly surprising. (Also, I'd rather
not refactor a private loop so that it becomes exposed to the outside
world (presumably to be instructed to ignore the binary overload), but
that is beside the point. The (num ...) hint is a simple enough
solution, which incidentally I'd be able to live with if things were
not to go the way I prefer.)

I wanted to address the pre-BOOM points above, though:

1) With the other approach being proposed, this would read something
like "1 is primitive in primitive contexts, but will get boxed if used
as an initialiser for an unhinted local". That's rather a significant
improvement over the stylistically matching description of the
1.2-master (& 1.1) state of affairs (which I'll omit from here while
noting that apparently acceptance was pretty low on that). I'm very
happy to see Clojure moving forward w.r.t. the way in which top
numeric performance can be achieved; that leaves plenty of room for
doubting whether it should be the outright default. So, how about we
address this point...?

2) That is in fact precisely the reason why I find it odd to presume
that we know the desired type of r based on the literal alone.

The question I wanted to pose when posting that factorial example was this:

Should basic arithmetic work with no explicit hints / casts / primed
ops / whatever -- or not necessarily?

This seems like a valid question to me. If the answer is "not
necessarily", I'll probably be able to get used to the idea; if it is
"yes", or at least "hopefully", then perhaps there's something left to
tweak / change etc. Note that if there was a way to have
primitive-by-default arithmetic, yet also have naive arithmetic code
work without hinting, this would be orthogonal to the issue of the
choice of default. (In fact, it could affect my personal preference,
or at least the degree to which I care about it.)

Sincerely,
Michał

Re: Enhanced Primitive Support Rob Lachlan 6/19/10 7:19 PM
Actually, Mike, your two functions work just fine.  (Equal branch).
Mind you I checked that out over two hours ago, so this information
might be out of date.

Rob

On Jun 19, 6:59 pm, Mike Meyer <mwm-keyword-googlegroups.
620...@mired.org> wrote:
> On Sat, 19 Jun 2010 20:40:13 -0400
>
Re: Enhanced Primitive Support Michał Marczyk 6/19/10 7:20 PM
On 20 June 2010 03:59, Mike Meyer

<mwm-keyword-googlegroups.6209c5@mired.org> wrote:
> So here's some new
> examples, pulled in sequence from some of my combinatorics code:

This works fine as far as arithmetic ops are concerned, though I
suppose it might suffer from issues of = vs. == now (which is a
separate matter).

Sincerely,
Michał

Re: Enhanced Primitive Support Aaron Cohen 6/19/10 7:26 PM
I actually believe it will throw an overflow exception if col contains
more than Integer.MAX_VALUE elements. Hard to confirm without getting
bored though.

-- Aaron

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support Rob Lachlan 6/19/10 7:34 PM
I don't think my computer has enough memory to test that.  But, you're
right, I might have spoken hastily.

Rob

On Jun 19, 7:26 pm, Aaron Cohen <aa...@assonance.org> wrote:
> I actually believe it will throw an overflow exception if col contains
> more than Integer.MAX_VALUE elements. Hard to confirm without getting
> bored though.
>
> -- Aaron
>
>
>
Re: Enhanced Primitive Support Mike Meyer 6/19/10 8:22 PM

"Rob Lachlan" <robert...@gmail.com> wrote:

>Actually, Mike, your two functions work just fine.  (Equal branch).
>Mind you I checked that out over two hours ago, so this information
>might be out of date.
>
>Rob
>
>On Jun 19, 6:59 pm, Mike Meyer <mwm-keyword-googlegroups.

Ok, why does this work but the fact fail? Or does the fact example still fail on that build?

The fact that this requires explanation is a pretty good argument the fact behavior.


>> (defn count-in [value col]
>>    (loop [value value col col res 0]
>>       (if (empty? col)
>>           res
>>           (recur value (rest col) (if (= (first col) value) (inc res) res)))))
>>
>> (defn ones-n-zeros [vectors]
>>   (loop [vectors vectors m-zeros 0 m-ones 0]
>>      (if (empty? vectors)
>>          [m-zeros m-ones]
>>          (let [data (first vectors)
>>                zeros (count-in 0 data)
>>                ones (count-in 1 data)]
>>             (recur (rest vectors) (if (> zeros ones) (inc m-zeros) m-zeros)
>>                                   (if (> ones zeros) (inc m-ones) m-ones))))))

--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: Enhanced Primitive Support Rob Lachlan 6/19/10 8:41 PM
Because the compiler is upset that it doesn't know what n is.  r is a
long, but n is ???.  The following works:

(defn ^:static fact [^long n]
  (loop [n n r 1]
    (if (zero? n)
      r
      (recur (dec n) (* r n)))))

Or see Dnolen's version above.  But yeah, I wish that it still worked,
because it used to work just fine.

Rob

On Jun 19, 8:22 pm, Mike Meyer <mwm-keyword-googlegroups.
620...@mired.org> wrote:
Re: Enhanced Primitive Support Mike Meyer 6/19/10 9:01 PM

"Rob Lachlan" <robert...@gmail.com> wrote:

>Because the compiler is upset that it doesn't know what n is.  r is a
>long, but n is ???.  The following works:
>
>(defn ^:static fact [^long n]
>  (loop [n n r 1]
>    (if (zero? n)
>      r
>      (recur (dec n) (* r n)))))
>
>Or see Dnolen's version above.  But yeah, I wish that it still worked,
>because it used to work just fine.

Is there reason the compiler can't box r instead of throwing an exception?


--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: Enhanced Primitive Support Aaron Cohen 6/19/10 9:02 PM
On Sat, Jun 19, 2010 at 11:22 PM, Mike Meyer
<mwm-keyword-googlegroups.6209c5@mired.org> wrote:
>
> "Rob Lachlan" <robert...@gmail.com> wrote:
>
>>Actually, Mike, your two functions work just fine.  (Equal branch).
>>Mind you I checked that out over two hours ago, so this information
>>might be out of date.
>>
>>Rob
>>
>>On Jun 19, 6:59 pm, Mike Meyer <mwm-keyword-googlegroups.
>
> Ok, why does this work but the fact fail? Or does the fact example still fail on that build?
>
> The fact that this requires explanation is a pretty good argument the fact behavior.
>

>> (defn fact [n]


>>  (loop [n n r 1]
>>   (if (zero? n)
>>     r
>>     (recur (dec n) (* r n)))))


(The quoting in this thread is getting out-of-hand, everyone please
reply on the bottom)

The difference between your examples and the fact examples is that

(inc prim) -> prim

while

(* prim boxed) -> boxed

In the fact example, the type of n as an initializer in the loop
statement is unknown at compile time and thus must be boxed. The fact
that you need to know that, is very much what some people are arguing
against.

The choices are to do either:

(loop [n n r (num 1)]

or

(loop [n (long n) r 1]

Either option will make (* r n) return a primitive or boxed value of
the appropriate type. Both versions will overflow and throw an
exception at (fact 21).

To have a fact that works for large numbers you need:

(defn fact [n]
 (loop [n n r (num 1)]
   (if-not (pos? n)


     r
     (recur (dec n) (*' r n)))))

Most of the versions posted so far were broken for negative n by the way.

Re: Enhanced Primitive Support Meikel Brandmeyer (kotarak) 6/20/10 12:07 AM
Hi,

Am 20.06.2010 um 01:10 schrieb Michał Marczyk:

> (defn fact [n]
>  (loop [n n r 1]
>    (if (zero? n)
>      r
>      (recur (dec n) (* r n)))))

Maybe I'm spoiled by OCaml, but why can't things be inferred?

n - We don't know.
1 - primitive
r - primitive
zero? - We don't know because of n.
dec - We don't know because of n.
* - We don't know because of n.

So, now the programmer has to choices.

1. He annotates n as primitive and *boom* full speed.
2. He does not annotate n. Then we box things.

A flag like *warn-on-boxing* can help to identify these spots. These works for all kind of things. Not only for contrived fact and fib exampls.

(loop [n 1]
  (if (pred? n)
    (recur (foo n))
    n))

n - primitive
foo - We don't know.

As it stands box things. Annotate foo to return something primitive and things go fast.

There are probably tons of problems with this, I don't understand or even know. So ignore at will.

Sincerely
Meikel

Re: Enhanced Primitive Support Carson 6/20/10 1:04 AM
On Jun 19, 9:02 pm, Aaron Cohen <aa...@assonance.org> wrote:
> On Sat, Jun 19, 2010 at 11:22 PM, Mike Meyer
> <mwm-keyword-googlegroups.620...@mired.org> wrote:
Or this! (on the new equal branch)

user=> (defn fact [n]
 (loop [n n r 1.0]
   (if (zero? n)
     r
     (recur (dec n) (* r n)))))
#'user/fact
user=> (fact 5)
120.0
user=> (fact 5N)
120.0

:)  Not sure I understand why completely since I don't entirely
understand the interaction of the Clojure code, LispReader.java, and
Numbers.java, but it seems a double multiplied by anything returns a
double now.  A long will be returned by multiplication only if both
arguments are long, unless you ask for auto-promoting multiplication,
in which case it always returns a Number object. If one of the
arguments to multiplication is an Object and the other a long, it
returns a Number object --- I suppose because if the Object turns out
to be a Double rather than a Long, then it has to return a Double
rather than a Long object?

So it seems like it's not enough to see a literal and think it's
initializing the variable r to be a primitive, but you'd need to know
what kind of primitive it is (long or double?), and what effect it has
on the return value of the functions they are arguments to.  Eg, by
looking at Numbers.java, I figured to try making r double so to force
* to return double.

This is all in "theory," of course.  Whether you think it matters day-
to-day is something else.  On that note, my own code works fine on the
new equal branch.

It does concern me a little bit that to really understand what a
function or loop/recur will do, it seems I'd have to know the kind of
primitive type that the variables are initialized to, or that the
functions I call might return...

Carson

PS: how come everyone is calling throw-on-overflow "unsafe" now?
hmm...
Re: Enhanced Primitive Support Nicolas Oury 6/20/10 3:39 AM
It seems to me that there is mainly 2 categories in this thread:

- people that are worried about the difficulty to understand and to code with the throw-on-overflow semantic. They worry about  untested cases showing in production code and the steeper learning curve of the language. (I am not of this side, so please forgive me if I don't understand the argument well enough)

- people that are annoyed to lose some speed everywhere in programs, in a way that can't be easily profiled or solved without annotating every lines. People that also believe it is rare and predictable to overflow a long.

I think both arguments are valid, and that most people won't change their mind on this subject. 
In consequence, I would like to advocate a proposal along the lines of: 
- A set of operations per return size, labelled by their size. +-long, +-boxed, +-float,+-double ... (I am not good with name, so I am sure other people can come with better names) . They are overloaded on their arguments, but indicate return size (I think that's what the last branch does, but then again I might be totally wrong).
- A set of operation without annotations : +, - ,...

By default, these operators map to +-boxed, --boxed,...
This solves the worries about the steep learning curve, because for beginners the story stops here.

There is a flag allowing to change the default value, (or a namespace trickery, but it make it harder to apply to library code).

That should come with some coding convention:
- use +-boxed when you write a function that creates int exponentially bigger that its argument. (fact is in this category)
- use +-long when you write code abound integers bound to a resource  (indices, counters and the like...)
- use the untagged operators in the other situations, +-boxed when in doubt.
 
I would also advocate that there should be a +-small family that is either +-long or +-int depending on the platform. (+-long on modern computers, but open the way to +-int on phones, or +-double on Javascript platform where there is no integer, if I am no wrong.)

For those worrying about a semantic-changing flag, I would like to note that it is not semantic-changing, it is only resource-size changing.
You answer the question : what is the size of an integer? It is not different than choosing your heap size.

If you don't have enough resource there is an exception (not a silent failure, so this is safe) and you can add more resources. 

I like this because:
- it gives a clear story for beginners
- it gives an easy way for people having very important production code (like a server), that does not need very high performance
- it gives an easy solution for people having some code that need higher performance
- it paves the way for other platforms
- it is a pay-as-you-go approach, you don't have to understand any of that before you need it.

What do you think?

Re: Enhanced Primitive Support Heinz N. Gies 6/20/10 6:34 AM

On Jun 19, 2010, at 15:58 , Rich Hickey wrote:

> I am telling you though, if you continue with these "show stopper", "will keep people from using Clojure" sky-is-falling histrionics, I will stop reading your messages.

My apologies here, simple reason is I love clojure and it is one of the best (IT wise) things that had happened to me in a long time (actually since I bought my first Apple but that is a different story). I personally for my own code am very much fine with everything and will hardly leave for something that will hardly ever affect me (sorry to disappoint people here). I simply want the best for clojure since I hope for many more people to feel the joy and fun I had when I first discovered clojure, that is why I argue strongly against changes that, in my eyes will take this from those people simply in the fear that when I explain one of my friends how great clojure is (and it certainly is) they will look oddly at me and tell me 'wait this isn't great this is confusing and frustrating.' I can't guarantee this will happen and of cause this is just an argument but this is what I fear. I actually wanted to explain a friend who is very new to clojure (but a quite versed programmer) why I'm so eager in this matter and I seriously had a problem to explain him what is so confusing because the mix of static and dynamic typing and why loop acts like this was very hard to explain without starting from a very low level and explain clojures inner workings with numbers, loops, recurs and types. But I'll try to keep things a little bit less enthusiastic I promise :).

That said I think we're currently arguing three topics intermixed and this might lead to some confusion, at least I see the two big topics here which are not entirely mixed but interact in some areas.

1) The behavior of auto promoting of operators.

2) The default type of literals (primitives vs boxed).

3) The behavior (and typing) of loop.

To 2)
I think at least 2 is quite separated from the other two issues. When I'm not wrong we can have primitive literals without loosing anything at all, just being faster until they need to be boxed (unless I overlook something).

To 1)
At 1 I wonder could we have auto promoting but only if it becomes required? As in (+ 1 1) returns (long 2) but (+ 100000000000 1000000000000) returns a boxed value? This would I think give us fast arithmetics as long as we use small enough values but graceful behavior when we need big math? Or will this again slow things down a lot? If this would be a compromise for a semi fast but still non breaking (as in throws) way how about this as default and +' as the version that enforces fast math and won't ever promote but throw where it really really matters?

To 3)
To the third thing and in my eyes this is the most challenging and problematic issue. loop in current clojure is not statically typing (of cause it is but it tosses objects in any field so you can pass whatever you want and it still works). The current qual branch introduces silent (implicit) static typing if you use primitives this in my eye can lead to a lot of confusion and requires the user to know a lot about clojures under the hood to safely use recur.

Some people complained 'there are no /real/ breaking examples' and that is true, there aren't because the use of loop itself is very rear I think most people either use the list functions or laziness instead of recur. Also people who use loop recur in a mathy way (where problems might arise due to lot of number handling) are either beginners who write try out code as the fact example commonly shot down for being 'not real' or people who very much know-what-they-do<tm> and will most likely already have type hinted their code to squeeze the speed out of it so neither will it break nor will they experience a speed improvement.

My dream would be that loop would 'just work' and 'just work fast' too, of cause this isn't easy. Generally two ideas spring to mind and were already mentioned. One being compiling the loop twice (I write it in pseudo code here since I am not sure how clojures compiler works, I tried to read the loop part but I got confused :P):

(loop [n n r 1]
        (code ..)
        (recur (f n) (g r)))

#loop label <object n object r>
... loop code ...
recur new-n new-r

#loop label <object n long r>
... loop code ...
recur new-n new-r (overloading the loop label r could either be long which would jump to the fast version or not long which would jump to the object version).

I am entirely not sure if this is possible or not, but if it were heck this would be a nice way to go ^^. To solve the problem with too many variables I'd not go with one overload for every possible combination but one overload with all objects (old way) and one overload with everything nice fast primitive types - meaning that when one primitive fails all of them have loose their primitiveness.

Another option would be as someone above I'm not entirely sure who any more, suggested that instead of the compiler warning about not returning a primitive the loop just would box that primitive and toss a warning if warn on reflection is set to true - this would be fast where the compiler can decide it is correct, never wrong, and quite easy findable and optimizable due to the reflection flag.

Regards,
Heinz

Re: Enhanced Primitive Support Steven E. Harris 6/20/10 7:28 AM
David Nolen <dnolen...@gmail.com> writes:

> Using loop/recur is already the beginning of code smell for anything
> that is not performance sensitive.

[...]

In your arity-overloaded example, is there any runtime cost to figure
out which of the two overloads to choose each time `recur' evaluates?

--
Steven E. Harris

Re: Enhanced Primitive Support Nicolas Oury 6/20/10 8:57 AM
On Sun, Jun 20, 2010 at 2:34 PM, Heinz N. Gies <he...@licenser.net> wrote:

To 3)
To the third thing and in my eyes this is the most challenging and problematic issue. loop in current clojure is not statically typing (of cause it is but it tosses objects in any field so you can pass whatever you want and it still works). The current qual branch introduces silent (implicit) static typing if you use primitives this in my eye can lead to a lot of confusion and requires the user to know a lot about clojures under the hood to safely use recur.


With what I think is 1.2-MASTER, a few weeks old (I start to be a bit lost with many clojure.jar):

user=> (loop [i (long 1)] (recur :a))
#<CompilerException java.lang.RuntimeException: java.lang.IllegalArgumentException: recur arg for primitive local: i must be matching primitive (NO_SOURCE_FILE:2)>

There is static typing for primitive already. (And I think it has been like that since a long time). It is what allows clojure to java-like fast on hot loops.
The new problem, as far as I understand it, is that, if literals are primitives, my snippset is equivalent to 
 (loop [i 1] (recur :a))

It might be easier to have the same default for literals and return values of operators (even if it is, as I would like, a parameterizable default).
Or to have a warning and a cast, for non primitive recur in a primitive loop?
(Might help if the next iteration comes as a result of a non-static function)
Re: Enhanced Primitive Support Heinz N. Gies 6/20/10 9:04 AM

On Jun 20, 2010, at 17:57 , Nicolas Oury wrote:
> With what I think is 1.2-MASTER, a few weeks old (I start to be a bit lost with many clojure.jar):
>
> user=> (loop [i (long 1)] (recur :a))
> #<CompilerException java.lang.RuntimeException: java.lang.IllegalArgumentException: recur arg for primitive local: i must be matching primitive (NO_SOURCE_FILE:2)>
>
> There is static typing for primitive already. (And I think it has been like that since a long time). It is what allows clojure to java-like fast on hot loops.
> The new problem, as far as I understand it, is that, if literals are primitives, my snippset is equivalent to
>  (loop [i 1] (recur :a))

Yes the different being in my eyes is the change from explicit (as in type hints or direct casts) to implicit (as in just since there is 1) typing.

Regards,
Heinz

Re: Enhanced Primitive Support MarkSwanson 6/20/10 9:16 AM
> A flag like *warn-on-boxing* can help to identify these spots. These works for all kind of things. Not only for contrived fact and fib exampls.

+1.
I think something like *warn-on-boxing* would be helpful. I'd use it.

Re: Enhanced Primitive Support Luke VanderHart 6/20/10 9:57 AM
I've been reading this thread, and there's good arguments being made
both ways - I've just been reading with interest. But after seeing the
factorial function that won't compile without hints/casts, I feel I
have to speak up.

I wrote a book on Clojure. It's a short book. That's because Clojure
is simple. It's dynamically typed, and everything "just works." In the
current version of Clojure, you can use numbers all day long without
worrying about types. Type hinting and casting is an advanced topic -
I think it should stay that way.

It seems evident that some sort of compromise will be made on this
question. That's fine. I genuinely could go either way. But whatever
combination of features makes it into the final decision, they should
absolutely, vehemently not *require* knowing how to type hint or cast
just to get something basic to compile (as in the factorial example).
That is antithetical to the whole concept of dynamic typing. Having
exceptions thrown on overflow is ok with me - that's something
programmers are used to putting up with, I think, as long as the rules
are clear. I don't care what the rules are, as long as they're simple
to understand and based on a few axioms. If integers are bounded,
that's fine, if they automatically promote, that's fine, as long as
the rules are clear. If there's a separate set of math functions which
do or don't promote, that's fine.   But don't, please don't, do
anything that would mean I'd have to explain the intricacies of
primitives, boxing, hinting and casting in an "Intro to Clojure"
course. As much as humanely possible, that should be reserved for the
"Performance coding in Clojure" sequel.

Thanks,
-Luke
Re: Enhanced Primitive Support David Nolen 6/20/10 10:27 AM
On Sun, Jun 20, 2010 at 12:57 PM, Luke VanderHart <luke.va...@gmail.com> wrote:
anything that would mean I'd have to explain the intricacies of
primitives, boxing, hinting and casting in an "Intro to Clojure"
course. As much as humanely possible, that should be reserved for the
"Performance coding in Clojure" sequel.

Thanks,
-Luke

This begs the question: Is loop/recur an advanced / performance coding topic?

(defn fact [n]
  (reduce *' (range 1 n)))
Re: Enhanced Primitive Support Laurent PETIT 6/20/10 10:42 AM
2010/6/20 David Nolen <dnolen...@gmail.com>:

> This begs the question: Is loop/recur an advanced / performance coding
> topic?
> (defn fact [n]
>   (reduce *' (range 1 n)))

no

Re: Enhanced Primitive Support Garth Sheldon-Coulson 6/20/10 10:52 AM
Like Luke, I have been reading this thread with interest. For what it's worth, I'm in basic agreement with him.

I might take it a step further. I don't think anyone should have to think about boxing and primitives when writing standard idiomatic code -- code that uses +, *, loop, and recur. The joy of Clojure is, in my opinion, that it makes functional programming easy. How numbers are stored is an implementation detail that I would really prefer not to have to consider when doing something "easy." For the vast majority of applications, computers are fast enough these days to handle a little boxing and unboxing without anyone ever noticing.

Many people younger than 25, especially casual programmers, have never programmed in a statically typed language. Many of my friends were introduced to programming in Python or Ruby or Mathematica or what have you, and have never used Java or C, and will never have any good reason to. Clojure is a good enough language in so many respects, in my opinion, that it will increasingly be used by casual programmers (perhaps mathematicians, economists, scientists, financial analysts) who need a powerful, simple, concurrency-oriented language to program their multicore machines. I think it would be a shame if the using the standard operations correctly required an understanding of concepts that are otherwise foreign to a dynamically typed language. Yes, people might get it right by accident most of the time because longs rarely overflow in normal use -- but getting it right by accident is not getting it right. Getting it right involves proper understanding.

I am completely in favor of a parallel universe that makes it easy to optimize without filling one's code with type hints. If that parallel universe involves functions called +', *', recur', loop', that's great. In Mathematica, there's a contagious operator called N that you can wrap numbers in to make them machine numbers rather than arbitrary precision numbers (sort of an inverse of the num function in the latest proposal). I don't think the Mathematica system would work well here, but it's based on a principle I like: make all numbers auto-promoting (integers) and arbitrary-precision (floats) by default until the user specifically requests otherwise, and make it easy to request otherwise. I think transients fit the more general version of this principle perfectly, and it would seem strange to me to go in a different direction for numbers.

These views are of course colored by my own needs (mathematical programming and statistics, where one sometimes deals with very large or small numbers if one is too lazy to work in logs). I know others have other needs. I'm sure a good compromise will be reached. It wouldn't be the end of the world if I had to use primed operators in my mathy code, but I would probably wince a little each time.

Garth

P.S. I don't consider loop-recur an advanced or performance-coding topic. It's the only way to get constant-stack recursion in Clojure. Many would consider recursion the most foundational of programming topics.

On Sun, Jun 20, 2010 at 12:57 PM, Luke VanderHart <luke.va...@gmail.com> wrote:
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support Luke VanderHart 6/20/10 11:07 AM
As Rich has mentioned a couple times, though, let's be careful not to
conflate "bounded integers" with "unsafe integers." In the proposed
system, an overflowing number would throw an exception. Presumably the
documentation would make it very clear what the behavior of numbers
is. That's not unsafe or unexpected in any way, even for a dynamic
language, it's just different.

The only thing that sends shivers down my spine is *requiring*
knowledge of casting and type hinting. I don't think that's
appropriate to demand of someone new to Clojure. Let them learn to
enjoy Clojure first, before showing them the ugly underbelly of what's
sometimes necessary for performance.

-Luke
Re: Enhanced Primitive Support Garth Sheldon-Coulson 6/20/10 11:25 AM
Yes, you're right. I wasn't suggesting that someone without an understanding of static types in general or Java types in particular would be liable to write *unsafe* code. I was saying that he or she might be prone to writing code that produces runtime exceptions, and that these exceptions might not appear right away but rather only with certain runtime inputs. To me, correct code is code that isn't liable to produce unexpected exceptions.

So, my own preference, for myself and also for those to whom I am likely to introduce Clojure, is always to have a fully auto-promoting and arbitrary-precision numerical stack by default, with optimizing operations as an option. I would be perfectly happy, however, if the decision went the other way. (What's a prime symbol among friends?)


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support Nicolas Oury 6/20/10 11:30 AM
Every code is prone to produce runtime exceptions on some input (mainly StackOverflow or OutOfMemory). 
That's just one more reason to produce.

I agree that it is one harder to grasp for a non computer scientist, but it is quite a simple one to manage for the more advanced user.
There is a difficult question beginner vs more knowledgeable people.

I don't believe overflow exception is hard for a beginner. (They know how an integer is stored in a computer), but they should not need to write any 
annotations. 


On Sun, Jun 20, 2010 at 7:25 PM, Garth Sheldon-Coulson <ga...@mit.edu> wrote:
Yes, you're right. I wasn't suggesting that someone without an understanding of static types in general or Java types in particular would be liable to write *unsafe* code. I was saying that he or she might be prone to writing code that produces runtime exceptions, and that these exceptions might not appear right away but rather only with certain runtime inputs. To me, correct code is code that isn't liable to produce unexpected exceptions.

Re: Enhanced Primitive Support Luc 6/20/10 12:03 PM
I find these compromises quite
acceptable. Tuning code has always
been a last step in most dev. projects
I worked on except a few and I worked
on many resource constraint
platforms.

All languages I encountered had a
"default" integer/float implementation
and getting away from it rarely
occured but was doable if needed
either through some coding
trickery or compilation flags.

The mass majority need a simple
approach/implementation and this
proposal meets this requirement.
Having to care about the number sizes
is not the first design concern.
 
As for callers using the default implementation, some API trickery could
allow them to call a library using a
different implementation and get
return values in a compatible way
if it makes sense.

Specialized libraries could still force
callers to use the same implementation if you really need the same performance or 
precision in you own code.

For specialized arithmetic implementations,
I find the namespace approach
interesting. It may prove valuable
since at a glance the implemtation
chosen by any piece of code
 would be pretty obvious.

Changing from one implementation
to another one could possibly be
easier to do with this approach.
Swapping the name space or adding
one to overload the default one
is quite practical.

The line by line annotation is a
painful approach to such changes and
should be only required in rare cases
(mixed operand computations ?)


Luc P.

Sent from my iPod
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
Re: Enhanced Primitive Support Sean Corfield 6/20/10 12:04 PM
On Sun, Jun 20, 2010 at 11:30 AM, Nicolas Oury <nicola...@gmail.com> wrote:
> I don't believe overflow exception is hard for a beginner. (They know how an
> integer is stored in a computer), but they should not need to write any
> annotations.

I agree with Nicolas. An overflow says "you did math that doesn't work
with basic numbers" and pretty much everybody coming to computers via
any path may encounter that from time to time. It's basic stuff.

What's more puzzling to beginners is why simple code executes very
slowly (compared to Java or whatever). I see this crop up in every
single language community that I've been part of. A language that does
dynamic magic with numbers causes more complaints about poor
performance than any optimized language does about overflow
exceptions, in my experience.

I would vote strongly against a system where a simple, 'obvious'
program failed to compiled / run (for reasonable input) without
advanced stuff like type hinting or additional language constructs.

If you know you're dealing with big numbers and need to avoid overflow
exceptions at the cost of performance, that's a decision you can be
informed to make and adjust your code accordingly.

I have sympathy with both sides of the argument in this thread and
I've gone back and forth over the last couple of days. After seeing
all the pros and cons (and now we're seeing a lot of things repeated
which suggests the argument may be running dry), I am now firmly in
the camp that would prefer performance by default and fewer surprises
for folks new to Clojure.

That's probably colored by my background where most of the languages I
grew up with that shaped my experience gave you overflow exceptions
from fact( someLargeNumber ) but they calculated fact(
reasonableNumber ) very fast :)
--
Sean A Corfield -- (904) 302-SEAN
Railo Technologies, Inc. -- http://getrailo.com/
An Architect's View -- http://corfield.org/

"If you're not annoying somebody, you're not really alive."
-- Margaret Atwood

Re: Enhanced Primitive Support Richard Newman 6/20/10 12:46 PM
> I agree with Nicolas. An overflow says "you did math that doesn't work
> with basic numbers" and pretty much everybody coming to computers via
> any path may encounter that from time to time. It's basic stuff.

There's a distinction between knowing that something can occur, and  
expecting to have to handle it in user code. (Particularly without  
something like checked exceptions to tell you about it.)

If you're coming from Python, or Common Lisp, or any one of a hundred  
other languages, you know that numeric overflow can occur... and you  
expect your programming language to automatically promote on your  
behalf.

 >>> math.factorial(50) + 1
30414093201713378043612608166064768844377641568960512000000000001L

We similarly expect our PL and OS stack to handle virtual memory  
paging, garbage collection, and so on, even though that's "basic  
stuff" that everyone knows.

The question here is whether Clojure is a "primitive" language, where  
numbers are machine numbers by default, or a "modern" language, where  
getting primitive math might require input from the programmer, but by  
default all operations are promoting and conceal the limitations of  
the hardware. (The big trick is that there are limitations imposed by  
the JVM.)

My personal feeling is that one should be able to write non-hinted  
code and have it work without exceptions for all input ranges, and  
that the compiler should (a) do static analysis to optimize code when  
enough info is available, and (b) provide rich compiler messages to  
allow hints when it's not.

> I am now firmly in
> the camp that would prefer performance by default and fewer surprises
> for folks new to Clojure.

This is interesting.

One set of folks considers non-machine-number performance to be  
surprising, and wants to avoid it.

Another set of folks considers the possibility of arithmetic  
exceptions for certain combinations of input to be surprising, and  
wants to avoid them.

Nobody likes surprises, but discounts the other folks' surprise as  
less important :)


> That's probably colored by my background where most of the languages I
> grew up with that shaped my experience gave you overflow exceptions
> from fact( someLargeNumber ) but they calculated fact(
> reasonableNumber ) very fast :)

I'm used to languages where the compiler gives you both, and if it  
can't it can tell you why...

http://www.franz.com/support/documentation/8.2/doc/compiler-explanations.htm#explain-1

;Tgen1:Examined a call to +_2op with arguments:
;Targ2:  symeval x type in fixnum range (integer 0 64)
;Tinf1:     variable-information: lexical: ((type (integer 0 64)))
;Targ3:  constant 1 type in fixnum range (integer 1 1)
;Tres1:   which returns a value in fixnum range of type (integer 0 64)

versus

;Call1:Generated a non-inline call to /_2op:
;Tgen1:Examined a call to /_2op with arguments:
;Targ2:  symeval sum type in fixnum range (integer -536870912 536870911)
;Tinf1:     variable-information: lexical: ((type
                                              (integer -536870912  
536870911)))
;Targ3:  constant 4096 type in fixnum range (integer 4096 4096)
;Tres1:   which returns a value of type (rational * *)

Re: Enhanced Primitive Support Nicolas Oury 6/20/10 12:58 PM
Actually, that's worst. Most language we grew up with, and most languages I taught, had fact( someLargeNumber ) = 0 very fastly.
(Which is harder to explain, but explainable)
At least, throw on overflow is safe! 


On Sun, Jun 20, 2010 at 8:04 PM, Sean Corfield <seanco...@gmail.com> wrote:
That's probably colored by my background where most of the languages I
grew up with that shaped my experience gave you overflow exceptions
from fact( someLargeNumber ) but they calculated fact(
reasonableNumber ) very fast :)

Re: Enhanced Primitive Support Nicolas Oury 6/20/10 1:03 PM
On Sun, Jun 20, 2010 at 8:46 PM, Richard Newman <holy...@gmail.com> wrote:
The question here is whether Clojure is a "primitive" language, where numbers are machine numbers by default, or a "modern" language, where getting primitive math might require input from the programmer, but by default all operations are promoting and conceal the limitations of the hardware. (The big trick is that there are limitations imposed by the JVM.)

primitive vs modern might not be the right terminology.

Dynamic vs static might be closer to the partition prim arith vs dynamic boxed arith.
There are modern languages that are static...

That's a natural choice when you don't want to talk about type, but a quite expensive one in term of performances.
If something better can be found, that would be great.

Re: Enhanced Primitive Support Heinz N. Gies 6/20/10 1:10 PM

On Jun 20, 2010, at 22:03 , Nicolas Oury wrote:

> primitive vs modern might not be the right terminology.

The main issue I see is not that primitives vs boxed I think when it is done right both will be accepted by the community, the main issue I see is the consequences primitives cause when it comes to things like (or actually only I think) loop.

That it makes loop that were initialized with primitives only work with primitives even if the functions return other stuff. so a loop that was initialized with (loop [n 1] ...) will only work with an recur that returns a long, if no long is returnd it acts oddly doing (* n 0.5) which would give you 0.5 every where else in clojure will (since it is type casted to a long) 1 there which is a very odd behavior even if it is documented. This adds to the fact that some simple looking functions that don't at all look wrong (like fact provided earlyer) don't work any more even so from the pure logic (without looking into the special rules for loop) it looks right.

Regards,
Heinz

Re: Enhanced Primitive Support Sean Corfield 6/20/10 1:52 PM
> This is interesting.
>
> One set of folks considers non-machine-number performance to be surprising,
> and wants to avoid it.
>
> Another set of folks considers the possibility of arithmetic exceptions for
> certain combinations of input to be surprising, and wants to avoid them.
>
> Nobody likes surprises, but discounts the other folks' surprise as less
> important :)

Like I said, our expectations are colored by our experiences and we
all have different experiences.

Whatever decision is ultimately made, it wouldn't affect my decision
to use Clojure or not overall, although it may incline me to use
Clojure for *different* parts of a problem solution (I'm more of a
mixed language solution guy than one size fits all anyway).


--
Sean A Corfield -- (904) 302-SEAN
Railo Technologies, Inc. -- http://getrailo.com/
An Architect's View -- http://corfield.org/

"If you're not annoying somebody, you're not really alive."
-- Margaret Atwood

Re: Enhanced Primitive Support Paul Moore 6/20/10 5:35 AM
On 19 June 2010 15:26, cageface <mile...@gmail.com> wrote:
> Maybe it's only because I'm coming from Ruby, in which number
> promotion is automatic and everything is slow, but if I have to choose
> between correctness and performance as a *default*, I'll choose
> correctness every time. I think there's a good reason that GCC, for
> instance, makes you push the compiler harder with compiler flags if
> you want to squeeze extra performance out of a program and accept the
> corresponding brittleness that it often brings. I also always thought
> that the transparent promotion of arithmetic was one of the strongest
> selling points of Common Lisp.
>
> My impression has always been that performance of numerics is rarely
> the bottleneck in typical code (web stuff, text processing, network
> code etc), but that unexpected exceptions in such code are the source
> of a lot of programmer heartache. On the other hand, I think 99% of
> the cases in which I've had a number exceed a 64 bit value were also
> examples of errors that might as well have been exceptions because
> they indicated a flaw in the code.

s/Ruby/Python/ and this is pretty much precisely my view.

I haven't yet hit this situation in Clojure, but a common type of
program I write is to collect and summarise database stats for a range
of systems we support. Here, tablespace sizes range from a few
hundreds of MB, to the half-terabyte mark. To handle these types of
figure, big integers are a necessity - but common test cases don't
need big integers. I understand that, by judiciously scattering
annotations around in my code, I can ensure that I don't hit primitive
integer size limits. I'm not sure that I'll have enough understanding
of numeric limits and promotion behaviour to know *where* to put
annotations, so I'll likely just scatter them round until I stop
hitting bugs (I would assume here that people for whom numeric
performance is crucial will be a lot more expert in numeric behaviour
and so will be better able to annotate precisely, but I accept that's
nothing more than an assumption).

But suppose I want to use some library code - for example, something
like clojure.contrib.accumulators, to summarise my data. It's not
clear to me from the things I've read whether I can expect it to work
with the big numbers I'm giving it. The documentation says nothing, as
it was written before this change. So do I have to read and understand
the code? If library code *won't* "just work" by adapting to the types
given to it, will we end up with a split between bignum and primitive
libraries? It seems to me that many of these libraries would be just
as good for people wanting fast math as for people like me slinging
about "big" numbers like disk file sizes.

The "it's only about which is default" argument is fine (I contend
that people wanting performance are likely to be the more numerically
expert and hence the better capable to enable non-default behaviour,
but that's just my view). But if libraries get split into
bignum-supporting vs fastnum-supporting, that seems to me to be a far
more significant issue.

Paul.

Paul

Re: Enhanced Primitive Support William Wadsworth 6/20/10 5:47 AM
I have been using Clojure for just under a week, but ...

I have been using CL for almost three years now and have lately been
looking
at and actually using Clojure. I think that it should be easy to write
correct code then put it a bit of work later to make it fast.

On Jun 18, 12:44 pm, Christophe Grand <christo...@cgrand.net> wrote:
> On Fri, Jun 18, 2010 at 8:58 AM, Richard Newman <holyg...@gmail.com> wrote:
> > It only takes one library to screw up, and the whole stack falls down.
> > Screwing up can occur because of omission (fact being written with a "1",
> > not "1N", and not being tested with large inputs), or by design.
>
> It can already happen: a library can screw you by design.
>
> (defn fact [n] (loop [r (int 1) n (int n)] (recur (* r n) (dec n))))
> user=> (fact 40)
> java.lang.ArithmeticException: integer overflow (NO_SOURCE_FILE:0)
>
> Actually people who want numerical perfs have to do very invasive type
> hinting and other interfaces workaround because primitives are...
> primitives (not objects).
>
> With contagious bigints (let's nick name them "safeints" and assume
> they are not BigInteger but something à la kawa) a single N on
> literals or having all your inputs going through a safeint conversion
> would trigger safe computations (unless you try to call a static
> primitive-hinted fn but the compiler would then bark).
>
> Of course a rogue fn can still internally convert to primitives and
> overflow but you can't do anything to protect you from that.
>
> As for mixing several numebr representations in a single collection, I
> think it's a problem of normalizing data but most of the time your are
> either in safeland or in fastland so your representations will be
> homogeneous (safehints or (boxed) primitives).
>
> A middleground may be to keep the semantics proposed by Rich but:
> * integer literals default to safeint (unless suffixed by L for long)
> * scientific notation and decimal literals default to double (unless
> suffixed by M or N)
> * implement a better bigint.
>

This, I feel, would be a win/win.

+1

> Christophe
>
> --
> European Clojure Training Session: Brussels, 23-25/6http://conj-labs.eu/
> Professional:http://cgrand.net/(fr)
> On Clojure:http://clj-me.cgrand.net/(en)

---
Martin.
Factorial (Was: Enhanced Primitive Support) Mike Meyer 6/20/10 3:10 PM
On Sun, 20 Jun 2010 13:27:01 -0400
David Nolen <dnolen...@gmail.com> wrote:

> On Sun, Jun 20, 2010 at 12:57 PM, Luke VanderHart <luke.va...@gmail.com
> > wrote:
>
> > anything that would mean I'd have to explain the intricacies of
> > primitives, boxing, hinting and casting in an "Intro to Clojure"
> > course. As much as humanely possible, that should be reserved for the
> > "Performance coding in Clojure" sequel.
> This begs the question: Is loop/recur an advanced / performance coding
> topic?
>
> (defn fact [n]
>   (reduce *' (range 1 n)))

In this case it isn't, because none of these solutions presented
qualify as a "fast" implementation of factorial. As in most cases, you
don't get real speed by trivial tweaks like using primitives instead
of boxed numbers; you real get speed by changing the algorithm to
something fast - and, in clojure, something that will leverage
multi-core hardware.

         <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Re: Enhanced Primitive Support puzzler 6/20/10 5:19 PM
The arguments seem to be winding down, so I thought this would be a
good time to summarize my thoughts.

My favorite option of those proposed is:
+, *, -, inc, dec auto-promote.
loop/recur is changed so that primitive bindings are boxed.
+',*',-',inc',dec' give error upon overflow.
A new construct, loop', is introduced that doesn't box primitives and
enforces recur with primitive upon penalty of error.

Why I like this:

*  Ordinary code continues to work almost exactly as before (no new
surprise errors from overflowing longs or incorrect loop/recur usage),
and with similar performance characteristics (and I already consider
Clojure to be quite fast among dynamic languages).  [Note that
changing loop to auto-boxing behavior shouldn't hurt performance
because currently, almost all numbers are boxed unless otherwise
specified with an explicit primitive typecast, so we're already seeing
what "boxed speed" looks like].

* The recipe for optimization is simple.  Declare your function to be
static and type hint the primitive inputs and output.  Inside your
static function, change all +,-,*,inc,dec,loop to
+',-',*',inc',dec',loop' and you'll get great speed improvements with
much less effort than before.  [Maybe through some clever macrology,
you could create something called defn-optimize which combines the
static function definition with a rebinding of +,-,*,inc,dec,loop to
their "prime" counterparts within the body of the static function,
since these constructs are likely to be all used in conjunction with
one another and are the most natural choice within a static function.]

* I believe that it makes the most sense to make the auto-promoting
ops the default because the non-promoting ops only provide benefit
when all the inputs are primitive longs.  Since numbers become boxed
whenever they are stored in a collection or passed to or returned from
a non-static function, it seems likely to me that except in certain
local contexts (for example, within a static function whose inputs are
declared to be primitive, or within a loop), most numbers will
eventually become "contaminated" by boxing anyway.  So I think it will
be easier to annotate those special instances where it is clear from
the local context that you're dealing with all primitives, rather than
the other way around.


I have this preference even though I am not using bigints in any of my
production code.  Except when I'm playing around with Project Euler
problems, most of my code is not particularly numeric in nature.  But
I still feel that the auto-promoting/auto-boxing defaults will provide
me with much less cognitive burden.  I don't want to worry about what
types of numbers are flowing into and out of the various calls and
loops to make sure that something will work the way I expect.  Types
are mostly "hidden" in Clojure, so it's difficult enough to analyze
this flow in one's own code, and it would certainly be even harder to
figure out what's going on when reading someone else's code.

The #1 reason I code in Clojure is that I find it easy to express
complex concepts in ways that are easy to understand and easy to
verify.  For me, keeping that clarity is paramount.

Re: Enhanced Primitive Support Luc 6/20/10 5:47 PM
I would expect a library to internaly
work with whatever is the best
implementation and provide an API
to allow callers using another implementation to use it.

I also expect a library to complain
about a potential overflow
and maybe some precision loss.

That's not different from what you can
see on many platforms over the last 20
years. Multiple float implementations
existed on the same platforms for
years and you had to choose one
considering performance and precision.

Of course this requires some discipline
by the library provider but the
performance requirements may dictate
that choice depending on its purpose.

Luc P.

Sent from my iPod

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient  
> with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support David Nolen 6/20/10 6:03 PM
On Sun, Jun 20, 2010 at 8:19 PM, Mark Engelberg <mark.en...@gmail.com> wrote:
My favorite option of those proposed is:
+, *, -, inc, dec auto-promote.
loop/recur is changed so that primitive bindings are boxed.
+',*',-',inc',dec' give error upon overflow.
A new construct, loop', is introduced that doesn't box primitives and
enforces recur with primitive upon penalty of error.

Until the discussion on loop/recur I was ok with fast math taking '. But now that I see that people are proposing to change the behavior of loop/recur (in way I that think is inconsistent with the behavior of literals under the equal branch). I'm now firmly on the side that people who want auto-promotion should use '.
Re: Enhanced Primitive Support Mike Meyer 6/20/10 8:11 PM

"David Nolen" <dnolen...@gmail.com> wrote:

The ' (prime) names are mnemonic for "primitive type" math. That doesn't work if you reverse the meanings. In that case the names would be better with something like do + - * / ++ -- (or maybe 1+ 1- or +1 -1 though that requires tweaking the definition of symbol) for primitive math and loop add sub mul div inc dec for boxed math.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: Enhanced Primitive Support Heinz N. Gies 6/20/10 10:59 PM

On Jun 21, 2010, at 2:19 , Mark Engelberg wrote:

> The arguments seem to be winding down, so I thought this would be a
> good time to summarize my thoughts.
>
> My favorite option of those proposed is:
> +, *, -, inc, dec auto-promote.
> loop/recur is changed so that primitive bindings are boxed.
> +',*',-',inc',dec' give error upon overflow.
> A new construct, loop', is introduced that doesn't box primitives and
> enforces recur with primitive upon penalty of error.
+42

I find this is a very good summery and I about totally agree with it. I'd just like to add some 'next steps' or 'next goals' that would be good. it would be good if the auto promoting +, -, * ... would in the long run leave a long a long as long as it can (what I mean is that  (+ long long) is again long as if it does not overflow, same for the other ops especially inc and dec (since used for counters. And at one point of things go on it'd be great if loop would become 'smart enough' to decide when boxing is necessary and when it can keep stuff in primitives for speedy things. If we put this as our goals we mostly don't need forced ' to switch between promoting and non promoting since it would be just working and would be just fast :)

Regards,
Heinz

Re: Enhanced Primitive Support Nicolas Oury 6/21/10 12:32 AM
That's not true. primitive loop is one of the key of the java-like performance of clojure, once there are a few annotations.
Re: Enhanced Primitive Support Nicolas Oury 6/21/10 1:06 AM
On Mon, Jun 21, 2010 at 1:19 AM, Mark Engelberg <mark.en...@gmail.com> wrote:
The arguments seem to be winding down, so I thought this would be a
good time to summarize my thoughts.

I will do that too.
The two sides of the arguments have both good points and won't change their position. 
So, my proposal:
- a different set of labeled operators +', +#  per family of return value. (boxed, long, maybe more to come...)
- the default for the non-labeled operators +,-... can be choose either by a flag or by a ns trickery. ( I have to say I advocate the flag)
- the flag also change the default for literals.
- Default for the the non-labeled operator is boxed. It removes the problem for loop/recur, and gives back the problems are they are now.
- People that don't know use the default operators
- people that explicitly knows their code has some interesting properties (always small int, 
can produce a big int with very small inputs) specifies it by using explicitly the specialised operators. 

The flag *use-primitive-as-default* is a big help for optimization. Most of the performance hits of boxing do not show in profiling.
(It slows down a lot of functions, by memory access and - more importantly - filling the cache with garbage,  that do not seem to be arithmetic-heavy)

The do-not-worry default allows an easy learning curve.

The specialized operators in both directions allows to explicit some semantic differences in the meaning of the operations. It is good to give a way to people to 
specify, if they want to, what they know about their program.



Re: Enhanced Primitive Support Nicolas Oury 6/21/10 1:07 AM
On Mon, Jun 21, 2010 at 9:06 AM, Nicolas Oury <nicola...@gmail.com> wrote:
- Default for the the non-labeled operator is boxed. It removes the problem for loop/recur, and gives back the problems are they are now.


oups. Not really awake. Monday morning :( 
I meant:

- Default for the the non-labeled operator is boxed. It removes the problem for loop/recur without annotations, and gives back the problems as they are now with annotations.
Re: Enhanced Primitive Support puzzler 6/21/10 3:59 AM
On Mon, Jun 21, 2010 at 12:32 AM, Nicolas Oury <nicola...@gmail.com> wrote:
> That's not true. primitive loop is one of the key of the java-like
> performance of clojure, once there are a few annotations.

Right, the key words here are "once there are a few annotations".
Those who aren't making those annotations will see the same
performance they saw before.  Those who have been making annotations
will simply use loop', which would trigger the primitive bindings and
give the java-like performance with much less annotation than before.

Re: Enhanced Primitive Support Tim Daly 6/21/10 4:40 AM
This debate happened many years ago for common lisp.

> The flag *use-primitive-as-default* is a big help for optimization.
> Most of the performance hits of boxing do not show in profiling.
> (It slows down a lot of functions, by memory access and - more
> importantly - filling the cache with garbage,  that do not seem to be
> arithmetic-heavy)
Except for renaming this appears to be the common lisp
(proclaim (safety 0) (speed 3))

>
> The do-not-worry default allows an easy learning curve.
>
Except for renaming this appears to be the common lisp
(proclaim (safety 3) (speed 0))

Rich explicitly said he did not want this solution.

Tim Daly

Re: Enhanced Primitive Support Heinz N. Gies 6/21/10 5:23 AM

On Jun 21, 2010, at 13:40 , Tim Daly wrote:

This debate happened many years ago for common lisp.
...

Rich explicitly said he did not want this solution.

Tim Daly

it would be more or less trivial to make a macro that switches between non promoting + primitive loop and promoting and boxed loop:

(defmacro fast [body]
  `(let [+# +, *# *, -# -, loop# loop ...]
     (binding [+ +', * *', - -', loop loop' ...
                      +' +#, *' *#,-' +#, loop' loop# ...]
      ~@body)))
;; NOT TESTED! Just a brainstorming

And of cause it would just work the same when the defaults were the other way round only would need to rename fast to safe or just-working or whatever is fitting.

It does not has to be part of c.core or c.contrib if Rich does not like it but it might very well be in every code if someone likes it for their little realm as I would.

Regards,
Heinz
Re: Enhanced Primitive Support Konrad Hinsen 6/21/10 5:37 AM
On 21.06.2010, at 02:19, Mark Engelberg wrote:

> The arguments seem to be winding down, so I thought this would be a
> good time to summarize my thoughts.
>
> My favorite option of those proposed is:
> +, *, -, inc, dec auto-promote.
> loop/recur is changed so that primitive bindings are boxed.
> +',*',-',inc',dec' give error upon overflow.
> A new construct, loop', is introduced that doesn't box primitives and
> enforces recur with primitive upon penalty of error.

My preference is the same, for pretty much the same reasons.

Konrad.

Re: Enhanced Primitive Support Nicolas Oury 6/21/10 6:43 AM
I though his concern was mainly readability, but maybe I misread. Anyway 
(set! *default-integers* :boxed)

or

(set! *default-integers* :long) seems easier to understand.

 
Re: Enhanced Primitive Support Lee 6/21/10 7:04 AM

On Jun 21, 2010, at 8:23 AM, Heinz N. Gies wrote:
> it would be more or less trivial to make a macro that switches between non promoting + primitive loop and promoting and boxed loop:
>
> (defmacro fast [body]
>   `(let [+# +, *# *, -# -, loop# loop ...]
>      (binding [+ +', * *', - -', loop loop' ...
>                       +' +#, *' *#,-' +#, loop' loop# ...]
>       ~@body)))
> ;; NOT TESTED! Just a brainstorming


If the body does anything in new threads, e.g. in a pmap, then they won't be fast (or correct, if you reverse the defaults).

I guess this a somewhat unrelated newbie question: aren't there often cases like this where one wants to temporarily rebind something, but across all children threads? Is there some straightforward way to do this that I've missed?

 -Lee

--
Lee Spector, Professor of Computer Science
School of Cognitive Science, Hampshire College
893 West Street, Amherst, MA 01002-3359
lspe...@hampshire.edu, http://hampshire.edu/lspector/
Phone: 413-559-5352, Fax: 413-559-5438

Check out Genetic Programming and Evolvable Machines:
http://www.springer.com/10710 - http://gpemjournal.blogspot.com/

Re: Enhanced Primitive Support Meikel Brandmeyer (kotarak) 6/21/10 7:21 AM
Hi,

On Jun 21, 4:04 pm, Lee Spector <lspec...@hampshire.edu> wrote:

> I guess this a somewhat unrelated newbie question: aren't there often cases like this where one wants to temporarily rebind something, but across all children threads? Is there some straightforward way to do this that I've missed?

See: http://richhickey.github.com/clojure/clojure.core-api.html#clojure.core/bound-fn

Sincerely
Meikel
Re: Enhanced Primitive Support Rich Hickey 6/21/10 7:52 AM

On Jun 20, 2010, at 12:57 PM, Luke VanderHart wrote:

> I've been reading this thread, and there's good arguments being made
> both ways - I've just been reading with interest. But after seeing the
> factorial function that won't compile without hints/casts, I feel I
> have to speak up.
>
> I wrote a book on Clojure. It's a short book. That's because Clojure
> is simple. It's dynamically typed, and everything "just works." In the
> current version of Clojure, you can use numbers all day long without
> worrying about types. Type hinting and casting is an advanced topic -
> I think it should stay that way.
>
> It seems evident that some sort of compromise will be made on this
> question. That's fine. I genuinely could go either way. But whatever
> combination of features makes it into the final decision, they should
> absolutely, vehemently not *require* knowing how to type hint or cast
> just to get something basic to compile (as in the factorial example).
> That is antithetical to the whole concept of dynamic typing. Having
> exceptions thrown on overflow is ok with me - that's something
> programmers are used to putting up with, I think, as long as the rules
> are clear. I don't care what the rules are, as long as they're simple
> to understand and based on a few axioms. If integers are bounded,
> that's fine, if they automatically promote, that's fine, as long as
> the rules are clear. If there's a separate set of math functions which
> do or don't promote, that's fine.   But don't, please don't, do


> anything that would mean I'd have to explain the intricacies of
> primitives, boxing, hinting and casting in an "Intro to Clojure"
> course. As much as humanely possible, that should be reserved for the
> "Performance coding in Clojure" sequel.
>

I've added the speculative analysis required to detect when recur  
arguments fail to match the type of primitive loop locals, and  
recompile the loop with those loop args boxed. When *warn-on-
reflection* is true it will issue a report that this is happening and  
why.

This code is in commit 0df995dc6d31a9f4d0fe199bc63c4abfac7c86b1 on the  
equal branch:

http://github.com/richhickey/clojure/commits/equal

Rich

Re: Enhanced Primitive Support Heinz N. Gies 6/21/10 7:55 AM

On Jun 21, 2010, at 16:52 , Rich Hickey wrote:

> I've added the speculative analysis required to detect when recur arguments fail to match the type of primitive loop locals, and recompile the loop with those loop args boxed. When *warn-on-reflection* is true it will issue a report that this is happening and why.


>
> This code is in commit 0df995dc6d31a9f4d0fe199bc63c4abfac7c86b1 on the equal branch:
>
> http://github.com/richhickey/clojure/commits/equal
>
> Rich
Awsome!!!! :)

Regards,
Heinz

Re: Enhanced Primitive Support ka 6/21/10 8:04 AM
> My favorite option of those proposed is:
> +, *, -, inc, dec auto-promote.
> loop/recur is changed so that primitive bindings are boxed.
> +',*',-',inc',dec' give error upon overflow.
> A new construct, loop', is introduced that doesn't box primitives and
> enforces recur with primitive upon penalty of error.

+ 1.  This seems the least of all evils.  Think 15 years down the line
when we have 1024 procs to program on.

Is it time for a vote? :)
Re: Enhanced Primitive Support David Nolen 6/21/10 9:32 AM
On Mon, Jun 21, 2010 at 10:52 AM, Rich Hickey <richh...@gmail.com> wrote:
I've added the speculative analysis required to detect when recur arguments fail to match the type of primitive loop locals, and recompile the loop with those loop args boxed. When *warn-on-reflection* is true it will issue a report that this is happening and why.


This code is in commit 0df995dc6d31a9f4d0fe199bc63c4abfac7c86b1 on the equal branch:

http://github.com/richhickey/clojure/commits/equal

Rich

Brilliant! 
Re: Enhanced Primitive Support Michał Marczyk 6/21/10 10:41 AM
On 21 June 2010 16:52, Rich Hickey <richh...@gmail.com> wrote:
> I've added the speculative analysis required to detect when recur arguments
> fail to match the type of primitive loop locals, and recompile the loop with
> those loop args boxed. When *warn-on-reflection* is true it will issue a

> report that this is happening and why.

Ah, this is absolutely fantastic!

I find my general stance on the issue matches Mark's summary pretty
much exactly and to be totally blissfully happy, I'd prefer the primes
to be moved over to the non-promoting ops (which I realise would cause
autoboxing to appear wherever any of the non-primed ops appear -- but
now there's the warning and thus no inexplicable slowdowns). In fact,
my understanding is that any locals participating in arithmetic which
directly or indirectly (i.e. because of interaction with other locals)
involves an unhinted parameter to the function the local is declared
in will now be autoboxed; does using the non-promoting operators by
default have any intrinsic benefits in this situation?

Note that after this commit I think I'm guaranteed to be extremely
happy about Clojure's new arithmetic. Thanks!

Sincerely,
Michał

Re: Enhanced Primitive Support Daniel 6/21/10 11:16 AM
If this is half the genius it sounds like then bravo! :D
Re: Enhanced Primitive Support Daniel 6/21/10 11:24 AM
user=> (set! *warn-on-reflection* true)
true
user=> (defn fac [n] (loop [n n r 1] (if (= n 1) r (recur (dec' n) (*'
r n)))))
NO_SOURCE_FILE:5 recur arg for primitive local: r is not matching
primitive, had: Object, needed: long
Auto-boxing loop arg: r
#'user/fac
user=> (fac 40)
815915283247897734345611269596115894272000000000N


Yes!!  Genius!  :D
Re: Enhanced Primitive Support Richard Newman 6/21/10 9:22 PM
> user=> (defn fac [n] (loop [n n r 1] (if (= n 1) r (recur (dec' n) (*'
> r n)))))
> NO_SOURCE_FILE:5 recur arg for primitive local: r is not matching
> primitive, had: Object, needed: long
> Auto-boxing loop arg: r
> #'user/fac
> user=> (fac 40)
> 815915283247897734345611269596115894272000000000N

Might I suggest that this message be augmented to include the  
recommended course of action, or at least a pointer to docs?

I can easily imagine a stream of confused people a year or so from now  
wondering how they can either (a) get the recur arg to match the  
primitive local, or (b) get a non-primitive local.

Re: Enhanced Primitive Support puzzler 6/21/10 9:44 PM
The new uber-loop is fantastic.

So I guess the main point still to be finalized is whether the default
arithmetic ops will auto-promote or error when long addition
overflows.

Playing around with the latest equals branch:

user=> (def n 9223372036854775810)
#'user/n

user=> (* (/ n 3) 3)
9223372036854775810N

user=> (* (/ n 2) 2)
java.lang.ArithmeticException: integer overflow

user=> (def x (/ n 4))
#'user/x

user=> (+ x x x x)
9223372036854775810N

user=> (+ (+ x x) (+ x x))
java.lang.ArithmeticException: integer overflow

user=> (range (- n 2) n)
(9223372036854775808N 9223372036854775809N)

user=> (range (- n 3) n)
java.lang.ArithmeticException: integer overflow


I understand exactly why some of these work and some of these don't.
My main point here is to illustrate that without the full numeric
tower supported by the default ops, there can certainly be some
surprises.  There is a "pathway" with the standard ops from longs to
rational numbers to bigints, but you can't cross directly from longs
to bigints.  Similarly, you can roundtrip from bigints to rationals
and back, but can't roundtrip from bigints to longs and back.  So the
results of computations depends on the path you follow.  Maybe we can
live with these idiosyncrasies for the speed benefits, but this is
worth being aware of.

The range example is something that is easily enough fixed.  Probably
if this branch becomes the standard, range should be modified so that
if the *upper bound* is a bigint, inc' is used rather than inc to
generate the range.  But I think this illustrates the kinds of issues
we're headed towards, regardless of which default is chosen -- people
who write libraries will be choosing between overflow and auto-boxing
primitives, and it might not always be clear from documentation what
the consequences are.  In the current implementation of range, it
works perfectly fine with longs, and it works perfectly fine with
bigints, but it breaks when your lower and upper bounds cross the
boundary.  This is exactly the kind of thing that might not be thought
of when making test cases, so errors like this could lurk for quite a
while without being spotted.

The people on the side of overflow-error-as-default feel that these
sorts of runtime errors are no more problematic than the many other
sorts of runtime errors that can result in Clojure, such as an
out-of-bounds exception when accessing a vector.  But I see these
types of errors as very different.  An out-of-bounds exception is easy
enough to prevent -- there is a simple test you can include in your
code to make sure your index is in bounds before you access your
vector.  But I think it's much harder to determine in advance whether
a sequence of computations will "cross the long boundary" for all the
possible inputs.

This is probably the main reason I continue to advocate for
auto-promoting ops as the default.  Error-upon-overflow adds an
element of run-time risk, and requires careful thought and additional
testing to achieve the same level of reliability.  I *want*
error-upon-overflow operations to be slightly harder to use so that
library writers will use them judiciously and consciously, and be very
aware of the extra effort they need to go to test their functions for
all numbers, clearly documenting any restrictions on the kinds of
numbers that are permitted.

Like I said before, Clojure's built-in range can easily be adjusted to
work well for speed *and* handle both longs and bigints gracefully.
But it serves as a good example of how the two defaults will affect
library writers.  If auto-promotion is the default, most library
writers will just use the +,*,-,inc,dec operators and it would work
for all numbers right out of the box.  A library writer who wants to
optimize for speed would have to go to a bit of extra effort to add
the apostrophes, and would hopefully at that point give some careful
thought as to what the consequences will be, catching the fact that
this will break ranges that span from longs to bigints, and adjusting
the code accordingly.  On the other hand, if overflow-on-error is the
default, this is what most people will use, and we'll end up with a
lot of code that breaks when crossing the long boundary.

I don't use any bigints, or anything even close to overflowing a long,
in the kind of code that I write for work.  If error-upon-overflow
wins as the default, I'll gain performance benefits with no immediate
downside.  But ultimately, I feel that anything that helps me reason
about and trust my code, and helps me trust the robustness of code
written by others that I rely upon, is a principle worth fighting for.

Re: Enhanced Primitive Support David Nolen 6/21/10 10:47 PM
If you're going to give example of what the current branch is like, at least write the code how a primitive Clojurian would write it:

(let [n (long 9223372036854775810)]
  (* (/ n 3) 3))
; Value out of range for long: 9223372036854775810

(let [n (long 9223372036854775810)]
  (* (/ n 2) 2))
; Value out of range for long: 9223372036854775810

(let [n 9223372036854775810
      x (long (/ n 4))]
  (-> x (+ x) (+ x) (+ x)))
; integer overflow

No need to continue, a primitive Clojurian would have known that number is too large to fit into a long anyway.

You also fail to show that in the equals branch if you are an auto-promoting Clojurian using the right operators everything is just dandy:

(def n 9223372036854775810)
(*' (/ n 3) 3)
(*' (/ n 2) 2)
(def x (/ n 4))
(+' x x x x)
(+' (+' x x) (+' x x))

No errors, except for the case of range which internally uses +. Clearly a range' is in order. This behavior should satisfy any Eulerian enthusiast.

Which brings us to your final point. Is the divide between primitive and boxed numbers something to worry about? Rich will be the final say of that.

Yet consider this, If I'm writing OpenGL code in Penumbra I will have quite a bit of code that amounts to the following:

; 630 msecs
(dotimes [_ 10]
  (time
   (dotimes [_ 1000000000]
     (+ 1 2))))

Yes. With Rich's primitive work we can get to *1 billion arithmetic operations* in < 2/3 of a second on commodity hardware.

; 10662.927 msecs
(dotimes [_ 10]
  (time
   (dotimes [_ 1000000000]
     (+ (Long. 1) (Long. 2)))))

; 46456.174 msecs
(dotimes [_ 1]
  (time
   (dotimes [_ 1000000000]
     (+' 1N 2N))))

I just don't see the two worlds interacting very much. They have different goals. I'm not going to call out into some generic library in the middle of my 999,999,999 iteration over a bunch of primitives.

David
Re: Enhanced Primitive Support Garth Sheldon-Coulson 6/21/10 10:51 PM
To me, numbers are an abstraction, just as sequences are an abstraction.

Because Clojure treats sequences as abstractions, I can say (partition 3 (set '(1 2 3 4 5 6))) without Clojure complaining that sets are not the kind of thing that can be efficiently partitioned. Sets are seqable, so they can be partitioned, even if calling seq on them takes a little bit of time. If I want more speed, I can change my algorithm and use subvec with a vector instead.

Likewise, integers are the kind of thing that can be incremented. My life is better if I can say (inc 9223372036854775807) without worrying about Clojure complaining that 9223372036854775807s are not the kind of thing that can be incremented efficiently. If I want more speed, I can put a little thought into it and use inc' and range checks instead. No biggie.

So I'm in complete agreement with Mark. The hidden hang-ups of primitive math are not made up for by a bit of extra speed, in my opinion, especially if really nice primitive operations are just an apostrophe away. Math is a place where I'd like nice abstractions to be the default.

Who knows -- maybe one day Clojure's math abstractions will get some symbolic capacity ;-).

Thanks, Mark, for a really helpful e-mail. It helped me see some of the risks concretely that were largely abstract until now. Thanks to everyone else, too. I'm learning a lot about what's important to people and why.


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support Meikel Brandmeyer (kotarak) 6/22/10 12:16 AM
Hi,

On Jun 22, 7:47 am, David Nolen <dnolen.li...@gmail.com> wrote:

> Clearly a range' is in order.

Eh? Why? The following is probably completely hypothetical syntax,
because I'm not familiar with the changes in the numeric branches. But
I think, it's easy to get my point.

(defn range
  ([^long lower ^long upper] work with ints here)
  ([lower upper] work with boxed numbers here))

As long as the limits fit into a long, no overflow can happen. So we
can safely do fast primitive math inside. As soon as one of the limits
is a boxed number we do the usual stuff. So one can use range
transparently and still get fast results if provided with primitive
input.

Sincerely
Meikel
Re: Enhanced Primitive Support Heinz N. Gies 6/22/10 3:04 AM

On Jun 22, 2010, at 7:47 , David Nolen wrote:

> If you're going to give example of what the current branch is like, at least write the code how a primitive Clojurian would write it:
> ...
I feel his examples were on purpose not given as a 'primitive' clojurian or a 'boxing' one would write them but how a random one who has not thought about this would write them. And he has a good point most libraries and codes out in reality and not discussion club (not meaning I don't enjoy it a lot :P) is written by those and might very well run into this problems.

> Yes. With Rich's primitive work we can get to *1 billion arithmetic operations* in < 2/3 of a second on commodity hardware.

Which is absolutely great since I always wanted to do that :P <sarcasm/>, meaning the example is kind of far fetched even compared to fact (which is working code with useful results).

I actually think Mark makes a good point, if optimization compromises results it should be extra effort (think gcc's flag for not initialized integers for example) or unchecked math as it exists now. It might sound silly but there are some smart people out there who are smart but not experienced, examples as Mark gave above might be mistakes in their libraries someday in the feature :).

Regards,
Heinz

Re: Enhanced Primitive Support Heinz N. Gies 6/22/10 3:47 AM
I'd like to ask something, is there an (open source) project that really got a speed boost from the move to primitive +,- & Co. and is testable / benchmarkable fairly easy? I sadly don't have any of those :( to do it all on my own.

I'm not asking to mock, what I'd like to do (for myself) is look how the impact actually (in the real world) is and how much the effort would be to do it by hand. On the matter of speed I've just seen abstract claims by now so I'd love to test it myself.

I think a real test might help a lot to settle this matter (in one or the other direction) :)

What I plan to do is:

1) run code in equal branch and take time).

2) replace all +,-,*,/ ... with the ' equivalents to 'slow it down'

3) run code again and take another timing.

4) See how long it takes me (without in depth knowledge of the code so I've an disadvantage here) to figure out where the sweet spots are that need the non ' variant to get back speed that is close to the run with all primitives.

Why? Because I argue for the + be boxing and say 'it should be fairly simple to identify the sweet spots' and I want to see if I am right or wrong or rather how simple it is and I think this will be a very interesting case study for our current discussion.

What I don't need are projects/problems that are especially designed for this discussions (either the one or the other way) since that would be defeating the point.

Regards,
Heinz

Re: Enhanced Primitive Support Luc 6/22/10 4:45 AM

++1 for auto promotion, average code
is complex enough, lets not add
more incertainty with overflow
exceptions.

Data changes overtime and
discovering it in production with
an overflow exception is less than
elegant.

Degraded performance is much more
acceptable in production than a crash
over an unhandled exception ....

Again, if performance is critical for
a library or an application then it's up
to its designer to take action.

For the other folks, they will get
the best implementation according
to the number size they crunch in
their data. No need for them to pay
the price for control they do not
need yet.

Luc P

Sent from my iPod

On 2010-06-22, at 12:44 AM, Mark Engelberg <mark.en...@gmail.com>  
wrote:

> The new uber-loop is fantastic.

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient  
> with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support Nicolas Oury 6/22/10 4:53 AM
On Tue, Jun 22, 2010 at 12:45 PM, Luc Préfontaine <lprefo...@softaddicts.ca> wrote:

Data changes overtime and
discovering it in production with
an overflow exception is less than
elegant.

Degraded performance is much more
acceptable in production than a crash
over an unhandled exception ....

Depends in which production.
That's why I defend a way to choose default.


Re: Enhanced Primitive Support Tim Daly 6/22/10 5:15 AM
Boxing and unboxing can be very confusing.
The "rules" need to be clearly stated somewhere.


It might help if we introduced an explicit syntax,
which might compile away, such as:

(box i)
(unbox i)
(with-unboxed [i 7] ...)

etc. I know Rich doesn't like "syntactic overhead"
but source languages are intended to communicate with
people, not machines, so it is vital to be clear.

Even Common Lisp struggles with this issue when people
"leave the ranch" and use ffi.
(I know most people here are not common lispers and some
do not know the "ffi" (foreign function interface) but
think of it as calling Java code from Clojure)

In the SBCL development list I just got this comment
(by Tobias Rittweiler)
======================================================

In

(declaim (inline call-with-ptr*))
(defun call-with-ptr* (vector function arg1 arg2)
  (cffi-sys:with-pointer-to-vector-data (ptr vector)
    (funcall function ptr arg1 arg2)))

(defun test* (vector i j)
  (declare (optimize speed))
  (flet ((frob (ptr i j)
           (+ (sb-sys:sap-ref-8 ptr i)
              (sb-sys:sap-ref-8 ptr j))))
    (declare (inline frob))
    (call-with-ptr* vector #'frob i j)))

(defun test** (vector i j)
  (declare (optimize speed))
  (flet ((frob (ptr i j)
           (+ (sb-sys:sap-ref-8 ptr i)
              (sb-sys:sap-ref-8 ptr j))))
    (cffi-sys:with-pointer-to-vector-data (ptr vector)
      (funcall #'frob ptr i j))))

I get a note about boxing up a SAP in TEST* but not in TEST**.

Well, why?

My actually use case involves APPLY & rest lists like:

(declaim (inline call-with-ptr))
(defun call-with-ptr (vector function &rest args)
  (cffi-sys:with-pointer-to-vector-data (ptr vector)
    (apply function ptr args)))

but it seems that sbcl does not even optimize (APPLY F (LIST X Y))
into (FUNCALL F X Y). What would it take to optimize this to the
same code as TEST** compiles down too?

==========================================================

Re: Enhanced Primitive Support David Nolen 6/22/10 5:27 AM
On Tue, Jun 22, 2010 at 6:04 AM, Heinz N. Gies <he...@licenser.net> wrote:

> Yes. With Rich's primitive work we can get to *1 billion arithmetic operations* in < 2/3 of a second on commodity hardware.

Which is absolutely great since I always wanted to do that :P <sarcasm/>, meaning the example is kind of far fetched even compared to fact (which is working code with useful results).

Perhaps you aren't interested in using Clojure for graphics or audio work.

David
Re: Enhanced Primitive Support Luc 6/22/10 5:31 AM
If the maintenance cost of 
keeping both options in the Clojure run
 time is reasonnable it's fine with me.
Rich is the best person to answer this.

Otherwise, auto promotion should be
the winner... And the default ...

The vast majority of apps these days
are not gravitating around high 
performance numeric computations.

Forcing people to deal with this in mundane code would be insane
in the long run.

Reminds me of C in the early days when we needed to keep and eye on
what was a byte, a short, an int 
or a long on the different hardware platforms...

If we want Clojure to remain a viable
option lets keep some usage simplicity
a primary goal.

When I want to scratch every bit of
performance, I expect to kick my butt
out to get it but after I have some
working code running, not while I scratch my head about getting it done.

Proposals up to know regarding 
specific operators for various
implementations provide plenty of
options to attain the nirvana of
high performance.  That's good.

Lets just make things easy for the
average guy.., 

Luc P

Sent from my iPod

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
Re: Enhanced Primitive Support Dan 6/22/10 6:00 AM
Lets just make things easy for the
average guy.., 

If we base the decision on the average guy not writing high performance numeric apps, then we should also base it on the fact that he does not need more than a long in 99% of cases either as Rich points out. And longs are a much simpler concept to grok. 
Re: Enhanced Primitive Support Nicolas Oury 6/22/10 6:34 AM


On Tue, Jun 22, 2010 at 2:00 PM, Daniel Gagnon <redal...@gmail.com> wrote:
If we base the decision on the average guy not writing high performance numeric apps, then we should also base it on the fact that he does not need more than a long in 99% of cases either as Rich points out. And longs are a much simpler concept to grok. 
 
+1. The average guy understands that "big numbers" are something special. The average guy also want to have close to native speed when he first tries a language, not * 10 or * 20 that.

Re: Enhanced Primitive Support Joost 6/22/10 6:42 AM
My work depends. I've written some audio code in different languages
and that needs to be fast. But writing fast math-heavy code is not
just a case of "no auto-promotion" or "primitives only". You'll have
to be very careful about float <-> int conversions too (so, for
instance, you want to multiply floats only or ints only wherever
possible) and you also need to be very careful about how your
algorithm is structured.

My point is, I expect all code that isn't written with performance in
mind to be slow. But for almost all non-audio work, I really don't
care all that much, as long as I can get better performance in the few
pieces of code that need it.

For that reason, I really dislike having to use the ugly *' variants
to get the "correct" math ops. To me, a 10 to 20% "slowdown" (or
rather, less accelleration compared to the current 1.1 branch) is
worth it if it guarantees that "unbounded" data can still be handled
by naive code, and it encourages use of "correct" math in code where
speed isn't that important.

Really speed-intensive code will need to be documented as using
specific input types. I.e. for audio code, you generally use floats or
doubles for all data, and any deviation from that will cause problems.
For this kind of code, there is no reason at all to support extra-wide
numbers. But for most code, I expect to be able to just cram in a
bunch of "untyped" numbers and get the correct answer even if it's a
bit slower.

I do see the argument that ''longs and doubles should be enough for
everybody'' and it's generally true. But making them the default will
mean that in the longer run you won't be able to use your bigints with
any library, since they'll all default to using the exception-throwing
math.

I may have misinterpreted the current proposals somewhere, so please
correct me if I'm wrong. Keeping up with this thread is kinda hard.

Just my 2 cnts.
Joost.
Re: Enhanced Primitive Support Luc 6/22/10 6:56 AM
I never said we should not provides ways to tune code and get higher throughput
or arbitrary sized numbers. There has to be a path to satisfies those needs.
The stuff that was proposed on this thread (specialized ops, ...) should allow
people to tune their code if they require to do so.

I said that the default approach had to be simple and should not be made
solely on the basis of numeric intensive apps.

If Rich finds auto promotion too costly to implement (and maintain) versus
using longs that's reasonable, (he's the best placed to take that decision),
then lets use longs and throw overflow exceptions.

The default will still remain simple, if your numbers don't fit in a long it
will bump. Anyway the long size as we know it will increase in size in the
future so it may become a mood point.

Auto promotion if doable can just enlarge the number of apps not having to care
about huge number limits.

Simplicity is not equivalent to a lack of sophisticated approaches when
required.

Caring about which numeric implementation to choose is not my first concern
when it comes to design, it comes later if it ever comes up as an issue.
I saw such needs in 30 years a number of times mostly in
real time systems with limited hardware resources. Not a common
need in apps in general these days.

Today numeric performance is needed by some specific and narrow
needs (heavy graphics, ,,,). If you really want max. numeric performance,
work on a bridge to submit computations to your graphic card.... there will
be greater payoffs than staying within the JVM.

Luc P.

Daniel Gagnon <redal...@gmail.com> wrote ..

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first
> post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support Lee 6/22/10 7:25 AM

All of this talk about the "average guy" is ungrounded. Average over what set and how are you getting the data? It may (or may not) be true that that the average *programmer* understands that big numbers (how big?) are something special, but IMHO that (if true) would probably be because the average programmer was first exposed to bad languages.

 -Lee

On Jun 22, 2010, at 9:34 AM, Nicolas Oury wrote:

> +1. The average guy understands that "big numbers" are something special. The average guy also want to have close to native speed when he first tries a language, not * 10 or * 20 that.
>

--


Lee Spector, Professor of Computer Science
School of Cognitive Science, Hampshire College
893 West Street, Amherst, MA 01002-3359
lspe...@hampshire.edu, http://hampshire.edu/lspector/
Phone: 413-559-5352, Fax: 413-559-5438

Check out Genetic Programming and Evolvable Machines:
http://www.springer.com/10710 - http://gpemjournal.blogspot.com/

Re: Enhanced Primitive Support Rich Hickey 6/22/10 7:46 AM

It is precisely for these reasons that the first iteration was based  
upon BigInteger contagion. Everyone has to understand this is not a  
black and white, preference based decision. It is a multi-dimensional  
problem involving promotion, reduction, contagion, loop types,  
literals, boxing types, interop, equality and more.

For instance, you can't keep the benefits of 'uber loop' and move the  
default to auto-promoting ops, because auto-promoting ops cannot  
return primitives. There is no free lunch in this.

It would be easy, and is likely, to have bigint literals require the N  
suffix, in that way there is no hiding the use of bigints as you have.

The issues with contagion are as you mentioned previously -  
performance with smaller numbers and equality. The former can be  
addressed with a better bigint, the latter only partially addressed by  
returning to equivalence based equality. Equivalence-based equality is  
still in play, and impacts contagion, as then (= 42 42N) can be true.  
More subtly, it impacts the choice of box type for longs-and-smaller.  
Right now they are 'packed', using Integers when they fit, but this  
will probably hurt us when escape analysis is in full swing, as it  
puts a branch in the boxing process and bifurcates the resulting types  
(Integer and Long). However, always boxing to Long (including on  
interop boundaries), while yielding highly-consistent box types, sans  
bigints, opens up a possible box type mismatch when someone in Java  
says:

someClojureFn.invoke(42); //autobox is Integer

With equivalence based equality, that is less of a problem. But there  
is still a tradeoff regarding keys, the algorithm of the host  
requiring .equals() for map keys and set members. There are also  
hashCode divergences between Longs and BigIntegers of the equivalent  
value.

BigInteger contagion would make all but the range case above work just  
fine.

-----------
The claim that this primitive stuff is just for numeric-intensive  
applications is outrageous and false, and ignores the implementation  
of Clojure itself to an embarrassing degree. I've worked my tail off  
to reduce the number of allocations inside things like sequences etc  
to the absolute minimum. Now down to 1 per step, and with chunks 1/32  
per step. Moving from 1 to 2 or 3 per step would result in a 2x to 3x  
slowdown for every consumer of these fns.

Everyone has to realize the math you are advocating for the default,  
on non-tagged architectures like the JVM and CLR, *must* do an  
allocation on every +/-/* etc operation. And such ops are littered  
throughout non-numeric data structure code, for indexes, offsets,  
bounds etc. Allocating on every math op in something like the  
persistent vector would make it completely unusably slow.

The languages being pointed to (Ruby, Python, Mathematica) write their  
hard bits in C, or, for the J versions, Java.

Well, guess what, all of the things I've written in my career have  
been hard bits. Those languages were unusable for any of my production  
work, for performance reasons. I wrote Clojure so I could stop writing  
Java and C#, not Ruby, because I couldn't have used Ruby in the first  
place.

And, I think some people are considering moving from Ruby to Clojure,  
for the hard bits of their systems, in part *because* Clojure has a  
better performance profile, their alternatives being Java or Scala,  
not Python or Groovy.

Now you're all sitting on the end of the food chain, eating gazelles  
and saying, 'who needs photosynthesis to be easy'?

;-)

I do. And so do you if you appreciate fast gazelles. You wouldn't be  
able to use a Clojure written in the default Clojure you are advocating.

Rich

Re: Enhanced Primitive Support Luc 6/22/10 8:25 AM
As the implementor Rich, it's your
call as I said :))
If auto promotion brings in chaos
then lets drop it entirely.

If people want to eat elephants they can still choose to use directly  
big number
implementations.

Luc P

Sent from my iPod

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient  
> with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support Mike Meyer 6/22/10 8:37 AM

Yes, but we've as yet to see that having auto-promotion as the default
buys the average app *10 or *20. The best we've seen is 10-20%, which
isn't really enough to justify the reduces range for what "just
works".

      <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Re: Enhanced Primitive Support Chas Emerick 6/22/10 8:37 AM
There are places where "regular" folks can be helped (e.g. perhaps  
supporting contagious bigs for the normal/fast ops so that the  
potentially foreign op-prime notation can be ignored if that's  
desirable), but handicapping capability or semantics in any direction  
based on vague appeals to "average" or "regular" programmers can't  
happen.

Clojure has a core feature called, of all things, 'reify', so I think  
the cat is somewhat out of the bag in terms of placating morts.  
There's a virtuous cycle at work here: Clojure demands that you raise  
your game, and pays you back for doing so.

- Chas

Re: Enhanced Primitive Support Fogus 6/22/10 8:41 AM
One of our reviewers made a fantastic comment on the book and I
thought I would share it for public consumption:

"I know that you Lisp guys are incredibly proud of arbitrary-precision
arithmetic, but I can't help but think that it is entirely missing the
point: infinite precision is infinitely slow; no underlying math
library is going to support it (so you are stuck with the four basic
math operations); and lastly, infinite precision is almost always the
wrong approach. If you really need to evaluate the binomial
coefficient of 117-take-31 (you probably don't, to begin with), then
the right way to do this is to work with logarithms and Stirling's
approximation, not with infinite precision."

Agree with him or not, this is the view of a non-Clojure programmer
and (I think) adds a thoughtful perspective.
:f
Re: Enhanced Primitive Support Garth Sheldon-Coulson 6/22/10 8:50 AM

Everyone has to realize the math you are advocating for the default, on non-tagged architectures like the JVM and CLR, *must* do an allocation on every +/-/* etc operation. And such ops are littered throughout non-numeric data structure code, for indexes, offsets, bounds etc. Allocating on every math op in something like the persistent vector would make it completely unusably slow.

I think my understanding of the trade-offs is stuck on this question: If the auto-promoting ops were chosen as the default, why couldn't Clojure-in-Clojure use the fast ops rather than the default ops? Couldn't the implementation of PersistentVector use primitives on the inside without anyone on the outside ever knowing? (I guess the answer is no, but I'm not understanding why. Sorry for being behind the curve.)

A second question: I see why the uber-loop requires ops that return primitives, but what would be the problem with simply having a loop and a loop'? Is the problem that then we go down the road of bifurcating all functions into fast and slow?

I like easy photosynthesis and I'm all for raising my game, but I'd like a better understanding of why I can't have my gazelle in this case and eat it too.
Re: Enhanced Primitive Support Heinz N. Gies 6/22/10 8:59 AM
We always talk about auto promoting ops, is there a way, and I guess the answer will be no, to have auto demoting ops? as in they get (+ (num 1) (num 1)) and return (long 2) if the calling side desires so? As in the function promises to return a Number or better.

I know this might die on the fact that we need to head with the JVM but technically this should be possible right?

it is like caller of + says: I'd like one of those: long, double, Integer, Number (in this order) and the the functions satisfies it to it's best capabilities, here it would be long.

Not sure if that would make any sense or is even possible within the JVM but it sounds kind of cool doesn't it?

Regards,
Heinz

Re: Enhanced Primitive Support Mike Meyer 6/22/10 9:02 AM
On Mon, 21 Jun 2010 21:44:04 -0700
Mark Engelberg <mark.en...@gmail.com> wrote:

> The new uber-loop is fantastic.
>
> So I guess the main point still to be finalized is whether the default
> arithmetic ops will auto-promote or error when long addition
> overflows.

I think calling them "default" gives the wrong idea here. We have two
behaviors for the math operators, so we need two sets of names. When
the existing set auto-promoted and the "other" set worked with
primitives, then calling them <op>' was a mnemonic (PRIMe implying
PRIMitive math).

But if it's going to be the other way 'round, then the "new" names
don't work as well. I propose (again, since nobody commented on it the
first time) that the two sets of ops be:

non-autopromoting math: +, -, *, /, ++, -- (if you want to muck with
the rules for symbols, the last two could be drawn from [+-]?1[-+]?
instead).

autopromoting math: add, sub, mul, div, inc, dec

No mucking about with namespaces, no settings to toggle behaviors -
you use the set you want. When teaching the language, you can use the
words and not have to worry about explaining about primitive types,
boxing, or anything else.

While I'm at it, do we still need the postfix N on bignums? Since both
the reader and humans do the right thing even if it's not there, we
shouldn't need it for output any more, even there might be times when
it's handy for input.

     <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Re: Enhanced Primitive Support puzzler 6/22/10 9:15 AM
On Tue, Jun 22, 2010 at 7:46 AM, Rich Hickey <richh...@gmail.com> wrote:
> It is precisely for these reasons that the first iteration was based upon
> BigInteger contagion.

Yeah, I still don't like BigInteger contagion, and I believe I would
vastly prefer the current equals branch, even with its idiosyncrasies,
versus BigInteger contagion.  I just think it makes things so much
simpler when there is one standard type representation for each
number.  Perhaps there are sides to this I haven't yet considered, but
based on my current understanding of the tradeoffs, that's how I feel.

> It would be easy, and is likely, to have bigint literals require the N
> suffix, in that way there is no hiding the use of bigints as you have.

One of my points was that you can create a "hidden" bigint through a
chain of non-auto-promoting operations that begin with longs, divide
to get fractions, and multiply the fractions to get a bigint.  So it
is possible to get a bigint without ever using an "N" and without ever
using one of the primed operators.

I can't believe I'm saying this, but if you're going to try to orient
Clojure around primitive math, I almost wonder if it would be cleaner
to rip out *more* pieces of the numeric tower.  Make Clojure math only
work on longs and doubles.  Make it so the default ops error if they
receive anything other than longs or doubles as inputs, or would
produce anything other than longs or doubles as the output.  Make it
so only the "primed" operators will do math with bigints, rationals,
or bigdecimals.

In my own code, numbers are generally generated by range, and then end
up getting stored in and retrieved from collections.  I assume that
under all the scenarios on the table, numbers in collections will
still have to be boxed?  Will they be automatically converted to
primitives as math is done to them so that further computations will
reap the benefits, or is boxing contagious?

Re: Enhanced Primitive Support Tim Daly 6/22/10 10:14 AM
We could always take the FORTRAN approach and make
identifiers that start with I through N default to
contagious behavior :-)
Re: Enhanced Primitive Support Mike Meyer 6/22/10 10:43 AM
On Tue, 22 Jun 2010 10:46:05 -0400
Rich Hickey <richh...@gmail.com> wrote:
> -----------
> The claim that this primitive stuff is just for numeric-intensive  
> applications is outrageous and false, and ignores the implementation  
> of Clojure itself to an embarrassing degree. I've worked my tail off  
> to reduce the number of allocations inside things like sequences etc  
> to the absolute minimum. Now down to 1 per step, and with chunks 1/32  
> per step. Moving from 1 to 2 or 3 per step would result in a 2x to 3x  
> slowdown for every consumer of these fns.

Actually, I've acknowledged that for a while. What I've been trying to
get (and that git clone from github fails for me has made hard to
measure myself) is some measurements of how much this costs real - but
non-numeric - applications. The one measurement posted was - IIRC -
10-20%.

> Everyone has to realize the math you are advocating for the default,  
> on non-tagged architectures like the JVM and CLR, *must* do an  
> allocation on every +/-/* etc operation. And such ops are littered  
> throughout non-numeric data structure code, for indexes, offsets,  
> bounds etc. Allocating on every math op in something like the  
> persistent vector would make it completely unusably slow.

I disagree with that first statement, and will point to python as the
counterexample. Like Clojure, it has immutable integers. It avoids the
issue of having to allocate for those ops some of the time by keeping
a table of the first N (tunable at interpreter build time) integers
that *all* instances of small integers point to. So 5 + 5 doesn't
allocate a new object, it returns a pointer to the pre-allocated 10 in
that table.

> Well, guess what, all of the things I've written in my career have  
> been hard bits. Those languages were unusable for any of my production  
> work, for performance reasons. I wrote Clojure so I could stop writing  
> Java and C#, not Ruby, because I couldn't have used Ruby in the first  
> place.

Which merely means that the language you're writing may not be the
language I want. No problem. Personally, given that the default is 64
bits, it's not nearly the issue it was when Python changed (when most
of the world used 32 bit ints). I just don't want to get stuck having
to use obviously second class names (or having to put up with errors
about "rebinding core functions") to get the math I want.

> Now you're all sitting on the end of the food chain, eating gazelles  
> and saying, 'who needs photosynthesis to be easy'?
>
> ;-)
>
> I do. And so do you if you appreciate fast gazelles. You wouldn't be  
> able to use a Clojure written in the default Clojure you are advocating.

Would it be Clojure if it didn't run on the JVM? Personally, I could
live without the JVM. And the more I learn about the JVM, the more I
could live without it!

On the other hand, the PyPy folks don't seem to be making much noise
either.

      <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Re: Enhanced Primitive Support Nicolas Oury 6/22/10 10:56 AM


On Tue, Jun 22, 2010 at 6:43 PM, Mike Meyer <mwm-keyword-googlegroups.6209c5@mired.org> wrote:

> Everyone has to realize the math you are advocating for the default,
> on non-tagged architectures like the JVM and CLR, *must* do an
> allocation on every +/-/* etc operation. And such ops are littered
> throughout non-numeric data structure code, for indexes, offsets,
> bounds etc. Allocating on every math op in something like the
> persistent vector would make it completely unusably slow.

I disagree with that first statement, and will point to python as the
counterexample. Like Clojure, it has immutable integers. It avoids the
issue of having to allocate for those ops some of the time by keeping
a table of the first N (tunable at interpreter build time) integers
that *all* instances of small integers point to. So 5 + 5 doesn't
allocate a new object, it returns a pointer to the pre-allocated 10 in
that table.

The JVM can do that for you, if you want. It may have issues with concurrency though, depending of the way it is implemented.
However, it does not solve the problem of performance of boxed nums, though.
1. It is only helps for small numbers
2.It only help GC. And GC is not the main issue, imho. 
(pointer dereference, including cache misses, and filling the cache with garbage is more annoying than GCs, imho. But I am not an expert).
Re: Enhanced Primitive Support Mike Meyer 6/22/10 11:06 AM
On Tue, 22 Jun 2010 18:56:30 +0100
Nicolas Oury <nicola...@gmail.com> wrote:

> On Tue, Jun 22, 2010 at 6:43 PM, Mike Meyer <
> mwm-keyword-googlegroups.6209c5@mired.org> wrote:
>
> >
> > > Everyone has to realize the math you are advocating for the default,
> > > on non-tagged architectures like the JVM and CLR, *must* do an
> > > allocation on every +/-/* etc operation. And such ops are littered
> > > throughout non-numeric data structure code, for indexes, offsets,
> > > bounds etc. Allocating on every math op in something like the
> > > persistent vector would make it completely unusably slow.
> >
> > I disagree with that first statement, and will point to python as the
> > counterexample. Like Clojure, it has immutable integers. It avoids the
> > issue of having to allocate for those ops some of the time by keeping
> > a table of the first N (tunable at interpreter build time) integers
> > that *all* instances of small integers point to. So 5 + 5 doesn't
> > allocate a new object, it returns a pointer to the pre-allocated 10 in
> > that table.
>
> The JVM can do that for you, if you want. It may have issues with
> concurrency though, depending of the way it is implemented.
> However, it does not solve the problem of performance of boxed nums, though.
> 1. It is only helps for small numbers

Don't I recall someone saying that small numbers are the most common
ones? *Especially* if you're doing things like vector arithmetic.

> 2.It only help GC. And GC is not the main issue, imho.
> (pointer dereference, including cache misses, and filling the cache with
> garbage is more annoying than GCs, imho. But I am not an expert).

Um, it helps avoid some GC by *not doing an allocation*. According to
what Rich said, doing the allocation is the performance problem - at
least in the internals of clojure. Personally, I believe what Rich
wrote. Not only does he have more experience than anyone else with the
insides of clojure, but this matches my experience with adding GO-FAST
to LISP systems.

   <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Re: Enhanced Primitive Support Sean Corfield 6/22/10 11:39 AM
On Tue, Jun 22, 2010 at 10:43 AM, Mike Meyer
<mwm-keyword-googlegroups.6209c5@mired.org> wrote:
> Would it be Clojure if it didn't run on the JVM? Personally, I could
> live without the JVM. And the more I learn about the JVM, the more I
> could live without it!

For my work, languages are only useful if they run on the JVM *and*
can interoperate so I can mix libraries written in multiple languages
and choose the "best language" for each part of the system. It's why
I've been mixing Groovy and Scala into projects over the last 2-3
years and it's why I want to be able to mix Clojure in as well. It's
why people bothered to create JRuby and Jython etc. It's why I didn't
pursue Erlang (although I'm keeping an eye on Erjang...).

We're all different and have different needs in languages.
--
Sean A Corfield -- (904) 302-SEAN
Railo Technologies, Inc. -- http://getrailo.com/
An Architect's View -- http://corfield.org/

"If you're not annoying somebody, you're not really alive."
-- Margaret Atwood

Re: Enhanced Primitive Support Mac 6/22/10 11:56 AM
On Tue, Jun 22, 2010 at 1:43 PM, Mike Meyer
<mwm-keyword-googlegroups.6209c5@mired.org> wrote:
> Would it be Clojure if it didn't run on the JVM? Personally, I could
> live without the JVM. And the more I learn about the JVM, the more I
> could live without it!

I just want to add my two cents here. I don't think we would've ever used
clojure if it wasn't on the JVM. Also, I think JVM plays a bit part
of the current good press clojure is getting.

--
Omnem crede diem tibi diluxisse supremum.

Re: Enhanced Primitive Support Mikera 6/22/10 10:13 AM
Most of my code does fairly intensive graphics work. I'd love to be
able to write
it all in Clojure but currently (sadly) it is often easier for me to
write the processing
code in Java and use on the Java interop to call it.

Also if anyone is interested in real-time performance (games,
animation etc.) then
lots of small boxing allocations can be unhelpful in terms of
increasing the number
and severity of GC pauses - to some extent that's even worse than the
overall
performance hit.

Aside from my personal utility and preferences (which are compelling
for me!)
I'm in the "primitive by default" camp for a few reasons:

1. The "pay for what you use" argument is very convincing
2. It's more conceptually close to what Java/C# developers are used to
so it helps
 with the learning curve
3. Primitive by default is going to work much better for non-
mathematical uses
of numbers (array indexes, simple loops, counts of collection sizes
etc. ) which
I believe is the common case rather than heavy-duty mathematics with
huge integer ranges
4. It minimises the risk of a "Clojure is slow" reputation developing.
Which you are
likely to get if people start comparing micro-benchmarks of non-hinted
code against
languages with static / primitive support.
Re: Enhanced Primitive Support Garth Sheldon-Coulson 6/22/10 6:07 PM
I'm just one voice on the side that's been advocating for auto-promotion by default. For what it's worth, I completely see where the other side is coming from, and if Rich sees primitive math by default as important to Clojure's future, then so be it. It sounds like he's given it a lot of though and has a pretty strongly held opinion at this point. That is, unless he's just playing devil's advocate.

I would, however, like to throw some support behind Mike Meyer's suggestion that the arbitrary precision numeric tower use names like add, sub, mul, div, inc, dec, while the "default" ops keep the symbolic names. I think it was Luke who suggested, and I agree, that it would be unfortunate to have to teach new Clojurians about machine math just to allow them to understand Clojure math. For new users, and also for people like me, it would be nice to having names for the arbitrary precision numeric tower that don't sound subordinated or second-class.

I also think Mark's idea of removing support for bigints and bigdecimals from the default ops is worth discussing. It seems cleaner to me than overloading the default ops and thereby making people complacent.

Garth


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Enhanced Primitive Support Mike Meyer 6/22/10 7:20 PM
On Tue, 22 Jun 2010 21:07:39 -0400
Garth Sheldon-Coulson <ga...@MIT.EDU> wrote:

> I would, however, like to throw some support behind Mike Meyer's suggestion
> that the arbitrary precision numeric tower use names like add, sub, mul,
> div, inc, dec, while the "default" ops keep the symbolic names. I think it
> was Luke who suggested, and I agree, that it would be unfortunate to have to
> teach new Clojurians about machine math just to allow them to understand
> Clojure math. For new users, and also for people like me, it would be nice
> to having names for the arbitrary precision numeric tower that don't sound
> subordinated or second-class.

That's the point.

> I also think Mark's idea of removing support for bigints and bigdecimals
> from the default ops is worth discussing. It seems cleaner to me than
> overloading the default ops and thereby making people complacent.

Did you leave ratios out of the list of things it should support?

I'm not so sure about this one. With things as they are, if I pass a
bigint to a library routine that does arithmetic on it, it will just
work. Yes, if I start with a boxed int and it overflows, I still lose,
but we can reduce the lossage a little here. We might even be able to
tweak things to make it even better.

       <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Re: Enhanced Primitive Support Heinz N. Gies 6/23/10 2:20 AM
Other math functions as max, min and so on should behave the same as +, - I think it is kind of confusing if you've a loop with a max and have to cast it to long by hand :)

Regards, heinz.

Re: Enhanced Primitive Support Heinz N. Gies 6/23/10 4:33 AM
Hmm yesterday Rick said that the new changes give clojure a chance to put down the 'scala is faster then clojure' argument. So I figured I give it a try and implemented some of the shootouts: fannkucken-redux and the regexdna example. I found some interesting things especially once I got carried away with hoeck's question how it looks when you benchmark ideomatic clojure 1.2-master vs 1.2-equal.

As a foreword:
1) I most likely suck at optimizing so I very much welcome any suggestions and changes.
1.1) As a result of that please don't take the results I got as abseluts so they still are interesting
2) The benchmarks I did are flawed and unfair since I did not do warmup runs nor did I remove the JVM startup time, BUT they are unfair for everyone in the same manner so I hope that evens things out a bit.
3) The idiomatic code is NOT optimized at all, it is the nicest way I found to write the code, or at least the one that seemed most natural to me.


My findings:

A) Comparison to Scala
So first of all it seems clojure is still slower then scale (at least in this particular benchmark and with my very limited skills in optimization), the optimization was done but there are still things to improve and I know that but we're coming closer :)

B) The process of optimization
During the process of optimization I sadly did not feel big relieve brought by the equal branch, optimizing code feels still horrible and painful. And the code that I got looks more like java with different parenthethes then clojure  to me - I personally feel ashamed to have written it :(

C) ideomatic clojure master vs. equal
I know again this is just a one piece observeration but I am torn. Over all execution time for the fannkuchen example has degraded nearly 20% with the equal branch (using same code) yet in profiling it looked faster. I am not certain why it seems that equal adds some overhead, hoeck had the same finding when not benchmakring JVM execution speed but using criterium to benchmark and even in code benchmarking (using time) indiceates equal to be slower in this case. I'm not sure how to explain this to be frank, my guess is that some optimization hurt the list speed or that the profiler makes odd things.


Code can be found here:


screenshots of some profiling results:

http://grab.by/55po (euqual branch)

http://grab.by/55pp (master branch)

Regards,
Heinz 
Re: Enhanced Primitive Support John R. Williams 6/23/10 9:24 AM
On Sat, Jun 19, 2010 at 5:41 AM, Heinz N. Gies <he...@licenser.net> wrote:
Lets get away from the argument already cast and bring a new one. It is an impossible situation that the literal 1 in one place has a different meaning and behavior then in another place. This is in fact a show stopper, I'm serious, this would be worst then java.

(* 1 0.2) -> works fine

but

(loop [n 1 r 1](if (zero? n) r (recur (dec n) (* r 0.2))))

does not because the 1 there is not the 1 in the upper example, that is so utterly wrong and horrible please think about it and think about how you'd sell someone he has to know where a literal means what, I can guarantee you this will keep many people from this wonderful language :(.

I think you're misinterpreting what's going on here. There's no trickery with the type of 1 or r, and the expression (* r 0.2) will correctly return 0.2. The problem is that when 0.2 is passed into recur, it's coerced to a long, so it becomes 0.

I do agree that the end result is unfortunate. I would be much happier if types of loop variables were based not just on their initializers, but also on the types of the arguments to recur. This would certainly result in some needless boxing, but the compiler would still be able to use primitives in a lot of cases. In particular, it could use primitives whenever the loop variable's new value is computed using only arithmetic operators or other :static functions that return primitives. The only possible harm would come from a very particular set of circumstances: (1) the loop variable is initialized with a primitive, (2) the performance of the loop depends in the variable being primitive, (3) a the compiler can't prove that the variable is re-bound with a primitive in a recur form, and (4) the actual value used to re-bind the variable is a boxed primitive of same type as the variable's initializer. If (1) is not true, the variable would not be primitive anyway; if (2) is not true, there may be some unnecessary boxing, but it doesn't hurt anything; if (3) is not true, the compiler can use primitives just as it does now; and if (4) is not true, the compiler is correctly forced to use a boxed type for the variable, whereas today it incorrectly tries to unbox the value, resulting in either a ClassCastException or a numeric conversion that silently loses precision.
Re: Enhanced Primitive Support Heinz N. Gies 6/24/10 6:33 AM
Another thing I noticed, clojures array functions are using int's as index, so to get best performance from them you currently need to cast every counter you use to a int by hand so you have (loop [i (int 0)] ... since otherwise there will be a lot of type casting when accessing arrays. So a good course of action, in regards of this changes might be to change aget and aset to use long indices.

Regards,
Heinz

Re: Enhanced Primitive Support Mike Meyer 6/24/10 7:18 PM
[Not really about enhanced primitive support - more about optimization
 on the jvm.]

On Tue, 22 Jun 2010 01:47:26 -0400
David Nolen <dnolen...@gmail.com> wrote:
> Yet consider this, If I'm writing OpenGL code in Penumbra I will have quite
> a bit of code that amounts to the following:
>
> ; 630 msecs
> (dotimes [_ 10]
>   (time
>    (dotimes [_ 1000000000]
>      (+ 1 2))))


>
> Yes. With Rich's primitive work we can get to *1 billion arithmetic
> operations* in < 2/3 of a second on commodity hardware.

Which begs the question - why are you mucking about with SISD
algorithms if you need that kind of thing to be fast? IIUC, recent
versions of the JVM can access the SIMD instructions that have been
available on most commodity hardware for the last couple of years. Why
aren't you trying to get to those to get a serious performance boost?
Better yet, given that you're using OpenGL and presumably have real
graphics hardware available, why aren't you trying to push this stuff
out to the vector processor on the GPU where it will really fly?

    <mike
--
Mike Meyer <m...@mired.org>                http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

More topics »