The latest revision of primitive support is in the equiv branch. It
takes the approach of num, with no auto-promotion, and bigint
contagion, and adds several things to better support contagion. I
think contagion is the best way to continue to support polymorphic
numeric code, something I consider important.
Contagion has several issues. First and foremost, it means that it it
will be possible for 42 and 42N to be produced in the course of normal
operations, so strict type-specific equality is not a good match. The
equiv branch brings back equivalence-based =, with a slightly tighter
notion of =, supporting only similar categories of numbers, so (= 42
42.0) => false, but (= 42 42N) => true. == is still available.
The second problem is the use of numbers (and collections of numbers)
as keys in hash maps and members of hash sets. We already had an issue
here, as there wasn't completely uniform boxing, and there will always
be the possibility of numbers from the outside. The equal branch tried
to use consistent boxing and type-specific =, but it was still open to
mismatch with numbers from outside. The equiv branch extends = to keys
and set members when used from Clojure, i.e. Clojure get and contains
will use = logic, while the Java get and containsKey will
use .equals(). I.e. we will still satisfy the semantics of Java when
used through the Java APIs, but nothing said the Clojure API must
match those semantics. When combined with the new BigInt class (see
below), this will allow you to look up (and find!) the integer 42 with
either 42 or 42N.
The equiv branch also has a new BigInt type. This is strictly
necessary for this scheme, as Java's BigIntegers and Longs can produce
different hashCodes for the same values. In addition to matching
hashCode support, the BigInt class opens the door to better
performance when used with numbers long-or-smaller. These performance
enhancements have not yet been made.
More details have been added to the top here:
https://www.assembla.com/wiki/show/b4-TTcvBSr3RAZeJe5aVNr/Enhanced_Primitive_Support
Code is here:
http://github.com/richhickey/clojure/commits/equiv
Feedback welcome,
Rich
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
On Fri, Jun 25, 2010 at 1:18 PM, Garth Sheldon-Coulson <ga...@mit.edu> wrote:
> 1) If there is going to be BigInt contagion, why not BigDecimal contagion?
doubles contaminate bigdecimals, not the other way around. That's how
it should be.
> 2) Will the same reasoning that produced BigInt give us a BigDec with a
> better hashCode() than that of BigDecimal?
You don't want the hashCode of bigdecimal to match the corresponding
double. They aren't equal, so they shouldn't have the same hashcode.
They aren't equal because one is exact and one is inexact.
Now, one interesting question is whether the hashCode of 40.0M should
match the hashCode for 40N, because they are two exact numbers that
represent the same value.
On Fri, Jun 25, 2010 at 1:35 PM, Mark Engelberg
<mark.en...@gmail.com> wrote:
> You don't want the hashCode of bigdecimal to match the corresponding
> double. They aren't equal, so they shouldn't have the same hashcode.
> They aren't equal because one is exact and one is inexact.
I should have said "they shouldn't be equal" based on Rich Hickey's
explanation that from now on (= 1 1.0) will return false. I think by
this logic (= 1.0M 1.0) should also be false. I have no idea what the
current branch actually does though -- haven't tried it yet.
>
> Now, one interesting question is whether the hashCode of 40.0M should
> match the hashCode for 40N, because they are two exact numbers that
> represent the same value.
>
The more I think about it, the more I think that big decimals are sort
of their own universe and we really wouldn't want the hashCodes of an
integral bigdecimal to match the integer hashcode. I mean, if 40.0M
and 40.00M are considered different by Java, it seems fruitless to try
to unify these with their integer counterparts.
I should have said "they shouldn't be equal" based on Rich Hickey's
explanation that from now on (= 1 1.0) will return false. I think by
this logic (= 1.0M 1.0) should also be false. I have no idea what the
current branch actually does though -- haven't tried it yet.
Yeah, if it's technically feasible, this definitely makes the most
mathematical sense.
Are the bit-bashing operators (bit-*) going to get the same treatment
as the arithmetic operators? I would expect that many of the fields
that benefit if the latter to be fast would also benefit from the former
being fast, and I know that fast algorithms in combinatorics and crypto
tend to make use of them.
<mike
--
Mike Meyer <m...@mired.org> http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.
O< ascii ribbon campaign - stop html mail - www.asciiribbon.org
Has it already been decided that some sort of this new numeric tower
will find its way into Clojure?
Personally, I think this change will open a can of worms and make
programming in Clojure more difficult and error prone.
I don't think there is a nice and clean solution to the numeric
problem (Clojure is not the first language tackling it) but there are
two possibilities that could produce an acceptable trade off:
- using boxed math everywhere and optimizing performance by storing
preallocated Integers in a cache,
- using both boxed and primitive math but keeping the boundary between
them as explicit as possible (different operators, no automatic
conversion etc.). Existing operators should default to boxed math (for
runtime safety and compatibility).
Andrzej
This was suggested on the previous thread on this topic as well, but
I don't think it was pointed out that *Java already does this*.
See the inner class IntegerCache at line 608:
http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/3956cdee6712/src/share/classes/java/lang/Integer.java
And for Long as well (line 543):
http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/3956cdee6712/src/share/classes/java/lang/Long.java
> - using both boxed and primitive math but keeping the boundary between
> them as explicit as possible (different operators, no automatic
> conversion etc.). Existing operators should default to boxed math (for
> runtime safety and compatibility).
>
> Andrzej
>
"Nicolas Oury" <nicola...@gmail.com> wrote:
>On Sat, Jun 26, 2010 at 7:59 AM, B Smith-Mannschott
><bsmit...@gmail.com>wrote:
>
>> This was suggested on the previous thread on this topic as well, but
>> I don't think it was pointed out that *Java already does this*.
Doesn't matter if clojure is boxing them as well. Yes, all the boxes will point to the cache, but you still have different boxes, and allocating those is the problem.
>I think I pointed it out, and I reiterate it will probably not improve
>performance a lot (Except if you use always the 5 same numbers).
Reiteration won't make it true.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
Thank you. I wasn't aware of it.
It is not exactly what I meant, though. The above Java code
preallocates some low Integers, that are likely to be encountered in
actual programs. It obviously doesn't help at all when you happen to
use other numbers.
What I'd rather like to have is an array of N preallocated objects
waiting to be assigned values and used. This way an allocation cycle
could be triggered every N Integer constructor calls and all boxes
used in a single procedure would be gathered in one place.
Andrzej
Re: caching boxed ints:
>>I think I pointed it out, and I reiterate it will probably not improve
>>performance a lot (Except if you use always the 5 same numbers).
> Reiteration won't make it true.
At about 10m - 12m into this video, Cliff Click suggests that Java's
caching of Integer objects might do some harm to performance because
it prevents the JIT from being able to do inlining and escape
analysis.
http://www.infoq.com/presentations/click-fast-bytecodes-funny-languages
--
Dave
Q1: Why does pmap use the number of available processors + 2? I would
have thought it would just use the number of avail processors...
Q2: Could someone clear up my misunderstanding of pmap w/ respect to the
code snippets below? Pmap doesn't seem to be limiting the number of
threads to the number of processors + 2...
I've created an anon func that does't return, so I wouldn't expect the
pmap step function to advance beyond 4 (I have 2 processors):
#1: Limit pmap to # of processors
----------------------------------
user=> (pmap #(while true (do (println "Thread: " (.getId
(Thread/currentThread)) "item: " %)(Thread/sleep 500))) (range
(.availableProcessors (Runtime/getRuntime))))
Thread: 12 item: 0
(Thread: 13 item: 1
Thread: 12 item: 0
Thread: 13 item: 1
Thread: 12 item: 0
Thread: 13 item: 1
Thread: 12 item: 0
Thread: 13 item: 1
Thread: 12 item: 0
Thread: 13 item: 1
Thread: 12 item: 0
Thread: 13 item: 1
Thread: 12 item: 0
Thread: 13 item: 1
Thread: 12 item: 0
Thread: 13 item: 1
Thread: 12 item: 0
--> just two threads running, as expected
#2: Limit pmap to # of processors * 10
--------------------------------------
user=> (pmap #(while true (do (println "Thread: " (.getId
(Thread/currentThread)) "item: " %)(Thread/sleep 500))) (range (* 10
(.availableProcessors (Runtime/getRuntime)))))
Thread: Thread: 12 item: 0
(Thread: 25 item: 13
Thread: 24 item: 12
Thread: 23 item: 11
Thread: 22 item: 10
Thread: 21 item: 9
Thread: 20 item: 8
Thread: 19 item: 7
Thread: 18 item: 6
Thread: 17 item: 5
Thread: 16 item: 4
Thread: 15 item: 3
--> In this short snippet, you can see > 4 threads running...expected?
toddg=> (source pmap)
(defn pmap
"Like map, except f is applied in parallel. Semi-lazy in that the
parallel computation stays ahead of the consumption, but doesn't
realize the entire result unless required. Only useful for
computationally intensive functions where the time of f dominates
the coordination overhead."
{:added "1.0"}
([f coll]
(let [n (+ 2 (.. Runtime getRuntime availableProcessors))
rets (map #(future (f %)) coll)
step (fn step [[x & xs :as vs] fs]
(lazy-seq
(if-let [s (seq fs)]
(cons (deref x) (step xs (rest s)))
(map deref vs))))]
(step rets (drop n rets))))
<snip>
-Todd