Just a question about the consistency of the API:
When one passes a "strange" (ie, wrong type) object to contains?, say
(contains? 'blab 'a)
the result is a false.
But if one passes the wrong type to, e.g., even?, like
(even? 'a)
The result is:
java.lang.ClassCastException: clojure.lang.Symbol cannot be cast to
java.lang.Number (NO_SOURCE_FILE:0)
This seems a bit inconsistent, in the sense that one would expect the
same behavior for the same kind of error.
So the question is: is there any thing less obvious here (at least for
me), or is this a real issue with the API?
Thanks,
Tiago
--
"The hottest places in hell are reserved for those who, in times of
moral crisis, maintain a neutrality." - Dante
Both of them honor their documentation - no doubt. My point is not
that, my point is that the behavior is different between the 2
functions for the same kind of issue:
What is the rationale for even? and contains? having different
behaviors for the exact same error (ie, one throws the other works
fine and just returns false on a type error)? From a design
perspective this seems to increase the cognitive load to programmers
without any (apparent) reason.
One would imagine that both functions should have the same behavior
for the same kind of error...
I imagine the rationale is efficiency. Every core function could
conceivably do a number of runtime checks to make sure that each input
is the right kind of type, and then Clojure might feel more sluggish.
So instead, the core functions just worry about what to do for the
appropriate inputs. If you pass a bogus input, the consequence
depends entirely on how that particular function was coded. It might
return a spurious result, or it might error. There are numerous
examples of this in the Clojure API.
It wouldn't surprise me if there are a number of programmers who would
be turned off by the ease with which one can shoot yourself in the
foot without getting any kind of error message, but in practice I
haven't gotten bit by this yet.
Of course they don't pick functions at random, but people do make
mistakes. Although I don't mind Clojure's approach, it is reasonable
for people to want clear error messages when they use a function
improperly.
Why is functional programming better than imperative programming?
One common answer to this question is that functional programs are
easier to debug. Why? Because in an imperative program, if one part
has an error, the error doesn't necessarily manifest in the portion of
code that has the error. Instead, a little piece of memory or state
can be corrupted in one spot, and then much, much later, an error
happens as a result of this inconsistency. Thus the need for
sophisticated debuggers and steppers to identify where the corruption
happened (since the manifestation of the error doesn't really help you
know the source of the problem).
However, in a functional program, you have these well-defined pieces
that reliably return the same output for a given set of inputs, making
it easier to test individual components, and when something produces
an error, it's pretty easy to pinpoint the function where things went
wrong.
Unfortunately, when core functions don't produce errors for invalid
inputs, then you have a similar problem as with imperative languages.
It becomes rather easy to write a program where the consequence of an
error is far removed from the source.
I have been told that Erlang programmers have developed a culture
where errors are generally not caught or disguised in any way.
There's sort of a "crash early and crash hard" philosophy, to increase
the likelihood that a crash will happen in the block of code that is
causing the problem and not later.
I understand this, but it creates another kind of problem: When
programmers make mistakes (and I, being not a perfect human being,
make lots of mistakes) the system behaves inconsistently in a place
where one would expect consistency (type checking): sometimes it blows
in your face (like even?), sometimes it does that silently (like
contains?).
From a software engineering perspective is seems a bit dangerous.
Unless of course, programmers are perfect and never make mistakes, for
those kind, this discussion might seem ridiculous.
I, for one, prefer the blow in your face (ie, like even?) design
approach than the silent one. But above all, consistency would be nice
to have.
But I understand the rationale for efficiency. Tough I question if
there is really a rationale of efficiency or this issue was really
never really thought about (considering that Clojure is such a fast
moving target, that would be normal).
Peace,
Tiago
> Just a question about the consistency of the API:
> When one passes a "strange" (ie, wrong type) object to contains?, say
> (contains? 'blab 'a)
> the result is a false.
> But if one passes the wrong type to, e.g., even?, like
> (even? 'a)
> The result is:
> java.lang.ClassCastException: clojure.lang.Symbol cannot be cast to
> java.lang.Number (NO_SOURCE_FILE:0)
>
> This seems a bit inconsistent, in the sense that one would expect the
> same behavior for the same kind of error.
Is it really the same kind of error? I am not sure. In fact, I'd say
this depends on the precise definitions of the two predicates.
The behaviour of contains? makes sense if you define "a contains b" as
"a is a collection AND b is an element of a". This definition implies a
return value of false whenever a is not a collection.
In the same spirit, one could define "x is even" as "x is an integer AND
x mod 2 = 0", concluding that (even? 'a) should return false because 'a
is not an integer. But that definition has its weak points as well, e.g.
that even? and odd? are not complementary.
I don't think it is possible to define a perfectly coherent set of
definitions of everything in the Clojure API that satisfies
simultaneously all conditions that one might expect to hold. In the end
it's a matter of priorities, and Rich's choices for Clojure are mostly
in the "pragmatic" category: a compromise between principles,
simplicity, and efficiency.
Konrad.
__________ Information provenant d'ESET NOD32 Antivirus, version de la base des signatures de virus 4589 (20091109) __________
Le message a été vérifié par ESET NOD32 Antivirus.
> I imagine the rationale is efficiency.
Here's the function from clojure/lang/RT.java:
static public Object contains(Object coll, Object key){
if(coll == null)
return F;
else if(coll instanceof Associative)
return ((Associative) coll).containsKey(key) ? T : F;
else if(coll instanceof IPersistentSet)
return ((IPersistentSet) coll).contains(key) ? T : F;
else if(coll instanceof Map) {
Map m = (Map) coll;
return m.containsKey(key) ? T : F;
}
else if(key instanceof Number && (coll instanceof String ||
coll.getClass().isArray())) {
int n = ((Number) key).intValue();
return n >= 0 && n < count(coll);
}
return F;
}
That last return could be changed to a throw and it wouldn't make things
any slower (in the non-error case). Note that 'get' always behaves the
same way, so I guess it's probably intentional for associative lookups,
I can't see why though.
(get 3 3)
=> nil
I did not want to make this a discussion about contains? per se, but
about the general design philosophy about dealing with type erros
(which I think is a bigger issue, IMHO).
But it is a bit difficult to avoid noticing that contains? is a bit
unintuitive, so to say.
Even with vectors
(contains? [1 5] 5) being false sounds somewhat strange.
I understand the initial intent, like this:
(contains? {'a 5} 5) being false
But the end result with strings and vectors is a tad unintuitive...
Tiago
Right, strings and vectors can be thought of as either collections, or
as associative mappings from integers to characters/objects.
contains? treats them as associative mappings. Yes, it's unintuitive,
but it has a certain degree of internal consistency.
This certainly encourages you to use sets whenever you want, um, set-
like behavior...
Mark Engelberg wrote:
> 2009/11/9 Tiago Antão <tiago...@gmail.com>:
>> What is the rationale for even? and contains? having different
>> behaviors for the exact same error (ie, one throws the other works
>> fine and just returns false on a type error)?
> I imagine the rationale is efficiency.Here's the function from clojure/lang/RT.java:
static public Object contains(Object coll, Object key){
if(coll == null)
return F;
else if(coll instanceof Associative)
return ((Associative) coll).containsKey(key) ? T : F;
else if(coll instanceof IPersistentSet)
return ((IPersistentSet) coll).contains(key) ? T : F;
else if(coll instanceof Map) {
Map m = (Map) coll;
return m.containsKey(key) ? T : F;
}
else if(key instanceof Number && (coll instanceof String ||
coll.getClass().isArray())) {
int n = ((Number) key).intValue();
return n >= 0 && n < count(coll);
}
return F;
}
Why not:static public Object contains(Object coll, Object key){if(coll == null)return F;else if(coll instanceof Map)return ((Map) coll).containsKey(key) ? T : F;else if(coll instanceof IPersistentSet)return ((IPersistentSet) coll).contains(key) ? T : F;else if(key instanceof Number && (coll instanceof String ||coll.getClass().isArray())) {int n = ((Number) key).intValue();return n >= 0 && n < count(coll);}return F;}instead?
What's wrong with clojure.lang.PersistentQueue?
>In the meantime, the main thing still missing from Clojure is a convenient
>queue. Lists and vectors both add and remove efficiently only at one end,
>and at the same end for add and remove in both cases. Doubly-linked lists
>can't be made persistent without massive problems, but laziness has its own
>issues:
Perhaps the GHC Data.Sequence library could be ported. It's based on
2-3 finger trees, and allows efficient adding and removal from either
end of the sequence.
Depending on use behavior, you can also make a decent lazy queue just
out a two lists, where you reverse and append whenever the source side
fills up.
David
The only clojure constructor I could find for this is in
clojure.contrib.accumulators. But, once you have the empty one
'clojure.lang.PersistentQueue/EMPTY' you can use it pretty well with
the rest of the language.
Perhaps 'empty-queue' should be moved into core?
David
>Depending on use behavior, you can also make a decent lazy queue just
>out a two lists, where you reverse and append whenever the source side
>fills up.
Ok, this is what PersistentQueue is, except without the reverse and
append, so it actually performs well.
David
I've tried porting finger trees to Scheme before, and although it is
efficient in an algorithmic sense, I found the constants involved to
be so high, performance was very poor. Perhaps it was a fault with my
port, but I've become quite skeptical of the value of finger trees.
Anyway, I've been satisfied with clojure.lang.PersistentQueue for queues.
I often want a priority queue, but John Harrop addressed that need not
long ago. Before that, I would sometimes just use Clojure's sorted
set.
Sometimes I still want a persistent deque, but I haven't needed it
quite badly enough to warrant coding one myself.
In the meantime, the main thing still missing from Clojure is a convenient queue. Lists and vectors both add and remove efficiently only at one end, and at the same end for add and remove in both cases. Doubly-linked lists can't be made persistent without massive problems, but laziness has its own issues:(defn queue-peek [q] (first q))(defn queue-pop [q] (rest q))(defn queue-push [q obj] (concat q [obj]))(let [q (reduce queue-push nil (range 1000000))](reduce (fn [_ q] (queue-pop q)) nil q))#<CompilerException java.lang.StackOverflowError (NO_SOURCE_FILE:0)>
So give it a convenient name like this:
(def empty-queue clojure.lang.PersistentQueue/EMPTY)
and then you're ready to go.
conj, peek, pop, into and all the other sequence-based functions work
the way you'd expect.
The implementation cleverly uses a list for the first part of the
queue, and a vector for the second part of the queue, and is efficient
and persistent.
Yes, it's in Clojure 1.0, it just doesn't have a convenient name.
So give it a convenient name like this:
(def empty-queue clojure.lang.PersistentQueue/EMPTY)
and then you're ready to go.
conj, peek, pop, into and all the other sequence-based functions work
the way you'd expect.
The implementation cleverly uses a list for the first part of the
queue, and a vector for the second part of the queue, and is efficient
and persistent.
Well, the latest code is here:
http://github.com/richhickey/clojure/blob/master/src/jvm/clojure/lang/PersistentQueue.java
I don't know whether it has changed since 1.0.
Just look in the src/jvm/clojure/lang directory of your clojure distribution.
But basically, when the front list is exhausted, the front part
becomes the seq of the vector, and the rear part becomes an empty
vector ready to accept more elements.