> I'll play around with the lazy branch this week, and this is just a
> name suggestion: what do you think of first/tail/rest where (rest s)
> == (seq (tail s))? tail is already used in other functional languages
> such as Haskell and OCaml to represent all-but-the-first elements, so
> it wouldn't be completely foreign.
I like that choice as well.
Otherwise, I'll join the crowd who thinks that it's better to get
everything in order now without making compromises for backwards
compatibility.
Konrad.
> I am looking for feedback from people willing to read and understand
> the linked-to documentation and the fully lazy model, and especially
> from those trying the lazy branch code and porting some of your own.
I'm trying svn rev 1282 with the following test (which depends on
javadb, (derby)):
user=> (use 'clojure.contrib.sql.test)
nil
user=> (db-write)
It hangs there. This works on the trunk.
I looked for uses of rest in the following libs and didn't find any:
clojure.contrib.sql
clojure.contrib.sql.internal
clojure.contrib.sql.test
I tried using Chouser's "-Dclojure.assert-if-lazy-seq=please"
facility. While I was able to trigger an exception from it using
sample code, it wasn't triggered during the hang.
I'd like to figure this out.
- Has anyone gotten past this already?
- Does anyone see the problem by inspecting the lib code?
- This seems like an opportunity for me to use a Java debugger with
Clojure for the first time. Has anyone written about using JSwat or
another debugger with Clojure?
I would appreciate hearing any tips for getting to the cause of this.
--Steve
- This seems like an opportunity for me to use a Java debugger with Clojure for the first time. Has anyone written about using JSwat or another debugger with Clojure?
(defn my-interpose [x & coll]
(loop [v [x] coll coll]
(if coll
(recur (-> v (conj (first coll)) (conj x)) (rest coll))
v)))
This is a bit like the builtin interpose, except it takes multiple
args instead of a collection, and it returns a vector with the
interposed value surrounding all the others:
(my-interpose 'x 'a 'b 'c)
-> [x a x b x c x]
At least that's what it does in svn 1282 trunk. In 1282 lazy branch,
it's an infinite loop. Can you spot the problem?
When discussing this yesterday in IRC, I was pretty firmly against
Rich's preferred names, for exactly this reason. And worse than
trying to fix my own code would be the potential confusion over which
versions of examples, libs, etc. work with which versions of Clojure.
...but my position has softened, as I tried to construct an example
for this post that actually broke in a bad way. My first several
attempts produced code that worked in both versions.
For example, my-interpose above takes multiple args so that I could
safely assume that 'coll' is a seq. My first (unposted) version took
a collection as a second argument, but in that case a simple
"(if coll" is probably already an error, in case a user passed in an
empty vector, or some other collection. The solution would be to test
the seq of the coll:
(defn my-interpose [x coll]
(loop [v [x] coll coll]
(if (seq coll) ; Don't assume coll is a seq-or-nil
(recur (-> v (conj (first coll)) (conj x)) (rest coll))
v)))
That also happens to solve the lazy-branch infinite-loop problem --
what's more correct in trunk is more correct in lazy, in this case.
So I kept refining my-interpose, trying to get a version that was
correct in trunk but caused a non-exception error in lazy. After
several iterations I finally got the one at the top of this post.
...but even that one can be caught easily by turning on
clojure.assert-if-lazy-seq, in which case you get an exception
pointing directly to the line that needs to be changed:
java.lang.Exception: LazySeq used in 'if'
So the same changes that will already have to be made to nil puns for
the other seq functions would now have to be made for uses of the new
'rest' function.
Sorry if this has been a bit long-winded, but I wanted to explain why
I've changed my mind a bit -- changing the meaning of 'rest' may not
be as bad as I had been thinking.
--Chouser
I just tried this on 1282 lazy branch with assert-if-lazy-seq, and I
get no exception and no hang:
user=> (time (db-write))
"Elapsed time: 802.020886 msecs"
I wonder what's different?
I seem to have version 1.6.0_07 of javadb, java, and javac, running on
Ubuntu.
--Chouser
Are you burning cycles while hung, or just blocked?
> I've been working on this for a few months, in lieu of more
> interesting things, because I knew it would be a breaking change and
> we're trying to get the biggest of those behind us. I appreciate any
> effort you spend in trying to provide informed input.
For those who want to play with this without keeping two versions of
their source code files, I have added a new macro lazy-and-standard-
branch to clojure.contrib.macros. Here is an example of how to use it:
(lazy-and-standard-branch
(defn value-seq [f seed]
(lazy-seq
(let [[value next] (f seed)]
(cons value (value-seq f next)))))
(defn value-seq [f seed]
(let [[value next] (f seed)]
(lazy-cons value (value-seq f next))))
)
Konrad.
You know, there is an empty? predicate. Why not write it as:
(defn my-interpose [x coll]
(loop [v [x] coll coll]
(if (empty? coll) v ; Don't assume coll is a seq-or-nil
(recur (-> v (conj (first coll)) (conj x)) (rest coll)))))
I know that your first version is viewed as more idiomatic in Clojure,
but I've never understood why Rich and others prefer that style. It
assumes that converting something to a seq is guaranteed to be a
computationally cheap operation, and I see no reason to assume that
will always be the case. I can certainly imagine seq-able
collections that take some time seq-ify, so converting to a seq to
test for empty, and then just throwing it away causing it to be
recomputed in rest doesn't seem as future-proof as just using empty?.
> I just tried this on 1282 lazy branch with assert-if-lazy-seq, and I
> get no exception and no hang:
>
> user=> (time (db-write))
> "Elapsed time: 802.020886 msecs"
>
> I wonder what's different?
Based on it working for you, the current theory I'm working to verify
is that this was caused by a clojure-contrib.jar compiled with trunk
interacting with a clojure.jar from lazy 1282.
Should we branch contrib and do the fixups on a lazy branch? Chouser,
have you already fixed it enough to compile with clojure contrib's
build.xml?
--Steve
> For those who want to play with this without keeping two versions of
> their source code files, I have added a new macro lazy-and-standard-
> branch to clojure.contrib.macros. Here is an example of how to use it:
BTW, my library modules in clojure.contrib (accumulators, monads,
probabilities) now work with the lazy branch as well as with the
standard one. The changes were minor and quick to do. The nil-punning
compiler flag was quite helpful.
Konrad.
> Based on it working for you, the current theory I'm working to
> verify is that this was caused by a clojure-contrib.jar compiled
> with trunk interacting with a clojure.jar from lazy 1282.
I've confirmed this. Thanks for the help. The test I wrote about is
now working for me with lazy 1282.
--Steve
>
> Here's an example of what I think will be the worst kind of breakage
> resulting from changing the meaning of rest from
> seq-on-the-next-item-if-any-else-nil to
> possibly-empty-collection-of-the-remaining-items:
>
> (defn my-interpose [x & coll]
> (loop [v [x] coll coll]
> (if coll
> (recur (-> v (conj (first coll)) (conj x)) (rest coll))
> v)))
>
> This is a bit like the builtin interpose, except it takes multiple
> args instead of a collection, and it returns a vector with the
> interposed value surrounding all the others:
>
> (my-interpose 'x 'a 'b 'c)
> -> [x a x b x c x]
>
> At least that's what it does in svn 1282 trunk. In 1282 lazy branch,
> it's an infinite loop. Can you spot the problem?
>
> When discussing this yesterday in IRC, I was pretty firmly against
> Rich's preferred names, for exactly this reason. And worse than
> trying to fix my own code would be the potential confusion over which
> versions of examples, libs, etc. work with which versions of Clojure.
While not knowing if sample code has been ported will still be an
issue, anyone following the porting recipe:
http://clojure.org/lazier#toc7
will avoid this one as well, as the call will be to next, not rest.
I would just clarify that to say that the best route is *not* to
structurally change code that uses rest, just have it call next
instead (unless you are writing a lazy-seq body). Using next is going
to let you preserve your code structure and yields the simplest idioms
- since next (still) nil puns!
core.clj e.g. is full of code that presumes it is walking a seq chain,
and so contains lots of next calls:
There's nothing wrong with that idiom. I do not recommend that people
leave their rest calls and 'fix' the nil puns - instead, change your
rest calls to next, then deal with your own lazy-cons calls (possibly
restoring some rest calls in lazy-seq bodies), then try the
clojure.assert-if-lazy-seq flag to find any conditional use of lazy
sequences.
Rich
When walking a chain of seqs, empty? made no sense as there is no such
thing as an empty seq. Now that rest returns a collection, this makes
more sense (although still not my preference), but to each his own.
Let's please not get bogged down in a style discussion now. You should
be quite happy for this:
(rest [1])
-> () ;an empty sequence, note - not a canonic/sentinel value!
Also note empty? is still defined like this:
(not (seq coll))
Rich
I don't ever compile clojure-contrib, I just put its src dir in my
classpath. I've fixed a couple functions here and there using a macro
something like Konrads, but I think that's going to make the code
cluttered pretty quickly. A branch of clojure-contrig is probably
quite sensible at this
point.
--Chouser
Great! Thanks for the report.
Rich
(I would say "seq-on-remainder-of-collection")
I really like the first/rest decomposition concept. first (if exists) is
an item, and rest is the remainder-of-whatever following the first.
To me next connotes another item like the first, and that may be
misleading. So I do not think that next is a good name.
Please allow me as an inexpert, relatively uninvolved reader to raise an
emperor's new clothes type question: why is there a need for next
anyway. Are there that many idioms or code internals that justify a
shortcut for (seq rest)?
Regards,
..jim
1. It always troubled me that filter, when written in the most
natural way, had a "hang on to the head" problem when skipping over
large numbers of items. I think this is something worth solving, and
I'm glad that while developing the lazier branch, you came up with a
compiler enhancement to address this. In my mind, this may be the
most valuable aspect of the new changes. The new version of filter is
definitely more complex than the old version, but it's not too bad.
If the compiler enhancement could be backported to make the old (most
natural) version work as well, I think that would be even better.
2. I definitely prefer that you go with the best names, rather than
worrying about backward compatibility at this point. So I like the
idea of changing the meaning of rest. I'm not particularly keen on
the name "next", but I don't care that much. I feel fairly certain
that my own personal programming style will be to stick with
first/rest, and I doubt I'll use "next", so to me the name choice only
matters to the extent that it uses up a name that might be natural in
another context.
3. As I've noted here previously, I never cared much for nil punning,
and I always try to write my own code in a way that doesn't rely on
it. So I don't care if it goes away.
4. The new model is definitely more complicated to understand than
the previous model. There was already a certain degree of mental
overlap between collections and the seq interface. Now, there is also
the subtle distinction between a seq and a sequence. rest and next
are very similar, but one can return something empty, and one can
return nil. Making the right choice, and interfacing with other code
is now a bit more complicated (although people can always call seq to
convert it into the seq/nil paradigm with certainty, which is not much
different than before). I think the additional complexity is worth it
to solve things like the filter problem, but I think it's definitely
more confusing than before.
5. At first glance, it seems like sticking with the original
lazy-cons model, but removing nil punning and adding an empty sequence
sentinel, along with your compiler enhancement, would accomplish
everything the new "lazier" branch accomplishes with much less mental
complexity and subtle overlap, and resulting in the most intuitive
version of filter working as expected. I know you like the seq/nil
model, and the nil punning, but since you're already moving in the
direction of reducing reliance on this approach, I hope you've
considered going "all the way", to see if it would solve the problem
more elegantly. If you have considered this already, I'd be curious
to know whether it didn't solve the problem, or whether it just
resulted in too much breakage with existing code, or whether it's just
a style you don't like as a matter of taste...
>
> My thoughts so far:
>
>
> 4. The new model is definitely more complicated to understand than
> the previous model. There was already a certain degree of mental
> overlap between collections and the seq interface. Now, there is also
> the subtle distinction between a seq and a sequence.
There will need to be good descriptions of these, but the similarity
is more in names than anything else - seqs are what they always were -
cursors, and sequences are just collections.
> rest and next
> are very similar, but one can return something empty, and one can
> return nil. Making the right choice, and interfacing with other code
> is now a bit more complicated (although people can always call seq to
> convert it into the seq/nil paradigm with certainty, which is not much
> different than before).
Code that returns sequences should use rest. next is just a
convenience function for terminal/consumer code. If you look through
the ported core code, most recurs use next, most conses inside lazy-
seqs use rest. But as you noted, if you want to ignore next, that's
fine.
> I think the additional complexity is worth it
> to solve things like the filter problem, but I think it's definitely
> more confusing than before.
>
> 5. At first glance, it seems like sticking with the original
> lazy-cons model, but removing nil punning and adding an empty sequence
> sentinel, along with your compiler enhancement, would accomplish
> everything the new "lazier" branch accomplishes with much less mental
> complexity and subtle overlap, and resulting in the most intuitive
> version of filter working as expected.
I realize you are focused on filter, but that point of the fully lazy
branch is full laziness, which would not fall out of what you
describe. lazy-cons requires the lazy sequence function do all the
work that precedes the call to lazy-cons, making functions like drop
and filter not fully lazy.
> I know you like the seq/nil
> model, and the nil punning, but since you're already moving in the
> direction of reducing reliance on this approach, I hope you've
> considered going "all the way", to see if it would solve the problem
> more elegantly. If you have considered this already, I'd be curious
> to know whether it didn't solve the problem, or whether it just
> resulted in too much breakage with existing code, or whether it's just
> a style you don't like as a matter of taste...
I'm not sure what you mean by "all the way". If you mean removing seqs
entirely, here are some issues:
(seq x) acts as a single interface to the sequence system, if it were
to return a seq[uence] collection instead of seq/nil, then all
sequence function bodies would have to return () explicitly, adding
another branch to all implementations that could otherwise use when:
(defn map [f coll]
(lazy-seq
(if (empty? coll)
()
(cons (f (first coll)) (map f (rest coll))))))
instead of:
(defn map [f coll]
(lazy-seq
(when (seq coll)
(cons (f (first coll)) (map f (rest coll))))))
But the actual definition in the lazy branch pulls the seq out of the
lazy sequence like so:
(defn map [f coll]
(lazy-seq
(when-let [s (seq coll)]
(cons (f (first s)) (map f (rest s))))))
Doing so yields a significant (> 60%) speed improvement, without it,
all lazy calls (empty?/first/rest) have a double indirection, and no
way to get rid of it.
Rich
I was thinking that with an empty sentinel, lazy-cons does not have to
do work ahead of time to see whether you're at the end. If you have a
lazy-cons, you know you're not empty without evaluating the first or
rest, and if you have the empty object, you're empty. So I think you
could make lazy-cons fully lazy. I haven't fully thought this through
though, so maybe I'm missing something.
> Doing so yields a significant (> 60%) speed improvement, without it,
> all lazy calls (empty?/first/rest) have a double indirection, and no
> way to get rid of it.
The 60% speed improvement is compelling, but I was thinking that
(especially with an empty sentinel) empty? could be implemented more
efficiently than (not (seq coll)), so the empty?/first/rest style
wouldn't have such a performance hit.
> There will need to be good descriptions of these, but the similarity
> is more in names than anything else - seqs are what they always were -
> cursors, and sequences are just collections.
That distinction is quite clear, the problem is indeed just in the
names, in my opinion. What's the difference between a sequence and
what the rest of the Lisp world calls a list? Would it be reasonable
to call sequences lists?
Konrad.
Rich Hickey a écrit :
> I am looking for feedback from people willing to read and understand
> the linked-to documentation and the fully lazy model, and especially
> from those trying the lazy branch code and porting some of your own.
>
I just ported Enlive
(http://github.com/cgrand/enlive/commit/3245678e6ae0a82152dbf4a6fb8916d2514b60dd):
* found/replaced rest by next and rrest by nnext,
* no broken nil punnings (!) but several calls to seq? that could be
rewritten in a less brittle way,
* no metadata on sequence, is this an oversight? or is this related to
the lack of metadata on closures? (I'm willing to work on this.)
I'll quickly get over my imperative interpretation of 'next and I trust
in your naming skills: if first/next/fine is the best option, go with it.
Christophe
--
Professional: http://cgrand.net/ (fr)
On Clojure: http://clj-me.blogspot.com/ (en)
butlast, doall, dorun, doseq, dosync, dotimes, doto, fnseq, gensym,
macroexpand, macroexpand-1, mapcat, nthrest
If we want to keep these names as-is then why do we have hyphens in so
many of the other multi-word function names?
--
R. Mark Volkmann
Object Computing, Inc.
butlast, doall, dorun, doseq, dosync, dotimes, doto, fnseq, gensym,
macroexpand, macroexpand-1, mapcat, nthrest
Changing these names is not on the table.
Rich
> New docs here:
>
> http://clojure.org/lazy
In the html doc:
rest... "returns a possibly empty seq, never nil"
then later
"never returns nil
- currently not enforced on 3rd party seqs"
In "(doc rest)"
"may return nil"
What's the cleaned up version of all that? Is it worth guaranteeing
that rest never returns nil or should "never returns nil" be removed
from the html docs?
--Steve
I've fixed the doc string. I'm not going to add code for 3rd party
seqs, enforcing this contract is up to them. If they derive from ASeq
and define next, ASeq will do the right thing for them.
Rich
Since count realizes the whole list, this seems like a bad way to test
for empty on a lazy sequence.
Fixed in (lazy) 1286 - thanks for the report.
Rich
> It seems the Sequence/ISeq dichotomy was a sticking point for many.
> After some tweaking, I've been able to get rid of Sequence entirely,
> SVN 1284+ in lazy branch. This is source compatible with 1282 (first/
> rest/next), except that sequence? no longer exists - go back to seq?.
>
> New docs here:
>
> http://clojure.org/lazy
>
> Let me know if that is simpler.
I'd say yes.
The remaining weird feature is the seq function and its use. The name
suggests that it converts to a seq, which is in fact what it used to
do. Now it converts to a seq unless the resulting seq would be empty.
For an empty seq, it actually converts a seq to a non-seq!
Would it be possible to make an empty seq test as false? One could
then do away with the conversion to seq in tests completely, and seq
could always return a seq, including an empty one. Of course, this
would imply that a logical test on a seq evaluates its first element,
but that doesn't look unreasonable to me.
Konrad.
>
> On Feb 16, 2009, at 20:23, Rich Hickey wrote:
>
>> It seems the Sequence/ISeq dichotomy was a sticking point for many.
>> After some tweaking, I've been able to get rid of Sequence entirely,
>> SVN 1284+ in lazy branch. This is source compatible with 1282 (first/
>> rest/next), except that sequence? no longer exists - go back to seq?.
>>
>> New docs here:
>>
>> http://clojure.org/lazy
>>
>> Let me know if that is simpler.
>
> I'd say yes.
>
> The remaining weird feature is the seq function and its use. The name
> suggests that it converts to a seq, which is in fact what it used to
> do. Now it converts to a seq unless the resulting seq would be empty.
> For an empty seq, it actually converts a seq to a non-seq!
>
There will always be a tension between treating the first node in a
list as a node vs as the entire list. The seq function is firmly in
the former camp, essentially returning the node containing the first
item. seq also has an important role regarding lazy seqs - when given
one it forces it and returns the inner seq. This is the big reason why
empty? is not a replacement for seq. One way to look at (seq x) is as
a version of (not (empty? x)), where the truth value is more useful
than 'true'.
I will be adding a sequence function that will act as a constructor/
coercion, so:
(seq []) -> nil
(sequence []) -> ()
In addition, sequence, when given a seq, will not force it, if it is
lazy.
It will work like this:
(defn sequence [x]
(if (seq? x)
x
(or (seq x) ())))
People that don't like nil punning need never use seq/next.
> Would it be possible to make an empty seq test as false?
No - then you could no longer distinguish between an empty collection
and nothing. Additionally, that would be a big performance hit for 'if'.
Rich
I am fully on board with this idea. 'next' seems unfit to me exactly
as a few people have pointed out, it connotes an item rather than the
rest of a seq.
That said, I really have no grounds to be confident either way that
rest will almost exclusively appear in the context of constructing
lazy-seqs. Can any other more experienced folks can offer some
thoughts on this conjecture?
Someone suggested 'next-seq' earlier. I would also suggest rest-seq,
which is almost a literal translation of the invariant:
(rest-seq x) === (seq (rest x))
I like this, but as you point out Perry, there is a typing penalty to
be paid here; however, I think it is a small one and completely
justified.
Jim asked above why seq-on-the-next-item-if-any-else-nil is needed at
all, and clearly as Rich stated, there is a large number of uses of
rest right now proving its usefulness.
For me, this highlights the fact that in the new fully lazy seq model,
the function 'seq-on-the-next-item-if-any-else-nil" is not core to the
abstraction. As a result, having a "second class' hyphenated name such
as rest-seq (or next-seq), almost confers the "exists for convenience"
nature of this function that we are trying to name.
/mike.
For example, if I were going to give a name to one empty sequence to
reuse within my code, would one of these be preferable?:
(def empty '())
(def empty (sequence []))
or some other variation?