Why cannot "last" be fast on vector?

3,406 views
Skip to first unread message

Warren Lynn

unread,
Jun 28, 2012, 7:32:03 PM6/28/12
to clo...@googlegroups.com
This is an off-shoot subject from my last post "General subsequence function".

I found people had similar questions before (one year ago):
http://groups.google.com/group/clojure/browse_thread/thread/712711f049507c63/aea7cf438aa22922

As of Clojure 1.4, seems nothing changed, as "source" show here:

user> (source last)
(def
 ^{:arglists '([coll])
   :doc "Return the last item in coll, in linear time"
   :added "1.0"
   :static true}
 last (fn ^:static last [s]
        (if (next s)
          (recur (next s))
          (first s))))

Any reason for that? Thanks.

David Nolen

unread,
Jun 28, 2012, 7:36:33 PM6/28/12
to clo...@googlegroups.com
Don't hold your breath. Assume that the language was designed after much consideration. last is a sequence operation, not a collection operation. If the distinction doesn't make sense, I suggest you explore the design decision by writing some non-trivial Clojure code so you can arrive at your own satisfying answer why this was done. Otherwise you'll just listen to people repeat the same answer without hearing what is being said.

David 

Tamreen Khan

unread,
Jun 28, 2012, 9:59:04 PM6/28/12
to clo...@googlegroups.com
Here's a somewhat old but still generally useful article on how Clojure vectors are implemented:  http://blog.higher-order.net/2009/02/01/understanding-clojures-persistentvector-implementation/ 

Vectors are optimized for random access, whereas lists are optimized for going through from the beginning to the end, which is exactly what the last function does.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Mark Engelberg

unread,
Jun 28, 2012, 11:08:35 PM6/28/12
to clo...@googlegroups.com
On Thu, Jun 28, 2012 at 6:59 PM, Tamreen Khan <hist...@gmail.com> wrote:
Here's a somewhat old but still generally useful article on how Clojure vectors are implemented:  http://blog.higher-order.net/2009/02/01/understanding-clojures-persistentvector-implementation/ 

Vectors are optimized for random access, whereas lists are optimized for going through from the beginning to the end, which is exactly what the last function does.



This doesn't really address the underlying question.

nth is a good example of a function that is fast for vectors, strings, and arrays, and only for lists uses the slower sequential implementation.

There's no technical reason that last couldn't do the same, dispatching on the underlying type to have a more efficient implementation for data types that support it.

The real reason is this: With the glaring exception of nth, Clojure tends to eschew functions with a fast path and a slow path, because programmers shouldn't have to do deep semantic analysis of their code to figure out which will get chosen.  In this case, however, we're talking about a function that is currently always slow.  Surely it would be a win to make it have a faster path in a large number of cases, right?  Well, the argument against that is eventually people would start to rely on it being fast for so many collection types, and then get burned when somehow their collection accidentally gets turned into a seq before last is called on it.  The theory is that it's better to force people to use a different function call, namely peek, to retrieve the last element quickly from a vector.

Honestly though, despite David Nolen's claim that this design decision will be obvious if you've coded enough Clojure, I've been programming in it for around 3-1/2 years, and I think the argument behind this design decision is a relatively weak one, so you are not making an unreasonable request.  In fact, the more I code in Clojure, the more I see that there are a number of core functions that require significant semantic analysis of the surrounding context to know how they will behave.  Either Clojure trusts that programmers can make determinations about the types of their collections or it doesn't.  Which is it?  Well, I'd argue there's a large body of evidence that under Clojure's current design, it's clear that you better know what you're doing.  If that's the case, why not go ahead and make last fast when it can be fast? 

Nevertheless, I hope I've clarified the reasoning behind the current design.  Early Clojure design decisions have a lot of inertia, so as David pointed out, this is very unlikely to change.

Timothy Baldridge

unread,
Jun 28, 2012, 11:26:07 PM6/28/12
to clo...@googlegroups.com
> Nevertheless, I hope I've clarified the reasoning behind the current
> design.  Early Clojure design decisions have a lot of inertia, so as David
> pointed out, this is very unlikely to change.

Well put.

I would argue as well, that many functional languages I've come across
take this stance: the language is opensource, you have namespaces, so
feel free to make a namespace called mycore, that includes all you
special versions of functions that you need, and then use that library
for your projects. It doesn't take very long to define these special
case functions and if they work for you, then just use them. For
instance, one such function I wrote tonight is called every-other
(returns '(1 3 5) if you hand it (1 2 3 4 5))

Does this function exist in clojure.core...not that I saw off-hand.
Does it really matter though? Not really. I'll copy and paste these
functions when I need them and go happily on my way. One of the
biggest strengths of functional programming is that we can build up
snippets of functions like this, and when we use then in our programs
we can't tell that they weren't included by default.

As Gerald Sussman said at the last Strange Loop conference: be a
Libertarian when it comes to programming. Don't restrict others to
your view of programming, liberty to you and to me and to everyone.
You can code in your style, I'll code in mine, and Clojure (or Lisp in
the case of the quote), shouldn't constrain me to its view of the
world.

Timothy

David Nolen

unread,
Jun 29, 2012, 1:42:42 AM6/29/12
to clo...@googlegroups.com
So you recommend that "last" should be only function that accepts IPersistentStack *and* ISeqable and does the appopiate thing? Or are there others?
--

Meikel Brandmeyer (kotarak)

unread,
Jun 29, 2012, 1:52:45 AM6/29/12
to clo...@googlegroups.com
Hi,


Am Freitag, 29. Juni 2012 05:26:07 UTC+2 schrieb tbc++:

For instance, one such function I wrote tonight is called every-other
(returns '(1 3 5) if you hand it (1 2 3 4 5))

user=> (take-nth 2 [1 2 3 4 5])
(1 3 5)

Kind regards
Meikel
 

Warren Lynn

unread,
Jun 29, 2012, 10:51:44 AM6/29/12
to clo...@googlegroups.com

Tamreen:

Thank you. Your posts really explained why it is so. I understand the reasons now, but I can hardly agree those reasons are good ones (not arguing with you, as you pointed out, the reasons are weak for this case).

As I pointed out before in my other post ("General subsequence function"), mixing abstraction with speed path is really a bad idea, and might be a classical example of what Rich called "complecting". In practice, it did not achieve what it tries to achieve either (one of the big complaints I read from the Web about Clojure is it is really hard to know the speed performance). For speed, the classical solution is:

1. Put good documentations on the functions, and the programmer needs to have some idea what data structure is fast/slow for what use. If the programmer does not have a clue, why would making "last" artificially slow on vectors help? Plus, whether one data structure is slower than the other for certain operations can also be an implementation detail that may change.
2. Use a profiler, so you can keep optimizing on the current hot spots.

The above solution has no interference with abstraction.

David Nolen

unread,
Jun 29, 2012, 10:55:13 AM6/29/12
to clo...@googlegroups.com
On Fri, Jun 29, 2012 at 10:51 AM, Warren Lynn <wrn....@gmail.com> wrote:
> 1. Put good documentations on the functions, and the programmer needs to
> have some idea what data structure is fast/slow for what use. If the
> programmer does not have a clue, why would making "last" artificially slow
> on vectors help? Plus, whether one data structure is slower than the other
> for certain operations can also be an implementation detail that may change.

The design choice has nothing to do with speed, it has nothing to do
with concrete types like lists and vectors either, no matter what
might have been said before by others.

David

Warren Lynn

unread,
Jun 29, 2012, 10:55:23 AM6/29/12
to clo...@googlegroups.com
Sorry, I should have addressed my last post to "puzzler".

Warren Lynn

unread,
Jun 29, 2012, 11:05:23 AM6/29/12
to clo...@googlegroups.com

Surely nobody can restrict/enforce anything on anybody, and I can always have my own "mycore" ns. In theory, I could even create my own language without the need to persuade anybody (or just fork from Clojure and have my own private copy). That will be the ultimate "Libertarian", but some people (including myself) may call me "savage" in doing so, unless you can attract enough people to form a new society around you, like Rich has done. :-)

Warren Lynn

unread,
Jun 29, 2012, 12:13:25 PM6/29/12
to clo...@googlegroups.com

The design choice has nothing to do with speed, it has nothing to do
with concrete types like lists and vectors either, no matter what
might have been said before by others.

David

If the design choice has nothing to do with speed path, Could you let us know why we cannot get free speed improvements on vectors?
 

David Nolen

unread,
Jun 29, 2012, 12:19:40 PM6/29/12
to clo...@googlegroups.com
On Fri, Jun 29, 2012 at 12:13 PM, Warren Lynn <wrn....@gmail.com> wrote:
> If the design choice has nothing to do with speed path, Could you let us
> know why we cannot get free speed improvements on vectors?

I already have.

Weber, Martin S

unread,
Jun 29, 2012, 12:30:04 PM6/29/12
to clo...@googlegroups.com
I'm sorry to say, but IMHO you failed to communicate the critical point to
your audience. If your audience keeps failing to grasp the point, and
communicates this failure back by asking the same question..

I do understand the distinction between a collection and a sequence and
something being a collection or a sequence operation. That doesn't stop
certain sequences from being capable of doing certain things in a better
way. I simply do not understand the reason of clojure/rich/whoever
throwing away a potential performance increase because "this operation
does not operate on the correct level to be performant. If you expect this
to be performant, rewrite it on a lower level." That stance seems
surprisingly patronizing.

Also, it seems of course clojure itself is inconsistent with regard to
that design choice. At least looking at Timothy Baldridge's reply to
"possible bug in reducer code" makes me think so. I mean come on, a high
level operation being optimized on certain types? Heresy!

Regards,
-Martin


David Nolen

unread,
Jun 29, 2012, 1:32:16 PM6/29/12
to clo...@googlegroups.com
On Fri, Jun 29, 2012 at 12:30 PM, Weber, Martin S <martin...@nist.gov> wrote:
> I'm sorry to say, but IMHO you failed to communicate the critical point to
> your audience. If your audience keeps failing to grasp the point, and
> communicates this failure back by asking the same question..

Perhaps we differ on the matter of audience participation? :)

The functions found in the standard library fall out of Clojure's well
considered protocol / interface partitions. Don't take my word for it
as I already said - examine the protocol / interface partitions
yourself.

In the end you'll be left with a discussion about *names* (peek vs.
last) - not performance, concrete types or anything else. At which
point we'll probably agree that it's water under the bridge and it's
not going to change.

David

Nicolas

unread,
Jun 29, 2012, 2:45:53 PM6/29/12
to Clojure
I would say we can have different ways of designing things.

A way is to design abstractions and provide services on top on theses
abstractions. The abstraction here is ISeq. That is sequences. Last
is not part of the ISeq abstraction and just work on top of it. There
is no way to access last element directly from something that is just
a an ISeq. So last can't use it.

If last was part of ISeq abstraction, you would not need a separate
last function at all. But this would also mean that all ISeq
implementations would need to implement last. This would be sure
conveniant, but this is not the same abstraction. It is also heavier
(more code, more maintenance...). And for some implementation like
networks streams, this would be anoying than anything as it would
provide no added value.

Seq abstraction is exactly that: a very basic (yet powerfull)
abstraction for accessing streams, linked list and other specialized
structures that are sequential in essence.

Clojure team could have designed last to work on any possible type
where it make sense and so have better performance for each possible
type. An efficient implementation would use a protocol to do so.
Making last its own abstraction. This is indeed possible, but was not
the choice here.

Fact is others abstractions already allow you to have the last element
directly like Indexed if this important to you.

So yes we could have a better last, promoting it to a full
abstraction. Like clojure could be many more things. Is this REALLY a
priority for the future of clojure? Not at least for clojure core
team. I tend to share their views. You have the right to disagree.

On my side, I'am far more interrested to see progress in
clojurescript, IDEs, tooling, maybe even grid computing.

You care about last. This is your right... And well why not implement
your own last in contrib or private lib and use if it is important?
Like other members of the community implemented code matching their
own interrest.

It is logical you ask for it, but there is no need to insist or maybe
be a little offensive if others don't share your views.

Regards,

Nicolas Bousquet.

On 29 juin, 01:32, Warren Lynn <wrn.l...@gmail.com> wrote:
> This is an off-shoot subject from my last post "General subsequence
> function".
>
> I found people had similar questions before (one year ago):http://groups.google.com/group/clojure/browse_thread/thread/712711f04...

Sean Corfield

unread,
Jun 29, 2012, 3:27:57 PM6/29/12
to clo...@googlegroups.com
On Fri, Jun 29, 2012 at 7:51 AM, Warren Lynn <wrn....@gmail.com> wrote:
> 1. Put good documentations on the functions, and the programmer needs to
> have some idea what data structure is fast/slow for what use.

At the risk of continuing what is quickly becoming a rather fruitless
thread, I figured I'd quote the docstrings from last and peek:

last: Return the last item in coll, in linear time

peek: For a list or queue, same as first, for a vector, same as, but
much more efficient than, last. If the collection is empty, returns
nil.

That seems like pretty good documentation to me. (last my-vector) is
documented to be a linear operation. (peek my-vector) is documented to
be "much more efficient". last is not dependent on the type of its
argument, peek is.
--
Sean A Corfield -- (904) 302-SEAN
An Architect's View -- http://corfield.org/
World Singles, LLC. -- http://worldsingles.com/

"Perfection is the enemy of the good."
-- Gustave Flaubert, French realist novelist (1821-1880)

Warren Lynn

unread,
Jun 29, 2012, 4:25:02 PM6/29/12
to clo...@googlegroups.com
Thanks for clarifying more on the rationale behind the design. Also a note on the tone: I never thought insisting on a view is offensive. Insisting our views are the essence of a debate. But we need to insist based on reasons and acknowledge when we are wrong. Also, I realize that sometimes debating on a forum is different from face-to-face debating and offense is easier to make. I hope nobody took any offense.

Here actually I am not only trying to ask for a faster "last". I am trying to understand some of the designs here (and along the way insisting on my own "right" one), because I have this regretful feeling that Clojure has some really nice LISP features but failed on some basic stuff, and I am seriously doubting whether I should invest more on it (possibly on a production system). I will be more than glad to be convinced that the design is sound, but I will need enlightenment.

My understanding here is "ISeq" is an INTERNAL, implementation level interface/abstraction, not the user/language level abstraction (which in this case should be "ordered collection", as somebody called), so whether "last" is part of "ISeq" is irrelevant. In my view, at the user/language level, last should work on all ordered collections as fast as it can without EXPOSING whatever internal implementation is on.

I know there is a lot of exciting projects going on with Clojure. But a good review of the design does not conflict with that, and may be beneficial for future exciting projects.

I am still insisting on my view, but I hope its based on reasons.

Warren Lynn

unread,
Jun 29, 2012, 4:34:04 PM6/29/12
to clo...@googlegroups.com
Even not a single action is taken because of this thread, I still would not consider the thread fruitless. It helped me (and maybe others) understand the issue better.

My point was: you need a clear documentation on a coherent, consistent abstraction, and let the programmer to understand. Just clear documentation is not enough. You can document a very messy system in clear documentation (maybe the US tax code?).

Here, we are having both "peek" and "last", which is not coherent to me. consider the documentation on an alternative design:

last: get the last element from an ordered collection. for queues and linked lists, it takes linear time. for vectors, it takes constant time.

and get rid of "peek" (we already have "first" for linked list and queues, right?)

Which one is cleaner?

David Nolen

unread,
Jun 29, 2012, 4:41:59 PM6/29/12
to clo...@googlegroups.com
On Fri, Jun 29, 2012 at 4:25 PM, Warren Lynn <wrn....@gmail.com> wrote:
> My understanding here is "ISeq" is an INTERNAL, implementation level
> interface/abstraction, not the user/language level abstraction (which in
> this case should be "ordered collection", as somebody called)

It is not internal. It is a user/language level abstraction.

David

Warren Lynn

unread,
Jun 29, 2012, 5:09:07 PM6/29/12
to clo...@googlegroups.com
You are right, clojure.lang.ISeq is public and I can see it from "user" namespace.

But that is not what I meant, the "ordered collection" is a language level abstraction, clojure.lang.ISeq is an exposed interface. With regard to "last", clojure.lang.ISeq is still implementation level, even if it is exposed.

In other words, the "ordered collection" abstraction is like (or maybe the same as) the "clojure.lang.Seqable". And vectors belongs to it:
(isa? clojure.lang.PersistentVector clojure.lang.Seqable) => true

Mark Engelberg

unread,
Jun 29, 2012, 5:17:53 PM6/29/12
to clo...@googlegroups.com
It is clear that some collections *could* support a more efficient last.  Anything with random access.  Anything that supports rseq (e.g., sorted collections). 

There are multiple ways to accomplish this; I presume a protocol would do the trick.

Perhaps the original design decision is easily justified in terms of the original way the collections were factored into various interfaces, but now that it is so easy to make functions polymorphic over different types via protocols, I can't imagine this would be a difficult change if anyone cared to do it.

I don't think anyone is arguing that the current semantics aren't well documented; the claim is that it violates the "Principle of Least Surprise" in the sense that most people expect a built-in core function to be implemented efficiently for the cases when it can be implemented efficiently.  Everyone knows that a vector, for example, is designed to provide very fast access to the last element, so it is counterintuitive that a function called "last" ignores this capability and just treats vector as a generic sequence, searching through the items one-by-one to get to the last element.

BTW, I disagree with Warren's comment about how a better last would eliminate the need for peek.  Even if you have last, peek is a very useful way of guaranteeing consistent stack semantics for a wide variety of collection types.

Warren Lynn

unread,
Jun 29, 2012, 5:34:46 PM6/29/12
to clo...@googlegroups.com
Although I have not yet looked at any of Clojure's internals yet, I suspect the change won't be too difficult (for the right person). So I hope/wish someone with enough knowledge, skills and influence will agree with me and advocate a review of the design ("last" may not be the only one with issues) and fix some of those things before things get too messy down the road.

Looking "peek" from your point of view (actually from another abstraction point of view), it seems useful. Thank you.

David Nolen

unread,
Jun 29, 2012, 6:05:37 PM6/29/12
to clo...@googlegroups.com
On Fri, Jun 29, 2012 at 5:17 PM, Mark Engelberg
<mark.en...@gmail.com> wrote:
> It is clear that some collections *could* support a more efficient last.
> Anything with random access.  Anything that supports rseq (e.g., sorted
> collections).

And what does overloading last in this way mean for drop-last,
take-last, but-last? All sequence functions.

David

Warren Lynn

unread,
Jun 29, 2012, 6:49:04 PM6/29/12
to clo...@googlegroups.com
The same? If internally it can be faster, be faster. If not, don't change.

Sam Ritchie

unread,
Jun 29, 2012, 7:28:08 PM6/29/12
to clo...@googlegroups.com
Perhaps place them inside a protocol, where core supplies implementations for ISeq only? This would make it easier to extend efficient behavior to other types without placing a big burden on core.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en



--
Sam Ritchie, Twitter Inc
@sritchie09

(Too brief? Here's why! http://emailcharter.org)

David Nolen

unread,
Jun 29, 2012, 7:42:14 PM6/29/12
to clo...@googlegroups.com
On Fri, Jun 29, 2012 at 6:49 PM, Warren Lynn <wrn....@gmail.com> wrote:
> The same? If internally it can be faster, be faster. If not, don't change.

For which types do you think they can be made faster?

David

David Nolen

unread,
Jun 29, 2012, 7:50:56 PM6/29/12
to clo...@googlegroups.com
On Fri, Jun 29, 2012 at 7:28 PM, Sam Ritchie <sritc...@gmail.com> wrote:
> Perhaps place them inside a protocol, where core supplies implementations
> for ISeq only? This would make it easier to extend efficient behavior to
> other types without placing a big burden on core.

ISeq *is* an interface on Clojure JVM. But ideally it would be
protocol as in ClojureScript. But then all ISeq implementing types
must also implement this new protocol you are suggesting to get these
basic *generic* sequence operations we enjoy today.

David

Mark Engelberg

unread,
Jun 29, 2012, 7:57:12 PM6/29/12
to clo...@googlegroups.com
vectors and sorted collections, for sure.

Warren Lynn

unread,
Jun 29, 2012, 8:02:28 PM6/29/12
to clo...@googlegroups.com
I really don't know, as I know very little about how these things are implemented. My point is, we maintain a coherent abstraction and get the best speed we can. To recap, what I don't like about current "last" is it makes writing generic code difficult. Currently, I need to use "peek" for vector and "last" for sequence. That defeats the very purpose of abstraction.

Mark Engelberg

unread,
Jun 29, 2012, 8:02:32 PM6/29/12
to clo...@googlegroups.com

I think the suggestion is to create a protocol for each function that could potentially gain speed improvements for specialized types.  So for example, ILast could be a protocol.  extend ILast to have an implementation for ISeq (using the current code).  Then, there's no burden on implementers of new data types that implement ISeq to do any additional work, but those who want to create custom implementations of last for other data types are free to do so.  Win-win.

Warren Lynn

unread,
Jun 29, 2012, 8:06:05 PM6/29/12
to clo...@googlegroups.com
Again, I don't know the internal details. If you are saying because of the current implementation, the change is difficult, then we will be talking about the implementation, not about the abstraction design. I have very little to say about that.

David Nolen

unread,
Jun 29, 2012, 8:17:40 PM6/29/12
to clo...@googlegroups.com
On Fri, Jun 29, 2012 at 8:02 PM, Mark Engelberg
<mark.en...@gmail.com> wrote:
> I think the suggestion is to create a protocol for each function that could
> potentially gain speed improvements for specialized types.  So for example,
> ILast could be a protocol.  extend ILast to have an implementation for ISeq
> (using the current code).

As I said, if ISeq and ILast are both protocols that won't work. No
protocol inheritance.

At this point I think some of the basic challenges are clear. Would
love to see someone actually build these ideas out purely on top of
protocols instead of making this thread any longer.

A good Clojure implementation via ClojureScript would make the
arguments more convincing ;)

David

Bobby Eickhoff

unread,
Jun 29, 2012, 9:23:09 PM6/29/12
to clo...@googlegroups.com
Warren, you're on the right track with your alternative design: Intuitively and ideally, "last" should return the last element of a finite, ordered collection.  But Clojure's "last" operates on sequences, not collections.  This is problematic because sequences can be (effectively) infinite.  Calling "last" on an arbitrary sequence is, therefore, dubious at best.  It's what Doug Crockford might call an attractive nuisance: sometimes useful, but dangerous.  So "last" isn't a function I would spend alot of time trying to fix.

There is historical precedent for Clojure's "last" function.  For example, see Haskell's "last" function in Data.List.  Historical precedent doesn't justify the design, but it helps explain how we got here.

Bobby

Mark Engelberg

unread,
Jun 29, 2012, 11:04:02 PM6/29/12
to clo...@googlegroups.com
On Fri, Jun 29, 2012 at 5:17 PM, David Nolen <dnolen...@gmail.com> wrote:
As I said, if ISeq and ILast are both protocols that won't work. No
protocol inheritance.


I don't see how inheritance factors into this.  This works just fine in Clojure 1.3.  What am I missing?:

(defprotocol Last
  (better-last [s]))

(extend-protocol Last
  nil
  (better-last [s] (last s))
  Object
  (better-last [s] (last s))
  clojure.lang.ISeq
  (better-last [s] (last s))
  clojure.lang.Reversible
  (better-last [s] (first (rseq s)))
  java.lang.String
  (better-last [s] (nth s (dec (count s))))
  clojure.lang.IPersistentVector
  (better-last [s] (peek s)))

Warren Lynn

unread,
Jun 29, 2012, 11:39:47 PM6/29/12
to clo...@googlegroups.com
Bobby:

Thanks for confirming my sanity here. But I have a different opinion on the "attractive nuisance" label on "last" function. Sure if you call "last" on an infinite sequence that will not work (the program will get stuck?), but it is no more dangerous than other dynamic part of the language. Think about a higher order function, a function that takes another function as an argument, isn't that much more dangerous? I mean, you don't even know what kind of function people may throw into it, and the damage could be bigger (for example, it may even wipe out you hard drive). As a dynamic language gives high level of freedom and abstraction power, the programmer needs to take certain responsibilities too, because the compiler will catch much less errors than a statically typed language.

So I still think "last" is a very useful function and worth fixing. Ideally, if a sequence can be checked if it is infinite or not so "last" can throw an exception when not applicable, but that may not be feasible with lazy sequences. And again I feel the responsibility should be more on the programmer's side.

David Nolen

unread,
Jun 29, 2012, 11:55:02 PM6/29/12
to clo...@googlegroups.com
ISeq is a interface on Clojure JVM. So that will work. In ClojureScript it won't as ISeq is a protocol.
--

Mark Engelberg

unread,
Jun 29, 2012, 11:55:19 PM6/29/12
to clo...@googlegroups.com
On Fri, Jun 29, 2012 at 4:50 PM, David Nolen <dnolen...@gmail.com> wrote:
ISeq *is* an interface on Clojure JVM. But ideally it would be
protocol as in ClojureScript. But then all ISeq implementing types
must also implement this new protocol you are suggesting to get these
basic *generic* sequence operations we enjoy today.


I see, you're not saying it can't be done in Clojure, you're saying it wouldn't work on ClojureScript.  It seems to me that's a limitation of either ClojureScript, or protocols, or both.  My guess is that it's a limitation of ClojureScript, because my understanding is that in Clojure, every protocol generates a Java interface, so I can't think of any reason why you couldn't list that generated interface as a "type" in another protocol (although I haven't tried it).

If that doesn't work, then this merits more discussion, I think, because it points to a real issue about building large systems with protocols.  The issue of "thin" vs "fat" interfaces is a tricky one in many languages, and one of the most reasonable solutions is to give the fat interface an implementation in terms of the thin one, so that implementers are only required to implement the thin interface, but can override the default implementation of the fat interface if need be.  If this can't be done in Clojure, it would be good to figure out an alternative that will work.

Mark Engelberg

unread,
Jun 30, 2012, 12:59:27 AM6/30/12
to clo...@googlegroups.com
On Fri, Jun 29, 2012 at 8:55 PM, Mark Engelberg <mark.en...@gmail.com> wrote:
my understanding is that in Clojure, every protocol generates a Java interface, so I can't think of any reason why you couldn't list that generated interface as a "type" in another protocol (although I haven't tried it).


Hmmm, thinking more about this, I wonder:  is the problem that unless the protocol is implemented inline in the defrecord/deftype, it doesn't officially "implement" the corresponding the interface?
 

Craig Brozefsky

unread,
Jun 30, 2012, 3:47:58 AM6/30/12
to clo...@googlegroups.com
Warren Lynn <wrn....@gmail.com> writes:

> Although I have not yet looked at any of Clojure's internals yet, I
> suspect the change won't be too difficult (for the right person). So I
> hope/wish someone with enough knowledge, skills and influence will
> agree with me and advocate a review of the design ("last" may not be
> the only one with issues) and fix some of those things before things
> get too messy down the road.

Err, I don't think this is an issue of "design" let alone something
worthy of getting a bit weepy about Clojure failing to be worthy of use
in your production system as implied by your previous posts.

Here is a version of last for clojure.core that will use peek on
vectors.

(def
^{:arglists '([coll])
:doc "Return the last item in coll."
:added "1.0"
:static true}
last (fn ^:static last [s]
(if (vector? s)
(peek s)
(if (next s)
(recur (next s))
(first s)))))

If the intent of specifying "in linear time" in the docstring was to
guarantee the user that we would be looping over a sequence using
first/next, then it should explicitely say so and not just imply it by
mentioning linear time. Otherwise, I see the phrase as a warning, and
not as a guarantee.

David Nolen asked about drop-last and take-last and but-last.

drop-last is defined as returning lazy sequences. A
change to them that returned a vector with it's tail peeked off would be
a change to the language. I am not so keen on that.

take-last returns a seq, and it's not obvious after a whiskey just how
to rewrite it to be more efficient for vectors and not just end up
making it linear scaling on n for vectors.

However, butlast is documented as returning a seq, so changing pop on a
vector might have some performance advantage, and not change the
outcomes of the function. Here is what that would look like.

(def
^{:arglists '([coll])
:doc "Return a seq of all but the last item in coll, in linear time"
:added "1.0"
:static true}
butlast (fn ^:static butlast [s]
(if (and (not (empty? s))
(vector? s))
(pop s)
(loop [ret [] s s]
(if (next s)
(recur (conj ret (first s)) (next s))
(seq ret))))))

These changes don't really bring any new entanglement between
clojure.core and the clojure.lang java objects -- because the language
defines explicitely how vector conj/peek. However, going thru
clojure.core and optimizing it based on knowledge of implementation in
clojure.lang would be complecting - thus punishable by exile to the
bitcoin mines.


--
Craig Brozefsky <cr...@red-bean.com>
Premature reification is the root of all evil

Craig Brozefsky

unread,
Jun 30, 2012, 4:18:46 AM6/30/12
to clo...@googlegroups.com
Sam Ritchie <sritc...@gmail.com> writes:

> Perhaps place them inside a protocol, where core supplies

I don't think a protocol is needed to identify a performance
characteristic, which as far as I can tell really is limited to vectors.

The definition of vector? itself is sufficient.

Also, consider that last is defined as part of the very base of core,
before protocols get loaded, and even before defn is available. Using a
protocol to optimize at that level is gonna get dirty dirty.

See my reply to Warren my proposed change to core. Those defs can be
dropped into clojurescript too, btw.

Stuart Halloway

unread,
Jun 30, 2012, 9:51:56 AM6/30/12
to clo...@googlegroups.com
Having separate "peek" and "last", with documented performance characteristics, makes it straightforward to reason about how code is likely to perform, a point that Mark made earlier.

It is a fundamental design principle of Clojure that, when given two approaches, A and B, make primitive the one that could be used to build the other.  Clojure's "peek" and "last" might be useful to somebody building the "last" that Warren wants.  Warren's "last" is *useless* for building the "peek" and "last" that Clojure already has.

Not arguing against having Warren's "last".  Just saying that c.c/last ain't broken, and is simpler.

Stu

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Stuart Halloway
Clojure/core
http://clojure.com

David Nolen

unread,
Jun 30, 2012, 10:07:48 AM6/30/12
to clo...@googlegroups.com
On Friday, June 29, 2012, Mark Engelberg wrote:
On Fri, Jun 29, 2012 at 4:50 PM, David Nolen <dnolen...@gmail.com> wrote:
ISeq *is* an interface on Clojure JVM. But ideally it would be
protocol as in ClojureScript. But then all ISeq implementing types
must also implement this new protocol you are suggesting to get these
basic *generic* sequence operations we enjoy today.


I see, you're not saying it can't be done in Clojure, you're saying it wouldn't work on ClojureScript.  It seems to me that's a limitation of either ClojureScript, or protocols, or both.  My guess is that it's a limitation of ClojureScript, because my understanding is that in Clojure, every protocol generates a Java interface, so I can't think of any reason why you couldn't list that generated interface as a "type" in another protocol (although I haven't tried it).

It is a host detail one can take advantage of in Clojure. But it is not a feature of protocols. In fact using protocols this way will fail. You have to go under the hood to the generated interface. And even then doesn't work for extend-typed things.

As far as whether this is a problem for large software I don't share you concerns. ClojureScript is getting towards 7000 lines of standard library. Not problems yet. core.logic is approaching 40 protocols and 3000 lines of code. No problems encountered and none foreseen when the full functionality is ported to ClojureScript.

Why does this seem to scale? Because Clojure's interfaces were already written in a protocol style. Inheritance doesn't matter much, and they only define 2-3 methods on average.

Warren Lynn

unread,
Jun 30, 2012, 10:09:11 AM6/30/12
to clo...@googlegroups.com

As I mentioned before, the issue here is not just a fast "last". The issue here is I find the design of Clojure "wrong", or confusing. That really shakes my confidence in the whole system.

I am also not talking about implementations, which is a separate issue that I don't know much yet

According to the book "Clojure Programming" by Chas, one of the Clojure's principles is to "build on abstractions" and to separate abstractions from concrete types. Now I have a little deeper contact with Clojure and my impression is that is not the reality.

Warren Lynn

unread,
Jun 30, 2012, 10:50:03 AM6/30/12
to clo...@googlegroups.com

On Saturday, June 30, 2012 9:51:56 AM UTC-4, stuart....@gmail.com wrote:
Having separate "peek" and "last", with documented performance characteristics, makes it straightforward to reason about how code is likely to perform, a point that Mark made earlier.


As I said before, I strongly feel mixing abstraction and speed path is the wrong direction to take. Let me elaborate a little bit more:

1. What's the purpose of high level language and dynamic typing? To abstract so we can be more efficient in writing code (Note: code itself my not run fast)  and the code is easier to read and maintain. Statically typed language has less abstraction power (although it still tries hard with things like C++ templates), but faster. Dynamic language has higher abstraction power but slower performance. So it is wrong to me to throw away the strength of abstraction of a dynamic language in pursuit of performance. Performance is of course important, but not at the expense of broken abstraction. Maybe that can be done on rare cases that may have a huge impact of performance on the whole system, but it seems wrong to do it as a "design choice".

2. The argument that if a function takes its pre-designated speed characteristics, even when its possible to be faster on a particular type, will encourage better code performance or performance analysis, is a very convoluted one. If you say: by making "last" slow on vectors, it will force people to use "peek", then why not make it even slower on vectors? (how about putting a "sleep" there for vectors?). If you say "no, no, sometimes people may still use "last" on vectors and we still want the best performance in that case", then why not make it as fast as we can?

3. Maybe we don't have one now, but we will have a profiler in the future if the language will be put into serious use. With a profiler you can see where the performance bottleneck is, and a programmer can reason "Ah, this part is slow because I used "last" on a large list", give we already documented that "last" will take linear time on lists. So he can choose the right data type instead of switching functions. If the abstraction is done right, switching a data type should not be difficult.


It is a fundamental design principle of Clojure that, when given two approaches, A and B, make primitive the one that could be used to build the other.  Clojure's "peek" and "last" might be useful to somebody building the "last" that Warren wants.  Warren's "last" is *useless* for building the "peek" and "last" that Clojure already has.

 
Not arguing against having Warren's "last".  Just saying that c.c/last ain't broken, and is simpler.


I don't understand the argument that because a higher level function cannot be used to build a lower level function, so it is not worth working on or fixing. We can build Clojure with Java but Clojure cannot be used to build Java, so we don't need to work on Clojure?

Keeping both type specific low level functions and abstraction level function is less of a problem to me, but I would avoid it if possible.

Stu


Craig Brozefsky

unread,
Jun 30, 2012, 12:17:39 PM6/30/12
to clo...@googlegroups.com
Warren Lynn <wrn....@gmail.com> writes:

> As I mentioned before, the issue here is not just a fast "last". The
> issue here is I find the design of Clojure "wrong", or confusing. That
> really shakes my confidence in the whole system.

To say stuff like this, then be demure about digging into the code and
understanding how things work, followed by asking for others to be do
the coding and be and advocate for your rather vague feeling of unease,
strikes me as passive-aggressive attention seeking. To do such while
top posting, well, it's just too much for me. 8^)

Warren Lynn

unread,
Jun 30, 2012, 12:37:17 PM6/30/12
to clo...@googlegroups.com
Craig:

If the dominant community attitude is "before you know everything about Clojure, you have no right to comment", then that itself will be the reason not to use Clojure. But I think that is just you, not the community.

Although I am glad some people paid attention to this post, I have far more important things to do than seeking some attention from strangers. I hope I stimulated some thinking here. I am a lisp lover, and I feel Clojure has a good potential of being a great and practical language, but it has its broken parts too and I wish they get fixed, so I will have a better experience using it. That is my goal (obvious I hope).

László Török

unread,
Jun 30, 2012, 1:03:04 PM6/30/12
to clo...@googlegroups.com
Warren,

I think the issue is this:

You claim there is sg. broken in clojure while admitting that you know little about how the design decision was made.

People that know clojure's implementation and the history of design decisions inside-out offered advice why they think it is not broken, they took time to explain the rationale for decision and even offerred advice how to "fix it" for yourself should you insist on your view of the matter.

It seems to me you 
a) need to reread these arguments to perhaps get a better grasp
b) have chosen to ignore them.

While you have right to do either of them, if it's b) not even the "clojure gods" can really help you unless you actually spend some time with the internals of clojure. ;-).

Las

2012/6/30 Warren Lynn <wrn....@gmail.com>
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en



--
László Török

Warren Lynn

unread,
Jun 30, 2012, 1:24:47 PM6/30/12
to clo...@googlegroups.com


On Saturday, June 30, 2012 1:03:04 PM UTC-4, Las wrote:
Warren,

I think the issue is this:

You claim there is sg. broken in clojure while admitting that you know little about how the design decision was made.

People that know clojure's implementation and the history of design decisions inside-out offered advice why they think it is not broken, they took time to explain the rationale for decision and even offerred advice how to "fix it" for yourself should you insist on your view of the matter.

It seems to me you 
a) need to reread these arguments to perhaps get a better grasp
b) have chosen to ignore them.

While you have right to do either of them, if it's b) not even the "clojure gods" can really help you unless you actually spend some time with the internals of clojure. ;-).

Las



First, this will be my last post on this thread, so I will be absolved from "attention seeking". But really, as you have pointed out, things are getting in a kind of circle and we may just need to sit back and think for a while.

I think some people agree with me something is broken here (puzzler, for example. Please correct me is I am wrong as I don't want to hijack other people's opinion). For the other people who don't agree with me, I am not really ignoring their argument (I tried hard to reply to each of them), it is just, I am not really convinced. No I don't know the language inside-out, and I don't know much about the implementation. But I was trying to keep my discussion on the high-level abstraction. If somebody told me "Warren, I agree with you that the abstractions and design principles need some fixing, but man it is very hard to do now, take my word for it", even nothing changed I will feel better using Clojure because at least people admit there is a problem and there is a chance it will get fixed in the future (on on another host language). So far I have not get that kind of feedback.

We all choose what we think is right for us, so there is no imposing anything on anybody. I appreciate the fact at least there is the language Clojure we can talk about.

Thanks you all for the participation.

And I will much more comfortalbe Of course in the end


Sean Corfield

unread,
Jun 30, 2012, 2:08:37 PM6/30/12
to clo...@googlegroups.com
On Sat, Jun 30, 2012 at 7:09 AM, Warren Lynn <wrn....@gmail.com> wrote:
> As I mentioned before, the issue here is not just a fast "last". The issue
> here is I find the design of Clojure "wrong", or confusing. That really
> shakes my confidence in the whole system.

Some of Clojure's design decisions take a while to really internalize.
Several folks here have explained them but you're still not satisfied.
I don't think there's much more they can do at this point.

> According to the book "Clojure Programming" by Chas, one of the Clojure's
> principles is to "build on abstractions" and to separate abstractions from
> concrete types. Now I have a little deeper contact with Clojure and my
> impression is that is not the reality.

But it is the abstractions specifically that lead to the situation you
are in, with last operating the way it does, as several folks have
explained.

Some have suggested you read through and understand the implementation
of Clojure. That may help you understand the abstractions better (it
may not). But, as has been stated (repeatedly) in this thread, Clojure
is this way for a (good) reason. I actually think Stu's post was one
of the clearest explanations...

Mark Engelberg

unread,
Jun 30, 2012, 8:05:06 PM6/30/12
to clo...@googlegroups.com
On Sat, Jun 30, 2012 at 10:24 AM, Warren Lynn <wrn....@gmail.com> wrote:
I think some people agree with me something is broken here (puzzler, for example. Please correct me is I am wrong as I don't want to hijack other people's opinion).

One really nice thing about the Clojure community is that from the very beginning, Rich instilled a value that we should generally avoid talking in terms of a "sky is falling" mentality, and avoid using terms like "broken".  Generally speaking, the community has been trained to avoid engaging with people who make such exaggerated claims.  This has a  couple of beneficial effects: first, it helps keep criticisms grounded and constructive, second, it tends to limit the destructive potential of people who are just trolling.  As a result of this community ethic, you'll find that if you keep calling Clojure broken, people will just tune you out.

So no, I don't agree that last is "broken".  I claimed that last *could* be made polymorphic, and that if, it were up to me, I *would* make it polymorphic because I think there's little downside and it's generally better to make functions behave in an intuitive way; I believe that the name last implies fast access for data structures which offer fast access to the last element.

However, I, like most of the others here, don't regard this as a big deal.  The first time I read through the docs, I noted that last wasn't a particularly useful function because of its linear time bound.  So I just don't use it.  If I need fast access to the last element, I use a vector, and I use the peek function.  It's not the way I would design things, but it's not a big deal, either.  There are a number of aspects of Clojure I feel that way about, and it doesn't stop me from wanting to use it.

Now I *do* find this discussion interesting, particularly because of things that David has said about protocols and ClojureScript, and the limitations that this may place on design.  Ever since protocols were introduced, I have been very intrigued to see how they would play out in practice.  I would like to discuss this issue further, but I will start a new thread to do so...

Vinzent

unread,
Jul 1, 2012, 6:33:23 AM7/1/12
to clo...@googlegroups.com
I fully support Warren's point of view. I'm also unhappy with current behaviour of sequence functions (more accurate: I think it can be made even better). 

In my mind, protocols (interfaces?) based, polymorphic core functions is just what we need. Moreover, similar request has been done already (e.g. see CLJ-803) for reasons completely unrelated to the topic of this thread.

I think some parts of clojure can be considered "broken", quoting from doc-strings for deftype, defrecord, extend: "Alpha - subject to change", "It is TBD how
  to specify which impl to use".

I believe Warren has provided some constructive criticism of the language and rised important problems. It's a good idea to create corresponding wiki page on JIRA.

This thread was definitely useful for me, and I find the discussion interesting and worthy too.

Rich Hickey

unread,
Jul 1, 2012, 2:17:07 PM7/1/12
to clo...@googlegroups.com
Wow, this thread was exhausting :) This reply is not to anyone in particular, just tagging on at the end (hint).

It is quite easy to come up to some part of Clojure, with some need, and some current perspective, and find it unsatisfying. It certainly is not perfect. But a good presumption is that there might be more context, more constraints, or more issues in play than one might recognize at first glance.

Mark was most correct in identifying that the algorithmic complexity dominates this decision. As far as the functions being defined in terms of abstractions: while true, the abstractions are still something that could be refactored. They are a means, not an end.

History: 'last' is a Lisp function. In Common Lisp it operates on lists only. It is a function that, there, advertises slowness by its mere presence. Code using it has a presumption it will only be used on short lists, and in fact it generally is; in macros and things that process code. People reading that code can presume the same.

I do think that, in general, it is bad to have a single polymorphic function that, depending on the target arguments, transitions between different algorithmic complexity categories (N.B. that is not equivalent to using polymorphism to get better performance by leveraging some implementation specifics, it is a specific subset). 'nth' for seqs is a counterexample, was requested by the community, and I still think is somewhat of a mistake. At least, it would be good to also have an 'elt' with specific non-linear complexity (for the reasons I give below).

Usually, there is a 'fast' function and someone wants it to work on something that would be in a slower category. This case is the opposite, 'last' is slow and people want it to be faster where possible. The end result is the same.

Why not just be as polymorphic as possible? In Common Lisp, there is a set of functions that lets you use lists as associative maps. The performance is not good as things get larger (i.e the sky falls). While this made sense at one point in time (when there were only lists), is it still a good idea now? Should we have functions that expect maps accept lists?

In my experience, having everything work on/with lists made it really easy to get a system together that worked on small data sets, only to have it crash and burn on larger ones. And then, making it perform well was extremely challenging. Some of that was because of the lack of interface-conforming implementations of e.g. hashmaps to swap in. but another category of problem was not being able to see what calls mattered for performance, when lists would still be ok etc. Thus Clojure 'forces' you to make some of these decisions earlier, and make them explicit. Even so, it is still highly polymorphic.

At the moment, I'm not sure it would be bad to have 'last' behave better for vectors et al while still promising in documentation to be not necessarily better than linear. But I do know this - it is seriously unimportant. We all have much better things to do, I hope, than argue over things like this.

There is a perfectly adequate and well documented way to get the last element of a vector quickly, and code for which that matters can (and should, *even if last were made faster*) use it. Code for which it doesn't matter can use 'last' on vectors because - ipso facto *it doesn't matter*! People reading either code will know what is and isn't important, and that seems to matter most.

Rich

Warren Lynn

unread,
Jul 1, 2012, 4:35:41 PM7/1/12
to clo...@googlegroups.com

I promised I won't post more on this thread. But Rich is here and I think I can grant myself an excuse to post just one more. End of it, I promise. :-)

First, although Rich does not think so, I myself feel this topic is very important as it is not just about "last", It touches some fundamental design principles.

Then, please see my comments embedded in Rich's original message. And my comments are not long as I already said most of what I want to say before. Thank you.


On Sunday, July 1, 2012 2:17:07 PM UTC-4, Rich Hickey wrote:
Wow, this thread was exhausting :) This reply is not to anyone in particular, just tagging on at the end (hint).

It is quite easy to come up to some part of Clojure, with some need, and some current perspective, and find it unsatisfying. It certainly is not perfect. But a good presumption is that there might be more context, more constraints, or more issues in play than one might recognize at first glance.


True, but this applies both ways. So it is also possible to be satisfied with our current design with our current context and find out it does not fit later.

 
Mark was most correct in identifying that the algorithmic complexity dominates this decision. As far as the functions being defined in terms of abstractions: while true, the abstractions are still something that could be refactored. They are a means, not an end.

History: 'last' is a Lisp function. In Common Lisp it operates on lists only. It is a function that, there, advertises slowness by its mere presence. Code using it has a presumption it will only be used on short lists, and in fact it generally is; in macros and things that process code. People reading that code can presume the same.

I do think that, in general, it is bad to have a single polymorphic function that, depending on the target arguments, transitions between different algorithmic complexity categories (N.B. that is not equivalent to using polymorphism to get better performance by leveraging some implementation specifics, it is a specific subset). 'nth' for seqs is a counterexample, was requested by the community, and I still think is somewhat of a mistake. At least, it would be good to also have an 'elt' with specific non-linear complexity (for the reasons I give below).

Usually, there is a 'fast' function and someone wants it to work on something that would be in a slower category. This case is the opposite, 'last' is slow and people want it to be faster where possible. The end result is the same.

Why not just be as polymorphic as possible? In Common Lisp, there is a set of functions that lets you use lists as associative maps. The performance is not good as things get larger (i.e the sky falls). While this made sense at one point in time (when there were only lists), is it still a good idea now? Should we have functions that expect maps accept lists?
 
In my experience, having everything work on/with lists made it really easy to get a system together that worked on small data sets, only to have it crash and burn on larger ones. And then, making it perform well was extremely challenging. Some of that was because of the lack of interface-conforming implementations of e.g. hashmaps to swap in. but another category of problem was not being able to see what calls mattered for performance, when lists would still be ok etc. Thus Clojure 'forces' you to make some of these decisions earlier, and make them explicit. Even so, it is still highly polymorphic.

 
In "list as map" example of Common Lisp, that in my view is not the fault of polymorphic design, that is because as you said a lack of it. (... because of the lack of interface-conforming implementations ...). If the polymorphic design were done right, then we should be able to swap in an efficient map implementation. I think we should learn the *right* lesson here.
 
At the moment, I'm not sure it would be bad to have 'last' behave better for vectors et al while still promising in documentation to be not necessarily better than linear. But I do know this - it is seriously unimportant. We all have much better things to do, I hope, than argue over things like this.

There is a perfectly adequate and well documented way to get the last element of a vector quickly, and code for which that matters can (and should, *even if last were made faster*) use it. Code for which it doesn't matter can use 'last' on vectors because - ipso facto *it doesn't matter*! People reading either code will know what is and isn't important, and that seems to matter most.

Rich


As a middle ground, maybe we can keep both type specific and polymorphic functions? For people who absolutely need to squeeze out the last drop of performance, the type specific function will save them some dispatching cost. But I myself would rather to have the most generic functions and tune my performance by using the right data structures.

Again, thank you.


 
Reply all
Reply to author
Forward
0 new messages