Why take-last of empty collection is nil?

Petr Gladkikh

unread,

Feb 3, 2011, 2:04:51 AM2/3/11

to clo...@googlegroups.com

Should not it be empty colection instead?
It seems odd to me since it is inconsistent and forces to consider one
more case (nil or collection).

And another question. I have written this function
(defn index-by
"Make map (f x) -> x"
[f coll]
(reduce #(assoc %1 (f %2) %2) {} coll))

I wonder, is there already such function somewhere in Clojure libraries?

--
Petr Gladkikh

Meikel Brandmeyer

unread,

Feb 3, 2011, 2:31:34 AM2/3/11

to Clojure

Hi,

On 3 Feb., 08:04, Petr Gladkikh <petrg...@gmail.com> wrote:

> Should not it be empty colection instead?
> It seems odd to me since it is inconsistent and forces to consider one
> more case (nil or collection).

It is consistent. There is a difference between () and nil. () is the
empty list. However there is no "empty sequence." Either there is
something or there is nothing. Why would you have to check for nil?
You can pass nil to any of the sequence library functions without fear
of harm. When you write such a function yourself, there is usually a
single check in the beginning when realising the sequence. Something
like (when-let [s (seq coll)] ...).

I never encountered any problems with this. Do you have a concrete
example where this causes trouble for you?

> And another question. I have written this function
> (defn index-by
> "Make map (f x) -> x"
> [f coll]
> (reduce #(assoc %1 (f %2) %2) {} coll))
>
> I wonder, is there already such function somewhere in Clojure libraries?

You are probably looking for group-by, although it's slightly
different in that it collects all pre-images of a value (f x) in case
your function is not injective.

Sincerely
Meikel

Ken Wesson

unread,

Feb 3, 2011, 2:48:35 AM2/3/11

to clo...@googlegroups.com

On Thu, Feb 3, 2011 at 2:31 AM, Meikel Brandmeyer <m...@kotka.de> wrote:
> Hi,
>
> On 3 Feb., 08:04, Petr Gladkikh <petrg...@gmail.com> wrote:
>
>> Should not it be empty colection instead?
>> It seems odd to me since it is inconsistent and forces to consider one
>> more case (nil or collection).
>
> It is consistent. There is a difference between () and nil. () is the
> empty list. However there is no "empty sequence."

According to Clojure 1.2 itself, there is:

user=> (seq? ())
true

The empty list is both, it seems. :)

However,

user=> (seq ())
nil

so (if-let [s (seq x)] ...) still works to separate empty lists
(indeed, all empty colls) from non-empty ones.

Also of interest is

user=> (seq? nil)
false

The sequence functions accept it nonetheless, and most accept vectors,
sets, and maps as well. In particular:

user=> (nth nil 42)
nil
user=> (nth nil 42 "not found")
"not found"

Petr Gladkikh

unread,

Feb 3, 2011, 3:11:11 AM2/3/11

to clo...@googlegroups.com

On Thu, Feb 3, 2011 at 1:31 PM, Meikel Brandmeyer <m...@kotka.de> wrote:
> Hi,
>
> On 3 Feb., 08:04, Petr Gladkikh <petrg...@gmail.com> wrote:
>
>> Should not it be empty colection instead?
>> It seems odd to me since it is inconsistent and forces to consider one
>> more case (nil or collection).
>
> It is consistent. There is a difference between () and nil. () is the
> empty list. However there is no "empty sequence." Either there is
> something or there is nothing. Why would you have to check for nil?
> You can pass nil to any of the sequence library functions without fear
> of harm. When you write such a function yourself, there is usually a
> single check in the beginning when realising the sequence. Something
> like (when-let [s (seq coll)] ...).
>
> I never encountered any problems with this. Do you have a concrete
> example where this causes trouble for you?

I have a vector that holds some history. I conj new items to it and to
save space I'd like to retain not more than n last items.
To do that I used (take-last n history). So: [] -> (take-last n []) ->
nil -> (conj nil newItem) -> '(newItem)

But list conj's at the beginning not at end of sequence as I would
like to. Of course I could use () from the beginning (with account for
reverse order).
But with [] I should do little more.

--
Petr Gladkikh

Meikel Brandmeyer

unread,

Feb 3, 2011, 4:23:56 AM2/3/11

to Clojure

Hi,

On 3 Feb., 09:11, Petr Gladkikh <petrg...@gmail.com> wrote:

> I have a vector that holds some history. I conj new items to it and to
> save space I'd like to retain not more than n last items.
> To do that I used (take-last n history). So: [] -> (take-last n []) ->
> nil -> (conj nil newItem) -> '(newItem)
>
> But list conj's at the beginning not at end of sequence as I would
> like to. Of course I could use () from the beginning (with account for
> reverse order).
> But with [] I should do little more.

There is a misunderstanding here: the sequence functions return - well
- sequences. If you want a vector back, you have to tell Clojure so.

[] -> (take-last n []) -> nil -> (vec nil) -> []
[1 2 3] -> (take-last n [1 2 3]) -> (3) -> (vec (3)) -> [3].

Or use into:

(into (empty input) (take-last n input))

Sincerely
Meikel

Bill James

unread,

Feb 3, 2011, 5:24:45 AM2/3/11

to Clojure

On Feb 3, 1:04 am, Petr Gladkikh <petrg...@gmail.com> wrote:

>
> And another question. I have written this function
> (defn index-by
> "Make map (f x) -> x"
> [f coll]
> (reduce #(assoc %1 (f %2) %2) {} coll))
>
> I wonder, is there already such function somewhere in Clojure libraries?

Somewhat shorter:

(defn index-by [f coll]
(zipmap (map f coll) coll))

Ken Wesson

unread,

Feb 3, 2011, 1:22:09 PM2/3/11

to clo...@googlegroups.com

Depending on your "little more", you might want to wrap the queue
implementation I posted a while ago with something like

(defn push-max [q item max]
(let [q (conj q item)]
(if (> (count q) max)
(pop q)
q)))

This will be efficient (in particular, the count function should be
efficient on my queue) without any full traversals (unlike take-last).
The least recent item can be viewed with peek. If you want to view the
most recent item you could use my deque implementation (amortized O(1)
operations at both ends, and O(1) count) which lets you peek both
ends.

But neither will satisfy you if you want to random access in the
middle. If you want to partially traverse from only one end (either
youngest or oldest items) my deque will do: it can be partially
traversed from the left end efficiently with seq, so you'd add new
items at that end if you wanted to partially traverse youngest items,
and at the other if you wanted to partially traverse oldest items.

Of course, if the max length is short it won't matter much; you can
just use a vector and (vec (take-last ...)) and it won't be very
inefficient, nor would using reverse on a queue or deque to traverse
from the other end and last to peek the "difficult" end of a queue.

Alan

unread,

Feb 3, 2011, 2:03:24 PM2/3/11

to Clojure

Or use a fixed-size vector as a ring, with start and end "pointers",
and a couple functions to abstract over the handling thereof. But
unless you really really care about performance, this is likely to be
overkill, and Ken's suggestion to just use a vector or a seq and
accept the O(n) behavior.

On Feb 3, 10:22 am, Ken Wesson <kwess...@gmail.com> wrote:

Ken Wesson

unread,

Feb 3, 2011, 3:57:29 PM2/3/11

to clo...@googlegroups.com

On Thu, Feb 3, 2011 at 2:03 PM, Alan <al...@malloys.org> wrote:
> Or use a fixed-size vector as a ring, with start and end "pointers",
> and a couple functions to abstract over the handling thereof. But
> unless you really really care about performance, this is likely to be
> overkill, and Ken's suggestion to just use a vector or a seq and
> accept the O(n) behavior.

A ring buffer. I considered using one to implement a deque, but
resizing it presented severe technical difficulties so I shelved that
approach. But for a fixed-size one it's perfect.

Below is a persistent immutable ring buffer implementation using
deftype. It behaves as a vector that accepts lpeek, lpop, and lconj
acting on the left as well as peek, pop, and conj acting on the right
and has a maximum capacity. Adding at either end when it is at
capacity causes it to drop an element at the other end. The assoc,
nth, and so forth functions work, but the index of a given element
changes if elements are dropped or added at the left end; assoc
accepts index -1 which adds at the left end, as well as index count
which adds at the right, an extension of how assoc can be used to grow
a vector at the right.

Create one with (ringbuffer capacity item item item ...), or just
(ringbuffer capacity) to get an empty one. The empty function produces
an empty ringbuffer with the same capacity as the argument ringbuffer,
so the capacity of (empty (ringbuffer 5 1 2 3 4 5)) will be 5.

I've tested it somewhat and it seems to work correctly. Let me know if
you use it, and especially if you run into any bugs.

Implementation note: internally there is a vector with a size one
greater than capacity (otherwise, start=end wouldn't distinguish an
empty from a full one!) and cap is one greater than capacity (it's the
modulus used for all the pointer arithmetic). The various "mutating"
methods ensure that removed objects are replaced with nil in the
vector (in particular, the "gap" between the last and first element is
always nil) so that things lost from the ring buffer may be promptly
eligible for garbage collection if nowhere else referenced.

(defprotocol LPop
(lpeek [this])
(lpop [this])
(lconj [this object]))

(defn +mod [a b m]
(mod (+ a b) m))

(defn incmod [a m]
(+mod a 1 m))

(defn -mod [a b m]
(mod (- a b) m))

(defn decmod [a m]
(-mod a 1 m))

(deftype RingBuffer [content cap start end]
clojure.lang.IObj
(withMeta [this m] (RingBuffer. (with-meta content m) cap start end))
(meta [this] (meta content))
clojure.lang.IPersistentVector
(cons [this object]
(let [e (incmod end cap)
s (if (== start e) (incmod start cap) start)]
(RingBuffer.
(assoc (assoc content end object) e nil)
cap s e)))
(count [this] (-mod (+ end cap) start cap))
(empty [this] (RingBuffer. (vec (repeat cap nil)) cap 0 0))
(equiv [this other] (= (seq this) other))
(seq [this]
(when-not (== end start)
(seq
(if (> end start)
(subvec content start end)
(lazy-cat (subvec content start) (subvec content 0 end))))))
(peek [this]
(when-not (== start end)
(nth content (decmod end cap))))
(pop [this]
(if (== start end)
(throw (IllegalStateException. "Can't pop empty ringbuffer"))
(let [e (decmod end cap)]
(RingBuffer.
(assoc content e nil)
cap start e))))
(containsKey [this object]
(and
(integer? object)
(>= object 0)
(< object (count this))))
(assoc [this key val]
(if (integer? key)
(if (= key -1)
(lconj this val)
(if (or (< key 0) (> key (count this)))
(throw (IndexOutOfBoundsException.))
(let [n (+mod key start cap)]
(if (= n end)
(conj this val)
(RingBuffer. (assoc content n val) cap start end)))))
(throw (IllegalArgumentException. "key must be an ingeger"))))
(entryAt [this key]
(if (.containsKey this key)
(first {key (content (+mod key start cap))})))
(valAt [this key]
(.valAt this key nil))
(valAt [this key not-found]
(if (.containsKey this key)
(content (+mod key start cap))
not-found))
(nth [this key]
(if (.containsKey this key)
(.valAt this key)
(throw (IndexOutOfBoundsException.))))
(nth [this key not-found]
(.valAt this key not-found))
(rseq [this]
(if (> end start)
(rseq (subvec content start end))
(lazy-cat (rseq (subvec content 0 end)) (rseq (subvec content start)))))
LPop
(lpeek [this]
(when-not (== start end)
(nth content start)))
(lpop [this]
(if (== start end)
(throw (IllegalStateException. "Can't pop empty ringbuffer"))
(RingBuffer.
(assoc content start nil)
cap (incmod start cap) end)))
(lconj [this object]
(let [s (decmod start cap)
ss (decmod s cap)
e (if (== s end) ss end)]
(RingBuffer.
(assoc (assoc content ss nil) s object)
cap s e)))
clojure.lang.IFn
(invoke [this]
(throw (IllegalArgumentException.
"Wrong number of args (0) passed to: RingBuffer")))
(invoke [this key]
(.nth this key))
(invoke [this key not-found]
(.nth this key not-found))
(invoke [this _ _ _]
(throw (IllegalArgumentException.
"Wrong number of args (3) passed to: RingBuffer")))
(invoke [this _ _ _ _]
(throw (IllegalArgumentException.
"Wrong number of args (4) passed to: RingBuffer")))
(invoke [this _ _ _ _ _]
(throw (IllegalArgumentException.
"Wrong number of args (5) passed to: RingBuffer"))))

(defn ringbuffer [capacity & things]
(assert (> capacity 0))
(let [content (vec (take capacity things))
end (count content)
cap (inc capacity)
nils (repeat (- cap end) nil)]
(RingBuffer. (apply conj content nils) cap 0 end)))

Alan

unread,

Feb 3, 2011, 5:22:43 PM2/3/11

to Clojure

You don't need an extra element in the vector to distinguish between
empty and full. You can store (start, size) instead of (start, end).

On Feb 3, 12:57 pm, Ken Wesson <kwess...@gmail.com> wrote:

Ken Wesson

unread,

Feb 3, 2011, 5:43:31 PM2/3/11

to clo...@googlegroups.com

On Thu, Feb 3, 2011 at 5:22 PM, Alan <al...@malloys.org> wrote:
> You don't need an extra element in the vector to distinguish between
> empty and full. You can store (start, size) instead of (start, end).

I considered that, but it made the arithmetic much messier in various
places. Count gets simpler, obviously, but things like conj now need
to use (start + size + 1) mod capacity and some things get awkward
elsewhere.

My implementation probably saves a few cycles, at a cost of a few
bytes more storage per ring buffer. I suppose either tradeoff is an
acceptable choice for nearly all cases though.

Reply all

Reply to author

Forward