String-friendly first/rest?

2,007 views
Skip to first unread message

Surgo

unread,
Dec 8, 2010, 2:43:46 PM12/8/10
to Clojure
To help myself learn Clojure, I figured I would write a pattern
matching / destructing macro to better look like languages I'm more
familiar with; i.e., destructuring by [first|second|rest] instead of
[first second & rest]. To do this I'm turning the aforementioned
vector into a string (via str) and looking for / replacing the |
character. However, this led to the following issue...

(def test "abc")
(first test)
> \a
(rest test)
> (\b \c)
(string? (rest test))
> false

It would be really helpful if first/rest returned strings (or a
character in the case of first), not lists, when given string input.
Is there a design reason for the current behaviour and, if so, are
there equivalent built-in functions that do the right thing for
strings?

Laurent PETIT

unread,
Dec 8, 2010, 5:00:48 PM12/8/10
to clo...@googlegroups.com
2010/12/8 Surgo <morgon...@gmail.com>

(first "abc") gives you a character.

(rest anything) returns a seq, by definition. It's not about Strings, it's the contract of rest. A String is not a seq, but it's viewable as a seq, in which case each element of the seq will be a character of the String.

Note that this is not particular to String, but to almost any clojure datastructure :

(rest [1 2 3])  doesn't return a vector either, but a seq: (2 3)

etc.

You seem to want to not use seq abstractions, but String manipulation abstractions, here.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Surgo

unread,
Dec 8, 2010, 5:05:59 PM12/8/10
to Clojure
> (rest anything) returns a seq, by definition. It's not about Strings, it's
> the contract of rest. A String is not a seq, but it's viewable as a seq, in
> which case each element of the seq will be a character of the String.
>
> Note that this is not particular to String, but to almost any clojure
> datastructure :
>
> (rest [1 2 3])  doesn't return a vector either, but a seq: (2 3)
>
> etc.
>
> You seem to want to not use seq abstractions, but String manipulation
> abstractions, here.

That's a fair criticism. I suppose that I'm not necessarily looking
for specifically String manipulation abstractions (I can just do a
(.substr "abc" 1) to get "bc" as a String after all), but rather
looking for an abstraction that takes something that's addressable as
a sequence and returns it in the same format or type instead of a seq.

Tim Robinson

unread,
Dec 8, 2010, 5:11:07 PM12/8/10
to Clojure
Laurent is right.

Best to use substring:

> (.substring test 1 (count test))
"bc"

Miki

unread,
Dec 8, 2010, 5:16:15 PM12/8/10
to clo...@googlegroups.com
> (.substring test 1 (count test))
"bc"
FYI: Clojure has "subs" -> (subs test 1 (count test))

Meikel Brandmeyer

unread,
Dec 8, 2010, 5:18:08 PM12/8/10
to clo...@googlegroups.com
Hi,

Am 08.12.2010 um 23:05 schrieb Surgo:

> That's a fair criticism. I suppose that I'm not necessarily looking
> for specifically String manipulation abstractions (I can just do a
> (.substr "abc" 1) to get "bc" as a String after all), but rather
> looking for an abstraction that takes something that's addressable as
> a sequence and returns it in the same format or type instead of a seq.

Namespaces to the rescue:

(ns your.name.space
(:refer-clojure :exclude (first rest)))

(defprotocol MySeq
(first [this])
(rest [this]))

(extend-protocol MySeq
String
(first [this] (.charAt this 0))
(rest [this] (subs this 1))
Object
(first [this] (clojure.core/first this))
(rest [this] (clojure.core/rest this)))

Now use first and rest as normal. Here some examples:

your.name.space=> (first "abc")
\a
your.name.space=> (rest "abc")
"bc"
your.name.space=> (first [1 2 3])
1
your.name.space=> (rest [1 2 3])
(2 3)

Sincerely
Meikel

Benny Tsai

unread,
Dec 8, 2010, 5:22:44 PM12/8/10
to Clojure
(subs test 1) will work as well; the default behavior is to go to the
end if no end position is specified.

Laurent PETIT

unread,
Dec 8, 2010, 5:26:31 PM12/8/10
to clo...@googlegroups.com
2010/12/8 Surgo <morgon...@gmail.com>

No such builtin that I'm aware of.

There are some "functor" things in clojure contrib, but I don't know them well.

Michael Gardner

unread,
Dec 8, 2010, 5:41:28 PM12/8/10
to clo...@googlegroups.com
On Dec 8, 2010, at 4:05 PM, Surgo wrote:

> That's a fair criticism. I suppose that I'm not necessarily looking
> for specifically String manipulation abstractions (I can just do a
> (.substr "abc" 1) to get "bc" as a String after all), but rather
> looking for an abstraction that takes something that's addressable as
> a sequence and returns it in the same format or type instead of a seq.

So something like an inverse to (seq)? You could write such a thing, though it would have to know about each type (seq) knows about. More importantly, it would have to somehow know what type the thing originally was, since there's no difference between e.g. (seq "abc") and (seq [\a \b \c]). You'd either have to store the type when calling (seq) and manually pass it when calling the inverse function, or else I suppose you could write a wrapper for (seq) that adds metadata about what type the thing originally was.

Incidentally, while testing this last idea, I was surprised to find that :type metadata is treated specially:

=> (with-meta '() {:type (type [])})
[]

I assume this means :type is used internally by Clojure somehow. I notice clojure.org says that metadata "is used to convey information to the compiler about types", but ought there be a list of "reserved" metadata?

Laurent PETIT

unread,
Dec 8, 2010, 5:53:08 PM12/8/10
to clo...@googlegroups.com
2010/12/8 Michael Gardner <gard...@gmail.com>

On Dec 8, 2010, at 4:05 PM, Surgo wrote:

> That's a fair criticism. I suppose that I'm not necessarily looking
> for specifically String manipulation abstractions (I can just do a
> (.substr "abc" 1) to get "bc" as a String after all), but rather
> looking for an abstraction that takes something that's addressable as
> a sequence and returns it in the same format or type instead of a seq.

So something like an inverse to (seq)? You could write such a thing, though it would have to know about each type (seq) knows about. More importantly, it would have to somehow know what type the thing originally was, since there's no difference between e.g. (seq "abc") and (seq [\a \b \c]). You'd either have to store the type when calling (seq) and manually pass it when calling the inverse function, or else I suppose you could write a wrapper for (seq) that adds metadata about what type the thing originally was.


Well, to the contrary, I think it would be a different abstraction from seq. seq abstraction is for providing a "view" over things which are seqable, but not necessarily data structures: can be streams, etc. And some datastructures know how to present a seq "view" of them without placing an importance on the sequential aspect of this (the ordering of the elements of the seq).

So this "new" abstraction would concern less input than seq. Only the input for which ordering is an information (Strings, vectors, etc.).

Meikel showed the way, though it's different enough in semantics to deserve its own protocol and not override (in fact replace, in his example) existing concepts.

Now I don't (really, I don't) know if there's interest in providing this, at this level of genericity.
 
Incidentally, while testing this last idea, I was surprised to find that :type metadata is treated specially:

=> (with-meta '() {:type (type [])})
[]

I assume this means :type is used internally by Clojure somehow. I notice clojure.org says that metadata "is used to convey information to the compiler about types", but ought there be a list of "reserved" metadata?

Meikel Brandmeyer

unread,
Dec 8, 2010, 6:14:17 PM12/8/10
to clo...@googlegroups.com
Hi,

Am 08.12.2010 um 23:53 schrieb Laurent PETIT:

> Meikel showed the way, though it's different enough in semantics to deserve its own protocol and not override (in fact replace, in his example) existing concepts.

Well, this showed up the second time in two days, so I thought I'd write it up in an email. However: I strongly discourage doing such things. I would scratch a project working internally like that from my dependency list. (luckily the effects of such a protocol are limited to opt-in namespaces)

Listen to Laurent! He is an experienced clojurian. This is a different thing. Name it differently! Handle it differently!

As the way Clojure works:

Listen to Rich! He has probably thought more about this, than anyone else ever will. If something is not the way you expect it to be or something is missing, then there is almost surely a reason for this fact. If you still think, that something should be changed, lobby for the change on the mailing list.

Sincerely
Meikel

Phil Hagelberg

unread,
Dec 8, 2010, 6:27:47 PM12/8/10
to clo...@googlegroups.com

On Wed, Dec 8, 2010 at 2:00 PM, Laurent PETIT <lauren...@gmail.com> wrote:
>> (def test "abc")
>> (first test)
>> > \a
>> (rest test)
>> > (\b \c)
>> (string? (rest test))
>> > false
>>
>> It would be really helpful if first/rest returned strings (or a
>> character in the case of first), not lists, when given string input.
>> Is there a design reason for the current behaviour and, if so, are
>> there equivalent built-in functions that do the right thing for
>> strings?
>
> (first "abc") gives you a character.
>
> (rest anything) returns a seq, by definition. It's not about Strings, it's
> the contract of rest. A String is not a seq, but it's viewable as a seq, in
> which case each element of the seq will be a character of the String.

This behaviour would be a lot easier to deal with if into worked with strings.

-Phil

Alan

unread,
Dec 9, 2010, 12:52:26 PM12/9/10
to Clojure
clojure.contrib.string has take and drop, which do what you want
(though you have to ask for exactly one character to emulate first/
rest). However, my understanding is that c.c.string is going away in
1.3, and many of its features will be removed rather than moved, so I
don't think you're supposed to use it anymore.

Stuart Sierra

unread,
Dec 9, 2010, 4:50:23 PM12/9/10
to Clojure

On Dec 9, 12:52 pm, Alan <a...@malloys.org> wrote:
> rest). However, my understanding is that c.c.string is going away in
> 1.3, and many of its features will be removed rather than moved, so I

Yes, it is replaced by clojure.string. c.c.string is deprecated in
1.2 and removed in 1.3

-S
Reply all
Reply to author
Forward
0 new messages