toVector in 2.10?

227 views
Skip to first unread message

Bill Venners

unread,
Jun 13, 2012, 12:36:51 PM6/13/12
to scala-i...@googlegroups.com
Hi All,

What's the feeling about adding a toVector in 2.10? You can usually
get one by writing toIndexedSeq, but that's a harder to type, visually
more cluttered, and doesn't for sure get you a Vector.

Bill
----
Bill Venners
Artima, Inc.
http://www.artima.com

Stefan Zeiger

unread,
Jun 13, 2012, 1:05:32 PM6/13/12
to scala-i...@googlegroups.com
On 2012-06-13 18:36, Bill Venners wrote:
> Hi All,
>
> What's the feeling about adding a toVector in 2.10? You can usually
> get one by writing toIndexedSeq, but that's a harder to type, visually
> more cluttered, and doesn't for sure get you a Vector.

How about a generic "to" that can build any collection? I added this to
ScalaQuery a while ago:
https://github.com/slick/slick/blob/d5d9bbe1d20321d3f4d4f28f95bc365b9e8f4eda/src/main/scala/scala/slick/jdbc/Invoker.scala#L72

-sz

√iktor Ҡlang

unread,
Jun 13, 2012, 2:26:29 PM6/13/12
to scala-i...@googlegroups.com
.to[Vector] would be badass
--
Viktor Klang

Akka Tech Lead
Typesafe - The software stack for applications that scale

Twitter: @viktorklang

Chris Hodapp

unread,
Jun 13, 2012, 4:47:29 PM6/13/12
to scala-i...@googlegroups.com
+2. This would also fix the long-standing annoyance in dealing with mixed mutable/nonmutable collections. List(1,2,3).to[mutable.Set] looks WAY better than mutable.Set() ++ List(1,2,3)

Josh Suereth

unread,
Jun 13, 2012, 5:01:10 PM6/13/12
to scala-i...@googlegroups.com

Seems easy enough.  I'll try to make a pull request tomorrow.

Seth Tisue

unread,
Jun 13, 2012, 5:15:10 PM6/13/12
to scala-i...@googlegroups.com
On Wed, Jun 13, 2012 at 5:01 PM, Josh Suereth <joshua....@gmail.com> wrote:
> Seems easy enough

I wonder if Paul has anything to say about that :-)

Even if we didn't get .to[X] this time around, I'd still do cartwheels
in the streets if we just got .toVector.

--
Seth Tisue | Northwestern University | http://tisue.net
lead developer, NetLogo: http://ccl.northwestern.edu/netlogo/

Matthew Pocock

unread,
Jun 13, 2012, 5:22:19 PM6/13/12
to scala-i...@googlegroups.com
+1 as it drastically reduces the API footprint, is arbitrarily extensible, and with a little bit of thought allows optimized strategies to be provided for pairs of collection types that know each other intimately. I am guessing that this would be mediated by an implicit argument that encapsulates the copying, and that instances of this can be implicitly built if there is a builder in scope for the target type. Pairs of types with a better strategy for copying would provide an implicit instance directly.

Matthew

--
Dr Matthew Pocock
Integrative Bioinformatics Group, School of Computing Science, Newcastle University
skype: matthew.pocock
tel: (0191) 2566550

Josh Suereth

unread,
Jun 13, 2012, 5:24:50 PM6/13/12
to scala-i...@googlegroups.com

Probably won't call it `to` as I'm a bit worried about conflicts.  But I figure I'll make the pull request and see what happens.

Josh Suereth

unread,
Jun 13, 2012, 5:30:21 PM6/13/12
to scala-i...@googlegroups.com

Note:

def to[Col[_]](implicit cbf: CanBuildFrom[Repr, A, Col[A]]) = {
  val b = cbf(repr)
  b ++= this.seq
  b.result
}

Or some such.  Builders are already there for everything.

Not sure to is the best name though as I'm worried about breaking other code.  Anyone know of conflicts?

Stefan Zeiger

unread,
Jun 13, 2012, 5:44:03 PM6/13/12
to scala-i...@googlegroups.com
On 2012-06-13 23:30, Josh Suereth wrote:
>
> Note:
>
> def to[Col[_]](implicit cbf: CanBuildFrom[Repr, A, Col[A]]) = {
> val b = cbf(repr)
> b ++= this.seq
> b.result
> }
>
> Or some such. Builders are already there for everything.
>

Good luck with that. And say hi to my old friend Covariance :)

I really wanted to make it work that way in ScalaQuery but failed. Miles
also had some good suggestions when I was working on it. But in the end
to[Coll] just doesn't work for covariant collections (unless we get
partial inference of multiple type parameters). to[Coll]() on the other
hand (note the parens) does work.

-sz

Matthew Pocock

unread,
Jun 13, 2012, 5:46:49 PM6/13/12
to scala-i...@googlegroups.com
What about the case of e.g. ordered collections where you'd like the ordering to propagate over? I think for that you need a specialised copy implicit.

trait CanCopyBetween[Repr, A, Col] { ... }

implicit def copyWithBuilder(implicit cbf: CanBuildFrom[...

Then for copying between e.g. sorted collections, you can provide a CanCopyBetween instance that propagates that extra info.

Matthew

Josh Suereth

unread,
Jun 13, 2012, 5:52:47 PM6/13/12
to scala-i...@googlegroups.com

Right...  unchecked variance might be in order.... hmm ...

Bill Venners

unread,
Jun 13, 2012, 6:34:09 PM6/13/12
to scala-i...@googlegroups.com
Hi Josh,

Are you thinking of the more general to-like method in addition to toVector? I'd like to see toVector even if we get something like Stefan's to.

Thanks.

Bill

Stefan Zeiger

unread,
Jun 13, 2012, 6:40:43 PM6/13/12
to scala-i...@googlegroups.com
On 2012-06-13 23:52, Josh Suereth wrote:

Right...  unchecked variance might be in order.... hmm ...


I hadn't considered cheating before but it looks promising:

C:\Users\szeiger\code\scala>build\quick\bin\scala
Welcome to Scala version 2.10.0-20120613-112903-1c2d466804 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_02).
Type in expressions to have them evaluated.
Type :help for more information.

scala> val l = List("Kobayashi", "Maru")
l: List[String] = List(Kobayashi, Maru)

scala> l.to[Vector]
res0: Vector[String] = Vector(Kobayashi, Maru)

Code is here: https://github.com/szeiger/scala/tree/topic/traversable-to

-sz

Josh Suereth

unread,
Jun 13, 2012, 6:54:03 PM6/13/12
to scala-i...@googlegroups.com
Right, I just wrote this:

import annotation.unchecked.{ uncheckedVariance => uV }

trait TraversableOnce.... 

  def to[Col[_]](implicit cbf: CanBuildFrom[Repr, A, Col[A @uV]]): Col[A @uV] = {
    val b = cbf(repr)
    b ++= this
    b.result
  }

However, Bill has a point about adding a toVector method.   It can delegate to to[Vector] when needed, but should be more efficient if the collection is already a vector.

Since "to" really does build a new collection, maybe calling it "buildAs[CollectionType]"  would be better?

Josh Suereth

unread,
Jun 13, 2012, 6:55:22 PM6/13/12
to scala-i...@googlegroups.com
Also, you should use the version I have (I believe).   It should actually be faster in the general case.

Daniel Sobral

unread,
Jun 13, 2012, 7:39:38 PM6/13/12
to scala-i...@googlegroups.com
Yes. Vector is the optimal collection in many cases, and having
".toVector" on a Seq do nothing (because it already is a Vector or
ParVector) is very desirable from a performance point of view, imho.
--
Daniel C. Sobral

I travel to the future all the time.

Stefan Zeiger

unread,
Jun 14, 2012, 5:59:37 AM6/14/12
to scala-i...@googlegroups.com
On 2012-06-14 0:54, Josh Suereth wrote:
> Right, I just wrote this:
>
> import annotation.unchecked.{ uncheckedVariance => uV }
>
> trait TraversableOnce....
>
> def to[Col[_]](implicit cbf: CanBuildFrom[Repr, A, Col[A @uV]]):
> Col[A @uV] = {
> val b = cbf(repr)
> b ++= this
> b.result
> }

I'm not so sure about the performance of ++= vs foreach/+= without doing
any profiling. Someone has to do the iteration, so the difference boils
down to making either iterating over the source monomorphic, or
appending to the builder, so hotspot can probably inline one of those
but not the other.

We should also use a sizeHint on the builder (as in the version you sent
to me privately) but that assumes a TraversableLike, not just a
TraversableOnce, so "to" needs to be overridden in TraversableLike.
Judging from the doc comment, it should be safe to call sizeHint from
any collection, not just indexed ones. Or we could override it in
IndexedSeq(Like?) because it's useless for other collections anyway.

Operations similar to "to" are actually defined on GenTraversableOnce,
so we may want to do the same. That begs the question how to implement
it for parallel collections. Can we do anything more useful there than
the current implementation or should this simply be moved down into
GenTraversableOnce?

Even if it's in TraversableOnce, we don't get a Repr type, so we can
only request a CanBuildFrom from Nothing. And we can't just override it
in TraversableLike and change the signature in that way. Does specifying
a Repr buy us anything? I haven't done a comprehensive search yet but so
far I could not see any benefit over taking a CanBuildFrom from Nothing.

> However, Bill has a point about adding a toVector method. It can
> delegate to to[Vector] when needed, but should be more efficient if
> the collection is already a vector.

That seems wrong. toIndexedSeq returns the caller if that's already an
immutable.IndexedSeq, so I think to[IndexedSeq] should do the same. This
could be done with a runtime subtype check on the erasure of the source
and target collection type (at least for the standard collections). We
know the source class but not the target class but we could request an
additional ClassTag for it. However, it may be better to integrate this
functionality into CanBuildFrom, e.g. as

/** Check if the given object is a valid instance of type `To` */
def isToHint(o: Any): Boolean = false

All standard implementations of CanBuildFrom for immutable collections
should override this method with an isInstanceOf check. Custom
implementations could do more elaborate checks if needed (e.g. when
collection types should not be considered assignable due to path
dependence even though the erasure indicates that they are).

-sz

Aleksandar Prokopec

unread,
Jun 14, 2012, 6:08:31 AM6/14/12
to scala-i...@googlegroups.com, Stefan Zeiger
On 6/14/12 11:59 AM, Stefan Zeiger wrote:
> On 2012-06-14 0:54, Josh Suereth wrote:
>> Right, I just wrote this:
>>
>> import annotation.unchecked.{ uncheckedVariance => uV }
>>
>> trait TraversableOnce....
>>
>> def to[Col[_]](implicit cbf: CanBuildFrom[Repr, A, Col[A @uV]]):
>> Col[A @uV] = {
>> val b = cbf(repr)
>> b ++= this
>> b.result
>> }
>
> I'm not so sure about the performance of ++= vs foreach/+= without
> doing any profiling. Someone has to do the iteration, so the
> difference boils down to making either iterating over the source
> monomorphic, or appending to the builder, so hotspot can probably
> inline one of those but not the other.
>

In my experience such differences are usually not noticeable.
But maybe we should double-check.

>
> Operations similar to "to" are actually defined on GenTraversableOnce,
> so we may want to do the same. That begs the question how to implement
> it for parallel collections. Can we do anything more useful there than
> the current implementation or should this simply be moved down into
> GenTraversableOnce?
>

In parallel collections it is possible to parallelize `to`, but only
given that the `cbf` produces a combiner (i.e. a builder for a parallel
collection).
So yes, it's possible to override it there. I already special-case this
in a bunch of places.

> Even if it's in TraversableOnce, we don't get a Repr type, so we can
> only request a CanBuildFrom from Nothing. And we can't just override
> it in TraversableLike and change the signature in that way. Does
> specifying a Repr buy us anything? I haven't done a comprehensive
> search yet but so far I could not see any benefit over taking a
> CanBuildFrom from Nothing.
>

Maybe `to` could be defined in GenTravLike, and only be a decorater
added through an implicit in Once classes.

Josh Suereth

unread,
Jun 14, 2012, 7:05:07 AM6/14/12
to scala-i...@googlegroups.com

A lot of collection methods are optimised with class pattern matches.  IIRC ++= and sizeHint are both done in this way.

Given how often both are used in methods, they're more likely to have hotspots optimisations, but of course we need to profile to be sure.

Also, even in Traversable you need to use Nothing as base type in CanBuildFrom, since effectively you're doing a breakOut.

Rex Kerr

unread,
Jun 14, 2012, 9:02:13 PM6/14/12
to scala-i...@googlegroups.com
I really like short method names.  But since you have asked for conflicts:

(1) I have used "to" in my code in a way that would probably be broken by this (not 100% sure).  I'm happy to change my code--just pointing out that "to" is a common word, and so we should be a bit cautious when claiming it.

(2) What does this do:
  x to y
?  If you assume standard library only, it's an inclusive range of integers.  With to[Coll], it's also possibly an inferred-type call with y as the CBF.  While there should be no conflicts in correct code, this could produce some mildly confusing error messages in incorrect code.

(3) Not sure this is really desirable:
  1 to 3 to [List]()
as you have to re-interpret what "to" means in quick succession.  Such mental acrobatics are possible but usually a bit taxing.

--Rex

Josh Suereth

unread,
Jun 18, 2012, 11:58:42 AM6/18/12
to scala-i...@googlegroups.com
https://github.com/scala/scala/pull/739

I went with "copyInto" and "toVector".   I like "copyInto" since you have to understand that's what it *does*.    The toVector method is exactly copyInto, unless you have a Vector or ParVector.

Please comment on the pull request.  Jason, as his usual awesome self, has already done so.

Rex Kerr

unread,
Jun 18, 2012, 1:18:55 PM6/18/12
to scala-i...@googlegroups.com
That sounds reasonable, except why "Into"?  We don't intoString, intoVector, etc., but it's exactly the same process.

Is copyTo not okay for some reason?

  --Rex

Matthew Pocock

unread,
Jun 18, 2012, 1:52:43 PM6/18/12
to scala-i...@googlegroups.com
On 18 June 2012 18:18, Rex Kerr <ich...@gmail.com> wrote:
That sounds reasonable, except why "Into"?  We don't intoString, intoVector, etc., but it's exactly the same process.

Is copyTo not okay for some reason?

At the danger of sounding very pedantic, copyInto sounds to me as if it copies the elements into a pre-existing collection, where as copyTo doesn't, so I prefer that one.

names.copyInto[Set] // sounds a bit as if you should be supplying the set to copy names into
names.copyTo[Set] // this is better

Matthew

Josh Suereth

unread,
Jun 18, 2012, 1:57:12 PM6/18/12
to scala-i...@googlegroups.com
I moved to copyTo.  Got some complaints, we're now on "build" as the name for the method.  Any objections?

Jeff Olson

unread,
Jun 18, 2012, 2:00:40 PM6/18/12
to scala-i...@googlegroups.com
I also like copyTo better than copyInto. Unfortunately we already have the copyToArray and copyToBuffer methods which are used to copy _into_ existing collections. So now copyToArray and copyTo[Array] would mean different things. Ugh.

What about 'convertTo'?


  --Rex



Matthew

skype: matthew.pocock
tel: (0191) 2566550




--
Dr Matthew Pocock
Integrative Bioinformatics Group, School of Computing Science, Newcastle University
skype: matthew.pocock
tel: (0191) 2566550



  --Rex



Matthew

skype: matthew.pocock
tel: (0191) 2566550




--
Dr Matthew Pocock
Integrative Bioinformatics Group, School of Computing Science, Newcastle University

Kevin Wright

unread,
Jun 18, 2012, 2:06:00 PM6/18/12
to scala-i...@googlegroups.com

How about copyTo, but with an argument that defaults to the appropriate empty collection type - kill two birds with one stone.

Seth Tisue

unread,
Jun 18, 2012, 4:50:28 PM6/18/12
to scala-i...@googlegroups.com
"build" is quite nice...!

(I wasn't happy with "copyTo" for multiple reasons, but I guess that's
a dead horse.)

Heiko Seeberger

unread,
Jun 19, 2012, 10:51:18 AM6/19/12
to scala-i...@googlegroups.com
On Jun 18, 2012, at 10:50 PM, Seth Tisue wrote:

> "build" is quite nice…!

+1

> (I wasn't happy with "copyTo" for multiple reasons,

+1000

Heiko

Josh Suereth

unread,
Jun 19, 2012, 11:57:58 AM6/19/12
to scala-i...@googlegroups.com
We got vetoed.  It's "convertTo" and it's in trunk.   

Simon Ochsenreither

unread,
Jun 19, 2012, 12:00:41 PM6/19/12
to scala-i...@googlegroups.com

> I moved to copyTo.  Got some complaints, we're now on "build" as the name for the method.  

The commit message says "copyInto" and the actual patch says "convertTo".

 
Any objections?

 Yes, frankly I don't think it carries its own weight.

Adding methods has a cost, both for those maintaining it in the future and those who actually have to use it.

This is yet-another method on a collection API with close to 100 methods, with a non-intuitive, non-established name.

I'm quite a bit concerned about the style this got added. Especially Paul (and a lot of others) did a lot of work to keep the API as clean as possible despite its heavy-weight. We should not drop that.

"to" would be great, because it has the same cost on the maintenance side as every other method name, but almost zero cost on the user side, because the pattern is already established.

A user will type "to" and will wait for auto-completion to kick in. Auto-completion will offer toList, toMap, toArray, toStream ... and probably to[DecideYourself]. It is completely obvious what "to" is supposed to do.

Apart from that "to" is already known from and established in third-party libraries for exactly this use-case.

Imho either call it "to" or remove it. Adding yet-another thing to learn is not worth the additional weight.

Bye,

Simon

PS: Sorry if this reads a bit harsh ... I'm just seriously concerned about API quality. Let's not drop the balls here.

Josh Suereth

unread,
Jun 19, 2012, 12:11:56 PM6/19/12
to scala-i...@googlegroups.com
On Tue, Jun 19, 2012 at 12:00 PM, Simon Ochsenreither <simon.och...@googlemail.com> wrote:

> I moved to copyTo.  Got some complaints, we're now on "build" as the name for the method.  

The commit message says "copyInto" and the actual patch says "convertTo".

 
Any objections?

 Yes, frankly I don't think it carries its own weight.

Adding methods has a cost, both for those maintaining it in the future and those who actually have to use it.


yes.  We managed to remove a few custom implementations of building collections patterns so they all rely on a single convertTo method.   I don't see that as a bad thing.

 
This is yet-another method on a collection API with close to 100 methods, with a non-intuitive, non-established name.


I *am* rather concerned about that.  But I have alternate ideas on how to remove the old methods.
 
I'm quite a bit concerned about the style this got added. Especially Paul (and a lot of others) did a lot of work to keep the API as clean as possible despite its heavy-weight. We should not drop that.


I don't think we are.   
 
"to" would be great, because it has the same cost on the maintenance side as every other method name, but almost zero cost on the user side, because the pattern is already established.

 
I don't disagree with you about the convention and "to".  However, I think the names toVector, toBuilder, etc. are somewhat poor.  Specifically, they don't denote the copying that happens in the general case.

A user will type "to" and will wait for auto-completion to kick in. Auto-completion will offer toList, toMap, toArray, toStream ... and probably to[DecideYourself]. It is completely obvious what "to" is supposed to do.

Apart from that "to" is already known from and established in third-party libraries for exactly this use-case.

Imho either call it "to" or remove it. Adding yet-another thing to learn is not worth the additional weight.


IMHO removing all the other "to" methods would be wise.   We don't need them with one good convertTo.   

I'd be fine naming the method "to" and deprecating all the other "toXYZ" methods myself.     They're extra baggage on the API since we know one method could do the work.

I still prefer copyTo/convertTo over just plain old "to" personally, because the name denotes a bit more meaning.

However, the point is that this new method makes the old ones unnecessary.   I think that's a win.   What's missing is deprecations.   What do people think about deprecating the "toXYZ" methods in favor of the generic convertTo?

(Note:  Deprecating all those methods will take a bit of work, but something I'd like to do.   They can all be implemented as convertTo, but that will require some refactoring of other collection methods to ensure we don't have stack overflows and such.)


- Josh




Jason Zaugg

unread,
Jun 19, 2012, 12:29:20 PM6/19/12
to scala-i...@googlegroups.com
On Tue, Jun 19, 2012 at 6:11 PM, Josh Suereth <joshua....@gmail.com> wrote:
> However, the point is that this new method makes the old ones unnecessary.
> I think that's a win.   What's missing is deprecations.   What do people
> think about deprecating the "toXYZ" methods in favor of the generic
> convertTo?

No thanks. I like that these ones need not copy if they already
satisfy the target collection type.

-jason

Chris Hodapp

unread,
Jun 19, 2012, 12:30:25 PM6/19/12
to scala-i...@googlegroups.com
I do note that convertTo and especially copyTo don't have the correct meanings, since in some cases no conversion or copying will take place.

Also, the notion of toX (rather than convertToX) is already established in Java through toString and even more established in Scala through toInt, toDouble, etc.

On Tuesday, June 19, 2012 11:11:56 AM UTC-5, Josh Suereth wrote:

Chris Hodapp

unread,
Jun 19, 2012, 12:34:19 PM6/19/12
to scala-i...@googlegroups.com
Disregard what I just said, it looks like what went into trunk did not do what Stefan suggested.

Paul Phillips

unread,
Jun 19, 2012, 12:39:44 PM6/19/12
to scala-i...@googlegroups.com
Something to consider is that, after I fixed two crashers in extensionmethods, methods f1 and f2 in the following generated bytecode which (under -optimise) is almost identical.  (As a bonus, it doesn't require the use of @uncheckedVariance.) There are pluses and minuses to avoiding inheritance, but for the record, the performance argument against extension methods is becoming a very weak one.

import scala.collection.{ mutable, immutable, generic, GenTraversableOnce }

package object foo {
  @inline implicit class TravOps[A, CC[A] <: GenTraversableOnce[A]](val coll: CC[A]) extends AnyVal {
    def build[CC2[X]](implicit cbf: generic.CanBuildFrom[Nothing, A, CC2[A]]): CC2[A] = {
      cbf() ++= coll.toIterator result
    }
  }
}

package foo {
  object Test {
    def f1[T](xs: Traversable[T]) = xs.convertTo[immutable.Vector]
    def f2[T](xs: Traversable[T]) = xs.build[immutable.Vector]
  }
}


// f1
public <T extends java/lang/Object> scala.collection.immutable.Vector<T> f1(scala.collection.Traversable<T>);
       0: aload_1       
       1: getstatic     #20                 // Field scala/collection/immutable/Vector$.MODULE$:Lscala/collection/immutable/Vector$;
       4: invokevirtual #26                 // Method scala/collection/generic/GenTraversableFactory.ReusableCBF:()Lscala/collection/generic/GenTraversableFactory$GenericCanBuildFrom;
       7: invokeinterface #32,  2           // InterfaceMethod scala/collection/Traversable.convertTo:(Lscala/collection/generic/CanBuildFrom;)Ljava/lang/Object;
      12: checkcast     #34                 // class scala/collection/immutable/Vector
      15: areturn       

// f2
public <T extends java/lang/Object> scala.collection.immutable.Vector<T> f2(scala.collection.Traversable<T>);
       0: getstatic     #46                 // Field foo/package$TravOps$.MODULE$:Lfoo/package$TravOps$;
       3: aload_1       
       4: getstatic     #20                 // Field scala/collection/immutable/Vector$.MODULE$:Lscala/collection/immutable/Vector$;
       7: invokevirtual #26                 // Method scala/collection/generic/GenTraversableFactory.ReusableCBF:()Lscala/collection/generic/GenTraversableFactory$GenericCanBuildFrom;
      10: invokevirtual #50                 // Method foo/package$TravOps$.extension$build:(Lscala/collection/GenTraversableOnce;Lscala/collection/generic/CanBuildFrom;)Ljava/lang/Object;
      13: checkcast     #34                 // class scala/collection/immutable/Vector
      16: areturn       

Simon Ochsenreither

unread,
Jun 19, 2012, 12:51:59 PM6/19/12
to scala-i...@googlegroups.com


 
"to" would be great, because it has the same cost on the maintenance side as every other method name, but almost zero cost on the user side, because the pattern is already established.

 
I don't disagree with you about the convention and "to".  However, I think the names toVector, toBuilder, etc. are somewhat poor.  Specifically, they don't denote the copying that happens in the general case.
 

A user will type "to" and will wait for auto-completion to kick in. Auto-completion will offer toList, toMap, toArray, toStream ... and probably to[DecideYourself]. It is completely obvious what "to" is supposed to do.

Apart from that "to" is already known from and established in third-party libraries for exactly this use-case.

Imho either call it "to" or remove it. Adding yet-another thing to learn is not worth the additional weight.


IMHO removing all the other "to" methods would be wise.   We don't need them with one good convertTo.   

I'd be fine naming the method "to" and deprecating all the other "toXYZ" methods myself.     They're extra baggage on the API since we know one method could do the work.

I still prefer copyTo/convertTo over just plain old "to" personally, because the name denotes a bit more meaning.

However, the point is that this new method makes the old ones unnecessary.   I think that's a win.   What's missing is deprecations.   What do people think about deprecating the "toXYZ" methods in favor of the generic convertTo?

(Note:  Deprecating all those methods will take a bit of work, but something I'd like to do.   They can all be implemented as convertTo, but that will require some refactoring of other collection methods to ensure we don't have stack overflows and such.)


Sorry, this sounds even considerably worse than I first assumed.

  • toX is concise and to the point.
  • It is established and one of the most used names of the API. "to" has the right level of preciseness to don't create any false assumptions about its space consumption.
  • If there were people totally confused whether “to” is copying/sharing the data, we would certainly heard from them until now.
  • Changing/removing it will be a nightmare to everyone targeting more than one Scala version. This will cause issues of the size of the 2.7-2.8 migration, imho.
  • I like the idea of having a few predefined toX methods available to guide new users. I also like the thought of having a smaller API. Not breaking existing stuff wins here, imho. -> Keep it.

I really don't see a single reason why we need to break working code and need to have an inconsistent API. This really feels like some second-system fallacy.

Btw: Afaik toVector wasn't added because there were concerns about creating huge dependencies (for Android). It would make sense to check whether the addition drags an tho whole parallel framework. Currently a proguarded HellowWorld is 170 kb, we should ckeck if this is still the case when adding toVector.

 

Matthew Pocock

unread,
Jun 19, 2012, 1:49:00 PM6/19/12
to scala-i...@googlegroups.com
On 19 June 2012 17:51, Simon Ochsenreither <simon.och...@googlemail.com> wrote:
Btw: Afaik toVector wasn't added because there were concerns about creating huge dependencies (for Android). It would make sense to check whether the addition drags an tho whole parallel framework. Currently a proguarded HellowWorld is 170 kb, we should ckeck if this is still the case when adding toVector.

I would expect that to[] with absent toFoo would result in typically smaller footprints, as each particular collection type would not explicitly mention any other collection type in its signature. Your HelloWorld would not need any collections that it didn't use. Perhaps it is possible for the android code minimisation step to elide @bridge methods that are not called. If so, we could place all the toFoo methods on a pimp with the appropriate @inline and @deprecated annotations to make the source-level transition work for one major release, provide @bridge defs for binary backward-compatibility, and get the best of both worlds.

I still think that .to[Col] should be accepting an implicit CanTo[Rep, A, Col[_]] argument, with two implementation. One is implicit from a builder. The other is implicit from Rep <:< Col[A]. This would remove the logic entirely from the collection type itself, while allowing specific collection pairs to provide optimized implementations without needing to over-ride any collection defs or introduce any explicit dependencies between collections classes that are costly in terms of minimal jar size.

Matthew

--
Dr Matthew Pocock
Integrative Bioinformatics Group, School of Computing Science, Newcastle University

Josh Suereth

unread,
Jun 19, 2012, 1:49:41 PM6/19/12
to scala-i...@googlegroups.com
So.... I am working on an adaption to remove all the "to" methods from collections and put them in type classes.  Perhaps that would solve your concerns? 

If done correctly, the toXYZ methods become extension methods.   That should hopefully reduce bytecode significantly, but I need to test it out.

As far as calling the method "to" I can see what you mean about convention.   It's not too late to change the name.   However, the name "to" still feels short to me.   Perhaps we just do it and see how much breaks in the next milestone/nightly?


- Josh

Stefan Zeiger

unread,
Jun 19, 2012, 2:06:10 PM6/19/12
to scala-i...@googlegroups.com
I can only reiterate my previous statement that I think none of those
methods should copy immutable collections unnecessarily. My vote goes to
calling the method "to", ensuring that it does not copy, then possibly
deprecating the other methods (but only if the new one is really called
"to" and not something much longer)

Jori Jovanovich

unread,
Jun 19, 2012, 2:08:32 PM6/19/12
to scala-i...@googlegroups.com

+1 for "to".

martin odersky

unread,
Jun 19, 2012, 2:12:55 PM6/19/12
to scala-i...@googlegroups.com


On Tue, Jun 19, 2012 at 8:08 PM, Jori Jovanovich <jo...@dimensiology.com> wrote:

+1 for "to".

I think "to" would work as well. I was a bit concerned with overlap with Range#to but it might be not so bad in the end. copyTo is definitely wrong because sometimes things are not copied. convertTo might work better, assuming that you can convert to yourself. But, like Simon says, "to" has the advantage that it fits with existing uses of toSeq, toList, toMap, which will not go away.

Cheers

 - Martin
 



--
Martin Odersky
Prof., EPFL and Chairman, Typesafe
PSED, 1015 Lausanne, Switzerland
Tel. EPFL: +41 21 693 6863
Tel. Typesafe: +41 21 691 4967

Jeff Olson

unread,
Jun 19, 2012, 2:13:38 PM6/19/12
to scala-i...@googlegroups.com
For the record I much prefer "to" over the alternatives. It is short, sweet, and is already established terminology. If "to" is vetoed I would cast a vote for "convertTo". -Jeff


On Tuesday, June 19, 2012 12:49:41 PM UTC-5, Josh Suereth wrote:
So.... I am working on an adaption to remove all the "to" methods from collections and put them in type classes.  Perhaps that would solve your concerns? 

If done correctly, the toXYZ methods become extension methods.   That should hopefully reduce bytecode significantly, but I need to test it out.

As far as calling the method "to" I can see what you mean about convention.   It's not too late to change the name.   However, the name "to" still feels short to me.   Perhaps we just do it and see how much breaks in the next milestone/nightly?


- Josh

Paul Phillips

unread,
Jun 19, 2012, 2:16:29 PM6/19/12
to scala-i...@googlegroups.com
On Tue, Jun 19, 2012 at 11:06 AM, Stefan Zeiger <sze...@novocode.com> wrote:
I can only reiterate my previous statement that I think none of those methods should copy immutable collections unnecessarily.

If you really want to deal with that, you should try to deal with the below, because one must define "unnecessarily".  As I have mentioned before, I do not think mutable and immutable collections should share implementation.  What possible code can go here which is anything but a latent bug to some subset of subclasses.

// from collection.LinearSeqOptimized

  override /*TraversableLike*/
  def drop(n: Int): Repr = {
    var these: Repr = repr
    var count = n
    while (!these.isEmpty && count > 0) {
      these = these.tail
      count -= 1
    }
    // !!! This line should actually be something like:
    //   newBuilder ++= these result
    // since we are in collection.*, not immutable.*.
    // However making that change will pessimize all the
    // immutable linear seqs (like list) which surely expect
    // drop to share.  (Or at least it would penalize List if
    // it didn't override drop.  It would be a lot better if
    // the leaf collections didn't override so many methods.)
    //
    // Upshot: MutableList is broken and passes part of the
    // original list as the result of drop.
    these
  }
 

Erik Rozendaal

unread,
Jun 19, 2012, 2:24:35 PM6/19/12
to scala-i...@googlegroups.com
Just some random thoughts...

"to" sounds fine to me. However, type inference can make this (and other proposals) look a bit surprising:

val xs: Set[Int] = Vector(1,2,3).to

or

f(Vector(1, 2, 3).to)

Surely not recommended usage, but something to keep in mind if we need to break ties between various name proposals.

And just another data point. Clojure seems to call this "into" (http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/into), as does the proposed JDK8 lambda library extensions: http://cr.openjdk.java.net/~briangoetz/lambda/collections-overview.html. The main difference is that you provide an actual collection, instead of just a type and (implicit) builder. So the closes Scala equivalent currently is "++".

And what about Vector(1, 2, 3).as[List] ? Maybe doesn't clearly imply that copying is a possibility, but may be less used than "to"…

Regards,
Erik

Josh Suereth

unread,
Jun 19, 2012, 2:28:48 PM6/19/12
to scala-i...@googlegroups.com
I don't think copying is that big a deal.

I'm also almost done with the extension method version of the patch.  I had to rejigger Miles collection extension method patch so that it works for Iterator.   I also found a crasher in extension methods.

As an aside, if this works out, We *could* move all the "toXYZ" methods to extensions to save the bytecode.   I'm also less hesitant about an extension method "to" because you can hide it!

Details to follow shortly.


- Josh

Stefan Zeiger

unread,
Jun 19, 2012, 2:31:16 PM6/19/12
to scala-i...@googlegroups.com
On 2012-06-19 20:16, Paul Phillips wrote:

On Tue, Jun 19, 2012 at 11:06 AM, Stefan Zeiger <sze...@novocode.com> wrote:
I can only reiterate my previous statement that I think none of those methods should copy immutable collections unnecessarily.

If you really want to deal with that, you should try to deal with the below, because one must define "unnecessarily".

I'd define it as "source is immutable and target is a subtype of source". Not copying mutable collections is just asking for trouble. (AFAICT there is no toXXX method right now which creates a mutable collection, so the issue does not arise without a generic "to").


 As I have mentioned before, I do not think mutable and immutable collections should share implementation.  What possible code can go here which is anything but a latent bug to some subset of subclasses.

The CanBuildFrom extension I outlined previously should be able to solve it in a concise and robust way. At least I think so, in theory. I have yet to try it in practice (if that's still relevant in a few weeks -- I suppose SLICK will keep me busy for a while)

-sz

Josh Suereth

unread,
Jun 19, 2012, 2:35:15 PM6/19/12
to scala-i...@googlegroups.com
toBuffer creates mutable buffers....

Simon Ochsenreither

unread,
Jun 19, 2012, 2:49:08 PM6/19/12
to scala-i...@googlegroups.com
+1 for “to”. (Kotlin uses “to”, too.) I'm still a bit scared about moving toList, toStream, toArray around, but I will comment on it as soon as Josh has some code to show.

Thanks for all the hard work and effort, I really appreciate it, Josh!

Paul Phillips

unread,
Jun 19, 2012, 2:50:42 PM6/19/12
to scala-i...@googlegroups.com


On Tue, Jun 19, 2012 at 11:31 AM, Stefan Zeiger <sze...@novocode.com> wrote:
I'd define it as "source is immutable and target is a subtype of source". Not copying mutable collections is just asking for trouble. (AFAICT there is no toXXX method right now which creates a mutable collection, so the issue does not arise without a generic "to").

I wrote some response to this, which I eventually realized is not super relevant because


The CanBuildFrom extension I outlined previously should be able to solve it in a concise and robust way.

Yes, that has the necessary quality of not depending on the statically known type of the source to determine the necessity of copying.

Daniel Sobral

unread,
Jun 19, 2012, 3:43:01 PM6/19/12
to scala-i...@googlegroups.com
On Tue, Jun 19, 2012 at 1:11 PM, Josh Suereth <joshua....@gmail.com> wrote:
>
> However, the point is that this new method makes the old ones unnecessary.
> I think that's a win.   What's missing is deprecations.   What do people
> think about deprecating the "toXYZ" methods in favor of the generic
> convertTo?

Does convertTo forsake copying if the actual type is assignable to the
target type and they are immutable?

--
Daniel C. Sobral

I travel to the future all the time.

Daniel Sobral

unread,
Jun 19, 2012, 3:46:23 PM6/19/12
to scala-i...@googlegroups.com
On Tue, Jun 19, 2012 at 2:49 PM, Josh Suereth <joshua....@gmail.com> wrote:
> So.... I am working on an adaption to remove all the "to" methods from
> collections and put them in type classes.  Perhaps that would solve your
> concerns?
>
> If done correctly, the toXYZ methods become extension methods.   That should
> hopefully reduce bytecode significantly, but I need to test it out.
>
> As far as calling the method "to" I can see what you mean about convention.
>   It's not too late to change the name.   However, the name "to" still feels
> short to me.   Perhaps we just do it and see how much breaks in the next
> milestone/nightly?

"to[T]" seems perfect to me. I thought it was discarded on the grounds
of compatibility with existing libraries?

Josh Suereth

unread,
Jun 19, 2012, 4:00:12 PM6/19/12
to scala-i...@googlegroups.com
Ok, so I'm going to see if I can enlist any help here.  What I'm trying to do is *move* all the "to" methods into extension methods.


Action Items
------------------
1. [ done ] Refactoring of Miles' generic collection extension method work for Iterators.
2. [ done ] Initial cut at collection conversions
3. [ TODO ] Stefan's idea to adapt CanBuildFrom so with a new method that lets us convert collections without copying.
4. [ TODO ] Remove all to* methods from GenTraversableOnce and make sure we really can reduce the API.
5. [ TODO ] Fix typer bug with AnyVal classes and type parameters.
6. [ TODO ] Polish + Documentation

If you want to hit up an item, let me know and I'll work with you.

I'm trying out Item #4 and running into some really fun issues with implicit lookup of CanBuildFrom in generic locations....
We may need to keep one of the "to" methods on GenTraversableOnce, but I feel confident we can remove most of the bytecode bloat.

#3 is very important.  If we can't continue to have efficient `toXYZ` methods, then it's probably not worth the extension method route.


In any case, I feel this could be a good experiment to see how viable moving to an extension method route is for us in other areas.   There's a balance here, but if we can isolate ideal behavior to remove from bloating our bytecode, that would be ideal.  Assuming this does, in fact, reduce bytecode size....

Ismael Juma

unread,
Jun 19, 2012, 4:19:53 PM6/19/12
to scala-i...@googlegroups.com

On Jun 19, 2012 9:00 PM, "Josh Suereth" <joshua....@gmail.com> wrote:
> Ok, so I'm going to see if I can enlist any help here.  What I'm trying to do is *move* all the "to" methods into extension methods.

I'm interested what effect this change has on compilation time (if any).

Best,
Ismael

Josh Suereth

unread,
Jun 19, 2012, 4:26:19 PM6/19/12
to scala-i...@googlegroups.com

Yeah. Won't know until there is a working patch

Josh Suereth

unread,
Jun 19, 2012, 5:18:38 PM6/19/12
to scala-i...@googlegroups.com

Just to clarify:  I think this is the perfect use case to test using real extension methods for both performance (compile/run time) and versatility.

If it all works out, it could drastically help us in maintaining an ABI and still fixing issues...

Josh Suereth

unread,
Jun 20, 2012, 9:08:27 AM6/20/12
to scala-i...@googlegroups.com
Ok, so thinking back, I think there are rather compelling reasons to name the method "to".  So can we all agree to try "to"?    I can change what's in trunk to use to then, and we'll see if it breaks any third-party code during the next milestone/RC whatever happens first.

Also, note:  This generic "to" method always copies unless we do some kind of work with CanBuildFrom, like Stefan suggests.  (Which is a good idea).


I'm still experimenting with having all the collection conversion methods defined as extension methods.   I think we need to get  handle on the full pros/cons of extension methods (and a corollary, using type-classes).    I know that in general they will give us easier solutions for binary compatibility, but run-time/compile-time performance are concerns.   There's also the question of @deprecation/@migration with a migration of this nature.   We could remain source compatible at the cost of people not knowing where methods are defined.


- Josh

√iktor Ҡlang

unread,
Jun 20, 2012, 9:21:11 AM6/20/12
to scala-i...@googlegroups.com
On Wed, Jun 20, 2012 at 3:08 PM, Josh Suereth <joshua....@gmail.com> wrote:
Ok, so thinking back, I think there are rather compelling reasons to name the method "to".  So can we all agree to try "to"?    I can change what's in trunk to use to then, and we'll see if it breaks any third-party code during the next milestone/RC whatever happens first.

Also, note:  This generic "to" method always copies unless we do some kind of work with CanBuildFrom, like Stefan suggests.  (Which is a good idea).

Yeah, we should be able to look in the From erasure to determine if we are indeed isAssignableFrom that and just pass ourselves over.

+1 for "to"



--
Viktor Klang

Akka Tech Lead
Typesafe - The software stack for applications that scale

Twitter: @viktorklang

Reply all
Reply to author
Forward
0 new messages