Sorted structures lose their ordering when mapped

738 views
Skip to first unread message

Tobin Yehle

unread,
Nov 28, 2016, 11:14:06 PM11/28/16
to scala-language
Not sure if I'm really posting this in the right place, but:

Calling map on a collection that requires an implicit ordering results in some surprising behavior. For example:

scala> val set = SortedSet(1,2,3)(ord=Ordering.Int.reverse)
set: scala.collection.SortedSet[Int] = TreeSet(3, 2, 1)


scala
> set.firstKey == set.map(identity).firstKey
res0
: Boolean = false

My feeling is that mapping the identity function over any collection should not change anything.

In this case it is clear what is happening. The CanBuildFrom provided by SortedSet is pulling an Ordering instance out of predef (Ordering.Int) instead of using the ordering from the origin collection.
scala> set.toList
res1
: List[Int] = List(3, 2, 1)


scala
> set.map(identity).toList
res2
: List[Int] = List(1, 2, 3)

scala> set.ordering
res3: Ordering[Int] = scala.math.Ordering$$anon$4@3ff3275b

scala> set.map(identity).ordering
res4: Ordering[Int] = scala.math.Ordering$Int$@7fa85a55

I think this can be fixed by providing a more specific CanBuildFrom instance to be used when mapping between two sorted collections that contain the same type.

i.e. Provide a CanBuildFrom[Sorted[K, _], K, Sorted[K, _]] somewhere that has precedence over the more generic usual CanBuildFrom[CC[_], A, CC[A]]

I'm not sure how it could be done, but the static overloading resolution rules make this possible, I think.

Vlad Patryshev

unread,
Nov 29, 2016, 12:34:04 AM11/29/16
to scala-l...@googlegroups.com
It was a pretty nice (counter-)example, but I wonder what is "SortedSet"? And of course the next question will be - how should map be defined on "SortedSet"?

It's not a joke, it's a serious question.

Thanks,
-Vlad

--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-language+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tobin Yehle

unread,
Nov 29, 2016, 1:24:02 AM11/29/16
to scala-l...@googlegroups.com
On the immutable side, I think the only concrete collections that inherit from SortedSet are TreeSet and BitSet. I think the problem also exists in TreeMap, and all the mutable versions of these three structures. Mutable BitSets are probably the largest use case.

Use cases for TreeSet and TreeMap are for situations where a HashSet/Map would be good, but in-order traversals are also needed. They provide Log insertions/deletions/lookups.

My thought is that sorted collections should retain their ordering whenever possible.

scala> val set = SortedSet(1,2,3)(ord=Ordering.Int.reverse)
set: scala.collection.SortedSet[Int] = TreeSet(3, 2, 1)

scala> set.map(identity)
res0: scala.collection.SortedSet[Int] = TreeSet(1, 2, 3)          should be TreeSet(3, 2, 1)?

scala> set.map(_*2)
res1: scala.collection.SortedSet[Int] = TreeSet(2, 4, 6)         should be TreeSet(6, 4, 2)?

and if possible

scala> set.map(_*2)(collection.breakOut) : BitSet
res2: scala.collection.BitSet = BitSet(2, 4, 6)                        should be BitSet(6, 4, 2)?

an impossible case is when the type of the objects in the container changes

scala> set.map("a" * _)
res3: scala.collection.SortedSet[String] = TreeSet(a, aa, aaa)        this seems like correct behavior


To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "scala-language" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scala-language/Q5A1TTEZN4Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scala-languag...@googlegroups.com.

Oliver Ruebenacker

unread,
Nov 29, 2016, 10:16:13 AM11/29/16
to scala-l...@googlegroups.com

     Hello,

  The original ordering can only be kept if the new collection has the same type of elements. That means your suggestion would split mapping into two cases - same type or not same type - with different behaviors for each case. That would certainly complicate things.

  For example, what happens when we don't know whether the type is the same due to type parameters? Would we treat them as different?

     Best, Oliver



 

To unsubscribe from this group and stop receiving emails from it, send an email to scala-language+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "scala-language" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scala-language/Q5A1TTEZN4Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scala-language+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-language+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Oliver Ruebenacker
Senior Software Engineer, Diabetes Portal, Broad Institute

Tobin Yehle

unread,
Nov 29, 2016, 3:23:18 PM11/29/16
to scala-l...@googlegroups.com
I think the least surprising behavior would be to treat them as the same, but I'm not sure I follow you exactly. Won't a CanBuildFrom[Sorted[K, _], K, Sorted[K, _]] work if it is scope?

To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "scala-language" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scala-language/Q5A1TTEZN4Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scala-languag...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Oliver Ruebenacker
Senior Software Engineer, Diabetes Portal, Broad Institute

--
You received this message because you are subscribed to a topic in the Google Groups "scala-language" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scala-language/Q5A1TTEZN4Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scala-languag...@googlegroups.com.

Stephen Compall

unread,
Nov 30, 2016, 8:33:45 AM11/30/16
to scala-l...@googlegroups.com, Tobin Yehle
On November 28, 2016 11:14:06 PM EST, Tobin Yehle <tobin...@gmail.com> wrote:
>Calling map on a collection that requires an implicit ordering results
>in
>some surprising behavior. For example:
>
>scala> val set = SortedSet(1,2,3)(ord=Ordering.Int.reverse)
>set: scala.collection.SortedSet[Int] = TreeSet(3, 2, 1)
>
>
>scala> set.firstKey == set.map(identity).firstKey
>res0: Boolean = false
>
>My feeling is that mapping the identity function over any collection
>should
>not change anything.
>
>In this case it is clear what is happening. The CanBuildFrom provided
>by
>SortedSet is pulling an Ordering instance out of predef (Ordering.Int)
>instead of using the ordering from the origin collection.

Doing so would introduce an identity law for your type at the expense of a composition law, namely, set.map(f).map(g) == set.map(f andThen g). e.g.

f = a => (a, ())
g = a => a._1

For the real Functor map, the type abstraction means that identity implies composition. However, this map function isn't fully polymorphic in element type, so you need more for that implication.

Global typeclass coherence is strong enough to provide the implication, and indeed, if you use "newtypes" like scalaz's Dual tag (whose ordering is reversed from the underlying type), both identity and composition just work. That gives you a SortedSet[Int @@ Dual].

However, this requires the discipline that you do not simply pass in an 'ord' argument as you do above. Nevertheless, for refactorability sanity and a fistful of such "obvious" equalities like the one you desire, I recommend keeping to this "typeclass coherence" discipline.

If you are uncomfortable with typeclass coherence, maybe you will be happy with a path-dependent types approach. http://typelevel.org/blog/2016/11/17/heaps.html

--
Stephen Compall
If anyone in the MSA is online, you should watch this flythrough.

Oliver Ruebenacker

unread,
Nov 30, 2016, 10:09:59 AM11/30/16
to scala-l...@googlegroups.com

     Hello,

  I  mean what if you have, for example

  def mapMySet[A, B](set: SortedSet[A], f: A => B): SortedSet[B] = set.map(f)

  You can't prove that A and B are the same type, so the mapped set can't just take the Ordering[A] from the original set, but you'd have to provide a fresh Ordering[B]. But then some one might call mapMySet with A and B the same.

     Best, Oliver

To unsubscribe from this group and stop receiving emails from it, send an email to scala-language+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "scala-language" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scala-language/Q5A1TTEZN4Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scala-language+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-language+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Oliver Ruebenacker
Senior Software Engineer, Diabetes Portal, Broad Institute

--
You received this message because you are subscribed to a topic in the Google Groups "scala-language" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scala-language/Q5A1TTEZN4Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scala-language+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-language+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Vlad Patryshev

unread,
Nov 30, 2016, 12:00:52 PM11/30/16
to scala-l...@googlegroups.com
Even if it's the same type, how would one deal with

1::2::3::Nil map (_%2) ?

Thanks,
-Vlad

Vlad Patryshev

unread,
Nov 30, 2016, 12:03:08 PM11/30/16
to scala-l...@googlegroups.com, Tobin Yehle
I guess the issue is, which category are we in? Posets, suddenly? Linear orders? Scala is not sophisticated enough to provide suddenly a specialization into such categories, so that map preserves order, or, next time, preserves monoidal operations, or be continuous, or be linear.

Would be nice though. Probably not in this century.

Thanks,
-Vlad

--

Tobin Yehle

unread,
Nov 30, 2016, 4:30:21 PM11/30/16
to Vlad Patryshev, scala-l...@googlegroups.com
Haha, clearly copying the ordering doesn't solve any problems. Thanks for the responses everyone.

Thanks for the examples Stephen, that makes a lot of sense. If not having typeclass coherence causes such problems would it be possible to give a compiler warning if someone like me breaks it? In this case manually specifying an implicit parameter that is being used as a a typeclass? It looks like for all the examples I gave it would be better to just put an implicit reversed order in the local scope. eg:
implicit val revOrd = Ordering.Int.reverse
val set = SortedSet(1,2,3)
set.map(_*2) // TreeSet(6,4,2)
set.map(i => i -> "a"*i)(breakOut):SortedMap[Int, String] // TreeMap(3 -> aaa, 2 -> aa, 1 -> a)

Is this style considered best practice?

To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.

Stephen Compall

unread,
Dec 15, 2016, 6:52:21 PM12/15/16
to scala-l...@googlegroups.com
On 11/30/16 1:30 PM, Tobin Yehle wrote:
> Thanks for the examples Stephen, that makes a lot of sense. If not
> having typeclass coherence causes such problems would it be possible
> to give a compiler warning if someone like me breaks it?
No, sorry, not with Scala's current design.

> It looks like for all the examples I gave it would be better to just
> put an implicit reversed order in the local scope. eg:
> implicit val revOrd = Ordering.Int.reverse
> val set = SortedSet(1,2,3)
> set.map(_*2) // TreeSet(6,4,2)
> set.map(i => i -> "a"*i)(breakOut):SortedMap[Int, String] // TreeMap(3
> -> aaa, 2 -> aa, 1 -> a)
>
> Is this style considered best practice?
Not really, because you don't get a warning if you forgot the difference.

scala> import scalaz.ISet, scalaz.Tags.Dual, scalaz.Dual._, scalaz.std.anyVal._


I can work with both forwards and backwards sets in the same scope.

scala> val s1 = ISet fromList List(4, 1, 7)
s1: scalaz.ISet[Int] = Bin(4,Bin(1,Tip(),Tip()),Bin(7,Tip(),Tip()))

scala> val s2 = ISet fromList Dual.subst(List(4, 1, 7))
s2: scalaz.ISet[scalaz.@@[Int,scalaz.Tags.Dual]] = Bin(4,Bin(7,Tip(),Tip()),Bin(1,Tip(),Tip()))

scala> s1.toAscList
res2: List[Int] = List(1, 4, 7)

scala> s2.toAscList
res3: List[scalaz.@@[Int,scalaz.Tags.Dual]] = List(7, 4, 1)


And if I forget (the equivalent to "put an implicit reversed order in
the local scope" would be "import Dual._"), I get a compiler error:

scala> import scalaz.ISet, scalaz.Tags.Dual, scalaz.std.anyVal._

scala> val s1 = ISet fromList List(4, 1, 7)
s1: scalaz.ISet[Int] = Bin(4,Bin(1,Tip(),Tip()),Bin(7,Tip(),Tip()))

scala> val s2 = ISet fromList Dual.subst(List(4, 1, 7))
<console>:16: error: could not find implicit value for parameter o: scalaz.Order[scalaz.@@[Int,scalaz.Tags.Dual]]
val s2 = ISet fromList Dual.subst(List(4, 1, 7))
^
Reply all
Reply to author
Forward
0 new messages