Join two Maps, handle collisions with logic?

4,200 views
Skip to first unread message

Troy Payne

unread,
Aug 4, 2013, 4:15:49 PM8/4/13
to scala...@googlegroups.com
Hello,

I'd like to join two maps, and add some logic on collision of keys on which to take, is this possible?

Here's an example:

Map(1 -> 0.5, 2 -> 0.0) ++ Map(1 -> 0.0, 2 -> 1.0)

(something to get these results):

Map(1 -> 0.5, 2 -> 1.0)

in this example, I take whichever max of the two values, and that is used when joining the two maps and there is a key collision


Ryan LeCompte

unread,
Aug 4, 2013, 4:28:42 PM8/4/13
to Troy Payne, scala-user

Hi Troy,

You can do something like this:

  def merge[K, V](maps: Seq[Map[K, V]])(f: (K, V, V) => V): Map[K, V] = {
    maps.foldLeft(Map.empty[K, V]) { case (merged, m) =>
      m.foldLeft(merged) { case (acc, (k, v)) =>
        acc.get(k) match {
          case Some(existing) => acc.updated(k, f(k, existing, v))
          case None => acc.updated(k, v)
        }
      }
    }
  }

Then you can use the above merge method with your example like this:

scala> val maps = Seq(Map(1 -> 0.5, 2 -> 0.0), Map(1 -> 0.0, 2 -> 1.0))
maps: Seq[scala.collection.immutable.Map[Int,Double]] = List(Map(1 -> 0.5, 2 -> 0.0), Map(1 -> 0.0, 2 -> 1.0))

scala> merge(maps) { (_, v1, v2) => v1 + v2 }
res2: Map[Int,Double] = Map(1 -> 0.5, 2 -> 1.0)




--
You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-user+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Troy Payne

unread,
Aug 4, 2013, 4:30:30 PM8/4/13
to Ryan LeCompte, scala-user
thanks!

Ryan LeCompte

unread,
Aug 4, 2013, 4:47:03 PM8/4/13
to Troy Payne, scala-user
omg np!

Kevin Scaldeferri

unread,
Aug 4, 2013, 4:57:02 PM8/4/13
to Troy Payne, scala-user
It might be overkill just to implement this one feature, but algebird provides functionality like this though MapMonoid: http://twitter.github.io/algebird/#com.twitter.algebird.MapMonoid.  You'd need to provide max as an implementation of the semigroup operation on Doubles and then the monoid plus on Maps would do just what  you want.


On Sun, Aug 4, 2013 at 1:15 PM, Troy Payne <troy...@gmail.com> wrote:

Erik Osheim

unread,
Aug 4, 2013, 6:28:18 PM8/4/13
to Troy Payne, scala...@googlegroups.com
On Sun, Aug 04, 2013 at 01:15:49PM -0700, Troy Payne wrote:
> I'd like to join two maps, and add some logic on collision of keys on which
> to take, is this possible?

Yes. Several Scala libraries support this, including Spire [1]:

import spire.implicits._

scala> import spire.implicits._
import spire.implicits._

scala> Map(1 -> 2) + Map(3 -> 4, 1 -> 7)
res0: Map[Int,Int] = Map(3 -> 4, 1 -> 9)

This uses an instance of Rng[Map[K, V]], which Spire can automatically
provide for any V that has a Rng[V]. A Rng (pronounced "Rung") is like
a Ring but lacks a multiplicative identity (since we can't easily
produce a multiplicative identity for maps).

Rng also provides many other operators, allowing things like:

scala> Map(1 -> 10) * Map(1 -> 5, 2 -> 4)
res1: Map[Int,Int] = Map(1 -> 50)

scala> -Map(9 -> 99, 10 -> 13)
res2: Map[Int,Int] = Map(9 -> -99, 10 -> -13)

Anyway, hope this helps!

-- Erik

[1] https://github.com/non/spire

Andreas Scheinert

unread,
Aug 5, 2013, 4:31:09 PM8/5/13
to scala...@googlegroups.com
"It might be overkill just to implement this one feature, but algebird provides functionality like this though MapMonoid: http://twitter.github.io/algebird/#com.twitter.algebird.MapMonoid.  You'd need to provide max as an implementation of the semigroup operation on Doubles and then the monoid plus on Maps would do just what  you want."

scalaz already has the required monoids for you defined.

import scalaz._

import Scalaz._

Map(1 -> 0.5, 2 -> 0.0) |+| Map(1 -> 0.0, 2 -> 1.0)
gives:
Map(1 -> 0.5, 2 -> 1.0)

Kevin Scaldeferri

unread,
Aug 5, 2013, 6:09:50 PM8/5/13
to Andreas Scheinert, scala...@googlegroups.com
Well, scalaz might also be overkill for just that one feature...

Btw, you haven't clearly demonstrated the functionality specified by the OP which was taking the max, not the sum, of the values

Alec Zorab

unread,
Aug 6, 2013, 1:59:58 AM8/6/13
to Kevin Scaldeferri, Andreas Scheinert, scala...@googlegroups.com
scala> Map(1 -> 0.5, 2 -> 3.0).mapValues(Tags.MaxVal) |+| Map(1 -> 0.0, 2 -> 1.0).mapValues(Tags.MaxVal)
res3: scala.collection.immutable.Map[Int,scalaz.@@[Double,scalaz.Tags.MaxVal]] = Map(1 -> 0.5, 2 -> 3.0)

vatel

unread,
Aug 6, 2013, 5:39:04 AM8/6/13
to scala...@googlegroups.com
and going further - if we want any function (valInMap1, valInMap2) =>
newVal ?

--
Victor

On 06.08.2013 09:59, Alec Zorab wrote:
> scala> Map(1 -> 0.5, 2 -> 3.0).mapValues(Tags.MaxVal) |+| Map(1 -> 0.0,
> 2 -> 1.0).mapValues(Tags.MaxVal)
> res3:
> scala.collection.immutable.Map[Int,scalaz.@@[Double,scalaz.Tags.MaxVal]]
> = Map(1 -> 0.5, 2 -> 3.0)
>
>
> On 5 August 2013 23:09, Kevin Scaldeferri
> <ke...@scaldeferri.com

Alec Zorab

unread,
Aug 6, 2013, 6:08:31 AM8/6/13
to vatel, scala-user
then you need to be a bit more explicit in your type constaints!

What types are map1 and map2, and what type is the output of your function? What do you expect to see in the map if there is only a map1 or map2 value?


--
You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-user+unsubscribe@googlegroups.com.

Robson Peixoto

unread,
Nov 18, 2013, 6:48:18 AM11/18/13
to scala...@googlegroups.com, Troy Payne


On Sunday, August 4, 2013 5:28:42 PM UTC-3, Ryan LeCompte wrote:

Hi Troy,

You can do something like this:

  def merge[K, V](maps: Seq[Map[K, V]])(f: (K, V, V) => V): Map[K, V] = {
    maps.foldLeft(Map.empty[K, V]) { case (merged, m) =>
      m.foldLeft(merged) { case (acc, (k, v)) =>
        acc.get(k) match {
          case Some(existing) => acc.updated(k, f(k, existing, v))
          case None => acc.updated(k, v)
        }
      }
    }
  }

Then you can use the above merge method with your example like this:

scala> val maps = Seq(Map(1 -> 0.5, 2 -> 0.0), Map(1 -> 0.0, 2 -> 1.0))
maps: Seq[scala.collection.immutable.Map[Int,Double]] = List(Map(1 -> 0.5, 2 -> 0.0), Map(1 -> 0.0, 2 -> 1.0))

scala> merge(maps) { (_, v1, v2) => v1 + v2 }
res2: Map[Int,Double] = Map(1 -> 0.5, 2 -> 1.0)


How can I use it to do something similar like:

Map1 = Map(1 -> Class1(1), 2 -> Class1(2))
Map2 = Map(2 -> Class2(1), 3 -> Class2(2))

After merged.

Merged = Map( 1 -> List(Class1(1)), 2 -> List(Class1(2), Class2(1)), 3 -> Class2(2))

Can be List, Set or any other collection who has size attribute.

Thanks.

Raphael Schweizer

unread,
Nov 18, 2013, 1:01:20 PM11/18/13
to scala...@googlegroups.com, Troy Payne
With val ms = Seq(map1, map2) you could use merge(ms.map(_.mapValues(List(_)))){(_, v1, v2) => v1 ++ v2} or if you are interested in just the sizes: merge(ms.map(_.mapValues(_ => 1))){(_, v1, v2) => v1 + v2} which in turn is the same as: (m1.toSeq ++ m2).groupBy(_._1).mapValues(_.size)

Robson Peixoto

unread,
Nov 18, 2013, 2:04:12 PM11/18/13
to scala...@googlegroups.com, Troy Payne


On Monday, November 18, 2013 3:01:20 PM UTC-3, Raphael Schweizer wrote:
With val ms = Seq(map1, map2) you could use merge(ms.map(_.mapValues(List(_)))){(_, v1, v2) => v1 ++ v2} or if you are interested in just the sizes: merge(ms.map(_.mapValues(_ => 1))){(_, v1, v2) => v1 + v2} which in turn is the same as: (m1.toSeq ++ m2).groupBy(_._1).mapValues(_.size)


Thanks !  =D

Bjorn Regnell

unread,
Nov 18, 2013, 2:52:27 PM11/18/13
to scala...@googlegroups.com, Troy Payne
A first cut on a typeclass that could do what Robson asked for: (Perhaps someone can help with making the resulting type of m3 in the example below more specific)
 
trait BagMerger[K,V] {
  type Bag[K, V] = Map[K, Traversable[V]]
  def mergeBags[K,V](m1: Bag[K, V],m2: Bag[K, V]): Bag[K, V] =
    (m1.keys++m2.keys).map(k => (k,m1.getOrElse(k,Traversable())++m2.getOrElse(k,Traversable()))).toMap
}

object BagMerger {
  implicit class RichMapToSeq[K,V](bag: Map[K,Traversable[V]]) extends BagMerger[K,V]  {
    def join[B >: V](that: Map[K,Traversable[B]]): Map[K,Traversable[B]] = mergeBags(this.bag, that)
  }
}

import BagMerger._

trait BaseClass
case class Class1(x: Int) extends BaseClass
case class Class2(x: Int) extends BaseClass
val m1 = Map(1 -> Set(Class1(1)), 2 -> List(Class1(2)))
val m2 = Map(2 -> Seq(Class2(1)), 3 -> Vector(Class2(2)))
val m3 = m1 join m2

//>m3: Map[Int,Traversable[Product with Serializable with BaseClass]] =
 Map(1 -> Set(Class1(1)), 2 -> List(Class1(2), Class2(1)), 3 -> List(Class2(2)))

/Bjorn

Sky Schulz

unread,
Mar 17, 2015, 2:19:05 AM3/17/15
to scala...@googlegroups.com, troy...@gmail.com
Thank you, Ryan. Worked a peach, for me!
Reply all
Reply to author
Forward
0 new messages