What about pattern matching on sets?

1,157 views
Skip to first unread message

Léonard Schneider

unread,
Jun 14, 2012, 11:59:19 AM6/14/12
to scala...@googlegroups.com
Hi,

Pattern matching is very powerful. I naively thought it was available on sets, but it "only" works on indexed sequences, such as lists. Unfortunately extractors can not make a workaround.

Some simple examples.

What we have now with lists:

val l = List(1, 2, 3, 4)
l match {
  case List(1, 2, 3, 4) => "The list I want"
  case List(4, 3, 2, 1) => "This is different"
}

What I would like with Sets:

val s = Set(1, 2, 3, 4)
s match {
  case Set(4, 3, 2, 1) => "Match !"
  case Set(4, _, _, _) => "Match"
  case Set(4, 3) => "No match"
}

Of course, this should also work on sub-patterns.

Anybody else interested? What do you think?

BR,


Leo

Meredith Gregory

unread,
Jun 14, 2012, 1:48:49 PM6/14/12
to Léonard Schneider, scala...@googlegroups.com
Dear Leonard,

It's certainly possible -- it's just more computationally costly, because Set's are unordered. In particular, for the compiler to issue a warning that

s match {
   case Set( 1, 2, 3, 4 ) => ...
   case Set( 2, 1, 3, 4 ) => ...
   ...
}

has overlapping patterns puts a certain price on compilation time.

Best wishes,

--greg
--
L.G. Meredith
Managing Partner
Biosimilarity LLC
7329 39th Ave SW

√iktor Ҡlang

unread,
Jun 14, 2012, 2:15:51 PM6/14/12
to Meredith Gregory, scala-user, Léonard Schneider

Why would that matter?

val x = 2
x match {
  case 2 =>
  case smth =>
}

Cheers,
V

Razvan Cojocaru

unread,
Jun 14, 2012, 4:26:19 PM6/14/12
to √iktor Ҡlang, Meredith Gregory, scala-user, Léonard Schneider

scala> val x=2

x: Int = 2

 

scala> x match {

     | case 2 => "haha"

     | case 2 => "hehe"

     | case x => x.toString

     | }

<console>:11: error: unreachable code

              case 2 => "hehe"

                        ^

 

scala>

√iktor Ҡlang

unread,
Jun 14, 2012, 4:45:43 PM6/14/12
to Razvan Cojocaru, Meredith Gregory, scala-user, Léonard Schneider
On Thu, Jun 14, 2012 at 10:26 PM, Razvan Cojocaru <p...@razie.com> wrote:

scala> val x=2

x: Int = 2

 

scala> x match {

     | case 2 => "haha"

     | case 2 => "hehe"

     | case x => x.toString

     | }

<console>:11: error: unreachable code

              case 2 => "hehe"

                        ^


Technically x would be 2.type and all other cases but the first is unreachable, this is statically verifiable. Where do you draw the line?

Cheers,



--
Viktor Klang

Akka Tech Lead
Typesafe - The software stack for applications that scale

Twitter: @viktorklang

Meredith Gregory

unread,
Jun 14, 2012, 5:14:28 PM6/14/12
to √iktor Ҡlang, Razvan Cojocaru, scala-user, Léonard Schneider
Dear Viktor,

object JustInCase {
  def caseTheJoint( joint : Int ) : Unit = {
    joint match {
      case 1234 => println( "yo!" )
      case 1234 => println( "oy!" )
      case _ => println( "oyoyo!" )
    }
  }
}

produces the error below from the compiler. This is an example that matches the essential characteristics of this code

object CaseInPoint {
  def whatsThePoint( point : Set[Int] ) : Unit = {
    point match {
      case Set( 1, 2, 3, 4 ) => println( "yo!" )
      case Set( 2, 1, 3, 4 )  => println( "oy!" )
      case _ => println( "oyoyo!" )
    }
  }
}

in that Set( 1, 2, 3, 4 ) and Set( 2, 1, 3, 4 ) not only constants, but the same constant. So, if we don't want the compiler to complain in this case, we don't want the compiler to complain in the case above. In your proposed example the second branch of the case creates a binding and so is really distinguishable from the first. That's why your example isn't an analog of the issue i was raising.

Personally, i very rarely make the error above; so, the aid from the compiler is not buying me much. However, as data types get more complex and we support more and more complex notions of equality, i could see that it might be useful to have help from the compiler. In that case, however, it would take the compiler longer and longer to deliver the analysis. So, you might be right that it doesn't actually matter all that much.

Best wishes,

--greg

[ERROR] /Users/lgm/work/src/projex/stellar/mdp4tw/Agent-Service-ATI/AgentServicesStore/src/main/scala/com/protegra/agentservicesstore/caseout.scala:13: error: unreachable code
[INFO]       case 1234 => println( "oy!" )
[INFO]                           ^
[ERROR] one error found

√iktor Ҡlang

unread,
Jun 14, 2012, 5:54:02 PM6/14/12
to Meredith Gregory, Razvan Cojocaru, scala-user, Léonard Schneider
On Thu, Jun 14, 2012 at 11:14 PM, Meredith Gregory <lgreg.m...@gmail.com> wrote:
Dear Viktor,

object JustInCase {
  def caseTheJoint( joint : Int ) : Unit = {
    joint match {
      case 1234 => println( "yo!" )
      case 1234 => println( "oy!" )
      case _ => println( "oyoyo!" )
    }
  }
}

produces the error below from the compiler. This is an example that matches the essential characteristics of this code

object CaseInPoint {
  def whatsThePoint( point : Set[Int] ) : Unit = {
    point match {
      case Set( 1, 2, 3, 4 ) => println( "yo!" )
      case Set( 2, 1, 3, 4 )  => println( "oy!" )
      case _ => println( "oyoyo!" )
    }
  }
}

in that Set( 1, 2, 3, 4 ) and Set( 2, 1, 3, 4 ) not only constants, but the same constant.

The defense presents exhibit B:

scala> object JustInCase {
     |   def caseTheJoint( joint : Int ) : Unit = {
     |     val SameSameButDifferent = 1234
     |     joint match {
     |       case 1234 => println( "yo!" )
     |       case SameSameButDifferent => println( "oy!" )
     |       case _ => println( "oyoyo!" )
     |     }
     |   }
     | }
defined module JustInCase
 
The Defense rests. ;-)

Cheers,

Lars Hupel

unread,
Jun 14, 2012, 6:06:29 PM6/14/12
to scala...@googlegroups.com
I'm not sure whether I understood the question correctly, but here are
my thoughts:

> It's certainly possible -- it's just more computationally costly, because
> Set's are unordered. In particular, for the compiler to issue a warning that
>
> s match {
> case Set( 1, 2, 3, 4 ) => ...
> case Set( 2, 1, 3, 4 ) => ...
> ...
> }

The current situation is that such an extractor method for sets would be
`unapplySeq`, which takes just the set and returns an
`Option[Seq[Int]]`. This represents *at most one* sequence. The compiler
does not reorder that sequence in any way, so this code snippet does not
actually contain overlapping patterns.

Thus, I see a far more fundamental problem: How would you change or
extend the notion of extractors to allow reordering? Perhaps with an
`unapplySet` which returns `Option[Set[A]]`. Another idea would be to
introduce `Set.unapplySeq[A: Ordering]`, although I'm not sure whether
this would be even possible.

Razvan Cojocaru

unread,
Jun 14, 2012, 8:43:42 PM6/14/12
to √iktor Ҡlang, Meredith Gregory, scala-user, Léonard Schneider
Greg - I think he's cheating... How do we flank him?

;)

Thanks,
Razvan

Meredith Gregory

unread,
Jun 14, 2012, 9:59:26 PM6/14/12
to √iktor Ҡlang, Razvan Cojocaru, scala-user, Léonard Schneider
Dear Viktor,

You're correct! The REPL doesn't complain about your code (or mine). This is another count against the REPL, imho. However, if you compile it, you will see the trace (or something equivalent) below.

Best wishes,

--greg

[ERROR] /Users/lgm/work/src/projex/stellar/mdp4tw/Agent-Service-ATI/AgentServicesStore/src/main/scala/com/protegra/agentservicesstore/caseout.scala:14: error: unreachable code
[INFO]       case SameSameButDifferent => println( "oy!" )
[INFO]                                           ^
[ERROR] one error found

Meredith Gregory

unread,
Jun 15, 2012, 12:13:59 AM6/15/12
to Razvan Cojocaru, √iktor Ҡlang, scala-user, Léonard Schneider
Dear Razie,

i think that the REPL is flanking us! ;-) 

More seriously, the divergence between the REPL experience and the compiler experience is reaching a pretty interesting width.

Best wishes,

--greg

Daniel Sobral

unread,
Jun 15, 2012, 12:51:18 PM6/15/12
to Meredith Gregory, Razvan Cojocaru, √iktor Ҡlang, scala-user, Léonard Schneider
Personally, I don't see what the problem is. A sufficiently advanced
version of REPL does all you want. ;-)

On Fri, Jun 15, 2012 at 1:13 AM, Meredith Gregory
--
Daniel C. Sobral

I travel to the future all the time.

Meredith Gregory

unread,
Jun 16, 2012, 3:14:09 PM6/16/12
to Daniel Sobral, Razvan Cojocaru, √iktor Ҡlang, scala-user, Léonard Schneider
Dear All,

PaulP noticed that the revisions between the REPL and compiler i was using were different. So, we got different behaviors between the two. Thanks for noticing, Paul! My mistake, entirely.

The desired behavior is for the compiler to complain about unreachable branches -- when it can figure it out. Currently, as i understand it, 2.9.1 and above cannot see that -- for i : Int and some Int constant, K,

i match {
  case K => ...
  case K => ...
  ...
}

the second branch is unreachable. Again, i've got no opinions about this one way or the other. The only thing i was pointing out was that for s : Set[Int], S a presentation of a constant in Set[Int] and S' another presentation of the same constant in Set[Int], we have 

s match {
   case S => ...
   case S' => ...
}

that the second case is unreachable -- provided that the extraction is based on Set-equality.

Whether the compiler can determine that or not is -- effectively -- up to the compiler. However, it should be noted that the time to check that will grow very, very fast.

Best wishes,

--greg
Reply all
Reply to author
Forward
0 new messages