(Undocumented) IllegalStateException thrown by uninitialized regex match...bug?

209 views
Skip to first unread message

Andrew Phillips

unread,
Jan 28, 2014, 7:53:09 AM1/28/14
to scala-l...@googlegroups.com

The findAllIn method of scala.util.matching.Regex [1] returns a MatchIterator [2] whose implementation is backed by a java.util.regex.Matcher [3, 4]. The latter is documented as throwing an IllegalStateException if queried before a match has been attempted. This behaviour carries through the MatchIterator, but is not documented:

// 2.11.0-M7 REPL
scala> val hello = "hello world"
hello: String = hello world

// declaring mi here prevents the REPL from calling toString on it
scala> println({ val mi = " ".r.findAllIn(hello); mi.start })
java.lang.IllegalStateException: No match available
  at java.util.regex.Matcher.start(Matcher.java:342)
  at scala.util.matching.Regex$MatchIterator.start(Regex.scala:661)
  ... 32 elided

scala> println({ val mi = " ".r.findAllIn(hello); mi.hasNext; mi.start })
5

Is this a bug, incomplete documentation or Just Something You Have To Know? A cursory search didn't turn up anything in JIRA. SI-2460 is related, but doesn't address this case directly.

Thanks!

ap

Simon Schäfer

unread,
Jan 28, 2014, 9:03:11 AM1/28/14
to scala-l...@googlegroups.com

On 01/28/2014 01:53 PM, Andrew Phillips wrote:

The findAllIn method of scala.util.matching.Regex [1] returns a MatchIterator [2] whose implementation is backed by a java.util.regex.Matcher [3, 4]. The latter is documented as throwing an IllegalStateException if queried before a match has been attempted. This behaviour carries through the MatchIterator, but is not documented:

// 2.11.0-M7 REPL
scala> val hello = "hello world"
hello: String = hello world

// declaring mi here prevents the REPL from calling toString on it
scala> println({ val mi = " ".r.findAllIn(hello); mi.start })
java.lang.IllegalStateException: No match available
  at java.util.regex.Matcher.start(Matcher.java:342)
  at scala.util.matching.Regex$MatchIterator.start(Regex.scala:661)
  ... 32 elided

scala> println({ val mi = " ".r.findAllIn(hello); mi.hasNext; mi.start })
5

Is this a bug, incomplete documentation or Just Something You Have To Know?
For the moment: something you have to know.
For user expecrience: a bug.
--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Som Snytt

unread,
Jan 28, 2014, 1:41:08 PM1/28/14
to scala-l...@googlegroups.com
In your PR to fix the docs, would you also mind just deleting the badly indented ASCII art class diagram on Regex$?

http://www.scala-lang.org/files/archive/nightly/docs-master/library/#scala.util.matching.Regex$

It's superseded by the "content hierarchy" diagram which, happily, looks exactly the same.

I remember reading the post where Daniel Sobral discovers the initializing effect in REPL.  But trying to google it on SO only gave me an ad for match.com.

I did find these immortal words (on SO, not match.com):

Alas, I have a long story of love and hate towards Scala's Regex. It gives a glimpse of paradise but fails to follow through. If I had settled on a good API, I'd have made a replacement library by now.

http://stackoverflow.com/a/4504146/1296806

Never settle!  You will find true love!

Oh, wait.

https://groups.google.com/d/msg/scala-user/7B6fPJ03Vww/Mzkch0iFviUJ

So it's worth noting that the title is "$!.*% Iterators!" and not f"$k%n Regexes!"

http://scalapuzzlers.com/index.html#pzzlr-025

This throws, too, after all:

scala> js.iterator.remove
java.lang.IllegalStateException
  at java.util.ArrayList$Itr.remove(ArrayList.java:804)
  ... 32 elided

The current docs for findAllIn call MatchIterator:

a special scala.collection.Iterator that returns the matched strings, but can also be converted into a normal iterator that returns objects of type scala.util.matching.Regex.Match that can be queried for data such as the text that precedes the match, subgroups, etc.

And the example is a for-comprehension to collect the strings. That implies: use Match to query match data.

Personally, I only ever use findAllMatchIn.

Possibly, we need an annotation for, @onlyUseMeIfYouKnowWhatYoureDoing.  That could be a macro that checks for `import language.level.L4`.



--

Andrew Phillips

unread,
Jan 28, 2014, 6:14:28 PM1/28/14
to scala-l...@googlegroups.com
Would a PR to modify the doc be welcomed here? Should an issue be created for that, too?

> This throws, too, after all:
>
> scala> js.iterator.remove
>   java.lang.IllegalStateException
>      at java.util.ArrayList$Itr.remove(ArrayList.java:804)
>      ... 32 elided

I'm guessing js is a java.util.ArrayList here? In that case, the behaviour is documented, at least [1]:

Throws:
IllegalStateException - if neither next nor previous have been called, or remove or add have been called after the last call to next or previous

> So it's worth noting that the title is "$!.*% Iterators!" and not f"$k%n Regexes!"

Som Snytt

unread,
Jan 28, 2014, 7:13:21 PM1/28/14
to scala-l...@googlegroups.com
"Regular Exasperations"

 Can I use that for a blog title?  I added an ess.



--

Andrew Phillips

unread,
Jan 28, 2014, 9:23:37 PM1/28/14
to scala-l...@googlegroups.com
> Can I use that for a blog title?  I added an ess.

By all means ;-)

ap

Andrew Phillips

unread,
Jan 30, 2014, 12:22:08 AM1/30/14
to scala-l...@googlegroups.com
> In your PR to fix the docs, would you also mind just deleting the badly indented ASCII art class diagram on Regex$?

Andrew Phillips

unread,
Jan 30, 2014, 10:37:03 AM1/30/14
to scala-l...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages