java.lang.OutOfMemoryError: Java heap space with CharPredicate.All

61 views
Skip to first unread message

Mark Shirley

unread,
Feb 14, 2016, 9:09:19 AM2/14/16
to parboiled2.org User List
Hi,

I'm just getting familiar with parboiled2, and I encountered what I found to be interesting behavior. Given the following two parsers:

import org.parboiled2._
import org.parboiled2.CharPredicate

class FooParser(val input: ParserInput) extends Parser with StringBuilding {
  def TextNoQuote = rule( CharPredicate.Printable -- '"' )
  def DDQuote = rule( "\"\"" )

  def InputLine = rule {
    '"' ~ clearSB() ~ zeroOrMore((TextNoQuote | DDQuote) ~ appendSB()) ~ '"' ~ push(sb.toString)
  }
}

class BarParser(val input: ParserInput) extends Parser with StringBuilding {
  def TextNoQuote = rule( CharPredicate.All -- '"' )
  def DDQuote = rule( "\"\"" )

  def InputLine = rule {
    '"' ~ clearSB() ~ zeroOrMore((TextNoQuote | DDQuote) ~ appendSB()) ~ '"' ~ push(sb.toString)
  }
}

I see the following output with the input string """:

scala> (new FooParser("\"\"\"")).InputLine.run()
res11: scala.util.Try[String] = Failure(ParseError(Position(3,1,4), Position(3,1,4), <3 traces>))

scala> (new BarParser("\"\"\"")).InputLine.run()
java.lang.OutOfMemoryError: Java heap space
  at java.util.Arrays.copyOf(Arrays.java:3332)
  at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
  at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
  at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:622)
  at java.lang.StringBuilder.append(StringBuilder.java:202)
  at org.parboiled2.StringBuilding$class.appendSB(StringBuilding.scala:30)
  at BarParser.appendSB(<console>:20)
  at BarParser.rec$2(<console>:25)
  at BarParser.InputLine(<console>:24)
  at $anonfun$1.apply(<console>:22)
  at $anonfun$1.apply(<console>:22)
  at org.parboiled2.Parser.runRule$1(Parser.scala:142)
  at org.parboiled2.Parser.phase0_initialRun$1(Parser.scala:150)
  at org.parboiled2.Parser.__run(Parser.scala:203)
  ... 18 elided

It seems like the use of the All predicate (instead of Printable) is causing the OOME. My very shallow understanding of PEG parsers is that memory usage should be proportional to the input size, so what is going on here?

Thanks,
Mark

Mathias Doenitz

unread,
Feb 14, 2016, 4:36:58 PM2/14/16
to parboil...@googlegroups.com
Hi Mark,

the reason that your BarParser dies with an OOME is that `CharPredicate.All` is too big a hammer for your use case.
`CharPredicate.All` really matches ALL characters, including the EOI character that pb2 uses for input positions that are beyond the limits of the underlying input content.
So you BarParser continues to read characters beyond the end of the input into all eternity.
If you exclude the EOI character from the CharPredicate things will work as expected.

Another note:
You shouldn't construct CharPredicate instances in your parser rules as this will cause the construction (and allocation) to happen everytime the rule is executed!
It's better to move the construction out of the rule body and into the companion object, for example.

Cheers,
Mathias

---
mat...@parboiled.org
http://www.parboiled.org
> --
> You received this message because you are subscribed to the Google Groups "parboiled2.org User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to parboiled-use...@googlegroups.com.
> Visit this group at https://groups.google.com/group/parboiled-user.
> To view this discussion on the web visit https://groups.google.com/d/msgid/parboiled-user/0d88d5dd-7e66-42ee-93a6-eba15c7f73bd%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Mark Shirley

unread,
Feb 14, 2016, 5:05:07 PM2/14/16
to parboiled2.org User List
Hi Mathias,

That makes sense; thanks for the explanation. Also, pointing out the CharPredicate construction penalty was useful. I'll keep that in mind.

Thanks for the help!

Mark
Reply all
Reply to author
Forward
0 new messages