awesome
/ˈɔːs(ə)m/
adjective
--
You received this message because you are subscribed to the Google Groups "scala-internals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-interna...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Wow, impressive!
Now it's much easier to have a look at how much impact certain changes might have (like my favorite one, making anonymous partial function stuff less painful).
--
You received this message because you are subscribed to the Google Groups "scala-internals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-interna...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "scala-internals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-interna...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "scala-internals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-interna...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Sure, I'll have a look! I'd be happy to experiment with it in Typelevel, it's just that I'm currently extremely busy with other stuff I finally want to get done.
--
You received this message because you are subscribed to the Google Groups "Typelevel Development List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to typelevel+...@googlegroups.com.
To post to this group, send email to type...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/typelevel/d9111788-7c1c-4c28-8b47-933c4ba5f361%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "scala-internals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-interna...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "scala-internals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-interna...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
> How hard would it be to modify the parser so that it would also support parsing in quasiquote mode?It already does:Unit test that only passes because we're parsing string interpolations correctly:Not that hard at all ^_^
Another day, another blessing. After a few hour's effort, I have XML support working as of diff a0c39cdb0639a64c3277c8c0b269dd4372892331, at a cost of 120LOC. Since most projects don't use much XML, I threw https://github.com/lift/framework into the mix of parsed projects and made everything pass.
--
> We don't know what q"foo" means until we typecheck `q`, I don't see how we can eagerly parse the contents of the strings, unless it is some sort of "optimistic parsing" that can be rolled back later.That's not the parser's problem. That's the responsibility of whoever's manipulating the AST later. The parser should just care about the $ and ${} and $$ escapes and treat them appropriately. And that's what it does!
Couldn't we just not throw away the information that makes then different? There's no reason they should be identical ASTs. Looking at the token stream seems like a silly workaround. For example the body of the class could be an Option.
@denys with the range positions and the source code, you can do all the postprocessing you want to find the individual tokens in an AST node. The parser could provide them, but it doesn't seem necessary if someone else can do it in a separate phase.
We could also just collect this information as part of the initial AST if so desired. Remember, this parser is actually sane! This parser isn't a 7000LOC beast full of mutable state which you don't dare touch because you're scared of breaking things. We can modify it in a few hours! If we're going to be using it for something, we should stop thinking about "working around and reconstructing" and start thinking about "making it parse what we want" in the first place.
I don't fully understand your other request - parsing ".." and "$foo" - but here's my two interrpetations:
- You want to hardcode support for quasiquotes q"..." and q"""...""" directly into the main scala-parser, so you can parse the whole thing in one go and not have to do a two phase parsing. Easy! But you shouldn't do it! Macros do macro things and parser does parser things, and never shall thy meet.
- You want to have a fork of the parser that has additional syntactic rules that will be used to parse the stuff inside q"..." and q"""...""" as a second phase. Also easy! This is something I can get behind
Btw: As a developer I would very much prefer a compiler that is slower on initial parse but does speedy incremental updates. So the last x% performance that a hand written parser can give you might not even be an issue if we can get reactive compilation in exchange.
Since everyone is talking about performance, does anyone know how much of the time in the compiler and macros is spent parsing? My impression is that the vast vast vast majority of the time is spent in the typechecker, so much so that even if parsing took 2x as long it'd be no big deal anyway.
--On Mon, Dec 1, 2014 at 3:41 PM, Martin Mauch <martin...@gmail.com> wrote:Btw: As a developer I would very much prefer a compiler that is slower on initial parse but does speedy incremental updates. So the last x% performance that a hand written parser can give you might not even be an issue if we can get reactive compilation in exchange.
--
You received this message because you are subscribed to the Google Groups "scala-internals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-interna...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "scala-internals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-interna...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
> That said, the current dotty parser (hand-written, 2100 lines including error reporting, accurate positions and tree construction) achieves several hundred thousand lines a second. So parboiled still has some way to go to beat that :-)
Can you run the dotty parser against the projects I used to benchmark scala-parser? I'm only reach 50kLOC a second so I'm wondering what you're doing that's 10x faster, or whether the subset of things you're parsing are just easier to parse =P
For reference, here's the link:
> That said, the current dotty parser (hand-written, 2100 lines including error reporting, accurate positions and tree construction) achieves several hundred thousand lines a second. So parboiled still has some way to go to beat that :-)Can you run the dotty parser against the projects I used to benchmark scala-parser? I'm only reach 50kLOC a second so I'm wondering what you're doing that's 10x faster, or whether the subset of things you're parsing are just easier to parse =PFWIW, The scala compiler runs about 0.5-0.05kLOC a second, so the parsing portion of the compiler is 1%-0.1% of the compilation pipeline.
case class StringLiteral(str: String, holes: Seq[Hole], start: Int)
case class Hole(tree: Tree, start: Int, end: Int)
class QuasiquoteParser(val input: String, holes: Seq[Hole]) extends Parser{
val holeMap = holes.map(h => h.start -> h).toMap
def MaybeHole =
if (!holeMap.contains(cursor)) rule( MISMATHCH )
else rule( ANY.times(holeMap(cursor).end - holeMap(cursor).start ~ push(holeMap(cursor))
... use MaybeHole elsewhere in the grammar anywhere a hole could possibly appear ...
}Hi Martin,You say that your new dotty parser is very maintainable and I’m interested in exploring that comment.How do you deal with the interaction between rules? Essentially information such as FIRST sets must be shared, which means that it’s not quite as simple as writing the grammar rules down and then coding them independently. A change to a grammar rule in one place can necessitate a change to a parser somewhere else.As a concrete example, your grammar rule parser for simple types is:/** SimpleType ::= SimpleType TypeArgs* | SimpleType `#' Id* | StableId* | Path `.' type* | `(' ArgTypes `)'* | Refinement*/def simpleType(): Tree = simpleTypeRest {if (in.token == LPAREN)atPos(in.offset) { makeTupleOrParens(inParens(argTypes())) }else if (in.token == LBRACE)atPos(in.offset) { RefinedTypeTree(EmptyTree, refinement()) }else path(thisOK = false, handleSingletonType) match {case r @ SingletonTypeTree(_) => rcase r => convertToTypeId(r)}}First of all, the cases are not obviously derived directly from the relevant RHSes, so the “correctness” of the code is harder to see than if the correspondence was direct.But putting that aside, there’s a non-trivial step from the grammar rule to the parsing code, mostly due to the need to identify the FIRST sets of the grammar rule RHSes. If you were to change the grammar of Refinement, for example, in a way that affected its FIRST set, you would have to remember to come back to the SimpleType parser and update the second test. Do you not find this to be an issue in practice?
Perhaps it is mostly a function of how often your grammar changes. I would imagine if you are making frequent changes to the syntax, keeping all of this up-to-date would be quite error prone. An advantage of generator-based approaches is that the tool takes care of that for you.
Hey All,As some of you may know, I have recently spent a lot of time with Parboiled2.The result of this ~50 hours of effort is scala-parser, a Parboiled2 parser for the Scala programming language syntax. This parser is still not complete (missing XML syntax and unicode escapes) but apart from those two features for which I've explicitly blacklisted files, scala-parser is able to parse every single .scala file in
- scala/scala
- scala-js/scala-js
- scalaz/scalaz
- milessabin/shapeless
- akka/akka
And it's a total of ~700LOC, for everything, including all the un-documented features of Scala syntax that are no-where to be found in the lexical specification (e.g. semicolon insertion & context-sensitivity, string interpolation, key-word/key-operator lookahead requirements, macros, annotation-types-can't-have-symbolic-names, ...).Adding the missing XML syntax and unicode-escapes would be about 1-day of work, given how much time I've spent on the parser so far (i.e. not much).
Presumably having a complete, reasonable, lightweight, modular parser (e.g. not the one in scalac) for the language would be of use to a bunch of people. scala-parser currently just identifies grammar and blows up if it can't parse something, but its straightforward to make it spit out an AST or whatever else you want (range positions, etc.). I haven't actively benchmarked performance, but it walks and parses all 5 of those projects in 80 seconds, which seems plenty fast.At 700 LOC, this is basically compact enough to be an executable specification. The grammar in the Scala spec may be shorter, but it is also wrong and incomplete (to put it mildly) so, and worse, and there's no way forward to ensure it matches reality in any way. i.e. it's all lies and wishful thinking. This one actually works! And you can feed in reams and reams of Scala code to verify it matches reality.The parser is still not complete, but is pretty close. It would be cool if people could try it out and see if it's useful for anything they're working on, or help find/fix edge cases in the parser.Hopefully this is useful to someone ^_^