ceylon.ast: Progress report 1

75 views
Skip to first unread message

Lucas Werkmeister

unread,
Jun 17, 2014, 1:15:06 PM6/17/14
to ceylo...@googlegroups.com
Hi all!

This is a progress report on my GSoC project, ceylon.ast.

Where am I right now?
Right now, the architecture is mostly decided, and I’m just busy adding all the node types. You can track the progress there on the tracking issue #9. (At the time of writing: 60 of 1405 tasks done.)

What happened?
During the “community bonding” phase – April 21 to May 19 – I was working out the architecture in lots of GitHub issues. Highlights:
  • There is no ceylon.ast module; instead, we have the separate ceylon.ast.api and ceylon.ast.redhat modules. ceylon.ast.api is a pure Ceylon, immutable, beautiful, fully ceylonic, free of Java hacks Ceylon AST; ceylon.ast.redhat contains conversion between the ceylon.ast and compiler ASTs, as well as functions to compile AST nodes from code (using the compiler). #6, #7, #2.
  • We offer a general Transformer<out Result> interface to operate on the AST, in two variants: WideningTransformer, where transformUIdentifier => transformIdentifier (Visitor is a WideningTransformer<Anything>), and NarrowingTransformer, where transformIdentifier switches on the subject and calls transformUIdentifier or transformLIdentifier respectively (Editor is a NarrowingTransformer<Node>, CeylonExpressionTransformer – creates Ceylon code that’s an expression which evaluates to a copy of a subject – is a NarrowingTransformer<String>, and RedHatTransformer is a NarrowingTransformer<JNode>). #25, #15.
    (The Transformer actually only happened a week ago – well into the “students coding” phase – which is a lot later than I’d have liked.)
  • The expression node types will form a hierarchy that makes it impossible to create nodes that would be parsed with the wrong associativity: For example, the type system would bar you from using a SumExpression as the left or right side of a ProductExpression because “1 + 2 * 3” is parsed as “1 + (2 * 3)” and the parser can never produce the meaning “(1 + 2) * 3” without parentheses. (At the bottom of this hierarchy is ParExpression, which closes the circle.) #13.

What happens next?
I cont
inue adding all the node types, and eventually merge 72469c4 back into master (at the moment, it exposes ceylon-compiler#1672).

After I’m done – either during GSoC if I’m fast enough, or afterwards otherwise – I change the ceylon.formatter to work on ceylon.ast ASTs. (The long-term goal is that someone – possibly me – writes a Ceylon parser in Ceylon, thus creating a completely Ceylon Ceylon formatter. The even longer-term dream is then obviously a complete compiler in pure Ceylon :) )

How can you help me?
I’m so glad you ask ;-) On every issue labeled “discussion” (list), feedback is very much welcome. Most of these are already done, but there are still some open ones where I’d like to get feedback. (I’d like to highlight #17, because adding that requires a change in every node type, and the later we decide to do it, the more work it is.)

If you think you want to use ceylon.ast somewhere or somehow, please contact me – the more I know about how it’s used, the better can I make certain decisions. For example, how important is #19 to you?

And of course, feel free to contribute code to ceylon.ast if you want to! Adding the node types is mostly straightforward. (Please read CONTRIBUTING.md first, though. I won’t be overly rigorous with those rules, but it would still be nice if you complied with them :) )

Any questions?
You can contact me

  • right here, in replies to this e-mail
  • in the #ceylonlang IRC channel on freenode
  • on gitter (nothing has happened there yet, but I can hope, right?)
  • via e-mail (my Git author e-mail)
  • via Skype, if you already have my Skype handle or guess correctly

Best regards, and have a nice evening!
Lucas Werkmeister

PS: Note to self: Don’t write the next progress report in Google Groups. Write it on GitHub, in Markdown (screw WYSIWYG), and then copy the result into the Google Groups editor. Switching the font every time, when I’m used to simply typing a backtick, was massively annoying.

Gavin King

unread,
Jun 17, 2014, 2:39:37 PM6/17/14
to ceylo...@googlegroups.com
Two questions:

1. can you transform a typechecker AST to a ceylon.ast AST with a function call?
2. can you transform a ceylon.ast AST to a typechecker AST with a function call?

If you can do both those things then it's really a very small step to
a useful macro facility. (I'm talking about the kind of macros that
operate only at the local syntactic level, not at the "model"/global
level.)

Indeed, this kind of macro facility is likely enough to be able to
help us solve a couple of very interesting problems:

- LINQ-style queries
- F#-style type "providers"
- annotation-based generation of equals()/hash/string
> --
> You received this message because you are subscribed to the Google Groups
> "ceylon-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ceylon-dev+...@googlegroups.com.
> To post to this group, send email to ceylo...@googlegroups.com.
> Visit this group at http://groups.google.com/group/ceylon-dev.
> For more options, visit https://groups.google.com/d/optout.



--
Gavin King
ga...@ceylon-lang.org
http://profiles.google.com/gavin.king
http://ceylon-lang.org
http://hibernate.org
http://seamframework.org

Lucas Werkmeister

unread,
Jun 17, 2014, 2:49:02 PM6/17/14
to ceylo...@googlegroups.com
CompilationUnit ceylonAst = nothing;
JCompilationUnit ceylonAstAsRedhatAst = RedHatTransformer(TokenFactoryImpl()).transformCompilationUnit(ceylonAst);

JCompilationUnit redhatAst = nothing;
CompilationUnit redhatAstAsCeylonAst = compilationUnitFromCeylon(redhatAst);

This is how it will most likely look. At the moment, however:
  • RedHatTransformer doesn’t refine transformCompilationUnit, so it still has the generic return type JNode (and is unimplemented)
  • TokenFactoryImpl does not exist – there’s an unshared SimpleTokenFactory in the test package, but it’s not enough for “real” use (does not count token index, start position)
  • compilationUnitFromCeylon does not exist.
Of course, if you really want one function call, I can create a “boilerplate” method for the first conversion.

Gavin King

unread,
Jun 17, 2014, 2:59:19 PM6/17/14
to ceylo...@googlegroups.com
So currently you can produce a typechecker AST from a Ceylon AST but not vice-versa?

Sent from my iPhone

Lucas Werkmeister

unread,
Jun 17, 2014, 3:02:33 PM6/17/14
to ceylo...@googlegroups.com
No, it’s not like that. CompilationUnit doesn’t really exist either (it’s just an empty class without any logic or content).

However, the nodes that are already implemented – literals, self references, and basic types – can also already be transformed back and forth. Example test

Stephane Epardaud

unread,
Jun 19, 2014, 11:12:47 AM6/19/14
to ceylon-dev
Well, this sounds great, except "60 of 1405". That sounds like a lot of work. Is this really that hard? The AST in the typechecker is generated from a rather tractable text file, couldn't we automate some of that?
Stéphane Épardaud

Lucas Werkmeister

unread,
Jun 19, 2014, 11:24:17 AM6/19/14
to ceylo...@googlegroups.com
First of all: I changed the tracking issue, and by the new count, I’m at 19 of 281, which is marginally better :) (The old tracking issue had subtasks, but I’ve decided that they’re useless to track, since the typechecker forces me to do most of them anyway.)

If you’re talking about auto-generating the source code, then I’m really hesitant about that. Part of what I really don’t like about the existing compiler is that there is virtually no documentation on the AST nodes.
For example: quick, tell me the difference between Type and StaticType, and Term and Expression. Those are fundamental nodes, and the difference between them is vital – but the only documentation on them is “An expression” on Expression.
I also don’t like the feeling of lost flexibility that this gives me.

However, if you’re talking about generating the source code once, just to save me some time, and then I would adapt it as I want and commit it when it’s done… that would be a great idea, and I could definitely do that. There’s definitely a lot of repetitive work that I could automate.

Stephane Epardaud

unread,
Jun 19, 2014, 11:28:31 AM6/19/14
to ceylon-dev
The second one, if you can't do the first kind and maintain it automated. But generating the code does not mean that it cannot have documentation ;)



For more options, visit https://groups.google.com/d/optout.



--
Stéphane Épardaud

Lucas Werkmeister

unread,
Jun 19, 2014, 11:38:49 AM6/19/14
to ceylo...@googlegroups.com
True, but you have 117 doc strings in Ceylon.nodes, with 851 nodes total. Every single one of those doc strings is only a single sentence, and most are utterly trivial (“A package descriptor” for PACKAGE_DESCRIPTOR, “A declaration” for DECLARATION, “An assertion” for ASSERTION, “A list of named arguments” for NAMED_ARGUMENT_LIST etc.). That doesn’t suggest to me that generated code encourages documentation efforts :)

Also, the Ceylon AST gets a bit more complicated because, for instance, I have to define how I pass the children on to the superclass’ initializer. In some cases, I need to spread something. In other cases, I even need an auxiliary function to deal with optional children.

I’d much rather have a simple source-gen for the common cases and hand-adapt the generated code than making the source-gen complicated enough to work in all cases.

Stephane Epardaud

unread,
Jun 19, 2014, 12:01:42 PM6/19/14
to ceylon-dev
Hey, whatever makes it easier for you to complete it, and keep it up-to-date ;)



For more options, visit https://groups.google.com/d/optout.



--
Stéphane Épardaud

Lucas Werkmeister

unread,
Jun 19, 2014, 12:12:28 PM6/19/14
to ceylo...@googlegroups.com
Ideas for the source-gen in https://gist.github.com/lucaswerkmeister/a4da0fa5d9d5b14cc3e9. I’ll probably write it in Ceylon too. (A bit too complicated for a shell script because I insert into alphabetically sorted lists.)
Reply all
Reply to author
Forward
0 new messages