Request for information about custom lexer

21 views
Skip to first unread message

Martin

unread,
Sep 26, 2014, 11:19:09 AM9/26/14
to aldor...@googlegroups.com
Hi,

I think I am making progress with my Xtext based IDE for Aldor (although progress is very slow)
https://github.com/martinbaker/euclideanspace

Can anyone tell me where the custom lexer for Aldor is located, I've looked in scan.c and token.c, but I think there might be more going on than I can find there.
https://github.com/pippijn/aldor/blob/master/aldor/aldor/src/scan.c
https://github.com/pippijn/aldor/blob/master/aldor/aldor/src/token.c

Example of the sort of problem I'm finding:

Implicit Semicolons
-------------------
According to the Aldor User Guide: "An implicit semicolon is assumed, if possible, after a closing brace. This is determined by whether the following token may start a new expression."

Does anyone know where this is implemented?

Is it implemented in the grammar? If it is then it has got lost in my xtext translation:
https://github.com/martinbaker/euclideanspace/blob/master/com.euclideanspace.aldor/src/com/euclideanspace/aldor/Editor.xtext
Using this xtext parser, the following won't parse:

double(n: Integer): Integer == {n*2}
triple(n: Integer): Integer == {n*3}
quad(n: Integer): Integer == {n*4}

but this works fine:

double(n: Integer): Integer == {n*2};
triple(n: Integer): Integer == {n*3};
quad(n: Integer): Integer == {n*4}

So, unless I've missed something, the semicolons are inserted by the custom scanner/lexer? I've had a quick glance through files like scan.c and token.c but I have not noticed anywhere where it could be done.

Any thoughts?

Custom Lexer
------------
I have worked out how to customise the lexer in Xtext by implementing the following:
https://github.com/martinbaker/euclideanspace/blob/master/com.euclideanspace.aldor/src/com/euclideanspace/aldor/CustomLexer.java

This allows me to customise nextToken() which allows me to do a lot of customisation (provided it does not need too much lookahead).

I have already implemented this phantom semicolon insertion after curly brackets, but not the exceptions.

It would therefore be useful if I could identify all the ways that the Aldor lexer is customised so that I can attempt to replicate that.

So hints of where to look might help me.

Martin

Peter Broadbery

unread,
Sep 26, 2014, 1:27:43 PM9/26/14
to Martin, aldor...@googlegroups.com

Just taken a quick look, and linear.c looks to be the answer.  This is a pass after tokenisation and before parsing.. idea is that it deals with pile mode, but as a bonus adds semicolons. This happens around line 1060.

Peter.

--
You received this message because you are subscribed to the Google Groups "aldor-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aldor-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Martin Baker

unread,
Sep 27, 2014, 3:52:48 AM9/27/14
to aldor...@googlegroups.com
On 26/09/14 18:27, Peter Broadbery wrote:
> Just taken a quick look, and linear.c looks to be the answer. This is a
> pass after tokenisation and before parsing.. idea is that it deals with
> pile mode, but as a bonus adds semicolons. This happens around line 1060.

Thank you Peter, it would have been very difficult to have found that by
myself.

Looks like it uses tokInfoTable to lookup flags like isFollower,
isOpener, isCloser and so on for each type of token so I will implement
this table in java.

Thanks,

Martin


Reply all
Reply to author
Forward
0 new messages