Problem adapting eiffel parser

48 views
Skip to first unread message

Jimmy Johnson

unread,
Feb 12, 2015, 7:21:16 PM2/12/15
to eiffelst...@googlegroups.com
I tried to used the parser from the 15.01_dev release by separating it from the complete compiler.

my_parse_feature
  local
    p: EIFFEL_PARSER
    f: KL_BINARY_INPUT_FILE
  do
    create p.make
    create f.make ("application.e")
    f.open_read
    p.parse (f)
  end

With Project-Settings/Debug turned on for "gelex" it appears that the parser/lexer only reads to the colon after the word "description" in:

note
    description :  "this is my application root class"
    date           :  "$Date$"
...

How do I make the parser digest the whole file?

jjj

Emmanuel Stapf

unread,
Feb 16, 2015, 9:17:28 AM2/16/15
to eiffelst...@googlegroups.com

One of the issue with the Eiffel syntax is that there are several ones. Indeed a while back the note keyword did not exist, it was `indexing’. As a result we configured the Eiffel parser to handle all of them.

 

So when parsing, you have to choose which syntax you are going to target. You do that by configuring the parser object as in:

 

parser.set_syntax_version ({EIFFEL_SCANNER}.provisional_syntax)

 

Note that EIFFEL_SCANNER defines the following possible syntax:

 

                ecma_syntax: NATURAL_8 = 0x00

                                                -- Syntax strictly follows the ECMA specification

 

                obsolete_syntax: NATURAL_8 = 0x01

                                                -- Allows pre-ECMA keywords and ignore new ECMA keywords such as `note', `attribute', `attached' and `detachable'

 

                transitional_syntax: NATURAL_8 = 0x2

                                                -- Allows both pre and ECMA keywords

 

                provisional_syntax: NATURAL_8 = 0x3

                                                -- ECMA syntax + possible future extensions

 

This should solve your issue.

 

Manu

 

--
For more messaging options, visit this group at http://forum.eiffel.com.
Information on the Eiffelstudio project: http://dev.eiffel.com.

Jimmy Johnson

unread,
Feb 16, 2015, 2:29:37 PM2/16/15
to eiffelst...@googlegroups.com, ma...@eiffel.com
Manu, after doing as you said the parser does a little better.  Get precondition violation on "has_default" in EIFFEL_LIST.make_filled.  Here is the last few lines of the output from debug with "geyacc" set to true:

Entering state 550

Reading a token

Next token is 285

Reducing via rule #35

Executing parser user-code from file 'eiffel.y' at line 511


Removing the indexing clause just pushes the same violation down to the parent_list in the inheritance section.

Emmanuel Stapf

unread,
Feb 16, 2015, 3:09:17 PM2/16/15
to Jimmy Johnson, eiffelst...@googlegroups.com

It might be that you are doing it in void-safe mode. We currently only use the Eiffel parser in non-void-safe mode as the rest of the EiffelStudio code. Can you try in non-void-safe mode and see if it works better?

 

Thanks,

Manu

Jimmy Johnson

unread,
Feb 16, 2015, 4:06:44 PM2/16/15
to eiffelst...@googlegroups.com, jjj...@g.uky.edu, ma...@eiffel.com
Void Safety:  No;  Syntax: Standard Syntax (I tried provisional and standard); attached by default True;

Emmanuel Stapf

unread,
Feb 16, 2015, 4:15:52 PM2/16/15
to Jimmy Johnson, eiffelst...@googlegroups.com

You must be using void-safe EiffelBase otherwise you would not get the check violation:

 

Get precondition violation on "has_default" in EIFFEL_LIST.make_filled.

 

So I should have been more precise, you need to use the non-void-safe versions of libraries as well.

Jimmy Johnson

unread,
Feb 17, 2015, 5:03:30 PM2/17/15
to eiffelst...@googlegroups.com, jjj...@g.uky.edu, ma...@eiffel.com
Yes, thank you.  Now I can parse a single file.  Not sure what I have when done, though.  Is there an AST?  If so how do I visit the nodes?  How complete is the AST?  Are create statements, assignment statements, locals, and invariant nodes in the tree?
Given this AST how can I reproduce an equivalent text (i.e. dot e file) from it?
jjj


On Monday, February 16, 2015 at 4:15:52 PM UTC-5, Emmanuel Stapf wrote:

You must be using void-safe EiffelBase otherwise you would not get the check violation:

So I should have been more precise, you need to use the non-void-safe versions of libraries as well.

Manu

 

Emmanuel Stapf

unread,
Feb 18, 2015, 2:44:27 AM2/18/15
to Jimmy Johnson, eiffelst...@googlegroups.com

Hi,

 

Once the file is parsed, you can query if the parsing was successful or not, and if it is then you have access to `root_node’ which contains the top level node of the AST.

 

To regenerate the text from it, then you have to create the parser instance with the AST_ROUNDTRIP_FACTORY which keeps all the whitespaces, and then use the following visitor to output the text back: AST_ROUNDTRIP_PRINTER_VISITOR.

 

Regards,

Manu

 

From: Jimmy Johnson [mailto:jjj...@g.uky.edu]
Sent: Tuesday, February 17, 2015 23:03
To: eiffelst...@googlegroups.com
Cc: jjj...@g.uky.edu; ma...@eiffel.com
Subject: Re: [EiffelStudio Dev] Problem adapting eiffel parser

 

Yes, thank you.  Now I can parse a single file.  Not sure what I have when done, though.  Is there an AST?  If so how do I visit the nodes?  How complete is the AST?  Are create statements, assignment statements, locals, and invariant nodes in the tree?

Jimmy Johnson

unread,
Feb 18, 2015, 2:31:08 PM2/18/15
to eiffelst...@googlegroups.com, jjj...@g.uky.edu, ma...@eiffel.com
 

Once the file is parsed, you can query if the parsing was successful or not, and if it is then you have access to `root_node’ which contains the top level node of the AST.

To regenerate the text from it, then you have to create the parser instance with the AST_ROUNDTRIP_FACTORY which keeps all the whitespaces, and then use the following visitor to output the text back: AST_ROUNDTRIP_PRINTER_VISITOR.

I tried:
  local
    f: KL_BINARY_INPUT_FILE
    p: EIFFEL_PARSER
    v: AST_ROUNDTRIP_PRINTER_VISITOR
  do
     create p.make_with_factor (create {AST_ROUNDTRIP_FACTORY})
     create f.make ("application.e")  -- in working directory
     p.set_syntax_version ({EIFFEL_SCANNER}.provisional_syntax)
     p.parse (f)
     create v.make_with_default_context
     v.process_class_as (p.root_node)
  end

I get precondition violation in `process_class_as' - "is_valid_visitor: is_valid".  In `is_valid' both the `parsed_class' and the `internal_match_list' is Void.  The intent here was to output new code after a particular node (say assignment) was visited, but this would require keeping track of the new line numbers and passing to to the visiting features.
Might it be better to insert code as the file is parsed.  Say, in the .y file, after an assignment statement is processed, insert new code into the file being parsed at the point of the next token.  (I assume the parser knows the line number in the original file.)
jjj


Emmanuel Stapf

unread,
Feb 18, 2015, 4:10:28 PM2/18/15
to Jimmy Johnson, eiffelst...@googlegroups.com

You need to setup the context for `v’ as in:

 

v.setup (p.root_node, p.match_list, True, True)

 

and then perform the visite:

 

v.process_ast_node (visitor.parsed_class)

 

As for inserting the new code, I think it is best to add it as I mentioned earlier if you want the compiler to generate the code. Modifying the AST is more complicated than adding a new instruction in the generated code.

 

Manu

 

From: Jimmy Johnson [mailto:jjj...@g.uky.edu]
Sent: Wednesday, February 18, 2015 20:31
To: eiffelst...@googlegroups.com
Cc: jjj...@g.uky.edu; ma...@eiffel.com
Subject: Re: [EiffelStudio Dev] Problem adapting eiffel parser

 

 

Once the file is parsed, you can query if the parsing was successful or not, and if it is then you have access to `root_node’ which contains the top level node of the AST.

Jimmy Johnson

unread,
Feb 18, 2015, 4:51:14 PM2/18/15
to eiffelst...@googlegroups.com, jjj...@g.uky.edu, ma...@eiffel.com
Thanks I will try that.


As for inserting the new code, I think it is best to add it as I mentioned earlier if you want the compiler to generate the code. Modifying the AST is more complicated than adding a new instruction in the generated code.

I assumed the parser scanner returns tokens sequentially.  Since I need to insert code, such as a feature call after an assignment statement, I was thinking maybe I could write the feature call text just after the current location in the original eiffel file so that the next sequence of tokens to be analyzed by the parser is my new feature call.   ???

Emmanuel Stapf

unread,
Feb 19, 2015, 1:24:33 PM2/19/15
to Jimmy Johnson, eiffelst...@googlegroups.com

Although a scheme like you propose is doable it will be quite complicated since you will need to fiddle with the input buffer of the parser and this can be quite complicated.

 

Another suggestion is to create a different type of AST node when you encounter an assignment. This requires less modification, just the Eiffel.y file.

 

But my suggestion is to only change {AST_FEATURE_CHECKER_GENERATOR}.process_assign_as to have the compiler generate exactly what you want.

 

Regards,

Manu

 

From: Jimmy Johnson [mailto:jjj...@g.uky.edu]
Sent: Wednesday, February 18, 2015 22:51
To: eiffelst...@googlegroups.com
Cc: jjj...@g.uky.edu; ma...@eiffel.com
Subject: Re: [EiffelStudio Dev] Problem adapting eiffel parser

 

Thanks I will try that.

Jimmy Johnson

unread,
Feb 19, 2015, 5:55:13 PM2/19/15
to eiffelst...@googlegroups.com, jjj...@g.uky.edu, ma...@eiffel.com


On Thursday, February 19, 2015 at 1:24:33 PM UTC-5, Emmanuel Stapf wrote:

Although a scheme like you propose is doable it will be quite complicated since you will need to fiddle with the input buffer of the parser and this can be quite complicated.

Another suggestion is to create a different type of AST node when you encounter an assignment. This requires less modification, just the Eiffel.y file.

But my suggestion is to only change {AST_FEATURE_CHECKER_GENERATOR}.process_assign_as to have the compiler generate exactly what you want.

 In another message you said to add a feature call by creating an instance of FEATURE_B in `process_assign_as'.  How do I do this?  Where do I get the arguments for FEATURE_B.make?  How do I create a FEATURE_I, for example, when all I know is the class and feature name of the procedure I want to call?

Emmanuel Stapf

unread,
Feb 19, 2015, 6:05:01 PM2/19/15
to Jimmy Johnson, eiffelst...@googlegroups.com

I’m just going to assume you are performing an unqualified call to a routine of the current class. In this case you get the FEATURE_I from the current class representation CLASS_C and querying feature_named to get the FEATURE_I instance.

 

If the call is qualified, it is more complicated since you need an object and I do not know from where it is coming from.

 

Manu

 

From: Jimmy Johnson [mailto:jjj...@g.uky.edu]
Sent: Thursday, February 19, 2015 23:55
To: eiffelst...@googlegroups.com
Cc: jjj...@g.uky.edu; ma...@eiffel.com
Subject: Re: [EiffelStudio Dev] Problem adapting eiffel parser

 

Jimmy Johnson

unread,
Feb 19, 2015, 9:43:40 PM2/19/15
to eiffelst...@googlegroups.com, jjj...@g.uky.edu, ma...@eiffel.com


On Thursday, February 19, 2015 at 6:05:01 PM UTC-5, Emmanuel Stapf wrote:

I’m just going to assume you are performing an unqualified call to a routine of the current class. In this case you get the FEATURE_I from the current class representation CLASS_C and querying feature_named to get the FEATURE_I instance.

The problem is that the class does not have the feature yet.  That is what I am trying to inject into the program--a new feature and a call to it.
 

If the call is qualified, it is more complicated since you need an object and I do not know from where it is coming from.

 What if the feature was incapsulated in a once object inherited from [a new] ANY class [where I have modified ANY]?
Would that be doable?

Alexander Kogtenkov

unread,
Feb 20, 2015, 2:52:44 AM2/20/15
to eiffelst...@googlegroups.com, jjj...@g.uky.edu, Emmanuel Stapf
It might be easier to help (or at least to be more specific) if you can express the transformation of an original program into a modified version, e.g. every call

   x := y.foobar (z)

becomes

   x := z.barfoo (y)

i.e. to describe how the modified version should look like for the given original version, in terms of Eiffel.

Regards,

Alexander Kogtenkov

Jimmy Johnson

unread,
Feb 20, 2015, 4:41:17 PM2/20/15
to eiffelst...@googlegroups.com, jjj...@g.uky.edu, ma...@eiffel.com

I hope this shows it:


1)

     x := y.foobar (z) 

becomes

    x := y.foobar (z)

    if property_holds (Current) and is_attribute (x) then

        do_procedure_1 (Current, x)

    end


2)

a qualified feature call:

    a.f

becomes

    a.f

    do_procedure_2 (a) 


3)

    new_keyword a_reference

becomes

    do_procedure_3 (a_reference)

if `a_reference’ is non-Void and a non-basic type; otherwise report syntax, type, or validity error.


I assume I can change ANY so that it has a once `Procedure_manager’ from which `property_holds’ and the `do_procedure_x’ features can be accessed, as in:

    a.f

    Procedure_manager.do_procedure_2 (a)


Ideally, I only want this code added if some class in the system contains “new_keyword”; if not then compile as it does now.  Also, the calls to `do_procedure_2’ and `do_procedure_3’ can be (would be best if) delayed until the end of the enclosing feature.


thanks,

jjj

Jimmy Johnson

unread,
Feb 22, 2015, 12:11:00 AM2/22/15
to eiffelst...@googlegroups.com, jjj...@g.uky.edu, ma...@eiffel.com
So, is the point I should modify in AST_FEATURE_CHECKER.process_assign_as just after the comment that says, "Only create BYTE code if there is no error"?


it seems I have to:
create byte_list.make (10)
create feature_b.make ( ?????? )
byte_list.extend (l_assign) -- the previously generated ASSIGN_B
byte_list.extend (feature_b)
create instr_list_b.make (byte_list)
last_byte_node := instr_list_b

I don't know what you mean by "get the FEATURE_I from the current class representation". What is the current class representation? Also, how do I fill the other parameters to create a FEATURE_B? (i.e. the t: TYPE_A and the p_type: TYPE_A)?

Am I even close?

And, I wonder why are my messages on this group double spaced.


On Thursday, February 19, 2015 at 6:05:01 PM UTC-5, Emmanuel Stapf wrote:

I’m just going to assume you are performing an unqualified call to a routine of the current class. In this case you get the FEATURE_I from the current class representation CLASS_C and querying feature_named to get the FEATURE_I instance.

 

 In another message you said to add a feature call by creating an instance of FEATURE_B in `process_assign_as'.  How do I do this?  Where do I get the arguments for FEATURE_B.make?  How do I create a FEATURE_I, for example, when all I know is the class and feature name of the procedure I want to call?

 

Alexander Kogtenkov

unread,
Feb 24, 2015, 7:05:18 AM2/24/15
to eiffelst...@googlegroups.com, jjj...@g.uky.edu, Emmanuel Stapf

> 1)

>    x := y.foobar (z) 

> becomes

>    x := y.foobar (z)

>    if property_holds (Current) and is_attribute (x) then

>        do_procedure_1 (Current, x)

>    end


If "property_holds (Current) and is_attribute (x)" is computed at compile time, you just need to add a call to do_procedure_1. It would be like you write:

   create byte_list.make (10)
   create feature_b.make ( ?????? )
   byte_list.extend (l_assign) -- the previously generated ASSIGN_B
   byte_list.extend (feature_b)
   c reate instr_list_b.make (byte_list)
   last_byte_node := instr_list_b

except that "create feature_b.make ( ?????? )" would be

   feature_b := do_procedure_1_feature_i.access_for_feature (none_type, Void, False, False)

where "do_procedure_1_feature_i" is the corresponding FEATURE_I for "do_procedure_1" that you can get from CLASS_C using either "feature_of_name_id", "feature_named" or, better, "feature_of_rout_id". The latter is better because it works for inherited/renamed/selected features as well, but then you need to find the feature in its origin by name, lookup for its routine id and then used this id to find an inherited version. Then you may need to wrap it as in "process_instr_call_as" instead of using FEATURE_B directly. Also, do not forget to set arguments of the feature by calling "set_parameters" on "ACCESS_B". The parameters need to be properly initialized. You can have a look how to do it in "process_call".


> 2)

> a qualified feature call:

>     a.f

> becomes

>     a.f

>     do_procedure_2 (a) 

This is essentially the same as in the previous case.


> 3)

>     new_keyword a_reference

> becomes

>    do_procedure_3 (a_reference)

> if `a_reference’ is non-Void and a non-basic type; otherwise report syntax, type, or validity error.

For this you need a new type of an AST node. Then a_reference can be processes as a normal expression and after that "last_type" can be checked whether this is an attached reference type. If not, report an error, if yes, proceed as in the previous cases.

> I assume I can change ANY so that it has a once `Procedure_manager’ from which `property_holds’ and the `do_procedure_x’ features can be accessed, as in:

>     a.f

>     Procedure_manager.do_procedure_2 (a)

Making qualified calls is a bit more complicated, as you need to generate two feature calls instead of one. They are combined using NESTED_B. "process_instr_call_as" and "process_nested_as" can be used as an example to do it.

> Ideally, I only want this code added if some class in the system contains “new_keyword”; if not then compile as it does now.

This is possible only when the whole system is recompiled from scratch. If you want your project to be compiled incrementally, there is no other way as to generate the additional code all the time. An alternative would be to use a project option, but this seems to be an overkill at this stage.

>  Also, the calls to `do_procedure_2’ and `do_procedure_3’ can be (would be best if) delayed until the end of the enclosing feature.

Instead of generating the code immediately, you can leave it unchanged, but record what has to be generated elsewhere. Then in "process_do_as" you can add the new code at the end of the byte node list. However I'm not sure this would the correct semantics if, for example the code

     new_keyword a_reference

appears inside some other instruction, e.g.


    if something then

          new_keyword a_reference

    end


Regards,

Alexander Kogtenkov

Jimmy Johnson

unread,
Feb 24, 2015, 3:15:06 PM2/24/15
to eiffelst...@googlegroups.com, jjj...@g.uky.edu, ma...@eiffel.com
Thank you Alexander.  I am eager to try your suggestions.  Just a couple of follow-up questions below. 

On Tuesday, February 24, 2015 at 7:05:18 AM UTC-5, Alexander Kogtenkov wrote:

> 1)

>    x := y.foobar (z) 

> becomes

>    x := y.foobar (z)

>    if property_holds (Current) and is_attribute (x) then

>        do_procedure_1 (Current, x)

>    end


If "property_holds (Current) and is_attribute (x)" is computed at compile time, you just need to add a call to do_procedure_1. It would be like you write:

What I really mean is, if construct "x" is an attribute (not a local variable) then ... I call a procedure.  How do I query this in the compiler so that if it is not an attribute the `do_procedure_1' is not inserted?

 
where "do_procedure_1_feature_i" is the corresponding FEATURE_I for "do_procedure_1" that you can get from CLASS_C using either "feature_of_name_id", "feature_named" or, better, "feature_of_rout_id". The latter is better because it works for inherited/renamed/selected features as well, but then you need to find the feature in its origin by name, lookup for its routine id and then used this id to find an inherited

Where do I get CLASS_C?  Is there an object at this local in the compiler (I think we are in `process_assign_as'?) or do I create a new CLASS_C for use here?


> Ideally, I only want this code added if some class in the system contains “new_keyword”; if not then compile as it does now.

This is possible only when the whole system is recompiled from scratch. If you want your project to be compiled incrementally, there is no other way as to generate the additional code all the time. An alternative would be to use a project option, but this seems to be an overkill at this stage.

I see and agree. 

Again thanks for spending so much time to give me the details I needed.
jjj

Alexander Kogtenkov

unread,
Feb 27, 2015, 3:52:19 AM2/27/15
to eiffelst...@googlegroups.com, jjj...@g.uky.edu, Emmanuel Stapf
> How do I query this in the compiler so that if it is not an attribute the `do_procedure_1' is not inserted?

First you check that it is not a local, an argument, an object-test local, etc. If this is done after the corresponding ACCESS_ID_AS has been processed, some flags are set on it, the details can be found in "process_access_id_as". Then a query "is_attribute" on the corresponding FEATURE_I object can be used to make sure this is not a routine.

> Where do I get CLASS_C?

All CLASS_C objects have been computed at this stage already. The current class can be accessed using "context.current_class", the class where the code is originally written - using "context.written_class". Some classes known by the compiler can be accessed directly, e.g. ANY - "system.any_class.compiled_class". Finally, you can reach any class by name, but the name depends on the current class, or, more precisely, its group. You can have a look at AST_CONTEXT.find_iteration_classes for an example.

Regards,

Alexander Kogtenkov

--

Jimmy Johnson

unread,
Mar 3, 2015, 8:52:19 PM3/3/15
to eiffelst...@googlegroups.com, jjj...@g.uky.edu, ma...@eiffel.com
Alexander,

Thank you for all the help.  I'm flattered that you think I can understand the compiler, but I am overwhelmed by features such as AST_FEATURE_CHECKER_GENERATOR.process_call which has over 700 lines of code.  I have tried various approaches: modifying ANY, overriding ANY, trying to understand `process_call', and modifying `process_assign_as' as previously discussed.  It is not working.

So, for myself, let me just describe one part of the problem again with more detail.  This project relates to automatic persistence (i.e. long-term data storage).  I want to call a feature, named `store,' after an assignment if the assignment is to an attribute (i.e not a local or object-test local) and, if possible, only if the target of the assignment is not a basic type.  (My code may be able to ignore basic types.)  In my library, `store' resides in class PERSISTENCE_MANAGER, of which I create a single instance as a once feature, `persistence_manager', in class PERSISTENCE_FACILITIES (which is shared throughout the library.)  Feature `store' takes one argument of type ANY; in this case, the target of the assignment.  Finally, and obviously, if the persistence classes are not in the universe, then compilation should proceed as if the compiler had not been modified. 

If it is not too much trouble, can you show me the code to do this?

Thanks and best regards,
Jimmy J. Johnson


Reply all
Reply to author
Forward
0 new messages