I don't know if it's helpful, but a few years back I played with writing a
grammar with the following attributes:
* Was written test first (to see if TDD grammar design was doable)
* No keywords at all
* Used indentation in a similar way to python
* Used a ":" to indicate the start of an indentation block
Due to using no keywords at all, everything was treated as a function call.
(The aim in a way was to see if it was possible to put a pythonlike syntax on
the front of a lisp-like language - using indentation & infix rather than
prefix / brackets)
The key result I found was that it was possible, but that indented blocks
needed some form of end keyword. That keyword didn't need to be defined
(though its nice to have conventions :), but it did need to be there.
I never attached a backend to the parser/lexer, but the code works and is
still accessible. I even gave a (slightly tongue in cheek :) lightning talk
on it at Europython in 2005. As a result:
Slides:
http://www.slideshare.net/kamaelian/swp-a-generic-language-parser
Code:
http:///www.cerenity.org/SWP-0.0.0.tar.gz
Other relevant stuff:
http:///www.cerenity.org/SWP/
Only posting this in case it's useful :)
That said, posting since hopefully of interest because fundamentally it treats
everything as a function call.
Regards,
Michael
--
http://yeoldeclue.com/blog
http://twitter.com/kamaelian
http://www.kamaelia.org/Home
--
http://yeoldeclue.com/blog
http://twitter.com/kamaelian
http://www.kamaelia.org/Home
I'd rather go for "coding guidelines", than implementing coding restrictions in the language.
The key result I found was that it was possible, but that indented blocks
needed some form of end keyword. That keyword didn't need to be defined
(though its nice to have conventions :), but it did need to be there.
For what it is worth, I am in favour of the Logix/Haskell approach.
--
Magic is insufficiently advanced technology.
if x
puts("x==true")
puts("outside all ifs")
Let's scan it with:
load('lib/file.re')
(~ok, scanned, _) = reia_scan::scan(File.read('indent.re').to_list())
commands
This currently scans to the following tokens:
[(~if,1),(~identifier,1,~x),(~eol,1),
(~indent,2),(~identifier,2,~puts),(~'(',2),(~string,2,"x==true"),(~')',2),(~eol,2),
(~dedent,3),
(~identifier,3,~puts),(~'(',3),(~string,3,"outside all ifs"),(~')',3),(~eol,3)]
And it looks like it should be easily parsed.
But reia_parser claims that "syntax error before: puts" on row 3
Let's look at it. That's how parser expects if expression to be:
if_expr -> if_op expr eol indent statements dedent :
(~if,1) matches if_op,
(~identifier,1,~x) matches expr, et c.
But if we look at the very beginning, we can see that:
statements -> statement : ['$1'].
statements -> statement statement_ending : ['$1'].
statements -> statement statement_ending statements : ['$1'|'$3'].
...
statement_ending -> ending_token : '$1'.
statement_ending -> statement_ending ending_token : '$1'.
ending_token -> ';' : '$1'.
ending_token -> eol : '$1'.
Let's modify our scanned:
scanned = [(~if,1),(~identifier,1,~x),(~eol,1),
(~indent,2),(~identifier,2,~puts),(~'(',2),(~string,2,"x==true"),(~')',2),(~eol,2),
(~dedent,3),(~eol,3),
(~identifier,3,~puts),(~'(',3),(~string,3,"outside all ifs"),(~')',3),(~eol,3)]
Yes, right, we've added an additional EOL after dedent, as STATEMENTS expect it before next STATEMENT
Let's parse it
reia_parse::parse(scanned)
(~ok,[(~if,1,(~identifier,1,~x),[(~funcall,2,(~identifier,2,~puts),[(~string,2,"x==true")])],(~else_clause,1,[(~atom,1,nil)
])),(~funcall,3,(~identifier,3,~puts),[(~string,3,"outside all ifs")])])
YES!
We've found a snake, now it's time to think what to do with it.
It's quite clear that most statements end with EOL (end-of-line), but some end with EOL _and_ DEDENT,
while we expect to have EOL even after multi-line multi-level STATEMENT.
It's late night already, and i don't see keyboard very sharp, same with the solution
Cheers, Phil
Is there a special reason for that?
It looks at a glance that
if x
if y
puts("y")
else
puts("not y")
puts("i'm only seen when x is true")
puts("outside all ifs")
can be parsed, and i don't see any possible conflicts here
statement ::= stmt_list NEWLINE | compound_stmtSo here we see, Python's stmt_lists (single-line statements separated by semicolons) need a NEWLINE at the end, but compound_stmts (which have indentation blocks) do not. In fact, Python's statements with indentation blocks have no statement separators whatsoever.
empty lines with any number of tabs/spaces, possibly ending with a comment
should be skipped ('$empty') during parsing
For what it is worth, I am in favour of the Logix/Haskell approach.
Fix me if i'm incorrect
The final goal here is to allow:
bar * 2
foo(1,2,3) do |bar|
.print()
and
if x
"yes"
else
"no"
.print()
mappings.add(Person)
.add(Account)
.add(Bank)
what prevents from adding a
puts("ground") on the same level of indentation?
This shouldn't confuse parser a lot, since yes, it begins with a dot
i don't really think the followong should be supported:
if if s == "30"
30
else
20
> 25
...
ugly
Hi Tony
The Haskell-like indentation rules in Logix worked out pretty well, but I never did anything like
.puts()
if true
'works'
else
'doesnt'
Although I can't remember if that was impossible or I just always parenthesised such things.
These days I find I actually prefer Ruby's syntax, with the 'end' markers, to something like python.
Cheers, Phil
+1
These days I find I actually prefer Ruby's syntax, with the 'end' markers, to something like python.
No objections.
The main thing i've choosen Ruby over Python some time ago was Python's compulsory indentation
I totally agree with this as well
+1