[proposal] Tree-Sitter grammar parser for Elixir

291 vues
Accéder directement au premier message non lu

João Paulo Silva de Souza

non lue,
28 mars 2019, 22:10:1428/03/2019
à elixir-lang-core
Seems like I wrongly opened an issue instead of posting here first -> https://github.com/elixir-lang/elixir/issues/8917.

My reasoning is that this is not a language feature and thus I thought I could skip the proposal process. My apologies.

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

https://tree-sitter.github.io/tree-sitter/#available-parsers

It seems like Elixir is not in the roadmap for language support.

Atom has the most complete implementation currently. I expect other editors (e.g. Vim, VSCode, Emacs, etc...) to follow in adoption when the API stabilizes and more bugs are fixed.

Project: https://github.com/tree-sitter/tree-sitter
Neovim: https://github.com/neovim/neovim/pull/9219
Emacs: https://github.com/karlotness/tree-sitter.el
VSCode: https://github.com/Microsoft/vscode/issues/50140

Example grammar for Javascript - https://github.com/tree-sitter/tree-sitter-javascript/blob/master/grammar.js

Ben Wilson

non lue,
29 mars 2019, 06:41:0929/03/2019
à elixir-lang-core
Can you elaborate for those of us with less context? What is tree-sitter? What are you proposing?

João Paulo Silva de Souza

non lue,
3 avr. 2019, 04:46:1103/04/2019
à elixir-lang-core
Ben, thank you for the attention. I held myself off elaborating further because the introductory content in the project's page does a much better job.

Tree-Sitter parses code and builds an AST out of the nodes. That same tree can then be smartly updated continuously by providing the locations where the code has changed, thus the cost and speed for re-evaluating the tree is much reduced, resulting in pretty much instant feedback and lower CPU usage.
Tree-Sitter also is able to highlight even incorrect syntax that would otherwise break completely in regex-based grammars.

More from the main author of Tree-Sitter: https://news.ycombinator.com/item?id=18349488

Tree-sitter is already usable in Atom/Emacs and that trend is very likely to be followed in all other editors. Neovim's progress is getting there quickly.

The problem: Elixir is not scheduled to be supported officially by the Tree-Sitter team.
What I am proposing: A grammar for Elixir to be written in Javascript.

Rich Morin

non lue,
3 avr. 2019, 05:32:0803/04/2019
à elixir-l...@googlegroups.com
Writing regexes to parse programming languages is painful at best
and impossible at worst. (I looked into trying to improve BBEdit's
rules for Elixir and backed away quickly :-/). So, I'm glad to see
a Real Parser (TM) being applied to the problem of code parsing in
text editors.

That said, I wonder whether Tree-Sitter can handle the effects of
Elixir macros on the program syntax. Could this be a problem?

-r

João Paulo Silva de Souza

non lue,
3 avr. 2019, 06:12:1903/04/2019
à elixir-lang-core
What exactly are those effects that could pose issues? Nested macro blocks inside of another macro? Ambiguous syntax due to macro usage?

IMO Tree-Sitter will be at least as correct as current RegExp syntaxes, while being faster, with the upside of potentially turning out much better. The worst case scenario of a faster sidegrade is still an improvement, so this is worth of consideration in my opinion.
Le message a été supprimé

João Paulo Silva de Souza

non lue,
3 avr. 2019, 06:31:1603/04/2019
à elixir-lang-core
I should also mention that Tree-Sitter still isn't as wide as the makers intend it to be (e.g. it doesn't handle gracefully many megabytes of code in a single file), but it's already plenty usable for the usual cases.

The reason for bringing this topic up is to keep it under radar. I doubt the grammar API is drastically gonna change, so it could be done now, but I think it would also be fine to postpone consideration until VSCode & friends are all fully integrated into it.

unif...@gmail.com

non lue,
3 avr. 2019, 07:29:3703/04/2019
à elixir-lang-core
Hi Rich
I am new to this whole ecosystem, and don't use BBEdit, 
but regarding development tools, have you had a look at 
CodeRunner and their language setup:

They use textmate language grammar which seems to be regular expression based


This is what you can use for CodeRunner ( if that is your thing ):


R.
Fridrik.

Norbert Melzer

non lue,
3 avr. 2019, 08:50:3103/04/2019
à elixir-lang-core
You said, you haven't elaborated on tree-parsers benefits, as you think the linked page did a better job than you could.

But the web page tells me about nothing.

On the first paragraph it looks like if it wants to be a modern replacement for YACC/BISON, not even telling us, what language it can generate code for.

One or two paragraphs later, it looks more as if it wants to be a language server, but with a different protocol and less features.

As we already have elixir-ls, and it does its job pretty well and LSP-clients exist for every major editor, I do not think, we need just another editor plugin.

Also, as the core team denied to take the LS under the org, I think, they should deny to do so for this as well, for the same reason they had for the LS last year. (TLDR: burden of maintenance is to high).

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/899f266d-6506-4aa6-bf31-0807a6d82786%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Allen Madsen

non lue,
3 avr. 2019, 09:55:2203/04/2019
à elixir-l...@googlegroups.com
I don't see this as something core wants to take on, but it would be great if there were a group of people dedicated to editor support. The closest thing to that right now is: https://github.com/elixir-lsp

Rich Morin

non lue,
3 avr. 2019, 11:19:3603/04/2019
à elixir-l...@googlegroups.com
> On Apr 3, 2019, at 3:12 AM, João Paulo Silva de Souza <joao.paulo...@hotmail.com> wrote:
>
> What exactly are those effects that could pose issues?
> Nested macro blocks inside of another macro?
> Ambiguous syntax due to macro usage?

Sorry, but if I had had a real effect to suggest, I would have done so.
I simply wondered about whether Tree-Sitter's general approach is able
to handle all the syntax variations that defining macros might cause.

> IMO Tree-Sitter will be at least as correct as current RegExp syntaxes,
> while being faster, with the upside of potentially turning out much better.

Point taken. BBEdit has something called a Codeless Language Module (CLM):

https://www.barebones.com/support/develop/clm.html

Basically, this is a data structure containing regular expressions.
Using this, BBEdit can handle the recognition needs for some languages.
However, Ruby is one of the languages which cannot use a CLM. AFAICT,
this is because of irregularities in its syntax. Because Elixir's
syntax is similar to Ruby's, it may have similar irregularities. On
the other hand, José was pretty careful in designing Elixir's syntax,
so he may have avoided these sorts of issues. I simply don't know...

In any event, there doesn't seem to be any compelling reason for the
Elixir core to get involved with this. However, if a group of Elixir
users is interested, they coould certainly get started on a grammar.

-r

João Paulo Silva de Souza

non lue,
3 avr. 2019, 13:05:3303/04/2019
à elixir-lang-core


On Wednesday, April 3, 2019 at 9:50:31 AM UTC-3, Norbert Melzer wrote:
You said, you haven't elaborated on tree-parsers benefits, as you think the linked page did a better job than you could.

But the web page tells me about nothing.

On the first paragraph it looks like if it wants to be a modern replacement for YACC/BISON, not even telling us, what language it can generate code for.

One or two paragraphs later, it looks more as if it wants to be a language server, but with a different protocol and less features.

As we already have elixir-ls, and it does its job pretty well and LSP-clients exist for every major editor, I do not think, we need just another editor plugin.

To my knowledge a LSP server is not a monolith a bunch of features coded into it - it's simply meant to be a service that complies to the protocol. Any LSP server could make use of Tree-Sitter underneath (or any other tool for that matter).

Tree-Sitter's parsed nodes can be used for things like autocompletion but that's beside the point. It doesn't do contextual analysis or anything of the sort (as it can be read here). It simply parses and classifies tokens based on the grammar given, and only for the document supplied.

Granted, this proposal is not even for a plug-in, IDE integration or anything of the sort. It's meant to be a call for action on building a grammar that could be used in editors. I had the idea to bring it here first because the core team is more likely to understand the grammar (and it's ambiguous/corner cases) better.


Also, as the core team denied to take the LS under the org, I think, they should deny to do so for this as well, for the same reason they had for the LS last year. (TLDR: burden of maintenance is to high).

On Fri, Mar 29, 2019 at 3:10 AM João Paulo Silva de Souza <joao.paul...@hotmail.com> wrote:
Seems like I wrongly opened an issue instead of posting here first -> https://github.com/elixir-lang/elixir/issues/8917.

My reasoning is that this is not a language feature and thus I thought I could skip the proposal process. My apologies.

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

https://tree-sitter.github.io/tree-sitter/#available-parsers

It seems like Elixir is not in the roadmap for language support.

Atom has the most complete implementation currently. I expect other editors (e.g. Vim, VSCode, Emacs, etc...) to follow in adoption when the API stabilizes and more bugs are fixed.

Project: https://github.com/tree-sitter/tree-sitter
Neovim: https://github.com/neovim/neovim/pull/9219
Emacs: https://github.com/karlotness/tree-sitter.el
VSCode: https://github.com/Microsoft/vscode/issues/50140

Example grammar for Javascript - https://github.com/tree-sitter/tree-sitter-javascript/blob/master/grammar.js

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-l...@googlegroups.com.
Répondre à tous
Répondre à l'auteur
Transférer
0 nouveau message