Understanding the Analyzer Package

296 views
Skip to first unread message

Isaac

unread,
Mar 30, 2021, 3:26:48 PM3/30/21
to Dart Analyzer Discussion
Hi,

I am fairly new to static analysis in general and am trying to find a good place to start with the dart analyzer.  I found a couple of conversations here about this, but they are several years old and the repo has clearly changed since then. I have 2 main questions: first, how does the analyzer work in general (the modification part), and second, how to extend the analyzer's code factorization for custom edits.

Around a year ago I attempted to write a program that would dynamically modify dart code. I did this by creating a handwritten parser for dart (v2.2) which could convert itself back to source code. I then created several wrapper libraries which understood how to modify the parser and thus create modified source code. This actually worked quite well. However,  about 6 months ago this stopped being very useful or reliable given several updates to the dart grammar.

Thus, for a short while (have since become busy with other things) I started worked on creating a program to automatically generate a parser (similar to ANTLR) to avoid future grammar inconsistencies. Now I'm beginning to work on this problem again and realized I should probably just extend the functionality of the Analyzer since it already seems to do what I want - but I would like to know in greater detail about how the dart analyzer actually works. While looking in this group, I found the following quote:

"I'm still curious as to why you want to modify the AST. If you're doing it to edit files, our experience is that that generally isn't the best way (from a user's perspective) to do so. And we have some generalized support for editing files (used for quick fixes and refactorings) that might be of use to you."

As well as:
"Many people assume that the best way to write code editing applications is to build an AST, modify the AST, then write the AST out as source code. But after many decades of experience writing such tools we have come to realize that it's actually much easier to modify the source code directly and then rebuild the AST from source in order to continue editing."

-----------------------------------------------------------------

So I'm clearly missing something when it comes to modifying source code if it is easier to NOT modify the AST. What I'm wondering is how is it possible to reliably refactor code without using the AST? Or is the AST still used in some helper role like identifying locations and then just updating the locations? I'm especially curious because I actually had a fairly positive/successful experience creating libraries to modify my custom AST - issues arose because of the inflexibility of updating my parser by hand.

So that's kinda my general how does it work question :). Secondly, I'd love to know if there are any examples of custom analyzer code so I can start understanding how to use the analyzer to modify source code. Does it use the AST or not?

Sorry for such a long post, I'm just very curious about this and trying to find a better way to accomplish my program now that I'm starting to get into it again - cheers!

-----------------------------------------------------------------

TLDR;
1. How does the analyzer modify code if it doesn't modify the AST? Or if it does, where does it do that?
2. Where to get started with creating custom refactor code using the analyzer?

Thanks!
Isaac

Brian Wilkerson

unread,
Mar 30, 2021, 5:55:38 PM3/30/21
to Dart Analyzer Discussion

How does the analyzer modify code if it doesn't modify the AST? Or if it does, where does it do that?

The `analyzer` package (if that's what you meant by "analyzer") doesn't support code modification. It exists to perform static analysis of Dart code and it represents the results of that analysis in a few ways, including an AST structure representing the syntactic structure of the code, an element / type model representing the semantic structure of the code, and diagnostics representing the problems found during static analysis. If you haven't seen it, there is some documentation about how to use the analyzer package in https://github.com/dart-lang/sdk/tree/master/pkg/analyzer/doc/tutor.

The modification of code is done by the `analyzer_plugin` and `analysis_server` packages. (The `analyzer_plugin` package is the portions of the analysis server code base that is useful for plugin authors.) They produce descriptions of edits that clients (such as command-line tools and IDEs) can apply to the files. (So technically they don't modify files, they only specify how to modify them.) This approach is used to support quick fixes and quick assists (aka code actions) and refactorings.

This code doesn't operate by modifying the AST. It works by examining the AST, which contains offsets for every token, and determining what changes would need to be made, then generates a sequence of textual replacement operations that clients can use to update the source code in the desired way. The concrete examples of how this is done are in the implementations of the fixes and assists (https://github.com/dart-lang/sdk/tree/master/pkg/analysis_server/lib/src/services/correction/dart) and the refactorings (https://github.com/dart-lang/sdk/tree/master/pkg/analysis_server/lib/src/services/refactoring).

Where to get started with creating custom refactor code using the analyzer?

I'm not sure whether you're asking about contributing to the analysis server or writing your own refactoring tool. If it's the former, then the locations above will give you a fairly good starting point, but I'm also happy to have a more detailed discussion about your specific use case. If it's the latter, then the locations above will give you one possible direction, but you might find other approaches more to your liking.

This feels like kind of a terse answer, but I'm happy to answer any additional or more specific questions that I might have missed answering.

--
You received this message because you are subscribed to the Google Groups "Dart Analyzer Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to analyzer-discu...@dartlang.org.
To view this discussion on the web visit https://groups.google.com/a/dartlang.org/d/msgid/analyzer-discuss/f4001904-fa54-42e8-a4ca-ecb149ed1512n%40dartlang.org.

Isaac

unread,
Mar 31, 2021, 11:33:28 PM3/31/21
to Dart Analyzer Discussion, brianwilkerson
Thank you - this was already very helpful in clearing up some of my questions. I have a few more as a follow-up (again very long, sorry...):

Regarding using/extending the analysis_server package:

they don't modify files, they only specify how to modify them

I believe I understand where this is done - but just want to confirm. The analysis_server packages would simply return a json response as specified here https://htmlpreview.github.io/?https://github.com/dart-lang/sdk/blob/master/pkg/analysis_server/doc/api.html. I can see from the links you posted how this works when traversing the AST. However, I want to make sure I am understanding this correctly, please correct me if I'm wrong:

To create a refactor, extend from some refactor class and override (among other functions) the "createChange()" function to return my own "SourceChange" object from the analyzer_plugin package.
To create a correction, similarly, extend from "CorrectionProducer" and override (among other functions) the "compute(ChangeBuilder)" and use the builder to make modifications.

One thing that isn't completely clear to me is the difference between a refactor and a correction. I understand conceptionally that a correction is for an error while a refactor isn't - but how are they different when returned by analysis_server to the client (plugin)? For example, with the API are they directly correlated to edit.getAvailableRefactoring and edit.getFixes respectively?

It's pretty clear to me how to create a plugin from the docs, I'll just need to spend some more time playing around with it.

I'm not sure whether you're asking about contributing to the analysis server or writing your own refactoring tool.

Right now I'm just working on my own refactoring tool and don't think it really applies to general use - but should I find cases or if that changes I would definitely be interested in contributing generally when applicable, just getting my feet wet right now :).
Regarding this, I'm assuming I can just fork - but want to be sure there won't be major breaking changes all the time. I assume not given this library is a language but just checking.

At this point, I feel I understand the overall process much better and now just need to clone and start tinkering to get a better understanding but please correct me if anything I've mentioned so far is inaccurate or way off, etc.

---------------------------------------------------------------------

Lastly, regarding how the analysis package (not server or plugin) works:

I found the "_fe_analyzer_shared" package and assume this is the parser that was created for the analyzer since it is used in "ast" part of the analyzer package.
The analyzer package notes that:

the structure of the tree is similar but not identical to the grammar of the language

This leads me to a few questions: 
  1. Is there somewhere I can find the grammar for the analyzer AST?
  2. The readme mentions the important note of how parameters are different. Is there a place to find all diversions between the formal specification and the analyzer's grammar (I couldn't find one)?
  3. Last I was aware, Dart was generated using ANTLR4 - is the analyzer parser the same? Or is it handmade? I'd love to learn about how this is done when the dart language is changed (for example when the grammar was updated in 2.3 with the "collection for") without breaking the code that relies on particular areas of the AST that are changed - or are these needed to be fixed by hand. Sorry if this is a large or ignorant question - but I ran into a lot of problems with this when I was first creating my tool and am very curious about how you guys handled these problems.
Thanks for the response thus far, it's been a big help for understanding where to start.

Brian Wilkerson

unread,
Apr 1, 2021, 12:08:27 PM4/1/21
to Isaac, Dart Analyzer Discussion
Regarding using/extending the analysis_server package:

Just to be clear, we don't publish the `analysis_server` package, so you can't have a pub dependency on it. We do publish both the `analyzer` and `analyzer_plugin` packages should either of those prove useful to you. While we do, of necessity, publish the `_fe_analyzer_shared` package, we would prefer that you not depend on it directly.

The analysis_server packages would simply return a json response as specified here https://htmlpreview.github.io/?https://github.com/dart-lang/sdk/blob/master/pkg/analysis_server/doc/api.html.

Correct.

To create a refactor, extend from some refactor class and override (among other functions) the "createChange()" function to return my own "SourceChange" object from the analyzer_plugin package.

I'd probably recommend subclassing `RenameRefactoringImpl`, but essentially, yes.

However, I'll point out that it isn't sufficient to implement a refactoring on the server side. The legacy protocol that some clients use (most notably IntelliJ) wasn't as well designed as I'd like, so there's a lot of work that needs to be done on the client side to implement a refactoring as well. And LSP, which is used by most other clients, doesn't support any refactorings other than 'rename'. I'm hoping to improve the situation at some point, but at the moment that effectively means that the only place refactorings can be implemented is in IntelliJ.

To create a correction, similarly, extend from "CorrectionProducer" and override (among other functions) the "compute(ChangeBuilder)" and use the builder to make modifications.

Correct.

One thing that isn't completely clear to me is the difference between a refactor and a correction. I understand conceptionally that a correction is for an error while a refactor isn't - but how are they different when returned by analysis_server to the client (plugin)? For example, with the API are they directly correlated to edit.getAvailableRefactoring and edit.getFixes respectively?

We divide the code modifying operations into three groups:
  • quick fix - these are changes that are available in response to a diagnostic (error, warning, hint or lint). The intent is for the change to fix the problem. For example, adding the keyword `required` when a parameter is not valid without it is a quick fix.
  • quick assist - these are changes that are always available and perform a small local change that doesn't require user input. The intent is to make some kinds of changes easier to make. For example, converting a string from being a single line string (delimited by one quote mark) to a multi-line string (delimited by three quote marks) is an assist. Assists are implemented just like fixes by subclassing `CorrectionProducer`.
  • refactoring - these are changes that either require user input or that require non-local analysis. For example, renaming a method requires that the user provide the new name for the method and also requires looking for invocations of the method and overrides of the method in order to change them to match.
All three have their own protocol in order to give clients more flexibility in terms of how to present the options to users.

It's pretty clear to me how to create a plugin from the docs, I'll just need to spend some more time playing around with it.

I'll just note that the plugin API doesn't support plugins contributing refactorings. The reason for this is because of the additional work required on the clients in order to support them.

I'm not sure whether you're asking about contributing to the analysis server or writing your own refactoring tool.

Right now I'm just working on my own refactoring tool and don't think it really applies to general use - but should I find cases or if that changes I would definitely be interested in contributing generally when applicable, just getting my feet wet right now :).

You're certainly welcome to use any ideas or code from server that would be of use to you, but if you're writing your own tool you don't need to be constrained to doing things the same way we have.

Regarding this, I'm assuming I can just fork - but want to be sure there won't be major breaking changes all the time. I assume not given this library is a language but just checking.

Yes, you can for the `sdk` repo (the `analysis_server` package is in that repo), but that's a lot of code. However, because we don't publish the package, we make no guarantees about breaking changes, nor any guarantees that you'll be able to merge your changes back into a later version of this package. I can fairly confidently say that we will eventually make changes that will be breaking changes. (We're in the process of migrating the package to null safety at the moment, and that alone is likely to result in breaking changes.)

I found the "_fe_analyzer_shared" package and assume this is the parser that was created for the analyzer since it is used in "ast" part of the analyzer package.

Yes, the `_fe_analyzer_shared` package is where the parser is implemented. It's used by all of our tools, not just the `analyzer` package.

Is there somewhere I can find the grammar for the analyzer AST?

For the most part it's in the comments for the various subclasses of `AstNode`. The differences are all extensions to the grammar. They allow us to represent invalid code in a way that allows us to provide a better editing experience.

The readme mentions the important note of how parameters are different. Is there a place to find all diversions between the formal specification and the analyzer's grammar (I couldn't find one)?

No, but all of them allow the AST to represent invalid code. The nodes represent a superset of the grammar or Dart.

Last I was aware, Dart was generated using ANTLR4 - is the analyzer parser the same? Or is it handmade? I'd love to learn about how this is done when the dart language is changed (for example when the grammar was updated in 2.3 with the "collection for") without breaking the code that relies on particular areas of the AST that are changed - or are these needed to be fixed by hand. Sorry if this is a large or ignorant question - but I ran into a lot of problems with this when I was first creating my tool and am very curious about how you guys handled these problems.

The parser in `_fe_analyzer_shared` is a hand written recursive descent parser with a lot of additional logic to allow it to recover in the face of errors. When the language is changed we update the parser by hand. We also update the AST structure as necessary. That does, unfortunately, sometimes require breaking changes to the AST structure, but we do our best to minimize the breakage for the sake of our clients. It also sometimes requires breaking changes to the element / type model. If you're writing tools for an evolving language then dealing with changes in the language is an unavoidable part of the cost.

Isaac

unread,
Apr 1, 2021, 11:53:34 PM4/1/21
to Dart Analyzer Discussion, brianwilkerson, Dart Analyzer Discussion, Isaac
Thank you! This was super helpful and I look forward to discovering more questions as I work through stuff :).
Reply all
Reply to author
Forward
0 new messages