New Java AST API: "UAST"

2,816 views
Skip to first unread message

Tor Norbye

unread,
Feb 24, 2017, 12:17:31 PM2/24/17
to lint-dev
In the next version of lint, there will be a new API for analyzing Java files. 

Originally, lint used the Lombok AST API to describe Java files, and in 2.2 it switched over to PSI. I didn't advertise this widely, since I already knew that we wanted to switch over to UAST (which was still in development), so I didn't want to make everyone jump through first porting from Lombok to PSI, and then from PSI to UAST. However, if you've already switched to PSI, the good news is that the port to UAST will be much simpler (since UAST is pretty similar to PSI), and in fact UAST actually augments PSI, so there's quite a bit of overlap.

Briefly, UAST is a "universal AST" library. When you write a lint check analyzing the AST, that lint check will work on both .java files and .kt (Kotlin) files -- and in theory any other similar languages that also implement a UAST bridge.

I'll provide more information about this later, but I wanted to give a heads up about it.

César Puerta

unread,
Feb 25, 2017, 1:04:01 AM2/25/17
to lint-dev
When you say the next version of lint, do you mean 2.4?

Tor Norbye

unread,
Feb 27, 2017, 7:46:36 PM2/27/17
to lint-dev
Yes.

Lin Wang

unread,
Mar 1, 2017, 6:40:52 PM3/1/17
to lint-dev
Interesting, would the LintDetectorTest be able to create a Kotlin file for testing accordingly? Just like java() and xml() for java and xml files.

Tor Norbye

unread,
Mar 1, 2017, 7:08:13 PM3/1/17
to Lin Wang, lint-dev
We should add that too - especially to catch cases where you've written a lint check mostly targeting UAST but you've done some accidental direct PSI usage (which would only work on Java, not Kotlin.)   But the main value add of UAST is that you can write your lint check targeting UAST, and then the same rule will work on both Java and Kotlin files, hiding all the syntactic differences from your check.

--
You received this message because you are subscribed to the Google Groups "lint-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lint-dev+u...@googlegroups.com.
To post to this group, send email to lint...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lint-dev/6e61621f-1c85-41b0-805c-c46b4524461b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael Bailey

unread,
Mar 2, 2017, 1:58:10 PM3/2/17
to lint-dev
What is the best source of docs for understanding UAST apis ? 

Tor Norbye

unread,
Mar 2, 2017, 2:05:12 PM3/2/17
to Michael Bailey, lint-dev
The source code is all here:

There are quite a few implications for lint. Here's the javadoc I wrote for the new UastScanner interface (when you port from PSI you'll change the implements JavaPsiScanner declaration to UastScanner, and then the various tips described below:

public static interface Detector.UastScanner
Interface to be implemented by lint detectors that want to analyze Java source files (or other similar source files, such as Kotlin files.)

There are several different common patterns for detecting issues:

Detector.UastScanner exposes the UAST API to lint checks. UAST is short for "Universal AST" and is an abstract syntax tree library which abstracts away details about Java versus Kotlin versus other similar languages and lets the client of the library access the AST in a unified way.

UAST isn't actually a full replacement for PSI; it augments PSI. Essentially, UAST is usd for the inside of methods (e.g. method bodies), and things like field initializers. PSI continues to be used at the outer level: for packages, classes, and methods (declarations and signatures). There are also wrappers around some of these for convenience.

The Detector.UastScanner interface reflects this fact. For example, when you indicate that you want to check calls to a method named foo, the call site node is a UAST node (in this case, UCallExpression, but the called method itself is a PsiMethod, since that method might be anywhere (including in a library that we don't have source for, so UAST doesn't make sense.)

Migrating JavaPsiScanner to UastScanner

As described above, PSI is still used, so a lot of code will remain the same. For example, all resolve methods, including those in UAST, will continue to return PsiElement, not necessarily a UElement. For example, if you resolve a method call or field reference, you'll get a PsiMethod or PsiField back.

However, the visitor methods have all changed, generally to change to UAST types. For example, the signature Detector.JavaPsiScanner.visitMethod(JavaContext, JavaElementVisitor, PsiMethodCallExpression, PsiMethod) should be changed to visitMethod(JavaContext, UCallExpression, PsiMethod).

There are a bunch of new methods on classes like JavaContext which lets you pass in a UElement to match the existing PsiElement methods.

If you have code which does something specific with PSI classes, the following mapping table in alphabetical order might be helpful, since it lists the corresponding UAST classes.

PSIUAST
com.intellij.psi.org.jetbrains.uast.
IElementTypeUastBinaryOperator
PsiAnnotationUAnnotation
PsiAnonymousClassUAnonymousClass
PsiArrayAccessExpressionUArrayAccessExpression
PsiBinaryExpressionUBinaryExpression
PsiCallExpressionUCallExpression
PsiCatchSectionUCatchClause
PsiClassUClass
PsiClassObjectAccessExpressionUClassLiteralExpression
PsiConditionalExpressionUIfExpression
PsiDeclarationStatementUDeclarationsExpression
PsiDoWhileStatementUDoWhileExpression
PsiElementUElement
PsiExpressionUExpression
PsiForeachStatementUForEachExpression
PsiIdentifierUSimpleNameReferenceExpression
PsiIfStatementUIfExpression
PsiImportStatementUImportStatement
PsiImportStaticStatementUImportStatement
PsiJavaCodeReferenceElementUReferenceExpression
PsiLiteralULiteralExpression
PsiLocalVariableULocalVariable
PsiMethodUMethod
PsiMethodCallExpressionUCallExpression
PsiNameValuePairUNamedExpression
PsiNewExpressionUCallExpression
PsiParameterUParameter
PsiParenthesizedExpressionUParenthesizedExpression
PsiPolyadicExpressionUPolyadicExpression
PsiPostfixExpressionUPostfixExpression or UUnaryExpression
PsiPrefixExpressionUPrefixExpression or UUnaryExpression
PsiReferenceUReferenceExpression
PsiReferenceUResolvable
PsiReferenceExpressionUReferenceExpression
PsiReturnStatementUReturnExpression
PsiSuperExpressionUSuperExpression
PsiSwitchLabelStatementUSwitchClauseExpression
PsiSwitchStatementUSwitchExpression
PsiThisExpressionUThisExpression
PsiThrowStatementUThrowExpression
PsiTryStatementUTryExpression
PsiTypeCastExpressionUBinaryExpressionWithType
PsiWhileStatementUWhileExpression
Note however that UAST isn't just a "renaming of classes"; there are some changes to the structure of the AST as well. Particularly around calls.

Parents

In UAST, you get your parent UElement by calling getUastParent instead of getParent. This is to avoid method name clashes on some elements which are both UAST elements and PSI elements at the same time - such as UMethod.

Children

When you're going in the opposite direction (e.g. you have a PsiMethod and you want to look at its content, you should not use PsiMethod.getBody(). This will only give you the PSI child content, which won't work for example when dealing with Kotlin methods. Normally lint passes you the UMethod which you should be procesing instead. But if for some reason you need to look up the UAST method body from a PsiMethod, use this:
     UastContext context = UastUtils.getUastContext(element);
     UExpression body = context.getMethodBody(method);
 
Similarly if you have a PsiField and you want to look up its field initializer, use this:
     UastContext context = UastUtils.getUastContext(element);
     UExpression initializer = context.getInitializerBody(field);
 

Call names

In PSI, a call was represented by a PsiCallExpression, and to get to things like the method called or to the operand/qualifier, you'd first need to get the "method expression". In UAST there is no method expression and this information is available directly on the UCallExpression element. Therefore, here's how you'd change the code:
 <    call.getMethodExpression().getReferenceName();
 ---
 >    call.getMethodName()
 

Call qualifiers

Similarly,
 <    call.getMethodExpression().getQualifierExpression();
 ---
 >    call.getReceiver()
 

Call arguments

PSI had a separate PsiArgumentList element you had to look up before you could get to the actual arguments, as an array. In UAST these are available directly on the call, and are represented as a list instead of an array.
 <    PsiExpression[] args = call.getArgumentList().getExpressions();
 ---
 >    List args = call.getValueArguments();
 
Typically you also need to go through your code and replace array access, arg[i], with list access, arg.get(i). Or in Kotlin, just arg[i]...

Instanceof

You may have code which does something like "parent instanceof PsiAssignmentExpression" to see if something is an assignment. Instead, use one of the many utilities in UastExpressionUtils - such as UastExpressionUtils.isAssignment(UElement). Take a look at all the methods there now - there are methods for checking whether a call is a constructor, whether an expression is an array initializer, etc etc.

Android Resources

Don't do your own AST lookup to figure out if something is a reference to an Android resource (e.g. see if the class refers to an inner class of a class named "R" etc.) There is now a new utility class which handles this: ResourceReference. Here's an example of code which has a UExpression and wants to know it's referencing a R.styleable resource:
        ResourceReference reference = ResourceReference.get(expression);
        if (reference == null || reference.getType() != ResourceType.STYLEABLE) {
            return;
        }
        ...
 

Binary Expressions

If you had been using PsiBinaryExpression for things like checking comparator operators or arithmetic combination of operands, you can replace this with UBinaryExpression. But you normally shouldn't; you should use UPolyadicExpression instead. A polyadic expression is just like a binary expression, but possibly with more than two terms. With the old parser backend, an expression like "A + B + C" would be represented by nested binary expressions (first A + B, then a parent element which combined that binary expression with C). However, this will now be provided as a UPolyadicExpression instead. And the binary case is handled trivially without the need to special case it.

Method name changes

The following table maps some common method names and what their corresponding names are in UAST.

createPsiVisitorcreateUastVisitor
getApplicablePsiTypesgetApplicableUastTypes
getApplicablePsiTypesgetApplicableUastTypes
getArgumentListgetValueArguments
getCatchSectionsgetCatchClauses
getDeclaredElementsgetDeclarations
getElseBranchgetElseExpression
getInitializergetUastInitializer
getLExpressiongetLeftOperand
getOperationTokenTypegetOperator
getOwnergetUastParent
getParentgetUastParent
getRExpressiongetRightOperand
getReturnValuegetReturnExpression
getTextasSourceString
getThenBranchgetThenExpression
getTypegetExpressionType
getTypeParametersgetTypeArguments
resolveMethodresolve

Handlers versus visitors

If you are processing a method on your own, or even a full class, you should switch from JavaRecursiveElementVisitor to AbstractUastVisitor. However, most lint checks don't do their own full AST traversal; they instead participate in a shared traversal of the tree, registering element types they're interested with using getApplicableUastTypes() and then providing a visitor where they implement the corresponding visit methods. However, from these visitors you should not be calling super.visitX. To remove this whole confusion, lint now provides a separate class, UElementHandler. For the shared traversal, just provide this handler instead and implement the appropriate visit methods. It will throw an error if you register element types in getApplicableUastTypes() that you don't override.

Migrating JavaScanner to UastScanner

First read the javadoc on how to convert from the older Detector.JavaScanner interface over to Detector.JavaPsiScanner. While Detector.JavaPsiScanner is itself deprecated, it's a lot closer to Detector.UastScanner so a lot of the same concepts apply; then follow the above section. 

Shintaro Katafuchi

unread,
Sep 9, 2017, 5:07:22 PM9/9/17
to lint-dev
Thank you for your great Info and work!

One quick question, as for PsiMethod.getBody() migration right now I have like the following code to traverse each statements in a method to detect something.
---
PsiCodeBlock codeBlock = method.getBody(); // method: UMethod
PsiStatement[] statements = codeBlock.getStatements();
for (PsiStatement statement : statements) {}
---
But after migrating to UExpression I found there's no equivalent, so to deal with that using UastVisitor for UExpression is a way to go?

Thanks in advance,

Tor Norbye

unread,
Sep 9, 2017, 5:24:37 PM9/9/17
to Shintaro Katafuchi, lint-dev
On the UMethod, you can get the body by calling getUastBody().
That method returns a UExpression. 

That's an important point -- it's not returning a block -- because in Kotlin for example a method declaration may not have a body, it may just use an expression, like this:
fun double(x: Int) = x * 2 

Here your getUastBody() method would return the binary expression for x * 2.

But in the "normal" case when you have a block, the getUastBody() method will return a UBlockExpression. That's similar to your PsiCodeBlock. UBlockExpression has a getExpressions() method which returns the expressions in the block -- that's similar to your statements above.

-- Tor


Thanks in advance,

--
You received this message because you are subscribed to the Google Groups "lint-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lint-dev+unsubscribe@googlegroups.com.

To post to this group, send email to lint...@googlegroups.com.

Shintaro Katafuchi

unread,
Sep 9, 2017, 11:36:41 PM9/9/17
to lint-dev
I see! That totally makes sense, thank you!

On Sunday, September 10, 2017 at 6:24:37 AM UTC+9, Tor Norbye wrote:
On Sat, Sep 9, 2017 at 5:53 PM, Shintaro Katafuchi <hot.che...@gmail.com> wrote:
Thank you for your great Info and work!

One quick question, as for PsiMethod.getBody() migration right now I have like the following code to traverse each statements in a method to detect something.
---
PsiCodeBlock codeBlock = method.getBody(); // method: UMethod
PsiStatement[] statements = codeBlock.getStatements();
for (PsiStatement statement : statements) {}
---
But after migrating to UExpression I found there's no equivalent, so to deal with that using UastVisitor for UExpression is a way to go?

On the UMethod, you can get the body by calling getUastBody().
That method returns a UExpression. 

That's an important point -- it's not returning a block -- because in Kotlin for example a method declaration may not have a body, it may just use an expression, like this:
fun double(x: Int) = x * 2 

Here your getUastBody() method would return the binary expression for x * 2.

But in the "normal" case when you have a block, the getUastBody() method will return a UBlockExpression. That's similar to your PsiCodeBlock. UBlockExpression has a getExpressions() method which returns the expressions in the block -- that's similar to your statements above.

-- Tor


Thanks in advance,

--
You received this message because you are subscribed to the Google Groups "lint-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lint-dev+u...@googlegroups.com.

To post to this group, send email to lint...@googlegroups.com.

Snow(Siruo) Zhao

unread,
Sep 11, 2017, 3:19:28 PM9/11/17
to lint-dev
Hi Tor,

I had a detector before like this:

public class MyDetector extends Detector implements Detector.JavaPsiScanner{
  ...
  @Override 
  public List<Class<? extends PsiElement>> getApplicablePsiTypes() {
    ...
  }

  @Override 
  public JavaElementVisitor createPsiVisitor(JavaContext context) {
    ...
  }

  private static class MyChecker extends JavaElementVisitor {
    ...
  }
}

And now I have migrated it to:
public class MyDetector extends Detector implements Detector.UastScanner{
  ...
  @Override 
  public List<Class<? extends UElement>> getApplicableUastTypes() {
    ...
  }

  @Override 
  public UElementHandler createUastHandler(@NonNull JavaContext context) {
    ...
  }

  private static class MyChecker extends UElementHandler {
    ...
  }
}

My tests have all passed but I wonder why I need to change the createPsiVisitor() to createUastHandler() instead of createUastVisitor() as mentioned in the "Method Name Changes". As I looked into the Detector.class, interface UastScanner has no method as createUastVisitor() and I will have to implement createUastHandler(). Am I getting this right or maybe missing something here?

Thank,
Snow 


Tor Norbye

unread,
Sep 12, 2017, 5:37:44 AM9/12/17
to lint-dev
Sorry, I think that documentation is obsolete (and I don't see it in that table in the javadoc sources anymore, I must have deleted it a while back).

Basically, I took this opportunity to make the API a bit clearer.

When you implemented a "PSI visitor", you had to extend one of the PSI visitor classes. At that point you may be very tempted to call "super.visitWhatever()" too, the way you normally would in visitors. I've seen many people do this -- but that's wrong.  That's because the "visitor" you're creating here is really only supposed to be checking this specific element, NOT cause iteration into child elements. The reason for that is that lint makes a single shared visit of the entire file, and for each node it calls the detectors that have pre-registered an interest in this element type. 

Consider what would happen if lint didn't do this - if for example one lint check wants to look at all PsiMethod elements. There are hundreds of detectors; if each detector had to do its own tree traversal, then the whole AST tree would be visited hundreds of times, and for each detector, only one or two element types would be relevant.

So instead, lint sets up this scheme where it does a single iteration of the whole ast and each element is passed only to the detectors that care about it.

For expediency these "callbacks" were handed a "visitor", since there already was an interface for this. But then some users would accidentally call "super" which can cause the visitor superclass to start iterating into children, which we don't want.

So for UAST I've made a whole separate interface UastHandler, which is nearly identical to the UastVisitor, but all the methods are abstract, so there's no way to call "super" on it or in any other way accidentally start recursing.

So that's why it's now called a "handler", not a "visitor" :-)

-- Tor

Snow(Siruo) Zhao

unread,
Sep 12, 2017, 7:40:02 PM9/12/17
to lint-dev
Hi Tor,

Thanks for the clarifications. So here I tried to put up all the most up to date docs for lint, please let me know if something is not right:




Source code for built in checks(Not sure if this is the most up to date one since deprecated apis are still being used here): https://android.googlesource.com/platform/tools/base/+/master/lint/libs/lint-checks/src/main/java/com/android/tools/lint/checks


Happy to learn about other good sources on lint.

Thanks,
Snow

Snow(Siruo) Zhao

unread,
Sep 13, 2017, 2:37:31 PM9/13/17
to lint-dev
Hi Tor,

I sent out another email yesterday asking for your advice on a custom lint check to be used for an educational tech talk. Although I am not seeing it in my sent box now, wondering if you have received it. If not I will try to reproduce the details.

Thanks,
Snow


On Friday, February 24, 2017 at 9:17:31 AM UTC-8, Tor Norbye wrote:

Tor Norbye

unread,
Sep 19, 2017, 1:44:22 PM9/19/17
to lint-dev
25.3.0 is for lint from the 2.3.0 plugin, e.g. it's missing the UAST APIs etc. It's better to use 26.0.0-beta<latest>. I'm not sure those are available as javadocs, but I'm told we include source jars on maven.google.com (where the 26.* libraries are published) so javadoc should be extractable from those (in fact when IDEs show quickdocs they often pull them from the source jars, not from the javadoc jars.)

-- Tor

Tor Norbye

unread,
Sep 19, 2017, 1:44:53 PM9/19/17
to lint-dev
Yes, I see that I received it -- I've been traveling the last few weeks so I'm just now starting to catch up :-)

Tor Norbye

unread,
Sep 19, 2017, 3:43:59 PM9/19/17
to lint-dev
On Tuesday, September 19, 2017 at 10:44:22 AM UTC-7, Tor Norbye wrote:
25.3.0 is for lint from the 2.3.0 plugin, e.g. it's missing the UAST APIs etc. It's better to use 26.0.0-beta<latest>. I'm not sure those are available as javadocs, but I'm told we include source jars on maven.google.com (where the 26.* libraries are published) so javadoc should be extractable from those (in fact when IDEs show quickdocs they often pull them from the source jars, not from the javadoc jars.)

In particular, since we now publish sources too you can probably find them following the maven directory structure below maven.google.com, but (if you've already built with Gradle using the dependency) they're also available in Gradle's cache; on my machine for example I have this:
~/.gradle/caches/modules-2/files-2.1/com.android.tools.lint/lint/26.0.0-beta5/f170befa8d345f9a6a0c66f1851e0bad002ac8c1/lint-26.0.0-beta5-sources.jar
~/.gradle/caches/modules-2/files-2.1/com.android.tools.lint/lint-api/26.0.0-beta5/de5068faaafa10cb4db11f220f97ff9e5d2cd6b7/lint-api-26.0.0-beta5-sources.jar
~/.gradle/caches/modules-2/files-2.1/com.android.tools.lint/lint-checks/26.0.0-beta5/35dcc463a20f33334bdef947dec4e9f8de86fa56/lint-checks-26.0.0-beta5-sources.jar 
...

Tony Robalik

unread,
Jan 16, 2020, 6:23:00 PM1/16/20
to lint-dev
Sorry for resurrecting an old thread, but I can't find any good docs on UAST. Do I have to use it with lint, or can I use it on its own? I want to be able to understand java/kotlin code, but not for the purposes of lint checks. Right now I have an algorithm that literally greps for particular import statements. Ideally, I'd replace that with an AST of some kind. Thanks.

Matthew Gharrity

unread,
Jan 16, 2020, 6:24:00 PM1/16/20
to Tony Robalik, lint-dev
Reply all
Reply to author
Forward
0 new messages