[llvm-dev] TableGen enhancements

39 views
Skip to first unread message

Paul C. Anagnostopoulos via llvm-dev

unread,
Aug 29, 2020, 3:45:16 PM8/29/20
to llvm...@lists.llvm.org
Now that I've learned my way around TableGen just a bit, I'd like to solicit
suggestions for improving and enhancing it.

Perhaps there are some lexical changes that could improve readability of .td
files (e.g., I'm planning to enhance the lexer to allow an apostrophe as a
digit group separator in integers, a la C++).

Perhaps there are some syntactic enhancements that would make .td files
easier to read and write.

Perhaps there are common portions of .td files that can be factored out to
reduce future duplications, as with Automaton.td and SearchableTable.td.

Perhaps there are common portions of TableGen backends that can be factored
out to reduce future efforts, resulting in some general-purpose library
methods.

Perhaps there are new features in TableGen that, coupled with enhanced or
new C++ files, would open up possibilities for using TableGen in new areas
of the target-independent code generator.

I don't know how much people have thought about this, but I'm interested in
any ideas you may have.

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Chris Lattner via llvm-dev

unread,
Aug 29, 2020, 6:51:02 PM8/29/20
to Paul C. Anagnostopoulos, llvm...@lists.llvm.org
Instead of syntactic enhancements, I think it would be great to invest in the internal infrastructure in the implementation of TableGen.

People frequently complain about the quality of error messages in TableGen. One big reason for this is that we don’t track source locations very well in the “tablegen AST”. I think that fixing that would be a really nice step towards upgrading the individual diagnostics.

-Chris

Paul C. Anagnostopoulos via llvm-dev

unread,
Aug 29, 2020, 7:05:17 PM8/29/20
to Chris Lattner, llvm...@lists.llvm.org
It's interesting that you bring this up. I've seen a lot of TableGen error messages over the past couple of weeks and I don't recall being confused about any error or its location. I will give this a closer look.

I agree that syntactic enhancements aren't particularly exciting by themselves. I was wondering whether some new features coupled with new backends would pave the way for additional uses for TableGen in the area of code generation (or any other areas, for that matter).

At 8/29/2020 06:50 PM, Chris Lattner wrote:
>Instead of syntactic enhancements, I think it would be great to invest in the internal infrastructure in the implementation of TableGen.
>

>People frequently complain about the quality of error messages in TableGen. One big reason for this is that we don’t track source locations very well in the “tablegen AST†. I think that fixing that would be a really nice step towards upgrading the individual diagnostics.
>
>-Chris

_______________________________________________

Chris Lattner via llvm-dev

unread,
Aug 29, 2020, 9:23:20 PM8/29/20
to Paul C. Anagnostopoulos, llvm...@lists.llvm.org
My sense (which is mostly historic, I haven’t worked on the code generators for a long time sadly) is that tblgen is reasonable with syntactic and other errors. However, it doesn’t maintain the location info in the AST, so if a tblgen backend wants to report something that is wrong, it points back up to the top level records more often than not.

For example, consider if someone writes an invalid pattern like:

(ADDrr32 EAX, AL)

The AL def is for an 8 bit register, but the instruction requires a 32-bit register. The error message should point to the “AL” token on that line when it complains about it.

This is very dated memory, it is possible someone already fixed this up.

-Chris

Paul C. Anagnostopoulos via llvm-dev

unread,
Aug 30, 2020, 12:05:24 PM8/30/20
to Chris Lattner, llvm...@lists.llvm.org
Ah, yes, it appears that the locations are saved only with records, not with any components of them. For better error messages, locations probably need to be saved with Inits. This should be interesting.

At 8/29/2020 09:23 PM, Chris Lattner wrote:
>My sense (which is mostly historic, I haven’t worked on the code generators for a long time sadly) is that tblgen is reasonable with syntactic and other errors. However, it doesn’t maintain the location info in the AST, so if a tblgen backend wants to report something that is wrong, it points back up to the top level records more often than not.


>
>For example, consider if someone writes an invalid pattern like:
>
> (ADDrr32 EAX, AL)
>

>The AL def is for an 8 bit register, but the instruction requires a 32-bit register. The error message should point to the “AL†token on that line when it complains about it.


>
>This is very dated memory, it is possible someone already fixed this up.
>
>-Chris

_______________________________________________

Paul C. Anagnostopoulos via llvm-dev

unread,
Aug 31, 2020, 2:32:53 PM8/31/20
to llvm...@lists.llvm.org
I puttered a bit with the error handling in TableGen. The Searchable Tables backend currently produces the following error message when a generic enum specifies an unknown FilterClass (an error that the parser cannot detect):

search-test.td:3:1: error: Enum FilterClass 'BEntryX' does not exist
def BValues : GenericEnum {
^

Now it produces this message:

search-test.td:4:21: error: Enum FilterClass 'BEntryX' does not exist
let FilterClass = "BEntryX";
^ [points to the open quote]

That was not a particularly difficult accomplishment. The horror comes when facing all the error message calls throughout the backends.

Paul C. Anagnostopoulos via llvm-dev

unread,
Sep 8, 2020, 1:17:26 PM9/8/20
to llvm...@lists.llvm.org
I spent some time playing with TableGen in order to improve the ability of backends to generate error messages with more precise source locations. The main requirement was to add a slot to the RecordVal class to hold the source location of the statement that defined the field. To make that easier to use, I added a third PrintFatalError() method that accepts a RecordVal and grabs the source location from it.

To test these ideas, I worked on the SearchableTable backend. It now includes in every message at least the record location and often the field definition location. I also reworded some message to make them more consistent.

I will submit a patch to Phabricator soon. The problem now is that every backend has to be modified to take advantage of these changes. I plan to start working on a document about how to write a TableGen backend.

Chris Lattner via llvm-dev

unread,
Sep 8, 2020, 1:55:13 PM9/8/20
to Paul C. Anagnostopoulos, llvm...@lists.llvm.org
Very nice Paul!

I think your approach makes sense. First we need the generic infrastructure to preserve more location information, then each of the backends will need to be updated to take advantage of it.

This will make a lot of developer’s live better, thank you for working on this!

-Chris

Paul C. Anagnostopoulos via llvm-dev

unread,
Sep 8, 2020, 2:12:22 PM9/8/20
to Chris Lattner, llvm...@lists.llvm.org
I'm happy to give this some attention.

Can you or someone else recommend a moderate-size backend that doesn't have a lot of work going on now, as a second backend to try to improve its error messages?

At 9/8/2020 01:55 PM, Chris Lattner wrote:
>Very nice Paul!
>
>I think your approach makes sense. First we need the generic infrastructure to preserve more location information, then each of the backends will need to be updated to take advantage of it.
>

>This will make a lot of developer’s live better, thank you for working on this!
>
>-Chris

_______________________________________________

Simon Pilgrim via llvm-dev

unread,
Sep 13, 2020, 10:00:27 AM9/13/20
to llvm...@lists.llvm.org
Hi Paul,

I'm not sure if you're still looking for additional TableGen tasks, but
something that has been an irritant for years is the performance of
llvm-tblgen on larger targets (X86 and AMDGPU in particular).

https://bugs.llvm.org/show_bug.cgi?id=28222 and
https://bugs.llvm.org/show_bug.cgi?id=44628 are perfect examples - if
you find yourself having to debug a later stage of a -gen-dag-isel run,
you can find yourself waiting for 10mins+ .....

There's some pretty nasty O(N^2) loops in
MatcherTableEmitter::EmitMatcherList for instance :-(

Simon.

Reply all
Reply to author
Forward
0 new messages