Hi everyone,
There was idea to add new directives to FileCheck:
1. Directive to use some patterns as named template with or without parameters.
2. CHECK-INCLUDE - Directive to include other file with checks to another.
3. Expressions repeat for CHECK - If statement should be checked several times repeat modifiers {n}, {n,m} , {,n}, {n,}, *, + can be used.
4. Repeat in regexs - Repeat with current number should become available by using {n}, {n,m} , {,n}, {n,}
5. CHECK-LABEL-DAG - Not sequential order of labels.
6. Check statement for words only - // CHECK-WORD, // CHECK-WORD-NEXT, // CHECK-WORD-SAME, // CHECK-WORD-DAG, // CHECK-WORD-NOT.
7. Wildcard for prefixes - If some statements should be checked regardless prefix, it should be used //{{*}}, //{{*}}-NEXT, //{{*}}-SAME and etc.
8. Prefix with regular expressions - If statement should be checked if prefix matches some regular expression, it should be used {{regex}}:, {{regex}}-NEXT and etc.
More information in file https://docs.google.com/document/d/1wAKNzU7-S2EeK1-aADwgP8dEiKfByKNazonybCQW3zs/edit?usp=sharing.
Now we have prototype with these features. It’s tested on LLVM 3.8.
There was found unsupported before directive in old test. Bug about this - https://llvm.org/bugs/show_bug.cgi?id=27852.
There is about 6% slowdown with new features when we tested them on 3.8.
I see that there are some changes in FileCheck LLVM 3.9 with new features too. We can publish patch for 3.8 and it can be adapted for LLVM 3.9. Is it interesting for anyone? And how will be better to publish patch as for 3.8 or for 3.9?
Thanks,
Elena.
On 5/24/16 7:51 AM, Elena Lepilkina via llvm-dev wrote:
> Hi everyone,
>
> There was idea to add new directives to FileCheck:
>
> 1.Directive to use some patterns as named template with or without
> parameters.
>
> 2.CHECK-INCLUDE - Directive to include other file with checks to another.
>
> 3.Expressions repeat for CHECK - If statement should be checked several
> times repeat modifiers {n}, {n,m} , {,n}, {n,}, *, + can be used.
>
> 4.Repeat in regexs - Repeat with current number should become available
> by using {n}, {n,m} , {,n}, {n,}
>
> 5.CHECK-LABEL-DAG - Not sequential order of labels.
>
> 6.Check statement for words only - // CHECK-WORD, // CHECK-WORD-NEXT, //
> CHECK-WORD-SAME, // CHECK-WORD-DAG, // CHECK-WORD-NOT.
What does this ^ do?
>
> 7.Wildcard for prefixes - If some statements should be checked
> regardless prefix, it should be used //{{*}}, //{{*}}-NEXT, //{{*}}-SAME
> and etc.
I'm not a fan of this ^ feature. I think it'll make testcases much
harder to understand.
>
> 8.Prefix with regular expressions - If statement should be checked if
> prefix matches some regular expression, it should be used {{regex}}:,
> {{regex}}-NEXT and etc.
I'm not a fan of this ^ feature. I think it'll make testcases much
harder to understand.
>
> More information in file
> https://docs.google.com/document/d/1wAKNzU7-S2EeK1-aADwgP8dEiKfByKNazonybCQW3zs/edit?usp=sharing.
>
> Now we have prototype with these features. It’s tested on LLVM 3.8.
>
> There was found unsupported before directive in old test. Bug about this
> - https://llvm.org/bugs/show_bug.cgi?id=27852.
>
> There is about 6% slowdown with new features when we tested them on 3.8.
>
> I see that there are some changes in FileCheck LLVM 3.9 with new
> features too. We can publish patch for 3.8 and it can be adapted for
> LLVM 3.9. Is it interesting for anyone? And how will be better to
> publish patch as for 3.8 or for 3.9?
Patches that apply on trunk are preferred (assuming the community
accepts these changes).
Jon
>
> Thanks,
>
> Elena.
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm...@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
--
Jon Roelofs
jona...@codesourcery.com
CodeSourcery / Mentor Embedded
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
On 5/24/16 8:10 AM, Elena Lepilkina wrote:
> Hi,
>
> CHECK-WORD - If you want find some string in file, but you want to be sure, that this string should be a separate word.
Is this functionally equivalent to doing:
// CHECK: {{\s}}whatever{{\s}}
Or is there some other subtlety about it?
Jon
8. Prefix with regular expressions - If statement should be checked if prefix matches some regular expression, it should be used {{regex}}:, {{regex}}-NEXT and etc.
Okay, if this feature is undesirable by most of people, I can change published patch by removing this feature. But I would like to know opinion of people developing FileCheck before.
Elena.
From: Ehsan Amiri [mailto:ehsan...@gmail.com]
Sent: Thursday, May 26, 2016 5:35 PM
To: Elena Lepilkina <Elena.L...@synopsys.com>
Cc: llvm-dev <llvm...@lists.llvm.org>
Subject: Re: [llvm-dev] RFC: FileCheck Enhancements
7. Wildcard for prefixes - If some statements should be checked regardless prefix, it should be used //{{*}}, //{{*}}-NEXT, //{{*}}-SAME and etc.
But then I should write
// CHECK: something
// SSE: something
// SSE3: something
With this feature it can be write // {{[A-Z0-9]+}} : something
From: James Y Knight [mailto:jykn...@google.com]
Sent: Thursday, May 26, 2016 5:53 PM
To: Ehsan Amiri <ehsan...@gmail.com>
Cc: Elena Lepilkina <Elena.L...@synopsys.com>; llvm-dev <llvm...@lists.llvm.org>
Subject: Re: [llvm-dev] RFC: FileCheck Enhancements
On Thu, May 26, 2016 at 10:35 AM, Ehsan Amiri via llvm-dev <llvm...@lists.llvm.org> wrote:
On May 26, 2016, at 8:01 AM, Elena Lepilkina via llvm-dev <llvm...@lists.llvm.org> wrote:But then I should write// CHECK: something// SSE: something// SSE3: something
> 7. Wildcard for prefixes - If some statements should be checked regardless prefix, it should be used //{{*}}, //{{*}}-NEXT, //{{*}}-SAME and etc.
For this one I agree that multiple check prefixes already provides this. The MIPS tests frequently have something like '--check-prefix=ALL --check-prefix=FOO' on one command and '—check-prefix=ALL –check-prefix=BAR'.
> 8. Prefix with regular expressions - If statement should be checked if prefix matches some regular expression, it should be used {{regex}}:, {{regex}}-NEXT and etc.
The previous example isn't very compelling but I can see how this feature could be useful to me. I have a number of tests that do something like:
// O32: something for O32
// N32: something for N32
// N64: something for N64
// NEW: something for both N32 and N64
But this is a bit clearer:
// O32: something for O32
// N32: something for N32
// N64: something for N64
// {{N32|N64}}: something for both N32 and N64
There's also some that define O32, O32EL, and O32EB which could drop the O32 and do:
// {{O32(EL|EB)}}: any endian
// {{O32(EL)}}: little endian
// {{O32(EB)}}: big endian
In this example, I've included redundant parenthesis so that vim's '*' key can find me all the O32 lines, all the little endian lines, etc.
One last example is that I have some tests that define MIPS32R1, MIPS32R2, MIPS32R3, MIPS32R5, MIPS32R6 and MIPS64 equivalents of each.
{{MIPS32R[2-5]}} would match MIPS32R2 through to MIPS32R6
{{MIPS(32|64)R6}} would match MIPS32R6 and MIPS64R6
{{MIPS64.*}} would match any MIPS64
This would remove a lot of redundancy but it's starting to harm readability. I'm not sure where I draw the line on that trade-off but I definitely wouldn't want complicated regexes.
{{MIPS32R[2-5]}} would match MIPS32R2 through to MIPS32R6
{{MIPS(32|64)R6}} would match MIPS32R6 and MIPS64R6
{{MIPS64.*}} would match any MIPS64
Thanks,
Sergey
Seems plausible, although I find the proposed syntax for a template call
with parameters to be very awkward.
> 2. CHECK-INCLUDE - Directive to include other file with checks to another.
> 3. Expressions repeat for CHECK - If statement should be checked several times repeat modifiers {n}, {n,m} , {,n}, {n,}, *, + can be used.
> 4. Repeat in regexs - Repeat with current number should become available by using {n}, {n,m} , {,n}, {n,}
> 5. CHECK-LABEL-DAG - Not sequential order of labels.
After reading through the google doc I see what you are trying to do.
The motivation for CHECK-DAG is to reduce future test "churn" in case
someone makes a change that influences the output order of something,
but the order is not important to the test. The motivation is NOT to
support instability in the output order across runs of the same compiler.
Just so we're clear on that.
In principle I could see how the order of blocks or entire functions
could change over time, with no real relevance to a test, but how
necessary is it really? I have seen work go into Clang/LLVM over the
years to ensure stable output order, and my impression has been that
these changes typically have little complexity or performance effect
while significantly simplifying test effort.
> 6. Check statement for words only - // CHECK-WORD, // CHECK-WORD-NEXT, // CHECK-WORD-SAME, // CHECK-WORD-DAG, // CHECK-WORD-NOT.
I would expect a regex package to provide a word-break meta symbol,
in which case this feature would be redundant. I admit I have not
checked whether the package we use has this feature, or how it would
be spelled.
> 7. Wildcard for prefixes - If some statements should be checked regardless prefix, it should be used //{{*}}, //{{*}}-NEXT, //{{*}}-SAME and etc.
> 8. Prefix with regular expressions - If statement should be checked if prefix matches some regular expression, it should be used {{regex}}:, {{regex}}-NEXT and etc.
>
> More information in file https://docs.google.com/document/d/1wAKNzU7-S2EeK1-aADwgP8dEiKfByKNazonybCQW3zs/edit?usp=sharing.
I noticed the google doc stated that multi-line patterns are not
supported. That's not actually the case, although it's a bit obscure:
the [[:space:]] character-class will match EOL and allow you to write
a multi-line CHECK.
>
> Now we have prototype with these features. It's tested on LLVM 3.8.
> There was found unsupported before directive in old test. Bug about this - https://llvm.org/bugs/show_bug.cgi?id=27852.
>
> There is about 6% slowdown with new features when we tested them on 3.8.
Is that a 6% slowdown just in FileCheck, or when running the entire
Clang/LLVM test suite? Making the entire test run 6% more expensive
seems like a lot.
--paulr
>
> I see that there are some changes in FileCheck LLVM 3.9 with new features too. We can publish patch for 3.8 and it can be adapted for LLVM 3.9. Is it interesting for anyone? And how will be better to publish patch as for 3.8 or for 3.9?
>
> Thanks,
> Elena.
Thank you for information about the [[:space:]] character-class.
About performance I tested on Clang/LLVM test suite. I try to profile and the problem is that I used regular expressions a lot for supporting some new features and functions in your regex library are very slow . Regex library is very old and quite awkward, in my opinion.
May be you will see some ways to improve performance, if some features are decided to include in LLVM FileCheck and I publish patches for each feature separately(I saw your comment in published patch).
I see that there are a lot of opinions, may be it will be better to vote. I suggest vote in file https://docs.google.com/spreadsheets/d/1p8Hi_PH3Nd2kEtYveCwKXENmJGOzYWW6us0XK7eVvOw/edit?usp=sharing. There will be history and it's possible to check that nobody votes twice. Then somebody can stop voting at one moment and I will be able to understand what patches I should do and publish if there are some changes you would like.
Thanks,
Elena.
-----Original Message-----
From: Robinson, Paul [mailto:paul.r...@sony.com]
Sent: Friday, May 27, 2016 2:26 AM
To: Elena Lepilkina <Elena.L...@synopsys.com>; llvm...@lists.llvm.org
Subject: RE: [llvm-dev] RFC: FileCheck Enhancements
On 5/26/16 5:25 PM, Robinson, Paul via llvm-dev wrote:
>
>> 7. Wildcard for prefixes - If some statements should be checked regardless prefix, it should be used //{{*}}, //{{*}}-NEXT, //{{*}}-SAME and etc.
>> 8. Prefix with regular expressions - If statement should be checked if prefix matches some regular expression, it should be used {{regex}}:, {{regex}}-NEXT and etc.
>>
>> More information in file https://docs.google.com/document/d/1wAKNzU7-S2EeK1-aADwgP8dEiKfByKNazonybCQW3zs/edit?usp=sharing.
>
> I noticed the google doc stated that multi-line patterns are not
> supported. That's not actually the case, although it's a bit obscure:
> the [[:space:]] character-class will match EOL and allow you to write
> a multi-line CHECK.
Also: CHECK-SAME.
Jon
--
Jon Roelofs
jona...@codesourcery.com
CodeSourcery / Mentor Embedded
On 5/27/16 7:41 AM, Jonathan Roelofs via llvm-dev wrote:
>
>
> On 5/26/16 5:25 PM, Robinson, Paul via llvm-dev wrote:
>>
>>> 7. Wildcard for prefixes - If some statements should be checked
>>> regardless prefix, it should be used //{{*}}, //{{*}}-NEXT,
>>> //{{*}}-SAME and etc.
>>> 8. Prefix with regular expressions - If statement should be
>>> checked if prefix matches some regular expression, it should be used
>>> {{regex}}:, {{regex}}-NEXT and etc.
>>>
>>> More information in file
>>> https://docs.google.com/document/d/1wAKNzU7-S2EeK1-aADwgP8dEiKfByKNazonybCQW3zs/edit?usp=sharing.
>>>
>>
>> I noticed the google doc stated that multi-line patterns are not
>> supported. That's not actually the case, although it's a bit obscure:
>> the [[:space:]] character-class will match EOL and allow you to write
>> a multi-line CHECK.
>
> Also: CHECK-SAME.
Whoops... I misread... Ignore me.
Jon
I support this. I found it very confusing when
; CHECK-NEXT: br
to check for a branch instruction also matched a line eg.
%2 = add i32 %subregion, i32 5
However, it is a breaking change and therefore might require changes
in existing tests.
Michael
Adding llvm-commits after the fact ends up in the mailing list not archiving the context of the reviews.
If these patches are fresh (no significant review occurred), they should be closed and new revisions need to be opened, with llvm-commits as a subscriber from the start.
—
Mehdi
Ah, oops. I've started reviews of a few of the patches. If you plan on re-creating the reviews, please mark the relevant in-line comments on these as "Done" so it's easier to track what's changed: D22353, D22403, D22344.
thanks
vedant
Hi,
I thought documentation will be changed after review if some changes are accepted. But as I need to create new patches because of next comments, I’ll add changes in documentation.
Thanks,
Elena.
From: Nemanja Ivanovic [mailto:nemanj...@gmail.com]
Sent: Friday, July 15, 2016 7:19 PM
To: Robinson, Paul <paul.r...@sony.com>
It does send a patch to the list if you upload a new diff to the existing review after adding the list as a subscriber. That seems the best way to go usually?
Thank you, James.
Then I’ll update all patches.
Pardon me for missing that discussion, this may have already been asked before: but is it possible to make arcanist default subscribe the correct commits mailing list in the process? This should make it at least harder to forget.
Cheers
It seems certainly possible, and has been asked before as well, and my memory of the answer is along the line of “patch welcome”.
(I.e. some community member are spending time to customize and maintain our fabricator instance, but they have some real work to do as well, so we’re resource limited).
—
Mehdi
Cool, thanks!
PS. I did some searching for this, and it seems there's no "easy" way to do this customisation via the project config. Probably something that has to be done on the server side. (See https://secure.phabricator.com/Q268).
But as I understood current library is quite old and it is branched from OpenBSD library, and there are no new versions.
Do you suggest to make changes in this version?
> On Jul 19, 2016, at 6:36 AM, Elena Lepilkina via llvm-dev <llvm...@lists.llvm.org> wrote:
>
> Hi all,
>
> I made new patches for most of changes with llvm-commits subscriber. But two patches were updated, because there are a lot of comments (patch for CHECK-WORD and patch for templates pattern). Will it be ok?
IMO it's fine to keep some of the original reviews if you don't want to
discard/recreate their state.
Please list the most up-to-date set of Phab URL's here, with a little note next
to the ones which did not initially CC llvm-commits.
Thanks again for working on this!
vedant
I would rather not introduce churn in our tests by turning on --check-word by
default. I'm also not convinced that turning on --check-word at the test level
is the right move: having a CHECK-WORD directive is more flexible, and not a
serious inconvenience (as compared to writing "CHECK").
>
> 2. FileCheck Enhancement - pattern templates ( https://reviews.llvm.org/D22403)
> There are some doubts about syntax of templates. I agree that use of \#, \:, \= is quite different from usual characters in FileCheck and was chosen because of same approach for escaping in regexp. Adrian Prantl suggested to use double-brackets "[[" to escape.
> Old syntax:
> \#(template_name) - use of template 'template_name'. It can occur in CHECK-PATTERN line, when description of one template includes other templates described before. (Without quote, I don't know how escape # here)
> \:(Variable_name)- template variable with name 'variable_name'
> \:(variable_name)\=(value) - current value of template variable(it's needed when you use template with variables).
> Suggested new syntax:
> [[#template_name]] - use of template 'template_name'. It can occur in CHECK-PATTERN line, when description of one template includes other templates described before. (Without quote, I don't know how escape # here)
> [[:Variable_name]] - template variable with name 'variable_name'
> [[:variable_name=value]] - current value of template variable(it's needed when you use template with variables).
> It'll be great to hear more opinions and suggestions about syntax. May be someone has really good ideas. Then I'll be able to change it.
First, I want to recap the FileCheck workflow Elena is proposing:
1. Define patterns using the CHECK-DEFINE-PATTERN directive. Defined patterns
have a name and may optionally have parameters.
2. Use defined patterns in the usual CHECK* directives.
This is similar to how FileCheck patterns work already. The difference is
that the patterns are defined using a dedicated directive, *not* when the
pattern is first encountered. E.g, here is what you can do today:
// RUN: echo "%r1 %r2" | FileCheck %s
// CHECK: %[[register:[a-z]+]]1
// CHECK-SAME: %[[register]]2
With the proposed changes, we'll be able to write something like:
// RUN: echo "%cmp %cmp" | FileCheck %s
// CHECK-DEFINE-PATERN: register(n): {{[a-z]+}}n
// CHECK: %[[register("1")]]
// CHECK-SAME: %[[register("2")]]
I saw "something like" because we haven't decided on the syntax for defining
and using patterns (that's what this thread is for). Briefly, here's the syntax
I'd like to use:
// Defining patterns.
CHECK-DEFINE-PATERN: <Name>(<Ident>, ...)?: <Pattern>
Where <Pattern> is a list of <PatternElement>, and a <PatternElement> is
either a regex ('{{' POSIX_REGEX '}}') or an argument identifier (IDENT).
// Using patterns.
CHECK: [[<Name>(<Argument>, ...)?]]
Fleshing this out some more, here is my candidate grammar (see the end of this
email for the current grammar):
ACTION <- CHECK ':' MATCH '\n' ;
ACTION <- CHECK-DEFINE-PATTERN ':' IDENT PARAMLIST? ':' PATTERN_ELEMENT* '\n' ;
PARAMLIST <- '(' IDENT (',' IDENT)* ')' ;
PATTERN_ELEMENT <- IDENT ;
PATTERN_ELEMENT <- REGEX ;
MATCH <- ;
MATCH <- TEXT MATCH ;
MATCH <- REGEX MATCH ;
MATCH <- VAR MATCH ;
REGEX <- '{{' POSIX_REGEX '}}' ;
VAR <- '[[' IDENT ':' POSIX_REGEX ']]' ;
VAR <- '[[' IDENT ARGLIST? ']]' ;
ARGLIST <- '(' ARG (',' ARG)* ')' ;
ARG <- "([^"]|\\")*" ;
> 3. FileCheck Enhancement - repeats in regular expressions (https://reviews.llvm.org/D22454), FileCheck Enhancement - Including files (https://reviews.llvm.org/D22500), FileCheck Enhancement - Expressions repeat for CHECK and CHECK-NEXT(https://reviews.llvm.org/D22501), FileCheck Enhancement - CHECK-LABEL-DAG(https://reviews.llvm.org/D22502), FileCheck Enhancement - prefixes-regular expressions (https://reviews.llvm.org/D22503)
> There were no comments about these enhancements at all. Your opinions are very important.
I personally am waiting for some version of D22403 to land in-tree before
starting on the other reviews. This would help me gauge what others in the
community are thinking and what they need.
>
> I hope that some of these changes will be useful for FileCheck users, so I need your opinions to get opportunity for review to be resumed.
thanks,
vedant
Original FileCheck grammar (shamelessly copied from the grammar Adrian posted
to D22403):
ACTION <- CHECK ':' MATCH '\n' ;
MATCH <- ;
MATCH <- TEXT MATCH ;
MATCH <- REGEX MATCH ;
MATCH <- VAR MATCH ;
REGEX <- '{{' POSIX_REGEX '}}' ;
VAR <- '[[' IDENT ':' POSIX_REGEX ']]' ;
VAR <- '[[' IDENT ']]' ;
I'd rather drop the parens when there are no parameters, but I have no hard
arguments for why that's a better approach. I'll post a follow-up to Mehdi's
email with a suggestion for how to disambiguate pattern usages from normal
matches.
> I have doubts about syntax of pattern usage. I think that '?' at the end will be inconspicuous.
The '?' just means: "the preceding bit of the grammar is optional". It isn't a
literal '?'. So, '[[' IDENT ARGLIST? ']]' would match both '[[' IDENT ']]' and
'[[' IDENT ARGLIST ']]'. Maybe I'm misunderstanding -- did you mean that having
an optional ARGLIST is the real issue?
> Moreover this is not similar to current syntax in FileCheck. Metacharacters are usually used at the beginning, aren't they? Why not then describe usage as ?[[<pattern>]] ?
>
> One advantages of current syntax is that if pattern is used inside regular expression there is no need to interrupt regular expression and then start it again. But with new syntax I will need to write something like this {{regex_part1}}[[<pattern>?]]{{regex_part2}}.
Yes. But, I don't view this as a disadvantage of the new syntax. IMO this is a
bit easier to read.
thanks,
This is a problem with my suggested gramamr (specifically: it doesn't provide a
way to use defined patterns to capture text for later reference). Fred brought
up that he was confused by this too.
One way to fix it is to use '@' before using patterns. I'll recap the suggested
grammar and work through another example. Here's how you'd define a pattern:
CHECK-DEFINE-PATTERN: car(make, model, year): {{Found a }} make model {{, from }} year
And here's how you'd use it, *without* capturing the text into a variable:
CHECK: [[@car("Honda", "Accord", "2009")]]
(Note, this matches the text: "Found a Honda Accord, from 2009".)
To use a pattern and capture the matched text in "MY_CAR", you'd write:
CHECK: [[MY_CAR @car("Honda", "Accord", "2009")]]
This gives us an unambiguous way to capture text matched via a defined pattern,
which is visually distinct from how normal regex-based capturing works.
Note that subsequent uses of "MY_CAR" should work as expected (i.e, you can do
'[[MY_CAR]]' later in the test).
Revised grammar:
ACTION <- CHECK ':' MATCH '\n' ;
ACTION <- CHECK-DEFINE-PATTERN ':' IDENT PARAMLIST? ':' PATTERN_ELEMENT* '\n' ;
PARAMLIST <- '(' IDENT (',' IDENT)* ')' ;
PATTERN_ELEMENT <- IDENT | REGEX;
MATCH <- (TEXT | REGEX | PATTERN_USE | VAR)* ;
REGEX <- '{{' POSIX_REGEX '}}' ;
PATTERN_USE <- '[[' '@' IDENT ']]' ;
VAR <- '[[' IDENT ':' POSIX_REGEX ']]' ;
VAR <- '[[' IDENT '@' IDENT ARGLIST? ']]' ;
ARGLIST <- '(' ARG (',' ARG)* ')' ;
ARG <- "([^"]|\\")*" ;
vedant
—
Mehdi
About CHECK-WORD. There is no problem to return first patch with CHECK-WORD, CHECK-WORD-SAME and etc.
I added option because there were a lot of opinions that add a lot of new directives is a bad idea.
Thanks,
Elena.
With the syntax I'm suggesting, there is a clear distinction between parameters
and regexes inside of pattern definitions. This is really valuable to me (and
probably others): it means that I don't have to learn an alternate,
FileCheck-specific flavor of regexes. That could also make it easier to switch
to a different regex engine in the future.
The tradeoff here is that 'simple strings' inside of pattern definitions need
to be wrapped in '{{' '}}', i.e they are actually just regexes. We'd also need
logic to escape metacharacters in pattern arguments. IMO, this is a fair
tradeoff. Note that you can still apply regexes to parameters, like this:
CHECK-DEFINE-PATTERN: one_or_more(x): x {{+}}
Wdyt?
[[REGISTER:%[[register(1)]]]]
Is this saying that we should match a literal '%' character, followed by a
literal '[' character etc.? Or, is it a pattern application? There's no way to
disambiguate the two.
That is the role of the '@' character in my suggested grammar. It makes the
grammar unambiguous, and it makes it easy to notice pattern applications. So:
[[REGISTER @register("1")]]
and
[[REGISTER: register]]]
Are two totally different things, and it should be obvious that they are
different.
vedant
> Sergey wrote:
>
> I also like proposed grammar. One minor correction, I guess you want to allow pattern use with parameters as well:
>
> PATTERN_USE <- '[[' '@' IDENT ARGLIST? ']]' ;
Thanks, that looks better!
vedant
So, to match one or more things you could write:
CHECK-DEFINE-PATTERN: one_or_more(x): #x {{+}}
CHECK: [[@one_or_more("1")]]
Without the '#', you'd see a syntax error because '+' isn't a valid regex.
I like this approach because it doesn't require changing the definitions of
REGEX or POSIX_REGEX. It'd be interesting to hear what other people think.
We should be able to hash this out in parallel to the work on D22403, since the
plan is to defer work on pattern arguments until basic support for pattern
definitions has landed.
Revised grammar (** proposal **):
ACTION <- CHECK ':' MATCH '\n' ;
ACTION <- CHECK-DEFINE-PATTERN ':' IDENT PARAMLIST? ':' PATTERN_ELEMENT* '\n' ;
PARAMLIST <- '(' IDENT (',' IDENT)* ')' ;
PATTERN_ELEMENT <- '#'? IDENT ;
PATTERN_ELEMENT <- REGEX ;
MATCH <- (TEXT | REGEX | PATTERN_USE | VAR)* ;
REGEX <- '{{' POSIX_REGEX '}}' ;
PATTERN_USE <- '[[' '@' IDENT ARGLIST? ']]' ;
VAR <- '[[' IDENT ':' POSIX_REGEX ']]' ;
VAR <- '[[' IDENT '@' IDENT ARGLIST? ']]' ;
ARGLIST <- '(' ARG (',' ARG)* ')' ;
ARG <- "([^"]|\\")*" ;
best,
vedant
This seems like a really complicated set of extensions to filecheck for marginal gain. I’m not enthused by the idea that you’d have to actually utter this in every individual test that needs to use the feature, because that would lead to boilerplate.
Instead of doing this, has anyone considered baking “modes” into Filecheck to support the important clients (e.g. LLVM IR, MC, clang, etc)? This would mean that a test could just add a “--mode=llvmir” flag to filecheck and get a bunch of baked in patterns, potentially with magic syntax. Something like this would avoid having to redundantly enter "%[0-9]+” a kajillion times all over the place.
-Chris
-----Original Message-----
From: v...@apple.com [mailto:v...@apple.com]
Sent: Tuesday, September 13, 2016 4:05 AM
To: Elena Lepilkina <Elena.L...@synopsys.com>
Cc: Mehdi Amini <mehdi...@apple.com>; llvm...@lists.llvm.org
Subject: Re: [llvm-dev] RFC: FileCheck Enhancements
I hadn't considered adding modes to FileCheck. That seems like it could work
pretty well.
Even if that's the route we're going to take, I'd prefer that the FileCheck
modes simply expose pre-defined patterns instead of introducing magic syntax.
It would be a shame to have to hack FileCheck to use check patterns for clients
that aren't LLVM IR, clang, etc.
Having check patterns available would let me simplify some of my tests, making
them easier to update. I hope that we can find a minimal/unintrusive way of
adding at least a basic version of this feature. But, you're right that we
shouldn't do it unless it's generally useful.
vedant
Why?
This is a tradeoff: the more extensible you make filecheck (by introducing ever more complicated DSLs into it) the more esoteric it gets, and the less likely that someone will be able to actually UNDERSTAND a random filecheck invocation that they come across.
The advantage of adding a few hard coded modes to filecheck is that we can have specific standardized solutions to important use cases that the LLVM project has. FileCheck lives in service of LLVM and its downstream projects, it shouldn’t aspire to be some general tool that needs a DSL. If a downstream project outside of the LLVM project (e.g. the Rust compiler) wanted its own mode, I think it would be fine to add it, and having a gaggle of modes seems preferable to me than building in a complex DSL for describing named patterns.
Let me start by saying that adding a few hard-coded patterns to FileCheck could
work well for many of llvm's tests. I'd be perfectly fine with that approach.
My thinking was that if it isn't easy to teach FileCheck about new patterns,
it's less likely that people will do it. I'm not interested in making FileCheck
a general-purpose testing tool, but I would like to use custom patterns to
simplify in-tree tests that aren't in *.ll or *.s files.
I don't think that all extensions to the FileCheck syntax would cause serious
maintainability issues. Being able to define patterns in some tests could
actually make them more readable. However, if there isn't a way to extend
FileCheck without creating serious readability issues, I agree that we
shouldn't do it.
vedant
Ok, it sounds like you agree that modes would be a good way to solve the most common case. How about we start with that. After that lands and gets adopted, we can see how big the middle zone served by custom pattern is, and what we are leaving on the table by not supporting it.
-Chris
How you would define the “modes” without a DSL? Hardcoding them inside FileCheck itself?
It seems that it would make the bar higher to add new pattern, and lock-us with this particular hard-coded solution (i.e. moving to a DSL solution after the fact would be harder).
On the opposite, I would expect that having the “DSL” support would make it easier to create/define these “modes”, even if we don’t use the “advanced DSL" in the tests themselves (we may even be able to forbid the use of these patterns outside of “mode file” if it is a strong concern).
I tend to think that with a low entry bar to create a “mode file”, it may be easier to have for example a “mode file” that defines patterns for some target-specific register format, and include it only in the tests for this particular target.
—
Mehdi
Yes.
> It seems that it would make the bar higher to add new pattern,
How so? It is as easy to change filecheck as it is to change a testcase, at least in the llvm repo.
> and lock-us with this particular hard-coded solution (i.e. moving to a DSL solution after the fact would be harder).
I don’t see how. The uses of filecheck in practice are very stratified based on what is being tested (e.g. llc output vs .ll files), I think that < 12 modes would cover the vast majority of the cases we care about.
> On the opposite, I would expect that having the “DSL” support would make it easier to create/define these “modes”, even if we don’t use the “advanced DSL" in the tests themselves (we may even be able to forbid the use of these patterns outside of “mode file” if it is a strong concern).
Right, but this is exactly what I *don’t* want. I don’t want it to be “too easy” to create micro DSLs, I want the bar to be somewhat higher so people use fewer better modes, rather than geeking out and getting crazy with a custom DSL that is only used in one or two tests.
Most people encounter a random variety of tests based on when they break them in their development. We don’t want them to have to spend an excessive amount of time to understand “how the test works” before they are able to dig into “what they did that broke the test”. Fewer but better modes is good this way.
-Chris