This is an archive of the discontinued LLVM Phabricator instance.

FileCheck: Allow CHECK-{SAME,NEXT,EMPTY} after initial CHECK-DAG
AbandonedPublic

Authored by thopre on Jan 16 2019, 9:03 AM.

Details

Reviewers
jdenny
probinson
Summary

When following a CHECK-DAG block, FileCheck correctly matches
CHECK-NEXT, CHECK-SAME and CHECK-EMPTY relative to the CHECK-DAG that
matches the latest line in the input file, as one would expect. However,
FileCheck refuses to run if the CHECK-DAG directives are the first in
the file. This commit lift this restriction.

Note: there is debate in the comments over whether the inconsistency should be fixed by forbidding CHECK-DAG before a CHECK-EMPTY, CHECK-SAME or CHECK-NEXT always. More opinions welcome.

Diff Detail

Event Timeline

thopre created this revision.Jan 16 2019, 9:03 AM

The revision title and summary contradict each other. The title says to allow something, but the summary says that already works; then the summary says CHECK-DAG can't be first, while the check-dag.txt test clearly already has CHECK-DAG as the first directive.
Please fix up the title and summary so that they correctly describe what you are trying to accomplish. This will help reviewers to make sure the patch does what you want.

Hi Thomas,

Thanks for working to clean up this area.

As I understand it, CHECK-NEXT, CHECK-SAME, and CHECK-EMPTY currently can appear after non-initial CHECK-DAG blocks, and this patch intends to permit them after initial CHECK-DAG blocks as well. I agree with Paul that the title should be adjusted to reflect that.

However, I remember some disagreement in the past as to whether these directives should be permitted after any CHECK-DAG at all. We should probably resolve any remaining disagreement before relaxing things further.

Consider the example that this patch adds to the docs:

// CHECK-DAG: mov r0, #0
// CHECK-DAG: mov r1, #1
// CHECK-NEXT: add r2, r0, r1

Unmatched lines are permitted to appear between CHECK-DAG matches, but unmatched lines are not permitted to appear between whichever CHECK-DAG match happens to occur last and the CHECK-NEXT match. Is that a realistic use case?

A realistic use case would probably involve something like a CHECK-DAG-NEXT directive that doesn't permit intervening lines anywhere in the block. Or should the above example actually indicate exactly that?

Do you have a use case from a real test to inform this discussion?

llvm/docs/CommandGuide/FileCheck.rst
255–259

"ie." -> "i.e.,"

If "ie." is common in some context, please give me a pointer.

284–287

"ie." -> "i.e.,"

452

"word" -> "words"

llvm/lib/Support/FileCheck.cpp
845

"ie." -> "i.e.,"

Hi Thomas,

Thanks for working to clean up this area.

As I understand it, CHECK-NEXT, CHECK-SAME, and CHECK-EMPTY currently can appear after non-initial CHECK-DAG blocks, and this patch intends to permit them after initial CHECK-DAG blocks as well. I agree with Paul that the title should be adjusted to reflect that.

Correct.

However, I remember some disagreement in the past as to whether these directives should be permitted after any CHECK-DAG at all. We should probably resolve any remaining disagreement before relaxing things further.

Indeed, I was not aware of the disagreement.

Consider the example that this patch adds to the docs:

// CHECK-DAG: mov r0, #0
// CHECK-DAG: mov r1, #1
// CHECK-NEXT: add r2, r0, r1

Unmatched lines are permitted to appear between CHECK-DAG matches, but unmatched lines are not permitted to appear between whichever CHECK-DAG match happens to occur last and the CHECK-NEXT match. Is that a realistic use case?

Fair enough, any clobber can happen between the two, forgot about that. Note that I can see several existing tests in the testsuite that have CHECK-NEXT right after CHECK-DAG which are probably not as strict in terms of testing as their author were expecting, e.g.

test/Analysis/LazyValueAnalysis/lvi-after-jumpthreading.ll
test/Bitcode/thinlto-function-summary-originalnames.ll
test/CodeGen/AArch64/arm64-ccmp.ll
test/CodeGen/AArch64/arm64-indexed-vector-ldst.ll (though it's a single DAG between a CHECK and CHECK-NEXT so it's a special case)
test/CodeGen/AArch64/f16-instructions.ll

and many more.

A realistic use case would probably involve something like a CHECK-DAG-NEXT directive that doesn't permit intervening lines anywhere in the block. Or should the above example actually indicate exactly that?

Do you have a use case from a real test to inform this discussion?

Agreed, something like CHECK-DAG-NEXT would be better suited. The requirement of CHECK-DAG with CHECK-NEXT was mentioned to me by a coworker and I noticed the weird inconsistency regarding whether CHECK-DAG is the first check directives or not and thought I'd fix it. The behaviour we need is indeed reordering without any line not checked for in between using CHECK-DAG is not the right solution. So I guess the correct approach would be to error out on CHECK-DAG with CHECK-NEXT, CHECK-SAME or CHECK-EMPTY, probably with some flag to enable the old behavior while tests are updated.

Regarding CHECK-DAG-NEXT, I wonder how it would interact with CHECK-DAG? Perhaps we should introduce some sort of only-tested region (e.g. CHECK-ONLY-START and CHECK-ONLY-END with better name) where CHECK-DAG inside would still be able to be reordered but only what is tested by a CHECK-DAG is allowed. It could also allow CHECK directive inside which would behave as CHECK-NEXT. How does that sound?

thopre updated this revision to Diff 183610.Jan 25 2019, 1:26 PM

Fix typos and improve title

thopre retitled this revision from Allow CHECK-SAME, NEXT and EMPTY after CHECK-DAG to FileCheck: Allow CHECK-{SAME,NEXT,EMPTY} after initial CHECK-DAG.Jan 25 2019, 1:27 PM
thopre edited the summary of this revision. (Show Details)

Fair enough, any clobber can happen between the two, forgot about that. Note that I can see several existing tests in the testsuite that have CHECK-NEXT right after CHECK-DAG which are probably not as strict in terms of testing as their author were expecting, e.g.

test/Analysis/LazyValueAnalysis/lvi-after-jumpthreading.ll
test/Bitcode/thinlto-function-summary-originalnames.ll
test/CodeGen/AArch64/arm64-ccmp.ll
test/CodeGen/AArch64/arm64-indexed-vector-ldst.ll (though it's a single DAG between a CHECK and CHECK-NEXT so it's a special case)
test/CodeGen/AArch64/f16-instructions.ll

and many more.

Not surprising. I remember now that I've put a CHECK-SAME after a CHECK-DAG because I wanted to look for items that could occur in any order on a single line. I don't remember now if I did that as a best approximation of what I wanted or if I mistakenly thought it already achieved what I wanted.

A realistic use case would probably involve something like a CHECK-DAG-NEXT directive that doesn't permit intervening lines anywhere in the block. Or should the above example actually indicate exactly that?

Do you have a use case from a real test to inform this discussion?

Agreed, something like CHECK-DAG-NEXT would be better suited. The requirement of CHECK-DAG with CHECK-NEXT was mentioned to me by a coworker and I noticed the weird inconsistency regarding whether CHECK-DAG is the first check directives or not and thought I'd fix it. The behaviour we need is indeed reordering without any line not checked for in between using CHECK-DAG is not the right solution. So I guess the correct approach would be to error out on CHECK-DAG with CHECK-NEXT, CHECK-SAME or CHECK-EMPTY, probably with some flag to enable the old behavior while tests are updated.

Possibly so. Given that many of those existing cases were probably intending the stricter behavior, let's first create a better solution for the stricter behavior.

Regarding CHECK-DAG-NEXT, I wonder how it would interact with CHECK-DAG?

I think that any kind of CHECK-DAG* following a different kind of CHECK-DAG* would start a new block such that there's no reordering among blocks. Otherwise, the semantics become very unclear to me.

It seems intuitive that the first match in a CHECK-DAG-NEXT block would have to appear on the line after whatever match came before the block. However, if you don't want that behavior, then I'm not sure how you would express that.

Perhaps we should introduce some sort of only-tested region (e.g. CHECK-ONLY-START and CHECK-ONLY-END with better name) where CHECK-DAG inside would still be able to be reordered but only what is tested by a CHECK-DAG is allowed. It could also allow CHECK directive inside which would behave as CHECK-NEXT. How does that sound?

Or would that CHECK inside behave as a CHECK-SAME? In general, how would whitespace be handled? Error or ignored?

What you propose sounds very similar to something I suggested at one point: CHECK-NOT-PUSH and CHECK-NOT-POP. The idea is you would push a pattern into -implicit-check-not so the pattern would be rejected among matches between the push and the pop. This would be useful in any context where you want -implicit-check-not but don't want it for the entire input.

To achieve something like CHECK-DAG-SAME, the pattern would specify a single newline, and you'd put CHECK-DAG directives in between the push and pop. To achieve something like CHECK-DAG-NEXT, you'd do the same but each CHECK-DAG's pattern would be specified to gobble up the next newline. Then again, those two cases might be easier to write and diagnostics might be clearer if we created special-purpose directives, but they could probably be built on the same infrastructure.

If all that makes sense to you, there are some details we could discuss about how CHECK-NOT-PUSH and CHECK-NOT-POP would interact with neighboring CHECK-NOT directives and neighboring matches.

I've thought about these ideas some more. While ideas like CHECK-NOT-{PUSH,POP} might be more general, I'm inclined to believe that CHECK-DAG-NEXT blocks and CHECK-DAG-SAME blocks could be significantly more user-friendly for the use cases we've been discussing. In any case, before anyone starts implementing one of these ideas, we should probably discuss the details and reach an agreement.

[SNIP]

Agreed, something like CHECK-DAG-NEXT would be better suited. The requirement of CHECK-DAG with CHECK-NEXT was mentioned to me by a coworker and I noticed the weird inconsistency regarding whether CHECK-DAG is the first check directives or not and thought I'd fix it. The behaviour we need is indeed reordering without any line not checked for in between using CHECK-DAG is not the right solution. So I guess the correct approach would be to error out on CHECK-DAG with CHECK-NEXT, CHECK-SAME or CHECK-EMPTY, probably with some flag to enable the old behavior while tests are updated.

Possibly so. Given that many of those existing cases were probably intending the stricter behavior, let's first create a better solution for the stricter behavior.

Regarding CHECK-DAG-NEXT, I wonder how it would interact with CHECK-DAG?

I think that any kind of CHECK-DAG* following a different kind of CHECK-DAG* would start a new block such that there's no reordering among blocks. Otherwise, the semantics become very unclear to me.

It seems intuitive that the first match in a CHECK-DAG-NEXT block would have to appear on the line after whatever match came before the block. However, if you don't want that behavior, then I'm not sure how you would express that.

No that makes sense.

Perhaps we should introduce some sort of only-tested region (e.g. CHECK-ONLY-START and CHECK-ONLY-END with better name) where CHECK-DAG inside would still be able to be reordered but only what is tested by a CHECK-DAG is allowed. It could also allow CHECK directive inside which would behave as CHECK-NEXT. How does that sound?

Or would that CHECK inside behave as a CHECK-SAME? In general, how would whitespace be handled? Error or ignored?

Whatever is consistent with the concept of CHECK-DAG-NEXT I guess. Am suddenly wondering whether 2 CHECK-DAG can match on the same line if they don't overlap? Anyway, this is just an idea, we should have a discussion on llvm-dev to gather requirements from people who used CHECK-DAG when they wanted something else and come up with a good design.

What you propose sounds very similar to something I suggested at one point: CHECK-NOT-PUSH and CHECK-NOT-POP. The idea is you would push a pattern into -implicit-check-not so the pattern would be rejected among matches between the push and the pop. This would be useful in any context where you want -implicit-check-not but don't want it for the entire input.

Oh nice, it would be useful indeed.

To achieve something like CHECK-DAG-SAME, the pattern would specify a single newline, and you'd put CHECK-DAG directives in between the push and pop. To achieve something like CHECK-DAG-NEXT, you'd do the same but each CHECK-DAG's pattern would be specified to gobble up the next newline. Then again, those two cases might be easier to write and diagnostics might be clearer if we created special-purpose directives, but they could probably be built on the same infrastructure.

If all that makes sense to you, there are some details we could discuss about how CHECK-NOT-PUSH and CHECK-NOT-POP would interact with neighboring CHECK-NOT directives and neighboring matches.

I'd love to participate in the discussion yes. If you create a thread please Cc me.

Perhaps we should introduce some sort of only-tested region (e.g. CHECK-ONLY-START and CHECK-ONLY-END with better name) where CHECK-DAG inside would still be able to be reordered but only what is tested by a CHECK-DAG is allowed. It could also allow CHECK directive inside which would behave as CHECK-NEXT. How does that sound?

Or would that CHECK inside behave as a CHECK-SAME? In general, how would whitespace be handled? Error or ignored?

Whatever is consistent with the concept of CHECK-DAG-NEXT I guess. Am suddenly wondering whether 2 CHECK-DAG can match on the same line if they don't overlap?

Yep:

$ cat check 
CHECK-DAG: abc
CHECK-DAG: def
$ echo 'def abc' | ./bin/FileCheck -v -dump-input=always check |& tail -5
<<<<<<
       1: def abc
dag:1         ^~~
dag:2     ^~~
>>>>>>

Anyway, this is just an idea, we should have a discussion on llvm-dev to gather requirements from people who used CHECK-DAG when they wanted something else and come up with a good design.

Agreed. Would you like to lead that discussion? While I'm happy to comment, I don't have time for much more here at the moment.

thopre abandoned this revision.Feb 5 2019, 3:37 AM

As suggested by Joel, I'll start a discussion on llvm-dev to solve the real issue.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 5 2019, 3:37 AM