Page MenuHomePhabricator

Fix PR46880: Fail CHECK-NOT with undefined variable
AcceptedPublic

Authored by thopre on Aug 19 2020, 8:48 AM.

Details

Summary

Currently a CHECK-NOT directive succeeds whenever the corresponding
match fails. However match can fail due to an error rather than a lack
of match, for instance if a variable is undefined. This commit makes match
error a failure for CHECK-NOT.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jhenderson added inline comments.Sep 8 2020, 1:09 AM
llvm/lib/Support/FileCheck.cpp
2014 ↗(On Diff #288570)

Could you investigate a bit more and then update the comment with your findings? This code at the moment also skips some whitespace, which isn't always desirable (imagine if you had a regex match looking for some explicit whitespace before some string).

thopre added inline comments.Sep 15 2020, 7:03 AM
llvm/lib/Support/FileCheck.cpp
2014 ↗(On Diff #288570)

The code was added in da108b4ed4e6e7267701e76d5fd3b87609c9ab77 when CHECK-NEXT was added and at the same time as the "scanning from here" diagnostic was added. There was no CHECK-SAME at the time and I guess the goal was to avoid confusion by showing the last matching line (even though the column info would show it's the end of the line). There was no regular expression match either, so no way to match a newline.

I think this should simply be removed, in a separate patch. What is your opinion?

jhenderson added inline comments.Sep 16 2020, 12:38 AM
llvm/lib/Support/FileCheck.cpp
2014 ↗(On Diff #288570)

I'm inclined to agree with you. Perhaps worth adding a TODO or FIXME note here and at the other location, referencing each other, saying that this should probably be removed?

@probinson, (or anybody else) do you have any thoughts on this?

Thanks for working on this. I'm sorry to start reviewing so late. I haven't digested the implementation fully yet, but I have a few comments to get started.

llvm/include/llvm/Support/FileCheck.h
107 ↗(On Diff #290106)

The name MatchError and its comment seem too general. It could easily describe many of the following enumerators. I suggest:

/// Indicates a match that can never succeed because the pattern is found to
/// be undefined at match time (e.g., contains an undefined variable).
MatchUndefined,

Undefined variables are probably the only such case we have right now, and maybe that will always be true, but I'm trying to ensure this can handle more cases if needed in the future.

Also, the middle paragraph in the preceding comments is now incorrect. Here's a possible revision:

/// A directive's supplied pattern is said to be either undefined, expected,
/// or excluded depending on whether the pattern is well formed at match time
/// and whether it must have or must not have a match in order for the 
/// directive to succeed.  For example, a CHECK directive's pattern is 
/// normally expected, a CHECK-NOT directive's pattern is normally excluded, 
/// but either directive's pattern is undefined if it contains an undefined
/// variable at match time.  The match result type MatchUndefined is for 
/// undefined patterns.  All match result types whose names end with 
/// "Excluded" are for excluded patterns.  All other match result types are 
/// for expected patterns.

Here, I'm assuming you consistently use the new enumerator for undefined variables, regardless of whether a directive is positive or negative. I haven't checked the implementation, but the test suite changes seems to confirm that.

llvm/test/FileCheck/dump-input-annotations.txt
626 ↗(On Diff #290106)

This test used to cover printing of variable definitions when a positive directive fails. It no longer does.

A separate test could be added to cover that case. But do we need to drop that info when a variable is undefined? It could be helpful for debugging.

llvm/utils/FileCheck/FileCheck.cpp
201 ↗(On Diff #290106)

If you agree to changing MatchError to MatchUndefined, I suggest changing error: match error to error: match undefined.

thopre added inline comments.Oct 15 2020, 3:19 AM
llvm/test/FileCheck/dump-input-annotations.txt
626 ↗(On Diff #290106)

We currently only print substitutions if there are no errors. I think if we want to print substitutions in case of undefined variable we should do it regardless of whether there are errors to be consistent. Obviously that means ignoring substitution errors in printSubstitutions

jdenny added inline comments.Oct 15 2020, 12:10 PM
llvm/test/FileCheck/dump-input-annotations.txt
626 ↗(On Diff #290106)

What's an example of an error besides undefined variable for which we currently don't print substitutions but you're suggesting we would?

thopre marked an inline comment as done.Oct 16 2020, 1:30 AM
thopre added inline comments.
llvm/test/FileCheck/dump-input-annotations.txt
626 ↗(On Diff #290106)

Overflow and underflow come to mind. There might be others.

% cat overflow_undef.test
BIGVAR=10000000000000000
CHECK: BIGVAR: [[#BIGVAR:0x8000000000000000+0x8000000000000000]] [[#UNDEFVAR]]
% ./bin/FileCheck --dump-input=never --input-file overflow_undef.test overflow_undef.test
overflow_undef.test:2:19: error: unable to substitute variable or numeric expression: overflow error
CHECK: BIGVAR: [[#BIGVAR:0x8000000000000000+0x8000000000000000]] [[#UNDEFVAR]]
                  ^
zsh: exit 1     ./bin/FileCheck --dump-input=never --input-file overflow_undef.test
jdenny added inline comments.Oct 16 2020, 8:27 AM
llvm/include/llvm/Support/FileCheck.h
107 ↗(On Diff #290106)

Another possibility that might be clearer is MatchPatternMalformed.

llvm/test/FileCheck/dump-input-annotations.txt
626 ↗(On Diff #290106)

In the case of overflow/underflow, I'm in favor of printing used variable definitions. It would be especially helpful if it would print any variable definitions used in the expression that overflowed/underflowed, but printing other used variable definitions could be helpful as well.

Your example is a bit different than the case I was pointing out originally: in your example, there are no used variable definitions to report. There is however an undefined variable. I am also in favor of printing all errors (e.g., overflow and undefined variable) from a directive instead of just the first (overflow), where error recovery makes that feasible.

Again, if supporting all this info isn't feasible, then please at least add another test so we don't lose coverage of -dump-input printing used variable definitions when a positive directive with a well formed pattern fails to match.

(By the way, I just tried a few examples (without the current patch), and I'm noticing issues with overflow errors with respect to -dump-input. The traditional error message doesn't print undefined variables, but -dump-input does. -dump-input reports error: no match found twice and doesn't report the overflow. My master is a couple of months old already, so ignore me if all this has been fixed.)

llvm/utils/FileCheck/FileCheck.cpp
201 ↗(On Diff #290106)

And MatchPatternMalformed would obviously become something like error: match pattern malformed.

thopre marked an inline comment as done.Dec 15 2020, 2:45 PM

@jhenderson I've made a patch to remove that skip-past-end-of-line code: https://reviews.llvm.org/D93341. Sorry for the long delay.

thopre updated this revision to Diff 312072.Dec 15 2020, 4:37 PM
thopre marked 6 inline comments as done.

Address most of the comments.

Given that I've gone cold on this change and need to go through it again properly, and the number of other reviews that are on my plate, in addition to higher priority internal work I need to do before the Christmas break (my last day is Friday), I doubt I'll get a chance to look at this until some time in the New Year. Sorry! If people feel it's important to land sooner than that, please don't let my lack of review block this (I can always do post-commit reviews if needed).

thopre added inline comments.Dec 16 2020, 2:02 AM
llvm/lib/Support/FileCheck.cpp
2014 ↗(On Diff #288570)

I've created https://reviews.llvm.org/D93341. I'll rebase this patch on top of it once the approach is agreed upon.

Given that I've gone cold on this change and need to go through it again properly, and the number of other reviews that are on my plate, in addition to higher priority internal work I need to do before the Christmas break (my last day is Friday), I doubt I'll get a chance to look at this until some time in the New Year. Sorry! If people feel it's important to land sooner than that, please don't let my lack of review block this (I can always do post-commit reviews if needed).

Tests still need to be fixed prior to landing this patch anyway, so don't worry it can wait. I've started working on that and sent 3 diff for review. A lot more will be needed though.

thopre updated this revision to Diff 333458.Thu, Mar 25, 4:24 PM
thopre marked an inline comment as done.

Rebase and massively simply thanks to Joel's work.

llvm/test/FileCheck/dump-input-annotations.txt
626 ↗(On Diff #290106)

Does this updated patch solve your concern? Ignore all the tests failures, I still haven't gotten round to look at those.

I haven't looked through the tests yet, but the implementation logic looks right to me. Thanks!

llvm/lib/FileCheck/FileCheck.cpp
2128

Is this change intentional? There's already a NotFoundError handler below (I'd prefer to keep its comment).

llvm/lib/FileCheck/FileCheckImpl.h
232

When is this function called now?

758

It looks like everything after the "or" is no longer true.

thopre updated this revision to Diff 333518.Fri, Mar 26, 3:03 AM
thopre marked 3 inline comments as done.

Address comments

llvm/lib/FileCheck/FileCheckImpl.h
232

It is a virtual function and thus needs to be defined. It's not called anywhere but I cannot remove it.

You say there are many test failures. Should I hold off reviewing the test changes already in this patch, or are those fine?

llvm/lib/FileCheck/FileCheckImpl.h
232

Makes sense. So the point of the change here is, in case this function is one day called, to have a log message that is consistent with other ErrorInfo log messages now that old message here is no longer used by printSubstitutions. Right?

thopre marked an inline comment as done.Fri, Mar 26, 8:16 AM

You say there are many test failures. Should I hold off reviewing the test changes already in this patch, or are those fine?

Oh no you can review, those are errors in tests that this patch uncover from what I could see (I only looked at the first 10 or so). I'll mail llvm-def to ask people's help. The only thing missing are unit tests.

llvm/lib/FileCheck/FileCheckImpl.h
232

This change was made because I wanted to have a general error catcher in printNoMatch that would catch both UndefVarError and ErrorDiagnostic using the log() and message() methods from ErrorInfoBase. When changing the message to fit the current one I thought the "uses" is not really needed and is not consistent with other error message and the quotes and escaping are not needed since variable names are known to be well behaved.

However UndefVarError has no source manager attached to it and thus cannot print an error message with location info because there's no SourceMgr available in places where the error is created. Therefore I had to do this translation from UndefVarError to ErrorDiagnostic. But I've just found a way to reuse the log message (see last update) so it's all good.

thopre updated this revision to Diff 333558.Fri, Mar 26, 8:16 AM
thopre marked an inline comment as done.

Avoid error message duplication

jdenny added inline comments.Fri, Mar 26, 9:23 AM
llvm/lib/FileCheck/FileCheckImpl.h
232

Ah, looks good.

llvm/test/FileCheck/dump-input/annotations.txt
701

Please restore the correct alignment here and elsewhere in this file.

(I'm starting to think it might have been better to use -strict-whitespace in most of these tests. I find myself turning it on temporarily anyway while adjusting tests because the squashed whitespace makes debugging confusing. My original goal for leaving it off was to make test updates easier if we were to adjust formatting, but that rarely happens. What do you think?)

705–706

This fixme can go now.

thopre updated this revision to Diff 333572.Fri, Mar 26, 9:47 AM
thopre marked an inline comment as done.

Fix whitespace

llvm/test/FileCheck/dump-input/annotations.txt
701

If whitespace is important it should be on. If formatting changes then we do want things to fail where whitespace is important.

It seems the main issue left to address is tests that this patch breaks.

llvm/test/FileCheck/dump-input/annotations.txt
701

The first test in this file checks whitespace in formatting. The remaining tests do not check whitespace, but I find them to be more legible when at least aligned, such as aligning the error and possible intended match lines above with the other diagnostics. To help maintain that and make debugging easier, maybe -strict-whitespace is worthwhile. One difficulty is that some tests involve multiple verbosity levels, across which whitespace varies, but regexes could help there.

It seems we're accumulating many stylistic changes to make to these tests, so maybe I'll write a patch one day to tackle it all.

It seems the main issue left to address is tests that this patch breaks.

A lot of the failures (e.g. llvm/test/Transforms/GVN/big-endian.ll, llvm/test/Transforms/LoopVectorize/pr34681.ll, llvm/test/Transforms/LoopVectorize/X86/x86-pr39099.ll) are tests that use several CHECK-NOT with one line defining a variable and another one using it. The intent seems to be to check that a sequence of lines are not happening. These could be converted to a combined CHECK-NOT with {{(.*[[:space:]])+}} between each of the individual CHECK-NOT pattern involved but it quickly becomes unreadable. I think this highlight the need for a new feature like "CHECK-NOT-NEXT" if that makes sense. It might not be the best name, because I'm sure there will be case where they want successive line so we'd want the CHECK-NOT equivalent of (1) two consecutive CHECK, (2) a CHECK and a CHECK-NEXT and (3) a CHECK and a CHECK-SAME.

What do you think? Is adding a new feature first the best way forward? If yes, how would you design it?

A lot of the failures (e.g. llvm/test/Transforms/GVN/big-endian.ll, llvm/test/Transforms/LoopVectorize/pr34681.ll, llvm/test/Transforms/LoopVectorize/X86/x86-pr39099.ll) are tests that use several CHECK-NOT with one line defining a variable and another one using it.

How many tests does this patch break?

What do you think? Is adding a new feature first the best way forward? If yes, how would you design it?

Those tests are passing without this patch, right? That should mean the CHECK-NOT that defines the variable never matches, so it should be impossible for the CHECK-NOT that uses the variable to match even if we had the feature you're suggesting. A quick fix is then to remove or comment out the CHECK-NOT that uses the variable. I believe the tests would be just as robust as now.

Before trying to design a new feature, I'd pursue that quick fix and see what the test authors say.

A lot of the failures (e.g. llvm/test/Transforms/GVN/big-endian.ll, llvm/test/Transforms/LoopVectorize/pr34681.ll, llvm/test/Transforms/LoopVectorize/X86/x86-pr39099.ll) are tests that use several CHECK-NOT with one line defining a variable and another one using it.

How many tests does this patch break?

60, one of them being the FileCheck unittest which I haven't adapted yet. I've already made 4 patches solving 5 tests:

https://reviews.llvm.org/D99581
https://reviews.llvm.org/D99582
https://reviews.llvm.org/D99583
https://reviews.llvm.org/D99589

What do you think? Is adding a new feature first the best way forward? If yes, how would you design it?

Those tests are passing without this patch, right? That should mean the CHECK-NOT that defines the variable never matches, so it should be impossible for the CHECK-NOT that uses the variable to match even if we had the feature you're suggesting. A quick fix is then to remove or comment out the CHECK-NOT that uses the variable. I believe the tests would be just as robust as now.

Before trying to design a new feature, I'd pursue that quick fix and see what the test authors say.

You're right, let's try that.

thopre added a comment.Thu, Apr 1, 2:51 PM

How many tests does this patch break?

60, one of them being the FileCheck unittest which I haven't adapted yet. I've already made 4 patches solving 5 tests:

Down to 19 when applying these diffs on top of current main branch:

https://reviews.llvm.org/D99581
https://reviews.llvm.org/D99582
https://reviews.llvm.org/D99589
https://reviews.llvm.org/D99602
https://reviews.llvm.org/D99770
https://reviews.llvm.org/D99771

thopre updated this revision to Diff 335320.Mon, Apr 5, 1:57 PM

Fix unit test failure, still need to add unit tests to cover new code

thopre added a comment.Mon, Apr 5, 1:59 PM

How many tests does this patch break?

60, one of them being the FileCheck unittest which I haven't adapted yet. I've already made 4 patches solving 5 tests:

All tests fixed except for 2 related ones but with a RFC review. Most fix have already landed, 12 still in review (incl. the RFC one). We're almost there. :-)

thopre updated this revision to Diff 335543.Tue, Apr 6, 9:10 AM
  • Add unit test for multiple errors in match()
  • Rebase
thopre added a comment.Thu, Apr 8, 3:41 AM

@jdenny I have no more changes to make to this patch and the CI is now green. Are you ok with the patch?

jdenny accepted this revision.Thu, Apr 8, 1:10 PM

Thanks for all that work on the other tests! Still LGTM.

This revision is now accepted and ready to land.Thu, Apr 8, 1:10 PM
thopre added a comment.Fri, Apr 9, 2:07 AM

Thanks for all that work on the other tests! Still LGTM.

Thanks! Note there's still 4 dependent patches waiting for review so don't be surprised if it takes me a few days before I can land this.

thopre updated this revision to Diff 336630.Sat, Apr 10, 1:47 PM

Fix comment