Page MenuHomePhabricator

Fix PR46880: Fail CHECK-NOT with undefined variable
Needs ReviewPublic

Authored by thopre on Aug 19 2020, 8:48 AM.

Details

Summary

Currently a CHECK-NOT directive succeeds whenever the corresponding
match fails. However match can fail due to an error rather than a lack
of match, for instance if a variable is undefined. This commit makes match
error a failure for CHECK-NOT.

Diff Detail

Event Timeline

thopre created this revision.Aug 19 2020, 8:48 AM
thopre requested review of this revision.Aug 19 2020, 8:48 AM
thopre planned changes to this revision.Aug 19 2020, 3:46 PM

FYI: I've only looked at a couple of the testsuite failures and they are all genuine use of undefined variable. There's 59 of them in total.

I am not well familar with the FileCheck logic probably to verify the code,
but following tests are failing for me with this diff applied:
(e.g.: Transforms/LoopVectorize/pr34681.ll fails with "error: uses undefined variable(s): "TEST"")

LLVM :: CodeGen/AArch64/aarch64-tbz.ll
LLVM :: CodeGen/AArch64/machine-outliner-retaddr-sign-thunk.ll
LLVM :: CodeGen/AArch64/speculation-hardening.ll
LLVM :: CodeGen/AMDGPU/amdgpu.private-memory.ll
LLVM :: CodeGen/AMDGPU/private-memory-r600.ll
LLVM :: CodeGen/Hexagon/hwloop3.ll
LLVM :: CodeGen/Mips/sr1.ll
LLVM :: CodeGen/PowerPC/ctrloops-softfloat.ll
LLVM :: CodeGen/PowerPC/ppc-disable-non-volatile-cr.ll
LLVM :: CodeGen/PowerPC/ppc-shrink-wrapping.ll
LLVM :: CodeGen/PowerPC/ppc64-i128-abi.ll
LLVM :: Transforms/CodeGenPrepare/ARM/sink-add-mul-shufflevector.ll
LLVM :: Transforms/Coroutines/coro-split-sink-lifetime-02.ll
LLVM :: Transforms/GVN/PRE/load-pre-nonlocal.ll
LLVM :: Transforms/GVN/big-endian.ll
LLVM :: Transforms/HardwareLoops/ARM/structure.ll
LLVM :: Transforms/InferFunctionAttrs/annotate.ll
LLVM :: Transforms/LoopVectorize/X86/x86-pr39099.ll
LLVM :: Transforms/LoopVectorize/pr34681.ll
LLVM :: Transforms/NewGVN/big-endian.ll

looks they all have undefined variables. So, seems this patch works well!
I've leaved 2 nits inline too.

llvm/lib/Support/FileCheck.cpp
2023

Perhaps a bit cleaner would be to return false, since it it known that UndefVar is always false here.

llvm/test/FileCheck/var-scope.txt
10

So, there are no ERRUNDEF check line anymore and seems you can remove it from here (and below) and switch to --check-prefix (just here)?

grimar added a comment.EditedAug 20 2020, 2:16 AM

FYI: I've only looked at a couple of the testsuite failures and they are all genuine use of undefined variable. There's 59 of them in total.

Oh, OK, I did not see this comment when wrote mine one.

thopre updated this revision to Diff 286750.Aug 20 2020, 2:22 AM
thopre marked 2 inline comments as done.
  • Address review comments
  • rename PrintNoMatch to printNoMatch to satisfy clang-tidy

There seems to be a number of failures like due to pattern like (taken from clang/test/CodeGen/debug-info-extern-call.c):

DECLS-FOR-EXTERN-NOT: !DICompileUnit({{.*}}retainedTypes: ![[RETTYPES:[0-9]+]]
DECLS-FOR-EXTERN-NOT: ![[RETTYPES]] = !{

I think FileCheck should forbid variable definition in a CHECK-NOT pattern. Granted with this patch we'll catch all the uses of variables defined in CHECK-NOT but we won't catch if they are not used.

There seems to be a number of failures like due to pattern like (taken from clang/test/CodeGen/debug-info-extern-call.c):

DECLS-FOR-EXTERN-NOT: !DICompileUnit({{.*}}retainedTypes: ![[RETTYPES:[0-9]+]]
DECLS-FOR-EXTERN-NOT: ![[RETTYPES]] = !{

I think FileCheck should forbid variable definition in a CHECK-NOT pattern. Granted with this patch we'll catch all the uses of variables defined in CHECK-NOT but we won't catch if they are not used.

I have no idea what that is even trying to do. A definition within a CHECK-NOT doesn't make sense - either the pattern will match, enabling the capture of the variable, and FileCheck to fail (because of a positive CHECK-NOT match), or it will not match, meaning the variable can't be used. In other words, I think forbidding variable definitions in CHECK-NOT makes sense. It's probably worth bringing this up on llvm-dev and cfe-dev though, to give people a head's up/ask for opinions. There may be a way this actually is useful, but if so, I don't know it.

This commit makes match error a failure for CHECK-NOT and changes PrintNoMatch to print all CHECK-NOT failure in a CHECK-NOT block rather than just the first one.

Are these two inherently linked, or could you split them into two separate changes?

llvm/lib/Support/FileCheck.cpp
1997

I'm not a fan on using booleans to indicate success or failure of a function, for two reasons:

  1. It's easy to forget to check the result, meaning errors will be missed.
  2. It's not obvious without reading whether a true indicates success or failure.

Can we use an Error instead?

2002

I'm not a fan of how this variable is being used throughout this function - it feels like it would be easy to forget to check it when doing something or check it incorrectly, making this function hard to maintain.

Could we bail out early if an undefined variable is detected?

llvm/test/FileCheck/numeric-expression.txt
249

Is this change related?

There seems to be a number of failures like due to pattern like (taken from clang/test/CodeGen/debug-info-extern-call.c):

DECLS-FOR-EXTERN-NOT: !DICompileUnit({{.*}}retainedTypes: ![[RETTYPES:[0-9]+]]
DECLS-FOR-EXTERN-NOT: ![[RETTYPES]] = !{

I think FileCheck should forbid variable definition in a CHECK-NOT pattern. Granted with this patch we'll catch all the uses of variables defined in CHECK-NOT but we won't catch if they are not used.

I have no idea what that is even trying to do. A definition within a CHECK-NOT doesn't make sense - either the pattern will match, enabling the capture of the variable, and FileCheck to fail (because of a positive CHECK-NOT match), or it will not match, meaning the variable can't be used. In other words, I think forbidding variable definitions in CHECK-NOT makes sense. It's probably worth bringing this up on llvm-dev and cfe-dev though, to give people a head's up/ask for opinions. There may be a way this actually is useful, but if so, I don't know it.

I have no idea either. I've found another case of 2 different invocation of FileCheck with different prefix where one invocation sets a variable and the other use it. I understand the intent having seen the output but I have so far no idea on how to fix it.

This commit makes match error a failure for CHECK-NOT and changes PrintNoMatch to print all CHECK-NOT failure in a CHECK-NOT block rather than just the first one.

Are these two inherently linked, or could you split them into two separate changes?

I think they can be separated if that change land first.

thopre updated this revision to Diff 287246.Aug 23 2020, 2:45 AM
thopre marked 3 inline comments as done.
  • Address comments
  • Restructure
thopre edited the summary of this revision. (Show Details)Aug 23 2020, 2:45 AM
jhenderson added inline comments.Aug 24 2020, 12:46 AM
llvm/lib/Support/FileCheck.cpp
2030–2031

Why is there commented out code here?

2302–2306

Aside from the DirectiveFail = true line here, this and the other two instances of this pattern look identical to me. I guess that's partly my fault for the suggestion earlier of not returning a boolean to indicate an error. Perhaps part of the problem previously was the suggestion that the printNoMatch function even returned anything. I certainly wouldn't expect it.

How about we wrap this common code in a helper function, called handleMatchFailure or something like that, and that can return a boolean saying whether the failure was due to an error or regular failure? We could invent an enum that indicates the same thing but with better words, to avoid the boolean, but that might be going further than necessary. An alternative would be to pass in a lambda OnMatchError or something, which does nothing in the two other cases, but here sets the DirectiveFail variable.

thopre updated this revision to Diff 288570.Fri, Aug 28, 3:30 AM
thopre marked 2 inline comments as done.

Address review comments

thopre added inline comments.Fri, Aug 28, 1:55 PM
llvm/lib/Support/FileCheck.cpp
2030–2031

Oops.

Code mostly looks fine now, but there are still plenty of test failures being reported by the pre-merge bot. What's the situation with those currently?

llvm/lib/Support/FileCheck.cpp
2064

This might be a dumb question, but I'm not following why we're skipping '\t' and ' ' here. It certainly doesn't line up with the comment above.

2081–2096

Can't these two be folded into a single handleAllErrors call? IIRC, handleAllErrors takes any number of handler lambdas, and iterates through them until one matches, or it runs out. Thus the above code should look like the something like the suggested edit.

thopre marked an inline comment as done.Thu, Sep 3, 3:42 PM

Code mostly looks fine now, but there are still plenty of test failures being reported by the pre-merge bot. What's the situation with those currently?

I've started trying to fix them but sadly most of them are non trivial which is even more difficult when one is not familiar with what the test is trying to check. It seems a fair amount are using several CHECK-NOT as a way to say "I don't want THIS followed by THAT to match the input", where THAT uses a variable defined in THIS. It basically expects a back reference. This can be fixed by merging the two CHECK-NOT when they don't use different prefix but it seems to be yet another example where a grouping directive would be useful, e.g.:

CHECK-REGION-START: REGION-A
CHECK: THIS
CHECK: THAT
CHECK-REGION-END: REGION-A
CHECK-NOT: REGION-A

This is just an example, better syntax would need to be use. The reason I suggest a region rather than a new special-purpose directive is I wanted to have something like REGION-DAG where directives in a region are CHECK: that need to appear linearly but directives in different regions can be interleaved. Anyway, my plan is to try to merge those CHECK-NOT directives and then we can discuss how to introduce new syntax to make this more readable.

llvm/lib/Support/FileCheck.cpp
2064

I'm not sure, this is copied over from PrintNoMatch. I thought it is used so that the "scanning from here" shows the first non blank line after last match but I haven't looked in details.

2081–2096

It can be merged yes. It's a relic from when I would group all undefined variables in one error message in which case the undefined variable errors needs to be processed together. I'll change that once I'm back to work on Monday.

thopre updated this revision to Diff 290106.Sat, Sep 5, 12:37 PM
thopre marked an inline comment as done.

Merge handleErrors in handleAllErrors below

jhenderson added inline comments.Tue, Sep 8, 1:09 AM
llvm/lib/Support/FileCheck.cpp
2064

Could you investigate a bit more and then update the comment with your findings? This code at the moment also skips some whitespace, which isn't always desirable (imagine if you had a regex match looking for some explicit whitespace before some string).

thopre added inline comments.Tue, Sep 15, 7:03 AM
llvm/lib/Support/FileCheck.cpp
2064

The code was added in da108b4ed4e6e7267701e76d5fd3b87609c9ab77 when CHECK-NEXT was added and at the same time as the "scanning from here" diagnostic was added. There was no CHECK-SAME at the time and I guess the goal was to avoid confusion by showing the last matching line (even though the column info would show it's the end of the line). There was no regular expression match either, so no way to match a newline.

I think this should simply be removed, in a separate patch. What is your opinion?

jhenderson added inline comments.Wed, Sep 16, 12:38 AM
llvm/lib/Support/FileCheck.cpp
2064

I'm inclined to agree with you. Perhaps worth adding a TODO or FIXME note here and at the other location, referencing each other, saying that this should probably be removed?

@probinson, (or anybody else) do you have any thoughts on this?