Introduce CHECK modifiers that change the behavior of the CHECK directive. Also add a LITERAL modifier for cases where matching could end requiring escaping strings interpreted as regex where only literal/fixed string matching is desired (where LITERAL would make the CHECK's easier to write, less fragile/difficult to interpret).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
You should add examples for the other directives as well (CHECK-LITERAL-DAG, CHECK-LITERAL-NOT, CHECK-LITERAL-COUNT)
llvm/docs/CommandGuide/FileCheck.rst | ||
---|---|---|
663–669 | Proposed rewording. If not convinced by it you should at least fix the syntax for mentioning a directive and the typo on "escaping". | |
llvm/include/llvm/FileCheck/FileCheck.h | ||
79 | I know that's what the above line did but I'd simply phrase it as per suggestion. | |
llvm/lib/FileCheck/FileCheck.cpp | ||
1597–1600 | Given how simple is this I would put it in the class declaration. |
On a more general note, I wonder if having new directives is the right way to go. Why not introduce a concept of directive modifier? It could allow multiple modifier to cumulate. We could even introduce modifier in the middle of a directive at some point (though obviously not for a directive that was marked literal).
How about: CHECK{LITERAL}? Or perhaps CHECK{\L}. Other directives would follow naturally and CHECK-EMPTY{\L} would be a nop.
Good point, indirectly this is what this is doing, LITERAL is more a modifier than a directive (and yes CHECK-EMPTY is a nop - I was in between asserting that CHECK-LITERAL-EMPTY is not used, or just allowing it as it means the same). This pushes the modifier parsing to the end. Let me give that a go.
I have to say, I'm not super enthused about this feature. Typically we've added directives when there's no other way to achieve the desired effect; the LITERAL feature doesn't meet that bar. It's a way to simplify something that you could do already. Not to say I'm going to insist on killing it, but I would like to hear more justification.
Does this excessive-escaping situation come up very often in practice? I saw a relatively simple case recently, {{{.*}}} which is, a pair of braces with anything in between. (Which FileCheck doesn't parse in the intuitive way, but it does work as intended.) But that's about the most complicated thing I've come across lately.
SAME could be achieved without needing SAME, COUNT too, and NEXT could be folded in too by extending to multi-line regex and verifying no newlines so too DAG. So I believe there is a goal with directives to make checks readable for users & simple to interpret rather than just possible. Escaping is possible but obscures what is being tested and easy to get wrong.
Does this excessive-escaping situation come up very often in practice?
Very often indeed in MLIR as dense constants use [] pairs for values, so if you have a multi-dimensional tensor value you end up with a lot of escaping.
Fair. I'd forgotten about COUNT and that multi-line regexes are possible. (Although I will point out that SAME does not always do the same thing as pasting the directives together with .* separators; see my "FileCheck Follies" lightning talk from the Fall 2016 Dev Meeting.)
"Easy to get wrong" is a worthwhile point too.
Does this excessive-escaping situation come up very often in practice?
Very often indeed in MLIR as dense constants use [] pairs for values, so if you have a multi-dimensional tensor value you end up with a lot of escaping.
Okay.
Nice talk, I had not seen it before.
"Easy to get wrong" is a worthwhile point too.
Yes, I had a couple of O(10) chat messages exchange with different folks that were frustrated and thought "mmm, could we make this better/easier"
llvm/docs/CommandGuide/FileCheck.rst | ||
---|---|---|
664 | Fix this line to have the right number of ~ or the doc build will break. If you don't have it setup, I recommend you enable the docs build on your machine and build the html pages, to flush out any problems. It would be nice to have an example of how to use the LITERAL modifier, both in conjunction with a normal check and a more complicated directive. I think this was there before, looking at the diff history, but it's disappeared again. | |
llvm/include/llvm/FileCheck/FileCheck.h | ||
96 | literal -> Literal, as per clang-tidy warning. Same issue in various other cases in the new code (variables in the core LLVM project use UpperCase) | |
llvm/lib/FileCheck/FileCheck.cpp | ||
1611 | ||
1612 | Would it be simpler to just do Prefix + str + getModifiersDescription()? Not sure formatv really gets us anything here. | |
llvm/test/FileCheck/check-literal.txt | ||
2 | It would be nice to have a brief comment at the start of the test explaining what this test is actually testing (specifically, it's testing the literal modifier, and not something else that might be considered "literal"). You should probably have a case with a suffix following {LITERAL} to show the behaviour when the : doesn't immediately follow it. Also, where there's no closing } after LITERAL. Is it worth checking the use with -NOT too? Not sure on that one, but maybe. |
llvm/test/FileCheck/check-literal.txt | ||
---|---|---|
4 | Here and in other comments, I suggest using ;; for the comment marker - the double marker helps distinguish it from an actual FileCheck or lit directive. We have tended to do that in some newer tests in the LLVM binutils at least, and I think others I've been involved in reviewing. | |
28–29 | I think this needs %ProtectFileCheckOutput, but am not 100% sure if that applies for "not" cases (I think it does). Also, I have a personal preference for continuation lines to be formatted as suggested in the inline edit. The idea is that it clearly shows from looking at the first line that you've got to the end of that command, whilst the second line is indented shwoing that it is a continuation. |
Updated test comments to be ;; , using ProtectFileCheckOutput for piped case, switch | location for continuation
The implementation of the feature looks fine to me, but I'd like to defer to @thopre and @probinson on whether the feature actually gets accepted or not, so not marking as accepted myself.
Could you add some more tests:
- negative tests for the modifier syntax (e.g. "{LITERAL,}", "{}", "{LITTERAL}", "{LITERAL"
- positive tests with several modifiers (i.e. "{LITERAL,LITERAL}")
It's looking good otherwise. Thanks for making the code already prepared for multiple modifiers. I think we might want to have a shorter version of the modifier but better keep it for later when the need is felt.
llvm/docs/CommandGuide/FileCheck.rst | ||
---|---|---|
667 | I suggest to match the formalism used to introduced numeric expression, as per edit | |
669 | When there'll be more than one I'll suggest using a list but since there's only one supported value I propose folding it inside the text above (as per edit) | |
llvm/include/llvm/FileCheck/FileCheck.h | ||
73 | I'd remove the value, it'll automatically take the last enumerator + 1 and won't require changing for every new modifier. | |
llvm/lib/FileCheck/FileCheck.cpp | ||
1659–1684 | This comment needs updating | |
1668–1673 | Perhaps allow whitespaces? | |
llvm/test/FileCheck/check-literal.txt | ||
32 | It took me a while to parse that sentence but I'm not sure how to improve it. Any suggestion @jhenderson ? Perhaps "This ensures an error is correctly reported when a NOT directive with a LITERAL modifier matches"? |
Adding more negative tests, updating doc & comments, allow whitespace in modifier list.
llvm/lib/FileCheck/FileCheck.cpp | ||
---|---|---|
1689–1690 | ||
1691–1692 | Looking at this again, I'm not sure these two should be folded together. I think it would make more sense to have the : check separate to the { check, and then not to call ConsumeModifiers in the event of the :, since there are none. | |
llvm/test/FileCheck/check-literal.txt | ||
21 | Also A{,LITERAL}? | |
31 | Also add a space before the comma, so that the before and after aspects of the whitespace trimming are tested. | |
32 | The latest version seems reasonable to me. I might suggest "an error" -> "a failure" since error could be implied to mean a problem (e.g. undefined variable) as opposed to a failed CHECK directive. | |
45 | I'd make this more obvious, e.g.INVALID. |
llvm/lib/FileCheck/FileCheck.cpp | ||
---|---|---|
1662–1663 | No because these are used by CHECK-NEXT and the rest, so I would need to repeat this for each of them there. |
The test is currently failing on the pre-merge bot :-)
llvm/test/FileCheck/check-literal.txt | ||
---|---|---|
47 | Should this be 'INVALID'? |
llvm/test/FileCheck/check-literal.txt | ||
---|---|---|
47 | Any reason, you're not testing the closing '? |
llvm/test/FileCheck/check-literal.txt | ||
---|---|---|
47 | Yes, as it ends with :' which I feel could be improved (either by just dropping the : or also adding a regex for modifiers as optional there, preference towards the former at the moment). So wanted to just rest the essential part here and have the update of error message in follow up as it could affect more tests (e.g., make this and it more directed changes). |
No more comments from me at this time, but as before, I'd prefer others to confirm they're happy with the concept.
llvm/test/FileCheck/check-literal.txt | ||
---|---|---|
47 | Thanks for the explanation. |
I suggested making it a modifier rather than a new directive so I'm of course happy with it. Will let others speak up if they want but you're good to go from me
Closed in https://github.com/llvm/llvm-project/commit/44f399ccc12e27d20bae1ea7e712ef7f71e2ff3a (unfortunately omitted from commit description :-/)
Fix this line to have the right number of ~ or the doc build will break. If you don't have it setup, I recommend you enable the docs build on your machine and build the html pages, to flush out any problems.
It would be nice to have an example of how to use the LITERAL modifier, both in conjunction with a normal check and a more complicated directive. I think this was there before, looking at the diff history, but it's disappeared again.