This is an archive of the discontinued LLVM Phabricator instance.

[FileCheck] Do not skip end of line in diagnostics
ClosedPublic

Authored by thopre on Dec 15 2020, 2:43 PM.

Details

Summary

When commit da108b4ed4e6e7267701e76d5fd3b87609c9ab77 introduced
the CHECK-NEXT directive, it added logic to skip to the next line when
printing a diagnostic if the current matching position is at the end of
a line. This was fine while FileCheck did not support regular expression
but since it does now it can be confusing when the pattern to match
starts with the expectation of a newline (e.g. CHECK-NEXT: {{\n}}foo).
It is also inconsistent with the column information in the diagnostic
which does point to the end of line.

This commit removes this logic altogether, such that failure to match
diagnostic for such cases would show the end of line and be consistent
with the column information. The commit also adapts all existing
testcases accordingly.

Note to reviewers: An alternative approach would be to restrict the code
to only skip to the next line if the first character of the pattern is
known not to match a whitespace-like character. This would respect the
original intent but keep the inconsistency in terms of column info and
requires more code. I've only chosen this current approach by laziness
and would be happy to restrict the logic instead.

Diff Detail

Event Timeline

thopre created this revision.Dec 15 2020, 2:43 PM
thopre requested review of this revision.Dec 15 2020, 2:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 15 2020, 2:43 PM
jhenderson accepted this revision.Jan 7 2021, 12:38 AM

Personally, I think the new behaviour you're adding makes more sense. The old behaviour could get particularly confusing, I imagine, when matching literal tabs/spaces (e.g. due to --strict-whitespace) etc if the marker moves on later than expected. LGTM, but I think it would be good to get one or two others to give their input.

This revision is now accepted and ready to land.Jan 7 2021, 12:38 AM
jdenny added a comment.Jan 7 2021, 9:46 AM

Thanks, I like this change as well.

Note to reviewers: An alternative approach would be to restrict the code
to only skip to the next line if the first character of the pattern is
known not to match a whitespace-like character. This would respect the
original intent but keep the inconsistency in terms of column info and
requires more code. I've only chosen this current approach by laziness
and would be happy to restrict the logic instead.

It's tempting to do something like that given that X so often appears by itself at the end of an input dump line now. On the other hand, given that FileCheck's matching behavior can be difficult to understand, an accurate representation of its behavior seems more important than a small improvement to input dump readability.

Another alternative to consider is to skip only newlines. In my experience, it's rare to have patterns that match newlines. It's not impossible, so it might be worthwhile to not skip newlines when the pattern contains [[:space:]].

Sorry for so many nits in the test expected output. I intentionally didn't use --strict-whitespace in many of the input dump tests so that there won't need to be too many updates if we later change something trivial like whitespace padding. For test case readability, I'd still like it to reflect the correct semantics and basic alignment. If we're actually missing bugs as a result of not using --strict-whitespace, then maybe there should be a few more --strict-whitespace tests.

llvm/test/FileCheck/dump-input-annotations.txt
445

Messages aren't aligned properly here. To fix this, the input dump implementation needs to reserve a column for newlines.

633

Hopefully this isn't how it looks in the actual dump. The X and error message should be at the end of the line. Please verify and update the test case for readability.

llvm/test/FileCheck/dump-input-context.txt
123

Again, X and the message should be at the end of the line. Please verify.

126

The message should be at the end of the line. Not sure about the ?.

(Interesting that your change adds a suggested match.)

llvm/test/FileCheck/dump-input-enable.txt
254

Message at end of line.

llvm/test/FileCheck/dump-input-filter.txt
71

Shouldn't the X be at the newline?

131

X at newline?

191

X at newline?

223

X at newline?

thopre updated this revision to Diff 315229.Jan 7 2021, 1:30 PM
thopre marked 9 inline comments as done.

Fix alignment of diagnostics and update testcases

jdenny added inline comments.Jan 7 2021, 2:42 PM
llvm/test/FileCheck/dump-input-annotations.txt
451–453

For consistency with the rest of this patch, I think the newline here and on subsequent lines should be shown to be part of the dag:4 search range. That is, there should be another ~. See my comments in FileCheck.cpp.

llvm/utils/FileCheck/FileCheck.cpp
666 ↗(On Diff #315229)

There should be a COS << ' '; here.

673 ↗(On Diff #315229)

I think - NewLine should be removed here so that the ~ loop below has enough iterations.

696 ↗(On Diff #315229)

With my above suggestion, this + Newline should be removed.

thopre added inline comments.Jan 7 2021, 3:02 PM
llvm/utils/FileCheck/FileCheck.cpp
666 ↗(On Diff #315229)

I'm not sure this is a good idea because people might wonder why there's a space in the input. Right now the shown input matches the actual input in that regards.

I don't know what to think about the ~ marks though. How do annotations even work for a single CHECK directive that matches several lines (e.g. CHECK: foo{{[\n]}}bar)?

jdenny added inline comments.Jan 7 2021, 4:47 PM
llvm/utils/FileCheck/FileCheck.cpp
666 ↗(On Diff #315229)

I'm not sure this is a good idea because people might wonder why there's a space in the input. Right now the shown input matches the actual input in that regards.

Good point. I was trying to ensure that unmatched text highlighting would also apply to newlines, but maybe it's not worth it.

I don't know what to think about the ~ marks though. How do annotations even work for a single CHECK directive that matches several lines (e.g. CHECK: foo{{[\n]}}bar)?

Search ranges and match ranges are the same in that regard:

<<<<<<
         1: FOO 
check:1     ^~~~
         2: BAR 
check:1     ~~~
>>>>>>

I do not think there should be an inconsistency between whether newlines are annotated at the beginning of a range (^, X, etc.) and whether they're annotated somewhere within a range (~), especially at the end of a range.

For example, where $[[:space:]] is a little known way to exclusively match a newline (the \n in your patch summary doesn't work for me):

$ cat check
CHECK: {{FOO[[:space:]]*}}
CHECK: {{$[[:space:]]BAR}}

$ FileCheck -vv check < input  |& tail -6
<<<<<<
         1: FOO
check:1     ^~~
         2: BAR
check:2     X~~ error: no match found
>>>>>>

Why did the second directive skip the newline on line 1? An extra tilde at the end of line 1 would make it clearer.

thopre updated this revision to Diff 321106.Feb 3 2021, 8:42 AM
thopre marked 2 inline comments as done.

Add ~ marks under newlines in input

llvm/utils/FileCheck/FileCheck.cpp
666 ↗(On Diff #315229)

Sorry for the late reply. I'm not sure how to read your answer. Do you agree about not putting a space at the end of each input line? I'm not sure what you refer by "unmatched text highlighting".

jdenny added inline comments.Feb 3 2021, 12:33 PM
llvm/utils/FileCheck/FileCheck.cpp
666 ↗(On Diff #315229)

When input text has no matching pattern, FileCheck highlights it if color is enabled. My concern was that, if we don't add a space for newlines, then that highlighting cannot happen for newlines, and that's inconsistent with the way annotations will mark newlines after this patch.

You pointed out that adding the space makes it look like the input actually had an extra space. For that reason, I was agreeing not to add the space.

Really, I don't know whether the inconsistency is worse or the extra space is worse. I see two possible solutions:

  1. Add the space, and add a bullet to -dump-input=help that states that each input line is printed with an extra trailing space to represent the newline for the sake of highlighting.
  2. Don't add the space but add a bullet to -dump-input=help that states that unmatched input highlighting is not shown for newlines.

At this point, I'd be happy if you'd pick either of the above. We can always change it later.

thopre updated this revision to Diff 321891.Feb 5 2021, 4:05 PM
thopre marked 4 inline comments as done.

Add space at the end of each input line dumped and explain it in dump input help

thopre added inline comments.Feb 5 2021, 4:08 PM
llvm/utils/FileCheck/FileCheck.cpp
666 ↗(On Diff #315229)

Although I'm sure a small portion of people using -dump-input have read the help (esp. since it's the default in case of failure to match), I really dislike inconsistencies. I would have liked to have a visual clue for the newline instead of a space but any character or sequence of character could be part of the input.

How about using a special color (e.g. lighter green/red/blue/etc) for newlines? It would not help for diagnostics without color but then space is invisible unless one look in an editor of sorts. What do you think?

jdenny added a comment.Feb 5 2021, 5:57 PM

I'm adding @mehdi_amini as a reviewer. He's shown interest in -dump-input in the past, and I'd like another opinion on whether the changes made in this patch will be bearable to others as they will impact most FileCheck failures.

llvm/utils/FileCheck/FileCheck.cpp
666 ↗(On Diff #315229)

Interesting idea. Would we need two colors, one for a matched newline and one for an unmatched newline?

Another possibility is something like "•" (hopefully that renders as a bullet for you). It's unlikely to occur in FileCheck input in LLVM's test suites, and people will quickly learn what it means because it will be on every line. Of course, we'd want to pick something easy on the eyes and something likely to be visible in most viewers, but I'm not confident about either of those points.

jdenny accepted this revision.Feb 5 2021, 5:59 PM

Right now, this LGTM. Let's give others a few days to comment, and I'm happy to iterate more on the newline issue.

I'm adding @mehdi_amini as a reviewer. He's shown interest in -dump-input in the past, and I'd like another opinion on whether the changes made in this patch will be bearable to others as they will impact most FileCheck failures.

Ping @mehdi_amini ?

jdenny added a comment.Mar 2 2021, 1:20 PM

@thopre It's been almost a month. I suggest we move forward with this and address any complaints post-commit.

This revision was landed with ongoing or failed builds.Mar 3 2021, 12:20 AM
This revision was automatically updated to reflect the committed changes.