This is an archive of the discontinued LLVM Phabricator instance.

[VerifyDiagnosticConsumer] support -verify=<prefixes>
ClosedPublic

Authored by jdenny on Nov 6 2017, 1:00 PM.

Details

Summary

This mimics FileCheck's --check-prefixes option.

The default prefix is "expected". That is, "-verify" is equivalent to
"-verify=expected".

The goal is to permit exercising a single test suite source file with
different compiler options producing different sets of diagnostics.
While cpp can be combined with the existing -verify to accomplish the
same goal, source is often easier to maintain when it's not cluttered
with preprocessor directives or duplicate passages of code. For
example, this patch also rewrites some existing clang tests to
demonstrate the benefit of this feature.

Diff Detail

Event Timeline

jdenny created this revision.Nov 6 2017, 1:00 PM
jdenny updated this revision to Diff 122172.Nov 8 2017, 3:56 PM
jdenny retitled this revision from [VerifyDiagnosticConsumer] support -verify=PREFIX to [VerifyDiagnosticConsumer] support -verify=<prefixes>.
jdenny edited the summary of this revision. (Show Details)
  1. Extended -verify to accept multiple prefixes, like FileCheck's --check-prefixes.
  1. Demonstrated the benefits of this feature by using it to clean up existing clang tests.
  1. Rebased to a more recent master.

Now that this patch is non-trivial, I'll wait a bit for a review before making more revisions.

jdenny updated this revision to Diff 123027.Nov 15 2017, 8:33 AM
  1. Capitalized some of the new local variables according to coding standards.
  1. Rebased on master/trunk fetched today.
jdenny updated this revision to Diff 125223.Dec 1 2017, 2:14 PM

Rebased on master/trunk fetched today.

hfinkel added a subscriber: hfinkel.Dec 4 2017, 7:28 PM

I think this is a good idea.

include/clang/Driver/CC1Options.td
407

"Similar to" seems unfortunately vague. Can it say, "Equivalent to ..."?

lib/Frontend/CompilerInvocation.cpp
1081

I'd prefer to have spaces around the != operators.

1095

If this were going to be an error, then it should have error message that explains the problem (e.g., duplicate --verify prefix). I don't believe that we should make this an error (i.e., I think that we should allow duplicate --verify options with the same prefixes).

lib/Frontend/VerifyDiagnosticConsumer.cpp
231–232

Please document here what FinishDirectiveWord means (and, while you're at it, documenting what EnsureStartOfWord means would be nice too).

387

This block is to limit the scope of the DType StringRef? That doesn't seem worthwhile.

jdenny added a comment.Dec 5 2017, 8:19 AM

I think this is a good idea.

Thanks for the review. I've replied to each comment and will make revisions after we agree on the behavioral issue you raised.

include/clang/Driver/CC1Options.td
407

I agree I should have made it clearer.

"Equivalent to -verify=expected" works if we decide that duplicate explicit prefixes are permitted, as you've suggested in a later comment.

With the current implementation, it should be "All occurrences together are equivalent to one occurrence of -verify=expected". That is, I chose to permit multiple occurrences of -verify without explicit prefixes for backward compatibility, but I chose not to permit duplicate explicit prefixes for reasons I'll discuss in the other comment.

I'll clean up the documentation once we agree on the right behavior.

lib/Frontend/CompilerInvocation.cpp
1081

Will change.

1095

If this were going to be an error, then it should have error message that explains the problem (e.g., duplicate --verify prefix).

Are you saying you prefer that to be in the error instead of the note (where it is now)?

I don't believe that we should make this an error (i.e., I think that we should allow duplicate --verify options with the same prefixes).

I see two reasons to permit duplicate explicit prefixes:

  1. It simplifies the documentation some (see previous comment).
  1. Typically, it's convenient to permit command-line options to be repeated so that groups of options can be collected in shell or make variables without having to worry about conflicts between groups. On the other hand, I'm having trouble imagining that use case for -verify options, which I believe would normally appear directly in command lines in test source files. Have you seen use cases (perhaps with FileCheck) where it would be useful?

I see three reasons not to permit duplicate explicit prefixes:

  1. Not permitting duplicates is consistent with FileCheck's --check-prefix[es].
  1. If we change our mind, we can later relax the restriction without breaking backward compatibility, but we cannot go the other direction.
  1. Suppose a developer wants to extend an existing test case by adding new -verify prefixes to existing clang command lines that already uses many -verify prefixes. If the developer accidentally duplicates an existing prefix, the test case surely will not behave as expected, but it should be easier to understand what has gone wrong if the compiler complains about duplicate prefixes.

I'm not adamant about the current behavior, but I think we should consider these points before deciding.

lib/Frontend/VerifyDiagnosticConsumer.cpp
231–232

Sure, I'll work on it.

387

Will change.

hfinkel added inline comments.Dec 5 2017, 11:38 AM
lib/Frontend/CompilerInvocation.cpp
1095

If this were going to be an error, then it should have error message that explains the problem (e.g., duplicate --verify prefix).

Are you saying you prefer that to be in the error instead of the note (where it is now)?

No, I'm saying that if we retain this as an error at all, then it needs better text. I'd prefer it not be an error.

I see three reasons not to permit duplicate explicit prefixes:

I don't have a strong opinion, but in general, prohibiting duplicating command-line options hurts composability of command lines and makes scripting more difficult. That's a general statement (i.e., not tied to this use case), but as a result, I feel that should be the default unless a good reason (technical or otherwise) is presented.

In this case, checking for duplicates adds complexity to the implementation, and as far as I can tell, adds little value. Obviously, when writing checks, the author should check that they work. Moreover, the implementation will already complain if there are unmatched diagnostics, or if nothing matches, so the chance of accidentally mistyping the verify prefix seems low.

  1. Not permitting duplicates is consistent with FileCheck's --check-prefix[es].

I don't find this compelling. In part, this is because FileCheck can't complain about unmatched output (that wouldn't make sense), and so the chance of error with FileCheck is much higher.

  1. If we change our mind, we can later relax the restriction without breaking backward compatibility, but we cannot go the other direction.

Not for this kind of option, really. This is a tool for Clang developers. If we found in the future that allowing duplicates where a large source of errors, we'd just change it.

  1. Suppose a developer wants to extend an existing test case by adding new -verify prefixes to existing clang command lines that already uses many -verify prefixes. If the developer accidentally duplicates an existing prefix, the test case surely will not behave as expected, but it should be easier to understand what has gone wrong if the compiler complains about duplicate prefixes.

If someone else feels strongly about this, please speak up. I don't see this as worth the implementation complexity nor sufficient justification to override what I see as the best practice of allowing duplicate options.

I'm not adamant about the current behavior, but I think we should consider these points before deciding.

Sure.

lib/Frontend/CompilerInvocation.cpp
1095

I'm not sure why, but Phabricator doesn't give me the menu option to quote your comment in full this time, so I'll copy and paste just the most important lines.

No, I'm saying that if we retain this as an error at all, then it needs better text. I'd prefer it not be an error.

Does the text for spelling errors, which are diagnosed immediately before this, look good to you?

In this case, checking for duplicates adds complexity to the implementation

But really not very much.

FileCheck can't complain about unmatched output (that wouldn't make sense), and so the chance of error with FileCheck is much higher.

That's an important distinction I hadn't appreciated.

This is a tool for Clang developers. If we found in the future that allowing duplicates where a large source of errors, we'd just change it.

That argument convinces me. Unless someone objects before I get to it, I'll remove the restriction for duplicates.

jdenny updated this revision to Diff 125640.EditedDec 5 2017, 4:03 PM
jdenny edited the summary of this revision. (Show Details)

This update includes all of Hal's suggestions.

I'm also thinking of converting prefix storage from a std::vector to a std::set so that lookup should be faster during parsing.

rsmith edited edge metadata.Dec 7 2017, 4:57 PM

I've not done a detailed review of the string manipulation here, but this looks like a great feature, thanks!

include/clang/Basic/DiagnosticDriverKinds.td
342

Please wrap this to 80 columns.

include/clang/Driver/CC1Options.td
407

I don't think we need to worry about backwards compatibility with people passing -verify more than once; it seems OK to disallow that if we need to.

jdenny marked 13 inline comments as done.Dec 7 2017, 6:17 PM

I marked the comments related to Hal's suggestions as done to avoid confusion for future reviews. I'm not used to using this sort of tool for reviews. Hopefully it's appropriate for the author to do that rather than the reviewer.

jdenny added a comment.Dec 7 2017, 6:19 PM

I've not done a detailed review of the string manipulation here, but this looks like a great feature, thanks!

Hi Richard. Thanks for your feedback. I'll fix the line wrapping you pointed out.

jdenny updated this revision to Diff 126086.Dec 7 2017, 6:47 PM

This update does the following:

  1. Fixes the line wrapping that Richard pointed out.
  1. Converts from std::vector to std::set for more efficient prefix lookup.
  1. Grows a dependence on D40995 (which is a small patch) because the test case here now exercises multiple invalid prefixes in one command line.
  1. Rebases onto a more recent master. One of the test cases rewritten here recently changed significantly.
hfinkel added inline comments.Dec 14 2017, 2:41 PM
lib/Frontend/VerifyDiagnosticConsumer.cpp
398

Converts from std::vector to std::set for more efficient prefix lookup.

Don't do that. First, std::find is O(N) here anyway. You'd want to use Prefixes.find(...) instead. However, this set is essentially constant and you're trying to optimize the lookup. In such a case, std::set is not a good choice. Just use a sorted vector instead (e.g., call std::sort on the vector), and then use std::binary_search here. That's likely much faster.

jdenny updated this revision to Diff 127088.Dec 15 2017, 3:48 AM
  1. Use std::binary_search, as suggested by Hal.
  1. Fix another case of line wrapping.
  1. Rebase onto a recent master, and remove rewrites of tests that have recently changed.
jdenny marked an inline comment as done.Dec 15 2017, 3:51 AM
jdenny added inline comments.
lib/Frontend/VerifyDiagnosticConsumer.cpp
398

Thanks.

jdenny marked 2 inline comments as done.Dec 15 2017, 3:59 AM
hfinkel accepted this revision.Dec 15 2017, 5:46 AM

LGTM

This revision is now accepted and ready to land.Dec 15 2017, 5:46 AM

LGTM

Thanks for accepting. I don't have commit privileges. Would you please commit for me? This depends on D40995, which also needs to be committed.

This revision was automatically updated to reflect the committed changes.