This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Avoid running aliased checks twice
AbandonedPublic

Authored by malcolm.parsons on Oct 16 2016, 7:39 AM.

Details

Summary

It is easy to configure clang-tidy with both aliased and original
checks enabled. This causes the check to be appear under both names
in -list-checks and -dump-config, and the check is run twice.

This changeset tells the CheckFactory about aliases so that an alias
is only enabled when the original is not enabled.

Event Timeline

malcolm.parsons retitled this revision from to [clang-tidy] Avoid running aliased checks twice.
malcolm.parsons updated this object.
malcolm.parsons added a subscriber: cfe-commits.
aaron.ballman edited edge metadata.Nov 1 2016, 11:22 AM

I *really* like this idea, thank you for working on this! A few things:

(0) I'm uncomfortable with the lack of tests in the patch. I'm not certain of a good way to test this, however. @alexfh, do you have ideas?
(1) I think that the aliases and the originals should be listed with -list-checks, because these are names under which the checks may be run (and sometimes the name may even imply different semantics for the check).
(2) I'm not as certain about -dump-config, since I am not really familiar with that option, but I think we want to list the alias and the original under that as well because different check names may have different configuration options.

(1) I think that the aliases and the originals should be listed with -list-checks, because these are names under which the checks may be run (and sometimes the name may even imply different semantics for the check).

I've always found it odd that -list-checks only lists the enabled checks.
I think a list of enabled checks shouldn't report aliases that will not be used.

(2) I'm not as certain about -dump-config, since I am not really familiar with that option, but I think we want to list the alias and the original under that as well because different check names may have different configuration options.

It was because of -dump-config that I noticed the checks were running twice.
I had to configure a check in two places to be sure it was configured how I wanted.
I don't think there is a use case for running the same check with different options at the same time.
Dumping options that won't be used isn't good either.

I'd like to remove cert-oop11-cpp's UseCERTSemantics option - see my comment on D12839.

An alternative to clang-tidy preferring the original check would be to complain if both are enabled.

(1) I think that the aliases and the originals should be listed with -list-checks, because these are names under which the checks may be run (and sometimes the name may even imply different semantics for the check).

I've always found it odd that -list-checks only lists the enabled checks.
I think a list of enabled checks shouldn't report aliases that will not be used.

Okay, that's a good point, I forgot that -list-checks only lists the enabled checks, not all of them. With that in mind, here's what I would expect to see:

(0) -checks=* -list-checks
(1) -checks=-*, cert-* -list-checks
(2) -checks=*, -cert-* -list checks

In (0), I would expect all checks to be listed, even aliases. Aliases may have semantic distinctions from other checks and the user said "run everything." However, for aliases which have no semantic distinction, there's no point to running the check twice even if the user said to run everything. Leaving aliases in or out results in an unsatisfying situation either way, unless we know whether an alias supplies different option values.
In (1), I would expect only the cert checks to be listed, even if they alias to something outside of the cert module. It would be strange to use those options and see something outside of the cert module listed, even if it had no semantic distinction, just because of cert rules that are an alias to something else.
In (2), I would expect to see every check *except* cert checks listed, regardless of aliasing in or out of the cert module. Similar rationale as (1).

If I understand the output from this patch, only (0) is modified to not list the alias name because aliases are never enabled with -checks=* in this patch, but (1) and (2) do not change behavior?

(2) I'm not as certain about -dump-config, since I am not really familiar with that option, but I think we want to list the alias and the original under that as well because different check names may have different configuration options.

It was because of -dump-config that I noticed the checks were running twice.
I had to configure a check in two places to be sure it was configured how I wanted.
I don't think there is a use case for running the same check with different options at the same time.

Does -dump-config only list enabled checks as well, or does it list the configuration for all checks? If it's all checks, then the alias should be present because it may be configured differently. If it's only enabled checks, then I agree that it should only list what is being used.

Dumping options that won't be used isn't good either.

I'd like to remove cert-oop11-cpp's UseCERTSemantics option - see my comment on D12839.

Removing that option may be possible so long as we still get the correct semantics when running cert-oop11-cpp.

An alternative to clang-tidy preferring the original check would be to complain if both are enabled.

I don't think that preferring the original check is bad so long as we get the semantic options correct; the ability for an alias to be defined as "that check with these options being set this way" is important.

(1) I think that the aliases and the originals should be listed with -list-checks, because these are names under which the checks may be run (and sometimes the name may even imply different semantics for the check).

I've always found it odd that -list-checks only lists the enabled checks.
I think a list of enabled checks shouldn't report aliases that will not be used.

Okay, that's a good point, I forgot that -list-checks only lists the enabled checks, not all of them. With that in mind, here's what I would expect to see:

(0) -checks=* -list-checks
(1) -checks=-*, cert-* -list-checks
(2) -checks=*, -cert-* -list checks

In (0), I would expect all checks to be listed, even aliases. Aliases may have semantic distinctions from other checks and the user said "run everything." However, for aliases which have no semantic distinction, there's no point to running the check twice even if the user said to run everything. Leaving aliases in or out results in an unsatisfying situation either way, unless we know whether an alias supplies different option values.
In (1), I would expect only the cert checks to be listed, even if they alias to something outside of the cert module. It would be strange to use those options and see something outside of the cert module listed, even if it had no semantic distinction, just because of cert rules that are an alias to something else.
In (2), I would expect to see every check *except* cert checks listed, regardless of aliasing in or out of the cert module. Similar rationale as (1).

If I understand the output from this patch, only (0) is modified to not list the alias name because aliases are never enabled with -checks=* in this patch, but (1) and (2) do not change behavior?

0) doesn't list aliases. I'd like a -list-all-checks option.

  1. lists all the cert checks, as they aren't aliases for other cert checks.
  2. doesn't list any cert checks, as none are enabled.

Does -dump-config only list enabled checks as well, or does it list the configuration for all checks? If it's all checks, then the alias should be present because it may be configured differently. If it's only enabled checks, then I agree that it should only list what is being used.

The config is only dumped for the enabled checks.

I don't think that preferring the original check is bad so long as we get the semantic options correct; the ability for an alias to be defined as "that check with these options being set this way" is important.

That might mean preferring the alias. But there might be more than one.

alexfh edited edge metadata.Nov 7 2016, 2:11 PM

I think, silently choosing one of the checks may be confusing and counter-intuitive. Should we just warn in case we see the same check enabled by multiple aliases?

alexfh requested changes to this revision.Nov 7 2016, 2:20 PM
alexfh edited edge metadata.

I think, silently choosing one of the checks may be confusing and counter-intuitive. Should we just warn in case we see the same check enabled by multiple aliases?

And by "aliases" I mean different names under which the check is registered. I'm not sure I see the benefit of introducing a separate concept and a separate API for registering checks under multiple names. This introduces another sort of dynamic binding, which we have to have good reasons for.

This revision now requires changes to proceed.Nov 7 2016, 2:20 PM

I think, silently choosing one of the checks may be confusing and counter-intuitive. Should we just warn in case we see the same check enabled by multiple aliases?

And by "aliases" I mean different names under which the check is registered. I'm not sure I see the benefit of introducing a separate concept and a separate API for registering checks under multiple names. This introduces another sort of dynamic binding, which we have to have good reasons for.

Is using typeid() on the constructed checks OK?

alexfh added a comment.Nov 8 2016, 8:55 AM

I think, silently choosing one of the checks may be confusing and counter-intuitive. Should we just warn in case we see the same check enabled by multiple aliases?

And by "aliases" I mean different names under which the check is registered. I'm not sure I see the benefit of introducing a separate concept and a separate API for registering checks under multiple names. This introduces another sort of dynamic binding, which we have to have good reasons for.

Is using typeid() on the constructed checks OK?

I don't think rtti is allowed in LLVM. But we can find a solution, I believe. The important thing is to decide what behavior is desired.

I think, silently choosing one of the checks may be confusing and counter-intuitive. Should we just warn in case we see the same check enabled by multiple aliases?

A warning is useful if the checks have different settings, e.g. google-readability-braces-around-statements and readability-braces-around-statements.

But a warning is just noise for other aliases, e.g. cert-dcl54-cpp and misc-new-delete-overloads.

I think, silently choosing one of the checks may be confusing and counter-intuitive. Should we just warn in case we see the same check enabled by multiple aliases?

A warning is useful if the checks have different settings, e.g. google-readability-braces-around-statements and readability-braces-around-statements.

But a warning is just noise for other aliases, e.g. cert-dcl54-cpp and misc-new-delete-overloads.

That's why I don't think there's a good default behavior without capturing further information about whether an alias is an identical copy under a different name or a partial copy with differing semantics. If we have that information, then my preference is for warning the user about enabling the (effectively) same check twice.

malcolm.parsons abandoned this revision.Dec 18 2016, 7:28 AM

Abandon until we have a plan.