This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Suppress reports to patternedly named parameters in 'bugprone-easily-swappable-parameters'
ClosedPublic

Authored by whisperity on Feb 23 2021, 7:47 AM.

Details

Summary

While the original check's (D69560) purpose is to identify potentially dangerous functions based on the parameter types (as identifier names do not mean anything when it comes to the language rules), unfortunately, such a plain interface check rule can be incredibly noisy. While the previous "filtering heuristic" in D78652 is able to find many similar usages, there is an entire class of parameters that should not be warned about very easily mixed by that check: parameters that have a name and their name follows a pattern, e.g. text1, text2, text3, .... [1]

This patch implements a simple, but powerful rule, that allows us to detect such cases and ensure that no warnings are emitted for parameter sequences that follow a pattern, even if their types allow for them to be potentially mixed at a call site.

Given a threshold k, warnings about two parameters are filtered from the result set if the names of the parameters are either prefixes or suffixes of each other, with at most k letters difference on the non-common end. (Assuming that the names themselves are at least k long.)

The above text1, text2 is an example of this. (Live finding from Xerces.)
LHS and RHS are also fitting the bill here. (Live finding from... virtually any project.)
So does Qmat, Tmat, Rmat. (Live finding from I think OpenCV.)

On average, 15% reduction in findings were achieved with this feature turned on.

Similarly to the other filtering patch on the "relatedness modelling", this option is implemented as a check option, NamePrefixSuffixSilenceDissimilarityTreshold. If 0, the heuristic is turned off. Defaults to 1, i.e. a single letter difference ("pattern") on either end of the parameter name suffice for silence. This is the most reasonable default.


[1] It has been identified before by other research that such patterned names carry no viable information and such names are useless for a name-based analysis where argument vs. parameter names are contrasted with one another. [Rice2017]

[Rice2017]: Andrew Rice, et al.: Detecting argument selection defects. In: Proceedings of the ACM on Programming Languages, 1, pp. 104:1-104:23, 2017.

Diff Detail

Event Timeline

whisperity created this revision.Feb 23 2021, 7:47 AM
  • [NFC] Realised I forgot to add the new option to the check's documentation. This is fixed now.

Please mention new option in Release Notes.

clang-tools-extra/docs/clang-tidy/checks/bugprone-easily-swappable-parameters.rst
89

Please use single back-ticks for option values. Same below.

Please mention new option in Release Notes.

The base check is not yet accepted or in. I plan to land the entire thing in one go once everything is accepted because a partial landing would result in a useless check (in one way or another).

clang-tools-extra/docs/clang-tidy/checks/bugprone-easily-swappable-parameters.rst
89

All values (including above the `_Bool and such, or only the numeric constants? The LHS and RHS` below are examples of source code, not a valid value for the option.

Eugene.Zelenko added inline comments.Feb 24 2021, 7:19 AM
clang-tools-extra/docs/clang-tidy/checks/bugprone-easily-swappable-parameters.rst
89

All option values.

whisperity marked 2 inline comments as done.
  • [NFC] Reformat the documentation so option values are in single backticks.
whisperity retitled this revision from [clang-tidy] Suppress reports to patternedly named parameters in 'bugprone-easily-swappable-parameters' to [clang-tidy] Suppress reports to patternedly named parameters in 'bugprone-easily-swappable-parameters'.Mar 3 2021, 5:53 AM
aaron.ballman added inline comments.Mar 16 2021, 8:50 AM
clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp
364–365

Should we be checking that .size() - N doesn't cause wraparound?

371

Same question here as above.

whisperity added inline comments.Mar 16 2021, 10:39 AM
clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp
364–365

Wraparound as in underflow? Like 2 - 8 becoming multiple millions? The incoming strings are padded to the common length.

take_xxx won't let you go out of bounds, even if the parameter is greater than the string's length. It will just silently return the entire string.

aaron.ballman accepted this revision.Mar 16 2021, 11:02 AM

LGTM!

clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp
364–365

Wraparound as in underflow? Like 2 - 8 becoming multiple millions? The incoming strings are padded to the common length.

They are, but I didn't see anything checking that Threshhold (which gets passed as N) isn't larger than the common string length.

take_xxx won't let you go out of bounds, even if the parameter is greater than the string's length. It will just silently return the entire string.

Aha, that's the important bit, thanks for verifying that this is safe!

This revision is now accepted and ready to land.Mar 16 2021, 11:02 AM
whisperity marked an inline comment as not done.Mar 16 2021, 11:07 AM
whisperity added inline comments.
clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp
364–365

Nevertheless I'll clarify this, specifically because I have to make sure (I was thinking for a few minutes then realised the !S1Prefix.empty() is actually making sure of this...) that if you got parameter names of length 1 (let's say) and a threshold of 1, it really shouldn't say that A and B are "related" because the common prefix is "" and the non-common end differs in one character, A vs. B.

384

@aaron.ballman I think I'll do something like if (BiggerLength < Threshold) return false;, how does that sound? The comparison is moot in that case, imagine having strings of 2 and 4, padded to 4, with a threshold of 6, for example.

That way even if someone accidentally makes StringRef overflow the buffer, we'll be safe on the call paths we have here.

aaron.ballman added inline comments.Mar 16 2021, 11:13 AM
clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp
384

I think that'd make the behavior much more obvious, but should that be <=?

whisperity added inline comments.Mar 16 2021, 11:22 AM
clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp
384

No, because being = to the threshold means that the common prefix/suffix is empty string. It'd make variables such as "A" and "X" deemed related. Generally not something we want, because that's too broad of an assumption that they are "meant to be used together". (Generally you shouldn't name variables like so, but still...)

aaron.ballman added inline comments.Mar 16 2021, 11:27 AM
clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp
384

Ah, that's a good point, thank you!

whisperity marked 5 inline comments as done.
  • Made a precondition check explicit, and added an early return to not evaluate a case where it would return false anyways...

NFC: Rebase & update.

While this patch conceptually only depends on D69560, the diff itself has become linear during the code review and subsequent updates, and as such, must be applied on top of D78652.

NC Rebase and buildbot trigger.