This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] New check bugprone-map-subscript-operator-lookup
Needs ReviewPublic

Authored by khuttun on May 1 2018, 8:05 AM.

Details

Summary

Warns on std::map and std::unordered_map lookups done with operator[]. Inserting values with operator[] is still allowed.

Diff Detail

Event Timeline

khuttun created this revision.May 1 2018, 8:05 AM
khuttun added a comment.EditedMay 1 2018, 8:15 AM

This checker is inspired by Louis Brandy's excellent CppCon 2017 talk (https://www.youtube.com/watch?v=lkgszkPnV8g) where he describes the side-effects of std::map::operator[] as being one of the most common source of bugs in Facebook's C++ code. In the talk he mentions that banning the usage of the operator was discussed at Facebook. This checker bans reading the map with it, writing is still allowed.

The documentation is intentionally still missing from the commit. If the reviewers feel that this sort of checker would be useful for clang-tidy, I'll add the documentation and more tests.

lebedev.ri added inline comments.
clang-tidy/bugprone/MapSubscriptOperatorLookupCheck.cpp
25 ↗(On Diff #144723)

I would imagine it might be best to make this configurable from the start.

Eugene.Zelenko added inline comments.
clang-tidy/bugprone/MapSubscriptOperatorLookupCheck.cpp
36 ↗(On Diff #144723)

Will be good idea to use actual container type.

docs/ReleaseNotes.rst
63 ↗(On Diff #144723)

Please add short description and place with other new checks in alphabetical order.

docs/clang-tidy/checks/bugprone-map-subscript-operator-lookup.rst
6 ↗(On Diff #144723)

Please add description and examples.

Eugene.Zelenko added a project: Restricted Project.

I am generally in favor of this but would be curious to see how the check reacts to real world code bases. Facebook's usage experience is compelling, but I'm also wondering how bad the false-positive rate is on existing code. Can you try running the check over LLVM itself (perhaps after adding the option to add custom data types to the list and adding DenseMap and friends) and report back on what you find?

clang-tidy/bugprone/MapSubscriptOperatorLookupCheck.cpp
21 ↗(On Diff #144723)

No need to register the matchers unless in C++ mode.

clang-tidy/bugprone/MapSubscriptOperatorLookupCheck.h
19 ↗(On Diff #144723)

Should fix up the description.

docs/clang-tidy/checks/list.rst
96 ↗(On Diff #144723)

This change looks unrelated (but reasonable -- you can commit this change without review).

khuttun abandoned this revision.Jul 9 2018, 12:32 PM

Abandoning this. The false positive rate would be too high for this checker.

IMHO we should proceed with this patch.
There's been several patches recently that seem to not care about false-positive rate,
and in this case the cost of true-positive can be humongous, as i/we've been finding
lots of bugs exactly like this in openmp optimization pass, attributor framework :/

mgehre added inline comments.Apr 17 2020, 1:50 AM
test/clang-tidy/bugprone-map-subscript-operator-lookup.cpp
60 ↗(On Diff #144723)

We often have code like

auto &V = Map[Idx];
if (!V) {
  V = init();
}
use(V);

Would that be flagged by this check? Could you add a test for it (possibly with FIXME)?

IMHO we should proceed with this patch.
There's been several patches recently that seem to not care about false-positive rate,

I don't think that's accurate. We still care about false-positive rates and ask people to report on them when we think there's going to be an issue. That's not to say that we shouldn't proceed with this patch.

khuttun marked an inline comment as done.Apr 17 2020, 4:45 AM

IMHO we should proceed with this patch.
There's been several patches recently that seem to not care about false-positive rate,
and in this case the cost of true-positive can be humongous, as i/we've been finding
lots of bugs exactly like this in openmp optimization pass, attributor framework :/

When developing this checker I tested it on LLVM code base, and there was hundreds of warnings. Looking at my notes from then, the ones that I analyzed fell in to couple of different categories:

  • The program logic is such that it's known that the key is found in the map (in some of these cases count() is asserted before the lookup)
  • The code does find() vs. end() comparison before the lookup

From these the at least the first category could be considered false positives.

I also think that this checker could bring value, though. For example when enabled on a new project from the start. I should have time to finish it if reviewers feel that we should reconsider it. Also, go ahead and share if you have any ideas on how to make the checker more accurate.

test/clang-tidy/bugprone-map-subscript-operator-lookup.cpp
60 ↗(On Diff #144723)

This would not be flagged, as the reference is non-const

Quuxplusone added inline comments.
test/clang-tidy/bugprone-map-subscript-operator-lookup.cpp
48 ↗(On Diff #144723)

Besides mgehre's request for auto& test cases, I'd like to see some test cases involving these member functions. I assume that's why these lines of code are here (but no tests use them yet).

khuttun updated this revision to Diff 259656.Apr 23 2020, 11:47 AM

The patch is now updated to be based on the monorepo. I also fixed all the comments from the previous review.

I ran the check for LLVM code using configuration key: bugprone-map-subscript-operator-lookup.MapTypes, value: "::std::map;::std::unordered_map;::llvm::DenseMapBase;::llvm::MapVector;::llvm::StringMap". It gave 1212 warnings (see the attachment).

khuttun updated this revision to Diff 259843.Apr 24 2020, 3:07 AM

Documented the MapTypes check option.

Eugene.Zelenko added inline comments.Apr 24 2020, 6:55 AM
clang-tools-extra/clang-tidy/bugprone/MapSubscriptOperatorLookupCheck.cpp
32

Should be in isLanguageVersionSupported() method.

clang-tools-extra/docs/ReleaseNotes.rst
78

Please separate with empty line.

81

Please synchronize with first statement in documentation.

khuttun marked 3 inline comments as done.Apr 30 2020, 1:06 AM
khuttun updated this revision to Diff 261142.Apr 30 2020, 1:12 AM

Accidentally removed the options documentation in the previous patch... Now added it back.

Any comments on this? Is this checker something that could be part of clang-tidy?

Any comments on this? Is this checker something that could be part of clang-tidy?

Thank you for posting some of the diagnostics found by the check, that was really helpful information! I spot-checked ~10 of the issues it reported and all of them were false positives. Were you able to find any true positives from that list? I think 1200 reports without any true positives indicates that the check may be too chatty to include (it may also suggest that bugprone is the wrong place for the check).

Any comments on this? Is this checker something that could be part of clang-tidy?

Thank you for posting some of the diagnostics found by the check, that was really helpful information! I spot-checked ~10 of the issues it reported and all of them were false positives. Were you able to find any true positives from that list? I think 1200 reports without any true positives indicates that the check may be too chatty to include (it may also suggest that bugprone is the wrong place for the check).

It's difficult to spot actual functionality bugs without knowing the code better, but there's plenty of unnecessary double-lookups (count()/find() + operator[]) reported, for example:

clang-tools-extra/clang-reorder-fields/ReorderFieldsAction.cpp:75:30
clang-tools-extra/clang-tidy/bugprone/ForwardDeclarationNamespaceCheck.cpp:157:33
clang-tools-extra/clang-tidy/modernize/DeprecatedHeadersCheck.cpp:106:29

Any comments on this? Is this checker something that could be part of clang-tidy?

Thank you for posting some of the diagnostics found by the check, that was really helpful information! I spot-checked ~10 of the issues it reported and all of them were false positives. Were you able to find any true positives from that list? I think 1200 reports without any true positives indicates that the check may be too chatty to include (it may also suggest that bugprone is the wrong place for the check).

It's difficult to spot actual functionality bugs without knowing the code better, but there's plenty of unnecessary double-lookups (count()/find() + operator[]) reported, for example:

Definitely agreed that it's labor-intensive. :-)

clang-tools-extra/clang-reorder-fields/ReorderFieldsAction.cpp:75:30
clang-tools-extra/clang-tidy/bugprone/ForwardDeclarationNamespaceCheck.cpp:157:33
clang-tools-extra/clang-tidy/modernize/DeprecatedHeadersCheck.cpp:106:29

I hadn't spot checked those, but did see at least one similar case with what I checked. However, I consider those false positives because none of these are cases where the code gives wrong results. It is possible they are performance concerns (though I'd be curious whether these patterns optimize well or not), so if the check was specific to finding that scenario, that would be pretty useful (though it likely requires codeflow analysis).

smeenai added a subscriber: smeenai.May 8 2020, 7:19 PM
lbrandy added a subscriber: lbrandy.May 8 2020, 8:16 PM

Someone pointed this patch/thread out to me so I wanted to give some background on what prompted the mention of this in my talk.

I think this particular bug is a _huge_ and _recurring_ problem (we've had yet more examples recently, internally) but I also believe that the legitimate uses of map::operator[] far outnumber the problematic cases. Data would sway me but my guess is that the false positive rate of this check would make it exceptionally difficult to ever enable this in anger. I don't think this is something we (facebook) could or would turn on, by default, in our codebase. I mentioned we'd had people seriously discuss banning it but none of those arguments actually turned into a consensus to act. I'd be super curious if anyone had experience actually trying to use a rule like this in their codebase. I just didn't want ya'll to have the impression that internal FB guidance and/or rules enforce this successfuly. We don't. And the reason we don't is the same reasons expressed above: that the signal-to-noise ratio seems way, way off. Obviously that doesn't mean a rule like this for clang-tidy wouldn't be useful, maybe it is, but it's a tough rule for me to believe would be a good idea to have enabled by default.

gkm added a subscriber: gkm.May 19 2020, 7:25 AM