Warns on std::map and std::unordered_map lookups done with operator[]. Inserting values with operator[] is still allowed.
Details
Diff Detail
Event Timeline
This checker is inspired by Louis Brandy's excellent CppCon 2017 talk (https://www.youtube.com/watch?v=lkgszkPnV8g) where he describes the side-effects of std::map::operator[] as being one of the most common source of bugs in Facebook's C++ code. In the talk he mentions that banning the usage of the operator was discussed at Facebook. This checker bans reading the map with it, writing is still allowed.
The documentation is intentionally still missing from the commit. If the reviewers feel that this sort of checker would be useful for clang-tidy, I'll add the documentation and more tests.
clang-tidy/bugprone/MapSubscriptOperatorLookupCheck.cpp | ||
---|---|---|
25 ↗ | (On Diff #144723) | I would imagine it might be best to make this configurable from the start. |
clang-tidy/bugprone/MapSubscriptOperatorLookupCheck.cpp | ||
---|---|---|
36 ↗ | (On Diff #144723) | Will be good idea to use actual container type. |
docs/ReleaseNotes.rst | ||
63 ↗ | (On Diff #144723) | Please add short description and place with other new checks in alphabetical order. |
docs/clang-tidy/checks/bugprone-map-subscript-operator-lookup.rst | ||
6 ↗ | (On Diff #144723) | Please add description and examples. |
I am generally in favor of this but would be curious to see how the check reacts to real world code bases. Facebook's usage experience is compelling, but I'm also wondering how bad the false-positive rate is on existing code. Can you try running the check over LLVM itself (perhaps after adding the option to add custom data types to the list and adding DenseMap and friends) and report back on what you find?
clang-tidy/bugprone/MapSubscriptOperatorLookupCheck.cpp | ||
---|---|---|
21 ↗ | (On Diff #144723) | No need to register the matchers unless in C++ mode. |
clang-tidy/bugprone/MapSubscriptOperatorLookupCheck.h | ||
19 ↗ | (On Diff #144723) | Should fix up the description. |
docs/clang-tidy/checks/list.rst | ||
96 ↗ | (On Diff #144723) | This change looks unrelated (but reasonable -- you can commit this change without review). |
IMHO we should proceed with this patch.
There's been several patches recently that seem to not care about false-positive rate,
and in this case the cost of true-positive can be humongous, as i/we've been finding
lots of bugs exactly like this in openmp optimization pass, attributor framework :/
test/clang-tidy/bugprone-map-subscript-operator-lookup.cpp | ||
---|---|---|
60 ↗ | (On Diff #144723) | We often have code like auto &V = Map[Idx]; if (!V) { V = init(); } use(V); Would that be flagged by this check? Could you add a test for it (possibly with FIXME)? |
I don't think that's accurate. We still care about false-positive rates and ask people to report on them when we think there's going to be an issue. That's not to say that we shouldn't proceed with this patch.
When developing this checker I tested it on LLVM code base, and there was hundreds of warnings. Looking at my notes from then, the ones that I analyzed fell in to couple of different categories:
- The program logic is such that it's known that the key is found in the map (in some of these cases count() is asserted before the lookup)
- The code does find() vs. end() comparison before the lookup
From these the at least the first category could be considered false positives.
I also think that this checker could bring value, though. For example when enabled on a new project from the start. I should have time to finish it if reviewers feel that we should reconsider it. Also, go ahead and share if you have any ideas on how to make the checker more accurate.
test/clang-tidy/bugprone-map-subscript-operator-lookup.cpp | ||
---|---|---|
60 ↗ | (On Diff #144723) | This would not be flagged, as the reference is non-const |
test/clang-tidy/bugprone-map-subscript-operator-lookup.cpp | ||
---|---|---|
48 ↗ | (On Diff #144723) | Besides mgehre's request for auto& test cases, I'd like to see some test cases involving these member functions. I assume that's why these lines of code are here (but no tests use them yet). |
The patch is now updated to be based on the monorepo. I also fixed all the comments from the previous review.
I ran the check for LLVM code using configuration key: bugprone-map-subscript-operator-lookup.MapTypes, value: "::std::map;::std::unordered_map;::llvm::DenseMapBase;::llvm::MapVector;::llvm::StringMap". It gave 1212 warnings (see the attachment).
Accidentally removed the options documentation in the previous patch... Now added it back.
Thank you for posting some of the diagnostics found by the check, that was really helpful information! I spot-checked ~10 of the issues it reported and all of them were false positives. Were you able to find any true positives from that list? I think 1200 reports without any true positives indicates that the check may be too chatty to include (it may also suggest that bugprone is the wrong place for the check).
It's difficult to spot actual functionality bugs without knowing the code better, but there's plenty of unnecessary double-lookups (count()/find() + operator[]) reported, for example:
clang-tools-extra/clang-reorder-fields/ReorderFieldsAction.cpp:75:30
clang-tools-extra/clang-tidy/bugprone/ForwardDeclarationNamespaceCheck.cpp:157:33
clang-tools-extra/clang-tidy/modernize/DeprecatedHeadersCheck.cpp:106:29
Definitely agreed that it's labor-intensive. :-)
clang-tools-extra/clang-reorder-fields/ReorderFieldsAction.cpp:75:30
clang-tools-extra/clang-tidy/bugprone/ForwardDeclarationNamespaceCheck.cpp:157:33
clang-tools-extra/clang-tidy/modernize/DeprecatedHeadersCheck.cpp:106:29
I hadn't spot checked those, but did see at least one similar case with what I checked. However, I consider those false positives because none of these are cases where the code gives wrong results. It is possible they are performance concerns (though I'd be curious whether these patterns optimize well or not), so if the check was specific to finding that scenario, that would be pretty useful (though it likely requires codeflow analysis).
Someone pointed this patch/thread out to me so I wanted to give some background on what prompted the mention of this in my talk.
I think this particular bug is a _huge_ and _recurring_ problem (we've had yet more examples recently, internally) but I also believe that the legitimate uses of map::operator[] far outnumber the problematic cases. Data would sway me but my guess is that the false positive rate of this check would make it exceptionally difficult to ever enable this in anger. I don't think this is something we (facebook) could or would turn on, by default, in our codebase. I mentioned we'd had people seriously discuss banning it but none of those arguments actually turned into a consensus to act. I'd be super curious if anyone had experience actually trying to use a rule like this in their codebase. I just didn't want ya'll to have the impression that internal FB guidance and/or rules enforce this successfuly. We don't. And the reason we don't is the same reasons expressed above: that the signal-to-noise ratio seems way, way off. Obviously that doesn't mean a rule like this for clang-tidy wouldn't be useful, maybe it is, but it's a tough rule for me to believe would be a good idea to have enabled by default.
Should be in isLanguageVersionSupported() method.