This commit fixes bug #48739. The bug was caused by the way static_casts
on pointer-to-member caused the CXXBaseSpecifier list of a
MemberToPointer to grow instead of shrink.
The list is now grown by implicit casts and corresponding entries are
removed by static_casts. No-op static_casts cause no effect.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
Thanks for taking your time and getting to the root cause of it!
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp | ||
---|---|---|
221 | std::list is like the slowest container, and it should be avoided in almost every case. | |
234 | llvm::find_if with a lambda capturing PathBase? | |
235 | It's better to be more verbose in the assertions. | |
236 | In order to avoid removals from the middle, you can use a more functional approach to collections and iterators and use llvm::make_filter_range and iterate over it in the loop below, | |
clang/test/Analysis/pointer-to-member.cpp | ||
304 | What about other type of casts? |
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp | ||
---|---|---|
235 | Well, the PointerToMember contains a PointerToMemberData, which has a list of CXXBaseSpecifier. Should I put something like: "PointerToMemberData has insufficient number of base specifiers"? |
Remoevd the use of std::list with llvm::SmallVector. Removed the need to delete elements.
@vsavchenko could you please see the current logic if it is better now?
clang/test/Analysis/pointer-to-member.cpp | ||
---|---|---|
304 | I am looking into this. |
Also, all these PathRange and PathList don't really reflect what they stand for semantically and talk more about implementation. It's not the problem of this patch, but it doesn't make your logic easier to comprehend.
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp | ||
---|---|---|
236 | I think it's all a bit too complicated now. Right now you have two nested loops:
It won't be a problem to put them in another order, so when you make a filter_range, you can have llvm::find_if which will iterate over PathRange and captures the given BaseList element. So, yes nested lambdas. Here is a sketch of the solution: auto BaseListFilteredRange = llvm::make_filter_range( BaseList, // you don't need to make_range here, BaseList IS a range [&PathRange](const CXXBaseSpecifier *Base) { return llvm::find_if(PathRange, [Base](const auto &It) { return Base->getType() == It.first->getType(); }) == PathRange.end(); }); Aaaand while I was writing that, it got me thinking that we don't need BaseList in the first place: for (const CXXBaseSpecifier *Base : PathList) if (llvm::none_of(PathRange, [Base](const auto &It) { return Base->getType() == It.first->getType(); }) PathList = CXXBaseListFactory.add(Base, PathList); |
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp | ||
---|---|---|
236 | Hmm, I really should brush up on my functional programming and think more in terms of STL/LLVM algorithms. Thank you for pointing out this solution, @vsavchenko! One thing I noticed though, the shortened logic that you suggested removes all matching instances from PathList, while my intention was to remove only the first one of every kind. If there are no repetitions, it is the same. And, normally, repetitions should not be present in the PathList retrieved from PTM. It is a bug elsewhere if that happens. So should I add an assert for that here? |
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp | ||
---|---|---|
236 | Yes, that's true. | |
clang/lib/StaticAnalyzer/Core/ExprEngineC.cpp | ||
528–531 | BTW, it looks like this can be replaced by CastE->path() |
@vsavchenko , I have used the logic that you suggested and put in a better assertion.
@vsavchenko, after some experimentation, I found out that handling reinterpret_cast will be a real pain. Consider the following:
struct Base { int field; }; struct Derived : public Base {}; struct DoubleDerived : public Derived {}; struct Some {}; void f() { int DoubleDerived::* ddf = &Base::field; int Base::* bf = reinterpret_cast<int Base::*>(reinterpret_cast<int Derived::*>(reinterpret_cast<int Base::*>(ddf))); int Some::* sf = reinterpret_cast<int Some::*>(ddf); Base base; base.*bf; }
The definition of bf can be handled I guess by manually discovering when to insert a sub-class or when to remove. It will require a bit of code but I guess is doable.
As for the next one, also it has to be manually worked out whether it makes sense to process this statement further and add a node to the Exploded Graph. For the example I gave I don't think it makes a lot of sense. Multiple inheritance will make the task a lot worse as far as I can tell.
Should I try to achieve this in another commit? What are your thoughts on this?
Yes, another commit is fine. However, I'd say that it is still good to add test cases with TODOs with this commit.
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp | ||
---|---|---|
27 | We don't need this anymore :) |
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp | ||
---|---|---|
222 | Containers like map and set usually return a pair for insert, where the second element tells us whether that element was inserted or was already there. It is usually a bad practice to use find in the vicinity of insert for this very reason, you double the work. Also, a not about this whole part of the code, if it's done only for assertions, none of this code should be executed in the build without assertions. You'll need to extract this into a function that you call from assert. It means that in release builds it will be unused and should be marked with LLVM_MAYBE_UNUSED. |
@vsavchenko , does it look okay now?
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp | ||
---|---|---|
27 | Wow, sharp eyes! |
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp | ||
---|---|---|
180 | nit: functions represent actions, in languages verbs act the same way, so it's recommended to use verbs in functions (hasNoRepeatedElements) + LLVM_MAYBE_UNUSED because in the build without assertions it would be a warning otherwise. | |
181 | ImmutableList is one of those value types that prefers copying for clarity. | |
184 | LLVM Coding Style calls for very limited use of auto, so here and the next line would be better with actual types. | |
184 | Let's also reiterate on using functional-style predicates and reimplement it in terms of llvm::all_of or llvm::none_of. | |
232 | The same thing about auto | |
234 | No need | |
clang/test/Analysis/pointer-to-member.cpp | ||
314–333 | Uncomment it, and expect the actual current result. This is where TODO will come in handy. |
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp | ||
---|---|---|
180 |
How do I put in the LLVM_MAYBE_UNUSED? Sorry for being annoying, but ripgrepping or Googling didn't return anything. | |
clang/test/Analysis/pointer-to-member.cpp | ||
314–333 |
Will do it. |
I downloaded and tested this patch. On the whole it works OK. See my suggestions below.
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp | ||
---|---|---|
217 | I suggest to use explicit types instead of auto to increase readability. | |
229–232 | It will look clearer if we use an intermediate variable for predicate here. | |
237–241 | We can remove else branch here and move its body out unconditionally as the main one already has return at the end. |
Some more aesthetic cleanup, added a failing test.
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp | ||
---|---|---|
184 | I couldn't do this since calling llvm::all_of (or std::all_of) on an ImmutableList causes a compile error saying that the function std::__iterator_category cannot be found for its iterator. | |
clang/test/Analysis/pointer-to-member.cpp | ||
314–333 | So right now we have a failing test for this, I guess its my job in the next commit to make it pass. :) |
I think we had a bit of misunderstanding about the test. It still should pass (we can't have broken tests, it will disrupt CI bots). Try to put the test case, so it is THERE and you CHANGE the expected outcome when the analyzer behaves correctly. Right now, as I see above, the test crashes.
Actually, that is the reason why I initially had commented out the crashing test. Until reinterpret_cast is properly handled, the test will crash due to an assertion failure, irrespective of what is put as the expected outcome. Is there a way to flag that a test should crash? (I see that after the tests are run, there is an expectedly-failed category). The crash is caused when the static analyzer tries to evaluate base.*bf.
@RedDocMD Please can you take a look at this buildbot failure: http://lab.llvm.org:8011/#/builders/124
The test seems to be broken on Fedora 33 (x86-64 with clang-11):
XPASS: Clang :: Analysis/reinterpret-cast-pointer-to-member.cpp (6531 of 74360) ******************** TEST 'Clang :: Analysis/reinterpret-cast-pointer-to-member.cpp' FAILED ******************** Script: -- : 'RUN: at line 1'; /tmp/_update_lc/r/bin/clang -cc1 -internal-isystem /tmp/_update_lc/r/lib/clang/13.0.0/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,debug.ExprInspection -verify /home/dave/ro_s/lp/clang/test/Analysis/reinterpret-cast-pointer-to-member.cpp -- Exit Code: 0
Yes this has been fixed by @vsavchenko in https://github.com/llvm/llvm-project/commit/6f21adac6dd7082f7231ae342d40ed04f4885e79
Hi there, is it happening after this commit?
https://github.com/llvm/llvm-project/commit/6f21adac6dd7082f7231ae342d40ed04f4885e79
We don't need this anymore :)