This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Fix static_cast on pointer-to-member handling
ClosedPublic

Authored by RedDocMD on Feb 2 2021, 9:02 AM.

Details

Summary

This commit fixes bug #48739. The bug was caused by the way static_casts
on pointer-to-member caused the CXXBaseSpecifier list of a
MemberToPointer to grow instead of shrink.
The list is now grown by implicit casts and corresponding entries are
removed by static_casts. No-op static_casts cause no effect.

Diff Detail

Event Timeline

RedDocMD created this revision.Feb 2 2021, 9:02 AM
RedDocMD requested review of this revision.Feb 2 2021, 9:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 2 2021, 9:02 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
RedDocMD retitled this revision from [analyer] Fix static_cast on pointer-to-member handling to [analyzer] Fix static_cast on pointer-to-member handling.Feb 2 2021, 9:05 AM

Thanks for taking your time and getting to the root cause of it!

clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp
209

std::list is like the slowest container, and it should be avoided in almost every case.
Considering the semantics of it, you can definitely use llvm::SmallVector.

222

llvm::find_if with a lambda capturing PathBase?

223

It's better to be more verbose in the assertions.
Additionally, I'm not sure that it is clear what it is all about because pointer-to-members do not have base specifiers.

224

In order to avoid removals from the middle, you can use a more functional approach to collections and iterators and use llvm::make_filter_range and iterate over it in the loop below,

clang/test/Analysis/pointer-to-member.cpp
304

What about other type of casts?

RedDocMD added inline comments.Feb 2 2021, 7:14 PM
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp
223

Well, the PointerToMember contains a PointerToMemberData, which has a list of CXXBaseSpecifier. Should I put something like: "PointerToMemberData has insufficient number of base specifiers"?

RedDocMD updated this revision to Diff 321074.Feb 3 2021, 5:40 AM

Remoevd the use of std::list with llvm::SmallVector. Removed the need to delete elements.

RedDocMD marked 3 inline comments as done.Feb 3 2021, 5:41 AM
RedDocMD updated this revision to Diff 321078.Feb 3 2021, 5:51 AM

Made assertion more verbose

RedDocMD marked an inline comment as done.Feb 3 2021, 5:51 AM

@vsavchenko could you please see the current logic if it is better now?

clang/test/Analysis/pointer-to-member.cpp
304

I am looking into this.

Also, all these PathRange and PathList don't really reflect what they stand for semantically and talk more about implementation. It's not the problem of this patch, but it doesn't make your logic easier to comprehend.

clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp
224

I think it's all a bit too complicated now.

Right now you have two nested loops:

  1. over PathRange
  2. over BaseList (in find_if)

It won't be a problem to put them in another order, so when you make a filter_range, you can have llvm::find_if which will iterate over PathRange and captures the given BaseList element. So, yes nested lambdas.

Here is a sketch of the solution:

auto BaseListFilteredRange = llvm::make_filter_range(
    BaseList, // you don't need to make_range here, BaseList IS a range
    [&PathRange](const CXXBaseSpecifier *Base) {
      return llvm::find_if(PathRange, [Base](const auto &It) {
        return Base->getType() == It.first->getType();
      }) == PathRange.end();
    });

Aaaand while I was writing that, it got me thinking that we don't need BaseList in the first place:

for (const CXXBaseSpecifier *Base : PathList)
  if (llvm::none_of(PathRange,
                    [Base](const auto &It) { return Base->getType() == It.first->getType(); })
      PathList = CXXBaseListFactory.add(Base, PathList);
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp
224

Hmm, I really should brush up on my functional programming and think more in terms of STL/LLVM algorithms. Thank you for pointing out this solution, @vsavchenko!

One thing I noticed though, the shortened logic that you suggested removes all matching instances from PathList, while my intention was to remove only the first one of every kind. If there are no repetitions, it is the same. And, normally, repetitions should not be present in the PathList retrieved from PTM. It is a bug elsewhere if that happens. So should I add an assert for that here?

vsavchenko added inline comments.Feb 3 2021, 6:54 AM
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp
224

Yes, that's true.
Assertions will usually make intensions of the original author more clear for future readers.
Plus, I think it can be also good to add a comment (something similar to your summary for this change) to describe what is going on here and WHY it is done.

clang/lib/StaticAnalyzer/Core/ExprEngineC.cpp
531–533

BTW, it looks like this can be replaced by CastE->path()

Harbormaster completed remote builds in B87692: Diff 321074.
RedDocMD updated this revision to Diff 321139.Feb 3 2021, 10:02 AM

Cleaner implementation of the popping logic, using STL/LLVM algorithms

RedDocMD marked 2 inline comments as done.Feb 3 2021, 10:04 AM

@vsavchenko , I have used the logic that you suggested and put in a better assertion.

RedDocMD marked an inline comment as done.Feb 4 2021, 9:03 AM
RedDocMD updated this revision to Diff 321476.Feb 4 2021, 9:05 AM

Updated to use new function

@vsavchenko, after some experimentation, I found out that handling reinterpret_cast will be a real pain. Consider the following:

struct Base {
	int field;
};

struct Derived : public Base {};

struct DoubleDerived : public Derived {};

struct Some {};

void f() {
	int DoubleDerived::* ddf = &Base::field;
	int Base::* bf = reinterpret_cast<int Base::*>(reinterpret_cast<int Derived::*>(reinterpret_cast<int Base::*>(ddf)));
	int Some::* sf = reinterpret_cast<int Some::*>(ddf);
	Base base;
	base.*bf;
}

The definition of bf can be handled I guess by manually discovering when to insert a sub-class or when to remove. It will require a bit of code but I guess is doable.
As for the next one, also it has to be manually worked out whether it makes sense to process this statement further and add a node to the Exploded Graph. For the example I gave I don't think it makes a lot of sense. Multiple inheritance will make the task a lot worse as far as I can tell.
Should I try to achieve this in another commit? What are your thoughts on this?

@vsavchenko, after some experimentation, I found out that handling reinterpret_cast will be a real pain. Consider the following:

struct Base {
	int field;
};

struct Derived : public Base {};

struct DoubleDerived : public Derived {};

struct Some {};

void f() {
	int DoubleDerived::* ddf = &Base::field;
	int Base::* bf = reinterpret_cast<int Base::*>(reinterpret_cast<int Derived::*>(reinterpret_cast<int Base::*>(ddf)));
	int Some::* sf = reinterpret_cast<int Some::*>(ddf);
	Base base;
	base.*bf;
}

The definition of bf can be handled I guess by manually discovering when to insert a sub-class or when to remove. It will require a bit of code but I guess is doable.
As for the next one, also it has to be manually worked out whether it makes sense to process this statement further and add a node to the Exploded Graph. For the example I gave I don't think it makes a lot of sense. Multiple inheritance will make the task a lot worse as far as I can tell.
Should I try to achieve this in another commit? What are your thoughts on this?

Yes, another commit is fine. However, I'd say that it is still good to add test cases with TODOs with this commit.

clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp
27

We don't need this anymore :)

vsavchenko added inline comments.Feb 5 2021, 12:21 AM
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp
210

Containers like map and set usually return a pair for insert, where the second element tells us whether that element was inserted or was already there. It is usually a bad practice to use find in the vicinity of insert for this very reason, you double the work.

Also, a not about this whole part of the code, if it's done only for assertions, none of this code should be executed in the build without assertions. You'll need to extract this into a function that you call from assert. It means that in release builds it will be unused and should be marked with LLVM_MAYBE_UNUSED.

RedDocMD updated this revision to Diff 321714.Feb 5 2021, 5:03 AM
RedDocMD marked an inline comment as done.

Added TODO for reinterpret_cast, moved assertion code to its own block

RedDocMD marked 2 inline comments as done.Feb 5 2021, 5:05 AM

@vsavchenko , does it look okay now?

clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp
27

Wow, sharp eyes!

vsavchenko added inline comments.Feb 5 2021, 6:05 AM
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp
181

nit: functions represent actions, in languages verbs act the same way, so it's recommended to use verbs in functions (hasNoRepeatedElements)

+ LLVM_MAYBE_UNUSED because in the build without assertions it would be a warning otherwise.

182

ImmutableList is one of those value types that prefers copying for clarity.

185

LLVM Coding Style calls for very limited use of auto, so here and the next line would be better with actual types.

185

Let's also reiterate on using functional-style predicates and reimplement it in terms of llvm::all_of or llvm::none_of.

220

The same thing about auto

222

No need

clang/test/Analysis/pointer-to-member.cpp
314–333

Uncomment it, and expect the actual current result. This is where TODO will come in handy.

RedDocMD marked an inline comment as done.Feb 5 2021, 6:44 AM
RedDocMD added inline comments.
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp
181

nit: functions represent actions, in languages verbs act the same way, so it's recommended to use verbs in functions (hasNoRepeatedElements)

+ LLVM_MAYBE_UNUSED because in the build without assertions it would be a warning otherwise.

How do I put in the LLVM_MAYBE_UNUSED? Sorry for being annoying, but ripgrepping or Googling didn't return anything.

clang/test/Analysis/pointer-to-member.cpp
314–333

Uncomment it, and expect the actual current result. This is where TODO will come in handy.

Will do it.
Just one clarification: the static analyzer tests only serve to check whether the Static Analyzer crashes or hits an assertion error, or is it something more?

vsavchenko added inline comments.Feb 5 2021, 8:09 AM
clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp
181

Sorry, my bad, it's LLVM_ATTRIBUTE_UNUSED

clang/test/Analysis/pointer-to-member.cpp
314–333

Mostly they check for reported results, you can see special comments like // expect-warning{...} in almost every test.

I downloaded and tested this patch. On the whole it works OK. See my suggestions below.

clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp
205

I suggest to use explicit types instead of auto to increase readability.

217–220

It will look clearer if we use an intermediate variable for predicate here.

225–229

We can remove else branch here and move its body out unconditionally as the main one already has return at the end.

RedDocMD updated this revision to Diff 321816.Feb 5 2021, 9:59 AM
RedDocMD marked 10 inline comments as done.

Some more aesthetic cleanup, added a failing test.

clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp
185

I couldn't do this since calling llvm::all_of (or std::all_of) on an ImmutableList causes a compile error saying that the function std::__iterator_category cannot be found for its iterator.

clang/test/Analysis/pointer-to-member.cpp
314–333

So right now we have a failing test for this, I guess its my job in the next commit to make it pass. :)

@vsavchenko are there some more changes you want done?

@vsavchenko are there some more changes you want done?

I think we had a bit of misunderstanding about the test. It still should pass (we can't have broken tests, it will disrupt CI bots). Try to put the test case, so it is THERE and you CHANGE the expected outcome when the analyzer behaves correctly. Right now, as I see above, the test crashes.

I think we had a bit of misunderstanding about the test. It still should pass (we can't have broken tests, it will disrupt CI bots). Try to put the test case, so it is THERE and you CHANGE the expected outcome when the analyzer behaves correctly. Right now, as I see above, the test crashes.

Actually, that is the reason why I initially had commented out the crashing test. Until reinterpret_cast is properly handled, the test will crash due to an assertion failure, irrespective of what is put as the expected outcome. Is there a way to flag that a test should crash? (I see that after the tests are run, there is an expectedly-failed category). The crash is caused when the static analyzer tries to evaluate base.*bf.

I think we had a bit of misunderstanding about the test. It still should pass (we can't have broken tests, it will disrupt CI bots). Try to put the test case, so it is THERE and you CHANGE the expected outcome when the analyzer behaves correctly. Right now, as I see above, the test crashes.

Actually, that is the reason why I initially had commented out the crashing test. Until reinterpret_cast is properly handled, the test will crash due to an assertion failure, irrespective of what is put as the expected outcome. Is there a way to flag that a test should crash? (I see that after the tests are run, there is an expectedly-failed category). The crash is caused when the static analyzer tries to evaluate base.*bf.

Yes, it'd be better to extract this case into a separate file and mark it as XFAIL

RedDocMD updated this revision to Diff 322094.Feb 8 2021, 6:11 AM

Moved failing test of reinterpret_cast to its own file, marked to failx

Yes, it'd be better to extract this case into a separate file and mark it as XFAIL

Done that.

@vsavchenko, anything else to add?

vsavchenko accepted this revision.Feb 9 2021, 12:59 AM

Great job! Thanks for dealing with all my nit-picking!

This revision is now accepted and ready to land.Feb 9 2021, 12:59 AM
This revision was landed with ongoing or failed builds.Feb 15 2021, 12:48 AM
This revision was automatically updated to reflect the committed changes.

@RedDocMD Please can you take a look at this buildbot failure: http://lab.llvm.org:8011/#/builders/124

@RedDocMD Please can you take a look at this buildbot failure: http://lab.llvm.org:8011/#/builders/124

Yes. Thanks for the ping 😀

The test seems to be broken on Fedora 33 (x86-64 with clang-11):

XPASS: Clang :: Analysis/reinterpret-cast-pointer-to-member.cpp (6531 of 74360)
******************** TEST 'Clang :: Analysis/reinterpret-cast-pointer-to-member.cpp' FAILED ********************
Script:
--
: 'RUN: at line 1';   /tmp/_update_lc/r/bin/clang -cc1 -internal-isystem /tmp/_update_lc/r/lib/clang/13.0.0/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,debug.ExprInspection -verify /home/dave/ro_s/lp/clang/test/Analysis/reinterpret-cast-pointer-to-member.cpp
--
Exit Code: 0

The test seems to be broken on Fedora 33 (x86-64 with clang-11):

XPASS: Clang :: Analysis/reinterpret-cast-pointer-to-member.cpp (6531 of 74360)
******************** TEST 'Clang :: Analysis/reinterpret-cast-pointer-to-member.cpp' FAILED ********************
Script:
--
: 'RUN: at line 1';   /tmp/_update_lc/r/bin/clang -cc1 -internal-isystem /tmp/_update_lc/r/lib/clang/13.0.0/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,debug.ExprInspection -verify /home/dave/ro_s/lp/clang/test/Analysis/reinterpret-cast-pointer-to-member.cpp
--
Exit Code: 0

Yes this has been fixed by @vsavchenko in https://github.com/llvm/llvm-project/commit/6f21adac6dd7082f7231ae342d40ed04f4885e79

The test seems to be broken on Fedora 33 (x86-64 with clang-11):

XPASS: Clang :: Analysis/reinterpret-cast-pointer-to-member.cpp (6531 of 74360)
******************** TEST 'Clang :: Analysis/reinterpret-cast-pointer-to-member.cpp' FAILED ********************
Script:
--
: 'RUN: at line 1';   /tmp/_update_lc/r/bin/clang -cc1 -internal-isystem /tmp/_update_lc/r/lib/clang/13.0.0/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,debug.ExprInspection -verify /home/dave/ro_s/lp/clang/test/Analysis/reinterpret-cast-pointer-to-member.cpp
--
Exit Code: 0

Hi there, is it happening after this commit?
https://github.com/llvm/llvm-project/commit/6f21adac6dd7082f7231ae342d40ed04f4885e79