This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Refactor makeNull to makeNullWithWidth (NFC)
ClosedPublic

Authored by vabridgers on Feb 11 2022, 3:35 PM.

Details

Summary

Usages of makeNull need to be deprecated in favor of makeNullWithWidth
for architectures where the pointer size should not be assumed. This can
occur when pointer sizes can be of different sizes, depending on address
space for example. See https://reviews.llvm.org/D118050 as an example.

This was uncovered initially in a downstream compiler project, and
tested through those systems tests.

steakhal performed systems testing across a large set of open source
projects.

Co-authored-by: steakhal
Resolves: https://github.com/llvm/llvm-project/issues/53664

Diff Detail

Event Timeline

vabridgers created this revision.Feb 11 2022, 3:35 PM
vabridgers requested review of this revision.Feb 11 2022, 3:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 11 2022, 3:35 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
NoQ added inline comments.Feb 11 2022, 8:15 PM
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp
659–661

I don't think this is the right type. This should be the raw pointer type for the corresponding smart pointer, not a pointer to the smart pointer.

Call.getResultType() should be good.

Also general advice, do not use SVal::getType() when you can obtain the type from the AST instead. SVal::getType() is speculative and it inevitably loses some information (in particular, it doesn't discriminate between lvalues and pointers).

798

Same here.

826

You're using dyn_cast but you're unconditionally dereferencing the result later. Looking at other copies of similar code, you probably need a null check.

(Does running the static analyzer produce a warning here? We have a checker for this!)

892–893

I suspect that this type is going to be bool which is probably not what you want.

NoQ added a comment.Feb 12 2022, 6:24 PM

steakhal performed systems testing across a large set of open source projects.

I'm curious if any findings there caused you to make changes in the patch. If so it'd be great to reduce them into test cases so that other people didn't make the same mistake when they edit your new code.

clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp
892–893

I think it makes sense to add an assertion in makeNullWithType() to protect against such cases. This will probably also allow you to cover this with a test case.

Thanks for the comments. I'll address and revert back soon.

vabridgers added a comment.EditedFeb 13 2022, 3:45 AM

steakhal performed systems testing across a large set of open source projects.

I'm curious if any findings there caused you to make changes in the patch. If so it'd be great to reduce them into test cases so that other people didn't make the same mistake when they edit your new code.

This patch was driven mostly by the negative cases we uncovered as represented in patch https://reviews.llvm.org/D118050. Basically, we have an internal proprietary multicore CPU with a CPU-local address space and a "global" common address space. We support memcpy intrinsics to copy to/from these memories with the different address space pointers - which also happen to be different pointer widths. The CString checker does an overlap check, and an assert caught a pointer comparison using different pointer widths. So the basic fix is to avoid overlap checks when the pointers are of different address spaces, and then that change ripples into these changes and the required supporting test cases. We also see similar problems in the SMTConv layer that we're just patching up internally for now in a non-principled way that are not suitable for upstreaming (yet).

Since our target is downstream, it's challenging to find equivalent and relevant test cases, but we've made some headway since we found some GPUs also have similar properties. https://reviews.llvm.org/D118050 has one such test case. Perhaps it makes sense to squash that change with this one? @NoQ, do you have thoughts or a recommendation for squashing these two changes?

Looks like our internal test cases can be refactored to use GPU address spaces for supporting test cases, I'll add those.

As always, thanks for the guidance. Best

I explored some test cases, and the rabbit hole gets deeper :) Continuing to explore. Best

Address NoQ comments

vabridgers marked 5 inline comments as done.Feb 17 2022, 12:34 PM

I believe I've addressed all of Artem's comments thus far. This is a NFC patch, so will include no test cases. We detected these inconsistencies in getting NULL pointers in our downstream target, that supports 2 different pointer sizes depending on address space. That review is here, https://reviews.llvm.org/D118050, and is dependent upon these fixes (makeNull -> makeNullWithWidth refactoring). These two patches will be kept separate, with this patch reviewed and submitted first followed by https://reviews.llvm.org/D118050.

Please let me know what else needs to be done to move this patch forward. Thanks - Vince

vabridgers retitled this revision from [analyzer] Refactor makeNull to makeNullWithWidth to [analyzer] Refactor makeNull to makeNullWithWidth (NFC).Feb 17 2022, 12:34 PM
steakhal added inline comments.Feb 18 2022, 3:39 AM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
368

Add a comment that isAnyPointerType() won't work here.

371

But you also let reference types.

372

You should also remove the BasicValueFactor::getZeroWithPtrWidth() along with BasicValueFactor::getIntWithPtrWidth() since those suffer from the same issue.
Feel free to do that in a separate patch.

372–376
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp
123

You don't need the whole CheckerContext. Pass the ASTContext instead.
I would also recommend using Idx instead of ndx

133

You should just return it. Oh wait, you also wrap it inside a pointer. The name of the function does not reflect this. Either rename it or hoist this to the callsite.

402

You should use the CC->getDecl()->getThisType() instead.

660

Similarly to the previous one, use getThisType().

764

Same here.

799

At first glance, all callsites have access to the getThisType(), so we shouldn't dig this up from the specialization decl.

892–893

You probably never want to use Call.getResultType() in this file for this context.
Something like this would be more appropriate: cast<CXXMethodDecl>(Call.getDecl())->getThisType()

refactor based on @steakhal comments

vabridgers marked 10 inline comments as done.Feb 18 2022, 5:37 AM
vabridgers marked an inline comment as done.Feb 18 2022, 5:53 AM
vabridgers added inline comments.
clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
372

Will do. I had also left "TODO" comments in a different source file. I'll find those, clean those up also.

steakhal accepted this revision.Feb 23 2022, 4:12 AM

I'm satisfied. I cannot see any issues.
AFAI remember the tests runs uncovered no report changes - as expected from an NFC change.
So it's good to go on my part.
Wait for a review from @NoQ, just to be sure. Unless Artem has objections, just commit this and the whole patch stack next week.

This revision is now accepted and ready to land.Feb 23 2022, 4:12 AM
vabridgers marked an inline comment as done.Mar 1 2022, 5:09 PM

@NoQ - ping!

Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2022, 5:09 PM

Is it ok to land this change? Or shall we continue to wait on @NoQ ?

NoQ added inline comments.Mar 8 2022, 4:57 PM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
378

How do you discover the pointer width by looking only at the pointee type? Are objects of specific type always restricted to a specific address space? I.e., does every int * have the same width everywhere in the program? If not, how is this code supposed to work?

vabridgers added inline comments.Mar 18 2022, 5:17 PM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
378

The case this code addresses is represented in the test case clang/test/Analysis/cast-value-notes.cpp, this function ...

24 void evalReferences(const Shape &S) {
25   const auto &C = dyn_cast<Circle>(S);
26   // expected-note@-1 {{Assuming 'S' is not a 'Circle'}}
27   // expected-note@-2 {{Dereference of null pointer}}
28   // expected-warning@-3 {{Dereference of null pointer}}
29 }

The crash occurs from line 25 above.

Debug printing what I see at this point in the code for this case ...

QualType type : LValueReferenceType 0xffea820 'const class clang::Circle &'
`-QualType 0xffea7c1 'const class clang::Circle' const
  `-SubstTemplateTypeParmType 0xffea7c0 'class clang::Circle' sugar
    |-TemplateTypeParmType 0xffe4f90 'X' dependent depth 0 index 0
    | `-TemplateTypeParm 0xffe4f40 'X'
    `-RecordType 0xffea090 'class clang::Circle'
      `-CXXRecord 0xffea000 'Circle'

Context.getPointerType(type) : PointerType 0x10006ab0 'const class clang::Circle &*'
`-LValueReferenceType 0xffea820 'const class clang::Circle &'
  `-QualType 0xffea7c1 'const class clang::Circle' const
    `-SubstTemplateTypeParmType 0xffea7c0 'class clang::Circle' sugar
      |-TemplateTypeParmType 0xffe4f90 'X' dependent depth 0 index 0
      | `-TemplateTypeParm 0xffe4f40 'X'
      `-RecordType 0xffea090 'class clang::Circle'
        `-CXXRecord 0xffea000 'Circle'

Context.getPointerType(type->getPointeeType()) : PointerType 0x10006ae0 'const class clang::Circle *'
`-QualType 0xffea7c1 'const class clang::Circle' const
  `-SubstTemplateTypeParmType 0xffea7c0 'class clang::Circle' sugar
    |-TemplateTypeParmType 0xffe4f90 'X' dependent depth 0 index 0
    | `-TemplateTypeParm 0xffe4f40 'X'
    `-RecordType 0xffea090 'class clang::Circle'
      `-CXXRecord 0xffea000 'Circle'

The LValueReferenceType causes a crash if I do not get a pointer type, looks like ...

clang:  <root>/clang/include/clang/StaticAnalyzer/Core/PathSensitive/BasicValueFactory.h:219: const llvm::APSInt& clang::ento::BasicValueFactory::getZeroWithTypeSize(clang::QualType): Assertion `T->isScalarType()' failed.

I'm assuming the pointer type retains the address space attribute of the LValueReferenceType.

Context.getPointerType(type->getPointeeType()) produces a QualType : PointerType 'const class clang::Circle *' from an original QualType : LValueReferenceType 'const class clang::Circle &'

Is this not what's wanted - a pointer type instead of a reference, with the same address space qualifiers, or am I missing something in the query?

Thanks

NoQ added inline comments.Mar 18 2022, 10:18 PM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
378

Do I understand correctly that __attribute__((address_space)) is a type attribute that gets applied to values rather than to pointers, so __attribute__((address_space(3))) int *x defines x as "pointer to (int that lives in address_space(3))", so when you unwrap the pointer type you get an "int that lives in address_space(3)"?

Can you share some dumps to demonstrate that the attribute is correctly preserved by this procedure? Damn, a test case could be great if we could have them.

vabridgers added inline comments.Mar 19 2022, 6:24 AM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
378

I'll see what can be done about a test case, and provide the dumps for evaluation. Best

vabridgers marked an inline comment as not done.

add updated test cases per comments from NoQ

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
378

I developed "a" test methodology to demonstrate this at work. I modified the LIT case to use "dyn_cast<Circle>(S)" and "dyn_cast<attribute((address_space(3))) Circle>(S)", checking the program state using clang_analyzer_printState(). Here's the info I see from the program state and from debug prints in each case.

First, the "dyn_cast<Circle>(S)" case.

void evalReferences(const Shape &S) {
  const auto &C = dyn_cast<Circle>(S);
  clang_analyzer_printState();
  ...
}

  "dynamic_types": [
    { "region": "SymRegion{reg_$0<const struct clang::Shape & S>}", "dyn_type": "const class clang::Circle &", "sub_classable": true }


QualType type : LValueReferenceType 0x1118ef60 'const class clang::Circle &'
`-QualType 0x1118ef01 'const class clang::Circle' const
  `-SubstTemplateTypeParmType 0x1118ef00 'class clang::Circle' sugar
    |-TemplateTypeParmType 0x111887d0 'X' dependent depth 0 index 0
    | `-TemplateTypeParm 0x11188780 'X'
    `-RecordType 0x1118e710 'class clang::Circle'
      `-CXXRecord 0x1118e680 'Circle'

Context.getPointerType(type) : PointerType 0x111a94d0 'const class clang::Circle &*'
`-LValueReferenceType 0x1118ef60 'const class clang::Circle &'
  `-QualType 0x1118ef01 'const class clang::Circle' const
    `-SubstTemplateTypeParmType 0x1118ef00 'class clang::Circle' sugar
      |-TemplateTypeParmType 0x111887d0 'X' dependent depth 0 index 0
      | `-TemplateTypeParm 0x11188780 'X'
      `-RecordType 0x1118e710 'class clang::Circle'
        `-CXXRecord 0x1118e680 'Circle'

Context.getPointerType(type->getPointeeType()) : PointerType 0x111a9500 'const class clang::Circle *'
`-QualType 0x1118ef01 'const class clang::Circle' const
  `-SubstTemplateTypeParmType 0x1118ef00 'class clang::Circle' sugar
    |-TemplateTypeParmType 0x111887d0 'X' dependent depth 0 index 0
    | `-TemplateTypeParm 0x11188780 'X'
    `-RecordType 0x1118e710 'class clang::Circle'
      `-CXXRecord 0x1118e680 'Circle'

then the "dyn_cast<attribute((address_space(3))) Circle>(S)" case

void evalReferences(const Shape &S) {
  const auto &C = dyn_cast<__attribute__((address_space(3))) Circle>(S);
  clang_analyzer_printState();
  ...
}

 "dynamic_types": [
    { "region": "SymRegion{reg_$0<const struct clang::Shape & S>}", "dyn_type": "const __attribute__((address_space(3))) class clang::Circle &", "sub_classable": true }


QualType type : LValueReferenceType 0x11527030 'const __attribute__((address_space(3))) class clang::Circle &'
`-QualType 0x11526fd1 'const __attribute__((address_space(3))) class clang::Circle' const
  `-SubstTemplateTypeParmType 0x11526fd0 '__attribute__((address_space(3))) class clang::Circle' sugar
    |-TemplateTypeParmType 0x115207d0 'X' dependent depth 0 index 0
    | `-TemplateTypeParm 0x11520780 'X'
    `-QualType 0x11526ce8 '__attribute__((address_space(3))) class clang::Circle' __attribute__((address_space(3)))
      `-RecordType 0x11526710 'class clang::Circle'
        `-CXXRecord 0x11526680 'Circle'

Context.getPointerType(type) : PointerType 0x115424e0 'const __attribute__((address_space(3))) class clang::Circle &*'
`-LValueReferenceType 0x11527030 'const __attribute__((address_space(3))) class clang::Circle &'
  `-QualType 0x11526fd1 'const __attribute__((address_space(3))) class clang::Circle' const
    `-SubstTemplateTypeParmType 0x11526fd0 '__attribute__((address_space(3))) class clang::Circle' sugar
      |-TemplateTypeParmType 0x115207d0 'X' dependent depth 0 index 0
      | `-TemplateTypeParm 0x11520780 'X'
      `-QualType 0x11526ce8 '__attribute__((address_space(3))) class clang::Circle' __attribute__((address_space(3)))
        `-RecordType 0x11526710 'class clang::Circle'
          `-CXXRecord 0x11526680 'Circle'

Context.getPointerType(type->getPointeeType()) : PointerType 0x11542510 'const __attribute__((address_space(3))) class clang::Circle *'
`-QualType 0x11526fd1 'const __attribute__((address_space(3))) class clang::Circle' const
  `-SubstTemplateTypeParmType 0x11526fd0 '__attribute__((address_space(3))) class clang::Circle' sugar
    |-TemplateTypeParmType 0x115207d0 'X' dependent depth 0 index 0
    | `-TemplateTypeParm 0x11520780 'X'
    `-QualType 0x11526ce8 '__attribute__((address_space(3))) class clang::Circle' __attribute__((address_space(3)))
      `-RecordType 0x11526710 'class clang::Circle'
        `-CXXRecord 0x11526680 'Circle'

So it looks to me like the address space attribute is retained through both the "Context.getPointerType(type)" and "Context.getPointerType(type->getPointeeType())" invocations.

This also revealed the core.NullDereference checker is ignoring pointers with address space attributes - so the test cases need to be different for this change (at least for now).

I think this is enough to prove this is a NFC. Please let me know if there's anything else I need to do here.

Best

NoQ accepted this revision.Mar 21 2022, 11:12 PM

All clear now. Amazing, thank you for your detailed response, I've corrected my understanding of address spaces so I'll be able to understand future patches better.