This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Fix assertion in getAPSIntType
ClosedPublic

Authored by vabridgers on Dec 9 2022, 5:27 PM.

Details

Summary

getAPSIntType crashes when analzying a simple case that uses a fixed
point type. getAPSIntType needs to handle fixed point types differently
to get sign information. LIT and Unittests were added since there were
none previously added.

clang: <root>/clang/include/clang/StaticAnalyzer/Core/PathSensitive/BasicValueFactory.h:155:
   clang::ento::APSIntType clang::ento::BasicValueFactory::getAPSIntType(clang::QualType) const:
   Assertion `T->isIntegralOrEnumerationType() || Loc::isLocType(T)' failed.
 
 Program received signal SIGABRT, Aborted.
 0x00007ffff66e2387 in raise () from /lib64/libc.so.6
 (gdb) bt
     at <root>/clang/include/clang/StaticAnalyzer/Core/PathSensitive/BasicValueFactory.h:155
     at <root>/clang/include/clang/StaticAnalyzer/Core/PathSensitive/BasicValueFactory.h:172
       LHS=0x108965a0, op=clang::BO_Shr, RHS=..., resultTy=...) at
       <root>/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:213
       (this=0x1088e460, state=..., op=clang::BO_Shr, lhs=..., rhs=..., resultTy=...)
       at <root>/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:681

Diff Detail

Event Timeline

vabridgers created this revision.Dec 9 2022, 5:27 PM
Herald added a project: Restricted Project. · View Herald Transcript
vabridgers requested review of this revision.Dec 9 2022, 5:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 9 2022, 5:27 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
vabridgers updated this revision to Diff 481809.Dec 9 2022, 5:52 PM

update commit header

vabridgers edited the summary of this revision. (Show Details)Dec 9 2022, 5:52 PM

Some debug context ...

(gdb) frame 4
#4  0x0000000006f55b73 in clang::ento::BasicValueFactory::getAPSIntType (this=0x1088e470, T=...)
    at <root>/clang/include/clang/StaticAnalyzer/Core/PathSensitive/BasicValueFactory.h:155
155	    assert(T->isIntegralOrEnumerationType() || Loc::isLocType(T));
(gdb) p T
$1 = {Value = {Value = 276884496}}
(gdb) p T.dump()
BuiltinType 0x1080ec10 'long _Accum'
$2 = void
(gdb) p T->isIntegralOrEnumerationType()
$3 = false
(gdb) p Loc::isLocType(T)
$4 = false
(gdb) p Ctx.getIntWidth(T)
$5 = 64
(gdb) p !T->isSignedIntegerOrEnumerationType()
$6 = true
steakhal requested changes to this revision.Dec 10 2022, 11:18 AM

Nice catch.

I had a look at https://lists.llvm.org/pipermail/llvm-dev/2021-March/149216.html, and http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf.
Your change makes sense to me.

However, I'm a bit worried that other parts of the analyzer might suffer from the same phenomenon. When I grepped for isIntegralOrEnumerationType(), there were like 45 mentions.
I guess, you will worry about it more than I do, but I'd recommend you to have a look at them. You might be able to craft even more crashes.

PS: I was a bit shocked that we don't have any tests mentioning _Accum and _Fract.

clang/include/clang/StaticAnalyzer/Core/PathSensitive/BasicValueFactory.h
154–155

I was thinking of using isFixedPointOrIntegerType(). However, that would not accept scoped enums.

inline bool Type::isFixedPointOrIntegerType() const {
  return isFixedPointType() || isIntegerType();
}

inline bool Type::isIntegerType() const {
  if (const auto *BT = dyn_cast<BuiltinType>(CanonicalType))
    return BT->getKind() >= BuiltinType::Bool &&
           BT->getKind() <= BuiltinType::Int128;
  if (const EnumType *ET = dyn_cast<EnumType>(CanonicalType)) {
    // Incomplete enum types are not treated as integer types.
    // FIXME: In C++, enum types are never integer types.
    return IsEnumDeclComplete(ET->getDecl()) &&
      !IsEnumDeclScoped(ET->getDecl());     // <------------ rejects scoped enums :(
  }
  return isBitIntType();
}

inline bool Type::isIntegralOrEnumerationType() const {
  if (const auto *BT = dyn_cast<BuiltinType>(CanonicalType))
    return BT->getKind() >= BuiltinType::Bool &&
           BT->getKind() <= BuiltinType::Int128;

  // Check for a complete enum type; incomplete enum types are not properly an
  // enumeration type in the sense required here.
  if (const auto *ET = dyn_cast<EnumType>(CanonicalType))
    return IsEnumDeclComplete(ET->getDecl());     // <-------- accepts scoped enums!

  return isBitIntType();
}

It looked so neat at first. Ah. I just had to share this.

154–161

I don't think you are supposed to call isSignedIntegerOrEnumerationType() if you have a fixed-point type.

inline bool Type::isSignedFixedPointType() const {
  if (const auto *BT = dyn_cast<BuiltinType>(CanonicalType)) {
    return ((BT->getKind() >= BuiltinType::ShortAccum &&
             BT->getKind() <= BuiltinType::LongAccum) ||
            (BT->getKind() >= BuiltinType::ShortFract &&
             BT->getKind() <= BuiltinType::LongFract) ||
            (BT->getKind() >= BuiltinType::SatShortAccum &&
             BT->getKind() <= BuiltinType::SatLongAccum) ||
            (BT->getKind() >= BuiltinType::SatShortFract &&
             BT->getKind() <= BuiltinType::SatLongFract));
  }
  return false;
}

By looking at the implementation of this, I don't think you could substitute that with isSignedIntegerOrEnumerationType().
Am I wrong about this?

Please demonstrate this by tests.

clang/test/Analysis/fixed-point.c
7–10

It had a few extra spaces. And by casting it to void we could eliminate -Wno-unused as well.

This revision now requires changes to proceed.Dec 10 2022, 11:18 AM

Thanks for the comments. I assumed git-clang-format cleaned up the cruft, but it didn't - that's disappointing. I'll try these things and update the review. Best!

vabridgers edited the summary of this revision. (Show Details)

correct formatting of test case and expand test cases

vabridgers marked 2 inline comments as done.Dec 10 2022, 3:16 PM

Thanks Balazs, I think the comments have been addressed. Let me know if there's anything else to do, or if this is ready to land.

Best!

clang/include/clang/StaticAnalyzer/Core/PathSensitive/BasicValueFactory.h
154–161

I tried using isSignedIntegerOrEnumerationType() instead of (T->isIntegralOrEnumerationType() || T->isFixedPointType() ... ), but got the same assert :/

I corrected the formatting and expanded the test cases.

clang/test/Analysis/fixed-point.c
7–10

Dang, I used git-clang-format to try cleaning the patch thinking it would clean up formatting, but I guess it didn't :/

I cleaned up the test case and removed the -Wno-unused option.

steakhal added inline comments.Dec 10 2022, 11:12 PM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/BasicValueFactory.h
154–161

Is hould have clarified, sorry.

My point is that for constructing the APSIntType, we calculate the bitwidth and the signedness.

My problem is that the calculation is wrong for the signedness in case we have a signed fixedpointtype.
It is wrong because we reach isSignedIntegerOrEnumerationType() with a fixedpoint type. For that even though its signed, it will return false!

And in the end we will have an APSIntType with the wrong signednss.

So my point is that we should probably handle fixedpoint types separately to have a distict return statement for it.
But im jumping to the solution, what I originally wanted to highlight was this.
That was why I requested changes.
And this is what I wanted to see some how intests, but I wont insist if its too difficult to craft.

vabridgers marked an inline comment as done.Dec 11 2022, 7:02 AM
vabridgers added inline comments.
clang/include/clang/StaticAnalyzer/Core/PathSensitive/BasicValueFactory.h
154–161

In retrospect, your original comment was clear. I did not fully comprehend the comment.

I'll explore a few options. I checked the unittests for a "reference", and it seems our static analysis unittest coverage is sparse. Not sure I need to go that far, but I'll explore the possibility of crafting a unittest for this case that demonstrates the problem and solution.

Thanks again - best!

I think we could settle on something like this:

   APSIntType getAPSIntType(QualType T) const {
+    if (T->isFixedPointType())
+      return APSIntType(Ctx.getIntWidth(T), T->isUnsignedFixedPointType());
+
     // For the purposes of the analysis and constraints, we treat atomics
     // as their underlying types.
     if (const AtomicType *AT = T->getAs<AtomicType>()) {
       T = AT->getValueType();
     }
 
     assert(T->isIntegralOrEnumerationType() || Loc::isLocType(T));
     return APSIntType(Ctx.getIntWidth(T),
-                      !T->isSignedIntegerOrEnumerationType());
+                      T->isUnsignedIntegerOrEnumerationType());
   }

Your original test should be enough.
In a follow-up patch, you could extend the cases and maybe uncover more crashes.
I wouldn't want to block you. It's good enough.

correct handling of sign type per @steakhal comments. Thank you, sir :)

vabridgers edited the summary of this revision. (Show Details)Dec 11 2022, 11:50 AM
vabridgers marked an inline comment as done.Dec 11 2022, 11:55 AM
vabridgers added inline comments.
clang/include/clang/StaticAnalyzer/Core/PathSensitive/BasicValueFactory.h
154–161

Thanks for your insightful comments, in retrospect I should have dug into this more from the very beginning. Turns out there is an API get the fixed point sign I should be using, and creating a simple unittest had value in working this out (imagine that ! :) ).

So, I updated with the correction and a simple unittest.

Really good progress. Nicestuff. I really appreciate the unittest.
However Ive got some minor comments there :D

clang/test/Analysis/fixed-point.c
10

I would suggest 'Kind' or something similar. So by reading it would feel natural that is an enum.
Its probably a personal preference though. Disregard if you want.

clang/unittests/StaticAnalyzer/APSIntTypeTest.cpp
2

Copypaste typo: SA Options?. Check the rest.

14

Do we really need checkers here? Check the rest of the includes. It will reduce compile times inthe long run.

21

Should the content really be inside the ento namespace?
I would probably advocate for anonymous namespaces instead.

24

Give a meaningful name for this.

30

We use Upper Camel Case for variables as per llvm. In addition we usually refer to QualTypes by 'Ty'.

vabridgers marked 7 inline comments as done.

clean up pass per comments from @steakhal

I think that addresses the last comments. Thanks again :)

steakhal accepted this revision.Dec 11 2022, 2:29 PM

The fix feels suboptimal in readability but let it be whatever.

Functionally feels good. Tests are there and gets the job done.
I wont object.

clang/include/clang/StaticAnalyzer/Core/PathSensitive/BasicValueFactory.h
158

else, or else if just complicates the control flow after a return statement.

And I also intentionally recommended to use the isUnsigned.. stuff to make it easier to read withoutthe negation.

This revision is now accepted and ready to land.Dec 11 2022, 2:29 PM