This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Produce SymbolCast symbols for integral types in SValBuilder::evalCast
ClosedPublic

Authored by ASDenysPetrov on Jul 2 2021, 2:47 AM.

Details

Summary

Produce SymbolCast for integral types in evalCast function. Apply several simplification techniques while producing the symbols. Added a boolean option support-symbolic-integer-casts under -analyzer-config flag. Disabled the feature by default.

Due to suggestions for splitting D103096 this patch has appeared. It's the first part of the D103096 revision.

Diff Detail

Event Timeline

ASDenysPetrov created this revision.Jul 2 2021, 2:47 AM
ASDenysPetrov requested review of this revision.Jul 2 2021, 2:47 AM

Hey, thanks for starting on splitting into more pieces!

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
96

What does it do and what should I give it?

clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
510–546

I'd like to see the motivation about why this code is removed.
My main concern is this:

  • If removing evalIntegralCast is essential for this feature and is not an NFC: it should also obey the new analyzer option.
  • If it is NFC, and we can safely remove this function no matter what the value of the option is, it should be a separate patch.
clang/test/Analysis/bool-assignment.c
46–50

If Z3 and not Z3 are the same now, we can simply merge two cases and remove preprocessor directive.

clang/test/Analysis/produce-symbolcast.cpp
1

This test is failing on my desktop, when I downloaded your patch:

error: 'warning' diagnostics expected but not seen: 
  File .../produce-symbolcast.cpp Line 67: (long) (reg_$0<char x>)
  File .../produce-symbolcast.cpp Line 79: (long long) ((unsigned long) (reg_$0<char x>))
  File .../produce-symbolcast.cpp Line 122: (unsigned long) (reg_$0<char x>)
  File .../produce-symbolcast.cpp Line 134: (unsigned long long) ((unsigned long) (reg_$0<char x>))
  File .../produce-symbolcast.cpp Line 192: (long) (reg_$0<short x>)
  File .../produce-symbolcast.cpp Line 204: (long long) ((unsigned long) (reg_$0<short x>))
  File .../produce-symbolcast.cpp Line 247: (unsigned long) (reg_$0<short x>)
  File .../produce-symbolcast.cpp Line 259: (unsigned long long) ((unsigned long) (reg_$0<short x>))
  File .../produce-symbolcast.cpp Line 317: (long) (reg_$0<int x>)
  File .../produce-symbolcast.cpp Line 329: (long long) ((unsigned long) (reg_$0<int x>))
  File .../produce-symbolcast.cpp Line 372: (unsigned long) (reg_$0<int x>)
  File .../produce-symbolcast.cpp Line 384: (unsigned long long) ((unsigned long) (reg_$0<int x>))
  File .../produce-symbolcast.cpp Line 448: (long long) (reg_$0<long x>)
  File .../produce-symbolcast.cpp Line 454: (long long) ((unsigned long) (reg_$0<long x>))
  File .../produce-symbolcast.cpp Line 492: (unsigned long) (reg_$0<long x>)
  File .../produce-symbolcast.cpp Line 497: (unsigned long) (reg_$0<long x>)
  File .../produce-symbolcast.cpp Line 503: (unsigned long long) (reg_$0<long x>)
  File .../produce-symbolcast.cpp Line 509: (unsigned long long) ((unsigned long) (reg_$0<long x>))
  File .../produce-symbolcast.cpp Line 562: (long) (reg_$0<llong x>)
  File .../produce-symbolcast.cpp Line 567: (long) (reg_$0<llong x>)
  File .../produce-symbolcast.cpp Line 574: (long) (reg_$0<llong x>)
  File .../produce-symbolcast.cpp Line 579: (unsigned long) (reg_$0<llong x>)
  File .../produce-symbolcast.cpp Line 617: (unsigned long) (reg_$0<llong x>)
  File .../produce-symbolcast.cpp Line 622: (unsigned long) (reg_$0<llong x>)
  File .../produce-symbolcast.cpp Line 629: (unsigned long long) ((long) (reg_$0<llong x>))
  File .../produce-symbolcast.cpp Line 634: (unsigned long long) ((unsigned long) (reg_$0<llong x>))
  File .../produce-symbolcast.cpp Line 937: (long) (reg_$0<uint x>)
  File .../produce-symbolcast.cpp Line 949: (long long) ((long) (reg_$0<uint x>))
  File .../produce-symbolcast.cpp Line 992: (unsigned long) (reg_$0<uint x>)
  File .../produce-symbolcast.cpp Line 1004: (unsigned long long) ((long) (reg_$0<uint x>))
  File .../produce-symbolcast.cpp Line 1062: (long) (reg_$0<ulong x>)
  File .../produce-symbolcast.cpp Line 1067: (long) (reg_$0<ulong x>)
  File .../produce-symbolcast.cpp Line 1074: (long long) ((long) (reg_$0<ulong x>))
  File .../produce-symbolcast.cpp Line 1078: (long long) (reg_$0<ulong x>)
  File .../produce-symbolcast.cpp Line 1129: (unsigned long long) ((long) (reg_$0<ulong x>))
  File .../produce-symbolcast.cpp Line 1133: (unsigned long long) (reg_$0<ulong x>)
  File .../produce-symbolcast.cpp Line 1187: (long) (reg_$0<ullong x>)
  File .../produce-symbolcast.cpp Line 1192: (long) (reg_$0<ullong x>)
  File .../produce-symbolcast.cpp Line 1199: (long long) ((long) (reg_$0<ullong x>))
  File .../produce-symbolcast.cpp Line 1204: (long long) ((unsigned long) (reg_$0<ullong x>))
  File .../produce-symbolcast.cpp Line 1242: (unsigned long) (reg_$0<ullong x>)
  File .../produce-symbolcast.cpp Line 1247: (unsigned long) (reg_$0<ullong x>)
  File .../produce-symbolcast.cpp Line 1254: (unsigned long long) ((long) (reg_$0<ullong x>))
  File .../produce-symbolcast.cpp Line 1259: (unsigned long long) ((unsigned long) (reg_$0<ullong x>))
error: 'warning' diagnostics seen but not expected: 
  File .../produce-symbolcast.cpp Line 67: (long) ((unsigned int) (reg_$0<char x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 79: (long long) (reg_$0<char x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 122: (unsigned long) ((unsigned int) (reg_$0<char x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 134: (unsigned long long) (reg_$0<char x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 192: (long) ((unsigned int) (reg_$0<short x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 204: (long long) (reg_$0<short x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 247: (unsigned long) ((unsigned int) (reg_$0<short x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 259: (unsigned long long) (reg_$0<short x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 317: (long) ((unsigned int) (reg_$0<int x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 329: (long long) (reg_$0<int x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 372: (unsigned long) ((unsigned int) (reg_$0<int x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 384: (unsigned long long) (reg_$0<int x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 448: (long long) ((int) (reg_$0<long x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 454: (long long) (reg_$0<long x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 492: (unsigned long) ((int) (reg_$0<long x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 497: (unsigned long) ((unsigned int) (reg_$0<long x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 503: (unsigned long long) ((int) (reg_$0<long x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 509: (unsigned long long) (reg_$0<long x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 562: (long) ((int) (reg_$0<llong x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 567: (long) ((unsigned int) (reg_$0<llong x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 574: reg_$0<llong x> [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 579: reg_$0<llong x> [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 617: (unsigned long) ((int) (reg_$0<llong x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 622: (unsigned long) ((unsigned int) (reg_$0<llong x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 629: (unsigned long long) (reg_$0<llong x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 634: (unsigned long long) (reg_$0<llong x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 937: (long) ((int) (reg_$0<uint x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 949: (long long) (reg_$0<uint x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 992: (unsigned long) ((int) (reg_$0<uint x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1004: (unsigned long long) (reg_$0<uint x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1062: (long) ((int) (reg_$0<ulong x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1067: (long) ((unsigned int) (reg_$0<ulong x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1074: (long long) (reg_$0<ulong x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1078: (long long) ((unsigned int) (reg_$0<ulong x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1129: (unsigned long long) (reg_$0<ulong x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1133: (unsigned long long) ((unsigned int) (reg_$0<ulong x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1187: (long) ((int) (reg_$0<ullong x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1192: (long) ((unsigned int) (reg_$0<ullong x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1199: (long long) (reg_$0<ullong x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1204: (long long) (reg_$0<ullong x>) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1242: (unsigned long) ((int) (reg_$0<ullong x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1247: (unsigned long) ((unsigned int) (reg_$0<ullong x>)) [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1254: reg_$0<ullong x> [debug.ExprInspection]
  File .../produce-symbolcast.cpp Line 1259: reg_$0<ullong x> [debug.ExprInspection]
88 errors generated.
vsavchenko added inline comments.Jul 2 2021, 4:14 AM
clang/test/Analysis/range_casts.c
125–126

The main purpose of the new option is exactly to prevent "temporary regressions" so that everything works just as it did before

I will update the patch soon according to your suggestions.

clang/test/Analysis/bool-assignment.c
46–50

+1

clang/test/Analysis/produce-symbolcast.cpp
1

I expected something of that. Thank you for the log. Could you please tell me what is sizeof(long) on you PC while test running?

vsavchenko added inline comments.Jul 2 2021, 6:29 AM
clang/test/Analysis/produce-symbolcast.cpp
1

sizeof(long) == 8

PS: buildbot also found it
https://reviews.llvm.org/harbormaster/unit/view/800270/

steakhal added inline comments.Jul 2 2021, 7:08 AM
clang/test/Analysis/produce-symbolcast.cpp
1

Maybe we shall explicitly pass the triple, and test for the major archs, such as linux 32 bits, linux 64 bits, and probably at least one from Microsoft?
@ASDenysPetrov, you use Windows, don't you?

16

char and signed char mean different types.
We should probably have schar and test it as well.

Retruned back SValBuilder::evalIntegralCast to eliminated regression. Fixed tests when long == 4bytes. Added doc-comment to simplifySymbolCast.

ASDenysPetrov added inline comments.Jul 2 2021, 7:46 AM
clang/test/Analysis/produce-symbolcast.cpp
1

Fair point! I'll see into it.

16

I'll add tests for schar. Thnx.

vsavchenko added a comment.EditedJul 2 2021, 8:13 AM

Also, although the test is very extensive, it is pretty lopsided at the same time. C-style cast is only one case out of the myriad of all explicit and, more importantly, implicit casts.

vsavchenko added inline comments.Jul 2 2021, 8:17 AM
clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
318

BTW, mb it should be less specific? Something like ShouldSupportSymbolicIntegerCasts?
BTW 2, do you even plan on supporting symbolic casts in other cases?

Also, although the test is very extensive, it is pretty lopsided at the same time. C-style cast is only one case out of the myriad of all explicit and, more importantly, implicit casts.

I agree in a part of size, but these C-style casts generates the same AST tree as all other ex/implicit forms of casts . Though, casts can look completely differently but at a low level they are very similar.
Here is an example:

void test(long x) {
  (llong)(short)(char)x; 
}

FunctionDecl 0xbc306b8 <test.cpp:1:1, line:3:1> line:1:6 test 'void (long)'
  |-ParmVarDecl 0xbc305f0 <col:11, col:16> col:16 used x 'long'
  `-CompoundStmt 0xbc9f4b0 <col:19, line:3:1>
    `-CStyleCastExpr 0xbc9f488 <line:2:3, col:27> 'long long' <NoOp>
      `-ImplicitCastExpr 0xbc9f470 <col:14, col:27> 'long long' <IntegralCast> part_of_explicit_cast
        `-CStyleCastExpr 0xbc9f430 <col:14, col:27> 'short' <NoOp>
          `-ImplicitCastExpr 0xbc9f418 <col:21, col:27> 'short' <IntegralCast> part_of_explicit_cast
            `-CStyleCastExpr 0xbc9f3d8 <col:21, col:27> 'char' <NoOp>
              `-ImplicitCastExpr 0xbc9f3c0 <col:27> 'char' <IntegralCast> part_of_explicit_cast
                `-ImplicitCastExpr 0xbc9f3a8 <col:27> 'long' <LValueToRValue> part_of_explicit_cast
                  `-DeclRefExpr 0xbc9f378 <col:27> 'long' lvalue ParmVar 0xbc305f0 'x' 'long'

and

void test(long x) {  
  char c = x;
  if(static_cast<short>(c) == -1ll);
}

`-FunctionDecl 0xbc606b8 <test.cpp:1:1, line:4:1> line:1:6 test 'void (long)'
  |-ParmVarDecl 0xbc605f0 <col:11, col:16> col:16 used x 'long'
  `-CompoundStmt 0xbcd0598 <col:19, line:4:1>
    |-DeclStmt 0xbcd0450 <line:2:3, col:13>
    | `-VarDecl 0xbcd0398 <col:3, col:12> col:8 used c 'char' cinit
    |   `-ImplicitCastExpr 0xbcd0438 <col:12> 'char' <IntegralCast>
    |     `-ImplicitCastExpr 0xbcd0420 <col:12> 'long' <LValueToRValue>
    |       `-DeclRefExpr 0xbcd0400 <col:12> 'long' lvalue ParmVar 0xbc605f0 'x' 'long'
    `-IfStmt 0xbcd0578 <line:3:3, col:36>
      |-BinaryOperator 0xbcd0550 <col:6, col:32> 'bool' '=='
      | |-ImplicitCastExpr 0xbcd0538 <col:6, col:26> 'long long' <IntegralCast>
      | | `-CXXStaticCastExpr 0xbcd04d0 <col:6, col:26> 'short' static_cast<short> <NoOp>
      | |   `-ImplicitCastExpr 0xbcd04b8 <col:25> 'short' <IntegralCast> part_of_explicit_cast
      | |     `-ImplicitCastExpr 0xbcd04a0 <col:25> 'char' <LValueToRValue> part_of_explicit_cast
      | |       `-DeclRefExpr 0xbcd0468 <col:25> 'char' lvalue Var 0xbcd0398 'c' 'char'
      | `-UnaryOperator 0xbcd0520 <col:31, col:32> 'long long' prefix '-'
      |   `-IntegerLiteral 0xbcd0500 <col:32> 'long long' 1
      `-NullStmt 0xbcd0570 <col:36>

We can see similar IntegralCast for both variants. I was aiming to generate all the cases of IntegralCast and C-style cast is enough for me.

clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
318

ShouldSupportSymbolicIntegerCasts?

Thanks. I thought about an appropriate name but failed to come up. That's what we need!

BTW 2, do you even plan on supporting symbolic casts in other cases?

I think about adding bool-int, int-bool, ptr-int, int-ptr to the list.

ASDenysPetrov edited the summary of this revision. (Show Details)

Changed option name to support-symbolic-integer-casts. Added signed char to tests. Splitted tests into x86 and x64 versions, where sizeof(long) is 4 and 8 correspondingly.
P.S. splitting tests is due to absence of any reliable way to include specific code with preprocessor directives based on size of long on a specific platform.

Just a ping. I'd like to have this patch stack loaded somewhen :)

NoQ accepted this revision.Jan 10 2022, 1:50 PM

This looks great with the option flag. Landing this patch will enable more people to test the new mode and produce feedback on whether the constraint solver keeps working well enough in presence of the new symbols.

clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
942

I really hope these are correct with respect to sign extension 🤞

clang/test/Analysis/produce-symbolcast_x64.cpp
15 ↗(On Diff #388513)

There's clang_analyzer_denote()/clang_analyzer_express() which is slightly better because it avoids testing the specific dump format (so we could change it more easily if we want). No pressure though, if it's too much work to re-do then let's keep them as is.

This revision is now accepted and ready to land.Jan 10 2022, 1:50 PM

This looks great with the option flag. Landing this patch will enable more people to test the new mode and produce feedback on whether the constraint solver keeps working well enough in presence of the new symbols.

Many thanks for your approval, @NoQ! The upset thing is that to get this loaded we also should close this parent revision D103094 :-(

This looks great with the option flag. Landing this patch will enable more people to test the new mode and produce feedback on whether the constraint solver keeps working well enough in presence of the new symbols.

Many thanks for your approval, @NoQ! The upset thing is that to get this loaded we also should close this parent revision D103094 :-(

Actually, this patch does not depend on castTo and thus it is independent from the rest of the patch stack. So we could just land it as it is. (In this patch, you just create the SymbolCasts but there is no handling of that in the constraint manager which needs the castTo.)

martong accepted this revision.Jan 12 2022, 9:10 AM

Thank you @martong.
I'll load it ASAP. It's great to see symcasts is closer.

Shame on me :) I'm mixed up in my own patches. I forgot that I separated them consciously for this purpose.

Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2022, 6:44 AM