Page MenuHomePhabricator

[analyzer] Fix cast evaluation on scoped enums in ExprEngine
Needs ReviewPublic

Authored by steakhal on Aug 7 2020, 8:10 AM.

Details

Summary

We ignored the cast if the enum was scoped.
This is bad since there is no implicit conversion from the scoped enum to the corresponding underlying type.

The fix is basically: isIntegralOrEnumerationType() -> isIntegralOr**Unscoped**EnumerationType()

This materialized in crashes on analyzing the LLVM itself using the Z3 refutation.
Refutation synthesized the given Z3 Binary expression (BO_And of unsigned char aka. 8bits and an int 32 bits) with the wrong bitwidth in the end, which triggered an assert.

Now, we evaluate the cast according to the standard.

This bug could have been triggered using the Z3 CM according to https://bugs.llvm.org/show_bug.cgi?id=44030

Diff Detail

Unit TestsFailed

TimeTest
710 mslinux > LLVM.Transforms/OpenMP::parallel_deletion.ll
Script: -- : 'RUN: at line 2'; /mnt/disks/ssd0/agent/llvm-project/build/bin/opt -S -attributor -openmpopt < /mnt/disks/ssd0/agent/llvm-project/llvm/test/Transforms/OpenMP/parallel_deletion.ll | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/llvm/test/Transforms/OpenMP/parallel_deletion.ll
50 mswindows > LLVM.DebugInfo/X86::addr-tu-to-non-tu.ll
Script: -- : 'RUN: at line 1'; c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\llc.exe -filetype=obj -O0 -generate-type-units -split-dwarf-file=x.dwo < C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\DebugInfo\X86\addr-tu-to-non-tu.ll | c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\llvm-dwarfdump.exe -debug-info -debug-types - | c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\filecheck.exe --implicit-check-not=Unit --implicit-check-not=contents --implicit-check-not=declaration C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\DebugInfo\X86\addr-tu-to-non-tu.ll
580 mswindows > LLVM.Transforms/OpenMP::parallel_deletion.ll
Script: -- : 'RUN: at line 2'; c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\opt.exe -S -attributor -openmpopt < C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\Transforms\OpenMP\parallel_deletion.ll | c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\filecheck.exe C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\Transforms\OpenMP\parallel_deletion.ll

Event Timeline

steakhal created this revision.Aug 7 2020, 8:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 7 2020, 8:10 AM
steakhal requested review of this revision.Aug 7 2020, 8:10 AM

Good catch!

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
98–104

I don't really see any reasons why is this a lambda and not a free function

101

I think it's better to add tests for this condition.

103–105

Maybe this comment should be moved closer to the return statement?

steakhal planned changes to this revision.Aug 7 2020, 9:14 AM
steakhal added inline comments.
clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
98–104

It's somewhat domain-specific that we require:

  • bool-uint128 builtin types
  • complete, scoped enum types ((I think we need completeness for the analysis))

The one that comes quite close to these requirements was the isIntegralOrEnumerationType, but that does not check if the enum is unscoped.

We can extend the Type header with this.
Should we go on that route?

101

You are right, I will add such cases as well.

103–105

Yes, I will move it closer.

whisperity added inline comments.Aug 7 2020, 9:22 AM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
98–104

I think they meant that this function should be a lambda, but a function inside the current translation unit (either in namespace anonymous, or a member function of SValBuilder).

98–104

Edit: should not be
It does not have to go into Type.h.

steakhal updated this revision to Diff 283936.Aug 7 2020, 9:52 AM
  • Moved the FIXME closer to the subject.
  • Added tests for covering incomplete enums as well.
  • Added REQUIRES: z3. This will mark the test case unsupported on every buildbots. See my notes about this behavior at D83677.
  • Refined test file.
steakhal marked 2 inline comments as done.Aug 7 2020, 9:53 AM
NoQ added a comment.Aug 7 2020, 11:12 AM

Aha, ok, sounds like the right thing to do. Like, for Z3 it's actually the wrong thing to do (you'd prefer to evaluate the cast perfectly by adding SymbolCast) but for pure RangeConstraintManager this is the lesser of two evils.

Because this patch changes the behavior of regular analysis (without Z3), i expect tests to reflect that. Please add ExprInspection-based tests to test values produced by casts.

I don't understand why should the behavior be different for incomplete types. Can you explain?

steakhal updated this revision to Diff 284126.Aug 8 2020, 2:22 AM
steakhal marked 3 inline comments as done.
steakhal edited the summary of this revision. (Show Details)
  • Using dumps instead of reaching in tests.
  • Not requiring complete enums anymore (unlike we did before the patch).
In D85528#2203325, @NoQ wrote:

Aha, ok, sounds like the right thing to do. Like, for Z3 it's actually the wrong thing to do (you'd prefer to evaluate the cast perfectly by adding SymbolCast) but for pure RangeConstraintManager this is the lesser of two evils.

My primary objective is to fix all the crashes related to Range CM + Z3 refutation.

Because this patch changes the behavior of regular analysis (without Z3), i expect tests to reflect that.

What do you expect exactly?

REQUIRES: z3 is necessary for the refutation.
However, adding this requirement would not mean that this test will run if you have Z3 installed though.
You should add the extra llvm-lit param to enable such tests.
I don't want to repeat myself too much but D83677 describes all the details of this test infra fiasco.
I would appreciate some feedback there.

Please add ExprInspection-based tests to test values produced by casts.

Ok, I fixed that - thanks.

I don't understand why should the behavior be different for incomplete types. Can you explain?

You should be right. Fixed that.

Looks reasonable to me, but I am not very familiar with the impacts of the additional casts. Do we lose some modeling power when we are using the regular constraint solver?

For example, when we have constraints both on the original and the cased symbol can the analyzer "merge" them?

Something like:

ScopedPrimitive sym = conjure<ScopedPrimitive>();
if (sym == ScopedPrimitive::Max)
  return;
int sym2 = static_cast<unsigned char>(sym);
if (sym2 == 0)
  return;
// Do we know here that both sym and sym2 has the same range?
// Is there a change in the behavior compared to before the patch?

Looks reasonable to me, but I am not very familiar with the impacts of the additional casts. Do we lose some modeling power when we are using the regular constraint solver?

For example, when we have constraints both on the original and the cased symbol can the analyzer "merge" them?

Something like:

ScopedPrimitive sym = conjure<ScopedPrimitive>();
if (sym == ScopedPrimitive::Max)
  return;
int sym2 = static_cast<unsigned char>(sym);
if (sym2 == 0)
  return;
// Do we know here that both sym and sym2 has the same range?
// Is there a change in the behavior compared to before the patch?

Huh, it's a really interesting question.
Here is the result:


Here is the test code:

enum class ScopedPrimitive : unsigned char { Min = 2, Max = 8 };
void foo() {
  auto sym = conjure<ScopedPrimitive>();
  if (sym == ScopedPrimitive::Max)
    return;

  int sym2 = static_cast<unsigned char>(sym);
  if (sym2 == 0)
    return;

  // Do we know here that both sym and sym2 has the same range?
  // Is there a change in the behavior compared to before the patch?
  clang_analyzer_printState();
  (void)sym;
  (void)sym2;
}

Before the patch:

"constraints": [
    { "symbol": "conj_$2{enum ScopedPrimitive, LC1, S1083, #1}", "range": "{ [1, 7], [9, 255] }" }
  ]

After the patch:

"constraints": [
    { "symbol": "conj_$2{enum ScopedPrimitive, LC1, S1881, #1}", "range": "{ [0, 7], [9, 255] }" },
    { "symbol": "(unsigned char) (conj_$2{enum ScopedPrimitive, LC1, S1881, #1})", "range": "{ [1, 255] }" }
  ]

For example, when we have constraints both on the original and the cased symbol can the analyzer "merge" them?

Apparently, not xD.

I mean, this shouldn't be an issue. Since we already omitted the 'unnecessary' cast expressions... That decision is the root cause of this, we should have expected that.

IMO we do the right thing here. If we want to treat sym and sym2 to refer to the same symbol, we should patch the CM to canonize, and remove such casts before storing the constraint.
But the cast symbols should exist for casting scoped enums.

steakhal edited the summary of this revision. (Show Details)Aug 9 2020, 4:41 AM
NoQ added a comment.Aug 9 2020, 11:33 PM
In D85528#2203325, @NoQ wrote:

Because this patch changes the behavior of regular analysis (without Z3), i expect tests to reflect that.

What do you expect exactly?

REQUIRES: z3 is necessary for the refutation.
However, adding this requirement would not mean that this test will run if you have Z3 installed though.
You should add the extra llvm-lit param to enable such tests.
I don't want to repeat myself too much but D83677 describes all the details of this test infra fiasco.
I would appreciate some feedback there.

I expect at least one LIT test without -analyzer-config crosscheck-with-z3=true (i.e., tests the default behavior, not the Z3 behavior) and works differently before and after the patch. Because you are introducing a change in the default behavior: an unknown value is now denoted by a different symbolic value. And the default behavior is much more important to cover with tests than the non-default behavior - simply because it's the default behavior, which means the vast majority of our users will notice the change.

steakhal edited the summary of this revision. (Show Details)Aug 10 2020, 5:23 AM
whisperity added inline comments.Aug 10 2020, 6:26 AM
clang/test/Analysis/z3-refute-enum-crash.cpp
6

You can have multiple RUN lines in the same test file, which will result in the tests potentially executed with multiple configuration.

In D85528#2205799, @NoQ wrote:

I expect at least one LIT test without -analyzer-config crosscheck-with-z3=true

Agreed. I think the code snippet I proposed would be a great test to include with this revision.

lebedev.ri resigned from this revision.Aug 12 2020, 2:34 PM
steakhal updated this revision to Diff 286209.Aug 18 2020, 12:44 AM
steakhal marked an inline comment as done.

Add an extra RUN line without refutation.
The expected result is the same as with refutation.

It would be nice to have this fix in clang11.
Do you think it's qualified for it?

It would be nice to have this fix in clang11.
Do you think it's qualified for it?

Unfortunately, this is not a fix only change. This is a fix for refutation, but also a behavioral change for the default settings (so for most users). With this little time left until release, I would be uncomfortable landing a behavioral change. I'm not opposed to landing this to master, as we will have more time there to see whether there are any unwanted side effects in practice.

I'm not opposed to landing this to master, as we will have more time there to see whether there are any unwanted side effects in practice.

I made some experiments on the following projects:
symengine,oatpp,zstd,simbody,duckdb,drogon,fmt,cppcheck,capnproto

Only these projects were C++ projects using enum class constructs under the clang/utils/analyzer/projects/projects.json enumeration.
According to the results, no reports changed using the SATest.py tool for the measurement.
I was using this script to collect the log:

NoQ added a comment.Oct 1 2020, 9:03 PM

Aha, ok, thanks, I now understand what the problem is because I was able to run the test before the patch and see how the patch changes the behavior.

What do you think about flattening the enum type out entirely? I.e., instead of (unsigned char) conj_$2{enum ScopedSugared} have conj_$2{unsigned char} from the start. Possibly same for unscoped enums. I don't think we actually extract any value from knowing that a symbol is an enum value. We also don't track enum types for concrete values (i.e., nonloc::ConcreteInt doesn't know whether it belongs to an enum).

In D85528#2307785, @NoQ wrote:

Aha, ok, thanks, I now understand what the problem is because I was able to run the test before the patch and see how the patch changes the behavior.

What do you think about flattening the enum type out entirely? I.e., instead of (unsigned char) conj_$2{enum ScopedSugared} have conj_$2{unsigned char} from the start. Possibly same for unscoped enums. I don't think we actually extract any value from knowing that a symbol is an enum value. We also don't track enum types for concrete values (i.e., nonloc::ConcreteInt doesn't know whether it belongs to an enum).

That's a great idea. It should work. I will investigate this direction.


By the same token - being cheap on modeling explicit casts - I have seen other cases where we do not represent casts explicitly. Just like in this example, at the inner-most if branch we will simply assume that a == b instead of (uchar)a == b.

void should_warn_twice(int a, unsigned char b) {
  // Consider this interpretation: {a: -250, b: 6}
  // It should satisfy both branches since (unsigned char)-250 == 6.
  if (a < -200) {
    clang_analyzer_warnIfReached(); // expected-warning {{REACHABLE}}
    if ((unsigned char)a == b) {
      clang_analyzer_warnIfReached(); // expected-warning {{REACHABLE}}, no report shown WITH refutation
    }
  }
}

Refutation does not cooperate well with our constraint manager if we don't emit the symbolic casts in the Z3 model. And we won't since we store constraints without casts.
Without this cast, Z3 will find the constraints of the second bugreport to be unsatisfiable, thus invalidating a true-positive bugreport.

To come around this problem, I would suggest to evaluate any casts as-is in the analyzed source code and make sure that we can recognize and reuse the constraints on any form of a symbol.
We might be able to do that by extending the Equivalence class of the constraint map with the notion of casts of:

  • which do not change the castee's value range (eg. int -> long, unsigned char -> unsigned long)
  • which do change the value range (eg. int -> signed char, int -> unsigned int)

I might will raise this on cfe-dev - that's probably a better place to discuss such ideas.

clang/test/Analysis/z3-refute-enum-crash.cpp
25

I will document the step-by-step reasoning in the test code to make sure why the crash happened.

NoQ added a comment.Oct 5 2020, 9:46 PM

With Z3 constraint manager you absolutely want your constraints to be as precise as possible. The only reason we don't add these casts is because it confuses the constraint manager a lot. With a better constraint manager we would have spared ourselves a lot of suffering in this area.

With Z3 refutation manager - probably same but not sure, experimental data needed. Z3 occasionally refuting correct reports is not that big of a deal; the overall picture is still much better than without refutation due to sheer numbers of eliminated false positives. Improved cast modeling will also allow it to occasionally eliminate more false positives because it's now being fed correct formulas. However most of the decisions *during* the path are still made with RangeConstraintManager, which includes deciding whether to make a state split. RangeConstraintManager fails to solve constraints => double up the remaining analysis time. Exponentially with respect to the amount of constraints it couldn't solve. We'll have to weigh that performance cost against the improved quality.

We might be able to do that by extending the Equivalence class of the constraint map with the notion of casts of...

In my previous life i once did an experiment with RangeConstraintManager in which i added truncating cast symbols but not widening cast symbols. I believe it worked fairly well but i'm not sure, i was too young to properly assess the situation. So i believe something like this might actually work but there's still a high risk of failure (behaving overall worse than before) and it won't solve the entirety of the problem anyway (for instance, it won't help us solve https://bugs.llvm.org/show_bug.cgi?id=44114 by making our SVals type-correct so that extents didn't have to be stored separately - and that's currently one of the main sources of false positives we have).