This is an archive of the discontinued LLVM Phabricator instance.

[clang][ubsan] Implicit Cast Sanitizer - integer truncation - clang part
ClosedPublic

Authored by lebedev.ri on Jul 5 2018, 1:18 AM.

Details

Summary

C and C++ are interesting languages. They are statically typed, but weakly.
The implicit conversions are allowed. This is nice, allows to write code
while balancing between getting drowned in everything being convertible,
and nothing being convertible. As usual, this comes with a price:

unsigned char store = 0;

bool consume(unsigned int val);

void test(unsigned long val) {
  if (consume(val)) {
    // the 'val' is `unsigned long`, but `consume()` takes `unsigned int`.
    // If their bit widths are different on this platform, the implicit
    // truncation happens. And if that `unsigned long` had a value bigger
    // than UINT_MAX, then you may or may not have a bug.

    // Similarly, integer addition happens on `int`s, so `store` will
    // be promoted to an `int`, the sum calculated (0+768=768),
    // and the result demoted to `unsigned char`, and stored to `store`.
    // In this case, the `store` will still be 0. Again, not always intended.
    store = store + 768; // before addition, 'store' was promoted to int.
  }

  // But yes, sometimes this is intentional.
  // You can either make the conversion explicit
  (void)consume((unsigned int)val);
  // or mask the value so no bits will be *implicitly* lost.
  (void)consume((~((unsigned int)0)) & val);
}

Yes, there is a -Wconversion` diagnostic group, but first, it is kinda
noisy, since it warns on everything (unlike sanitizers, warning on an
actual issues), and second, there are cases where it does not warn.
So a Sanitizer is needed. I don't have any motivational numbers, but i know
i had this kind of problem 10-20 times, and it was never easy to track down.

The logic to detect whether an truncation has happened is pretty simple
if you think about it - https://godbolt.org/g/NEzXbb - basically, just
extend (using the new, not original!, signedness) the 'truncated' value
back to it's original width, and equality-compare it with the original value.

The most non-trivial thing here is the logic to detect whether this
ImplicitCastExpr AST node is actually an implicit conversion, or
part of an explicit cast. Because the explicit casts are modeled as an outer
ExplicitCastExpr with some ImplicitCastExpr's as direct children.
https://godbolt.org/g/eE1GkJ

Nowadays, we can just use the new part_of_explicit_cast flag, which is set
on all the implicitly-added ImplicitCastExpr's of an ExplicitCastExpr.
So if that flag is not set, then it is an actual implicit conversion.

As you may have noted, this isn't just named -fsanitize=implicit-integer-truncation.
There are potentially some more implicit conversions to be warned about.
Namely, implicit conversions that result in sign change; implicit conversion
between different floating point types, or between fp and an integer,
when again, that conversion is lossy.

One thing i know isn't handled is bitfields.

This is a clang part.
The compiler-rt part is D48959.

Fixes PR21530, PR37552, PR35409.
Partially fixes PR9821.
Fixes https://github.com/google/sanitizers/issues/940. (other than sign-changing implicit conversions)

Diff Detail

Repository
rC Clang

Event Timeline

lebedev.ri edited the summary of this revision. (Show Details)Jul 5 2018, 1:29 AM
lebedev.ri updated this revision to Diff 154591.Jul 9 2018, 7:27 AM
lebedev.ri retitled this revision from [private][clang] Implicit Cast Sanitizer - integer truncation - clang part to [clang][ubsan] Implicit Cast Sanitizer - integer truncation - clang part.
lebedev.ri edited the summary of this revision. (Show Details)
lebedev.ri added a project: Restricted Project.

C and C++ are interesting languages. They are statically typed, but weakly.
The implicit conversions are allowed. This is nice, allows to write code
while balancing between getting drowned in everything being convertible,
and nothing being convertible. As usual, this comes with a price:

unsigned char store = 0;

bool consume(unsigned int val);

void test(unsigned long val) {
  if (consume(val)) {
    // the 'val' is `unsigned long`, but `consume()` takes `unsigned int`.
    // If their bit widths are different on this platform, the implicit
    // truncation happens. And if that `unsigned long` had a value bigger
    // than UINT_MAX, then you may or may not have a bug.

    // Similarly, integer addition happens on `int`s, so `store` will
    // be promoted to an `int`, the sum calculated (0+768=768),
    // and the result demoted to `unsigned char`, and stored to `store`.
    // In this case, the `store` will still be 0. Again, not always intended.
    store = store + 768; // before addition, 'store' was promoted to int.
  }

  // But yes, sometimes this is intentional.
  // You can either make the cast explicit
  (void)consume((unsigned int)val);
  // or mask the value so no bits will be *implicitly* lost.
  (void)consume((~((unsigned int)0)) & val);
}

Yes, there is a -Wconversion` diagnostic group, but first, it is kinda
noisy, since it warns on everything (unlike sanitizers, warning on an
actual issues), and second, there are cases where it does not warn.
So a Sanitizer is needed. I don't have any motivational numbers, but i know
i had this kind of problem 10-20 times, and it was never easy to track down.

The logic to detect whether an truncation has happened is pretty simple
if you think about it - https://godbolt.org/g/NEzXbb - basically, just
extend (using the new, not original!, signedness) the 'truncated' value
back to it's original width, and equality-compare it with the original value.

The most non-trivial thing here is the logic to detect whether this
ImplicitCastExpr AST node is actually an implicit cast, or part of
an explicit cast. Because the explicit casts are modeled as an outer
ExplicitCastExpr with some ImplicitCastExpr's as direct children.
https://godbolt.org/g/eE1GkJ

It would seem, we can just check that the current ImplicitCastExpr
we are processing either has no CastExpr parents, or all of them are
ImplicitCastExpr.

As you may have noted, this isn't just named -fsanitize=implicit-integer-truncation.
There are potentially some more implicit casts to be warned about.
Namely, implicit casts that result in sign change; implicit cast
between different floating point types, or between fp and an integer,
when again, that conversion is lossy.

I suspect, there may be some false-negatives, cases yet to be handled.

This is a clang part.
The compiler-rt part is D48959.

Fixes PR21530, PR37552, PR35409.
Partially fixes PR9821.
Fixes https://github.com/google/sanitizers/issues/940.

lebedev.ri changed the visibility from "All Users" to "Public (No Login Required)".Jul 9 2018, 7:27 AM

Finished running it on a normal testset of my pet project.

  • It fired ~18 times.
  • There were no obvious false-positives (e.g. when an explicit cast was involved).
  • At least 3 of those appear to be a true bugs.
  • 4-5 more are probably bugs, but it is hard to tell.
  • Last 10-11 appear to be mostly OK intentional truncating casts.

This was on a normal test set, i suspect fuzzing will reveal more.

  • Check that sanitizer is actually enabled before doing the AST upwalk. I didn't measure, but it would be logical for this to be better.
vsk added a comment.Jul 10 2018, 1:38 PM

Thanks for working on this!

docs/ReleaseNotes.rst
277

Could you mention whether the group is enabled by -fsanitize=undefined?

lib/CodeGen/CGExprScalar.cpp
350–351

I think the number of overloads here is really unwieldy. There should be a simpler way to structure this. What about consolidating all four overloads into one? Maybe:

struct ScalarConversionsOpts {
  bool TreatBoolAsUnsigned = false;
  bool EmitImplicitIntegerTruncationCheck = false;
};

Value *EmitScalarConversion(Src, SrcTy, DstTy, Loc, Opts = ScalarConversionOpts())

It's not necessary to pass CastExpr in, right? There's only one place where that's done. It seems simpler to just do the SanOpts / isCastPartOfExplicitCast checking there.

983

nit, function names typically begin with a verb: 'isCastPartOf...'

1693–1695

I think maintaining a stack of visited cast exprs in the emitter be cheaper/simpler than using ASTContext::getParents. You could push CE here and use a RAII helper to pop it. The 'isCastPartOfExplicitCast' check then simplifies to a quick stack traversal.

lebedev.ri marked 2 inline comments as done.

@vsk thank you for taking a look!

Addressed the trivial part of nits.

lebedev.ri added inline comments.Jul 10 2018, 2:51 PM
lib/CodeGen/CGExprScalar.cpp
350–351

The number of overloads is indeed unwieldy.

1693–1695

Hmm, two things come to mind:

  1. This pessimizes the (most popular) case when the sanitizer is disabled.
  2. ASTContext::getParents() may return more than one parent. I'm not sure if that matters here?

I'll take a look..

vsk added inline comments.Jul 11 2018, 12:46 PM
lib/CodeGen/CGExprScalar.cpp
1693–1695

As for (1), it's not necessary to push/pop the stack when this sanitizer is disabled. And for (2), IIUC the only interesting case is "explicit-cast <implicit-cast>+", and none of the implicit casts here have more than one parent.

lebedev.ri added inline comments.Jul 11 2018, 2:08 PM
lib/CodeGen/CGExprScalar.cpp
1693–1695

I think maintaining a stack of visited cast exprs in the emitter be cheaper/simpler than using ASTContext::getParents.

So yeah, this could work.
Except sadly it breaks in subtle cases like https://godbolt.org/g/5V2czU
I have added those tests beforehand.

Is ASTContext::getParents() really horribly slow so we want to duplicate/maintain/track
the current AST stack ourselves?
If so, we will need to maintain the *entire* stack, not just `CastExpr''s...

Add some more tricky tests where maintaining just the CastExpr part of AST stack would break them.

vsk added a subscriber: klimek.Jul 11 2018, 3:48 PM
vsk added inline comments.
lib/CodeGen/CGExprScalar.cpp
1693–1695

I think the scan in 'IsTopCastPartOfExplicitCast' can be fixed: while traversing backwards, you'd need to check that the previously-visited cast expr is the child of the current expr. That should address the false negative you pointed out in interference1. I don't yet see what the issue is with interference0.

Could you explain why maintaining a stack of unfinished casts wouldn't work? I don't understand why you'd need the entire stack. My sense is that it's not required to match the "explicit-cast <implicit-cast>+" pattern, but I could easily be missing something here. As for why this might be worth looking into, I think scanning for an explicit cast is much easier to understand when working with a stack.

+ @klimek to comment on what to expect in terms of the overhead of ASTContext::getParents. Regardless of what approach we pick, it would help to see pre/post-patch compile times for a stage2 build of something like clang or llc.

lebedev.ri marked 7 inline comments as done.

Address @vsk's review notes.

  • Maintain the stack of currently-being-visited CastExpr's
  • Use that stack to check whether we are in a ExplicitCastExpr
  • Move logic for deciding whether to emit the check out of EmitScalarConversion()
  • Condense all overloads of EmitScalarConversion() down to one.
lib/CodeGen/CGExprScalar.cpp
1693–1695

I think the scan in 'IsTopCastPartOfExplicitCast' can be fixed: while traversing backwards, you'd need to check that the previously-visited cast expr is the child of the current expr. That should address the false negative you pointed out in interference1.

Oh right. That seems to fix the issues.

Could you explain why maintaining a stack of unfinished casts wouldn't work?

I didn't think it wouldn't work. I just missed the tidbit about checking children.
Then it works.

vsk added a comment.Jul 12 2018, 10:27 AM

I have some minor comments but overall I think this is in good shape. It would be great to see some compile-time numbers just to make sure this is tractable. I'm pretty sure -fsanitize=null would fire more often across a codebase than this check, so I don't anticipate a big surprise here.

lib/CodeGen/CGExprScalar.cpp
222

It would help to have this comment explain that the stack is used/maintained exclusively by the implicit cast sanitizer.

228

Could you make this comment more specific -- maybe by explaining that for efficiency reasons, the cast expr stack is only maintained when a sanitizer check is enabled?

234

I think if you were to use references instead of pointers here, the code would be a bit clearer, and you wouldn't need to assert that CE is non-null.

352

Why not use default member initializers here (e.g, "bool a = false")?

989

The none_of call could safely be replaced by Cast->getSubExpr() != PreviousCast, I think.

Thank you for taking a look!

In D48958#1160381, @vsk wrote:

I have some minor comments but overall I think this is in good shape. It would be great to see some compile-time numbers just to make sure this is tractable. I'm pretty sure -fsanitize=null would fire more often across a codebase than this check, so I don't anticipate a big surprise here.

Could you please clarify, which numbers are you looking for, specifically?
The time it takes to build llvm stage2 with -fsanitize=implicit-cast?
Or the time it takes to build llvm stage3 with compiler built with -fsanitize=implicit-cast?
(The numbers won't be too representable, whole stage-1 takes ~40 minutes here...)

lib/CodeGen/CGExprScalar.cpp
352

I'll double-check, but i'm pretty sure then there were some warnings when i did that,
Or, the default needs to be defined in the actual declaration of EmitScalarConversion(), i think.

vsk added a comment.Jul 12 2018, 11:03 AM

Thank you for taking a look!

In D48958#1160381, @vsk wrote:

I have some minor comments but overall I think this is in good shape. It would be great to see some compile-time numbers just to make sure this is tractable. I'm pretty sure -fsanitize=null would fire more often across a codebase than this check, so I don't anticipate a big surprise here.

Could you please clarify, which numbers are you looking for, specifically?
The time it takes to build llvm stage2 with -fsanitize=implicit-cast?
Or the time it takes to build llvm stage3 with compiler built with -fsanitize=implicit-cast?

I had in mind measuring the difference between -fsanitize=undefined and -fsanitize=undefined,implicit-cast, with a stage2 compiler. I think that captures the expected use case: existing ubsan users enabling this new check.

(The numbers won't be too representable, whole stage-1 takes ~40 minutes here...)

Ah I see, I'll run a few builds and take a stab at it, then.

In D48958#1160479, @vsk wrote:

Thank you for taking a look!

In D48958#1160381, @vsk wrote:

I have some minor comments but overall I think this is in good shape. It would be great to see some compile-time numbers just to make sure this is tractable. I'm pretty sure -fsanitize=null would fire more often across a codebase than this check, so I don't anticipate a big surprise here.

Could you please clarify, which numbers are you looking for, specifically?
The time it takes to build llvm stage2 with -fsanitize=implicit-cast?
Or the time it takes to build llvm stage3 with compiler built with -fsanitize=implicit-cast?

I had in mind measuring the difference between -fsanitize=undefined and -fsanitize=undefined,implicit-cast, with a stage2 compiler. I think that captures the expected use case: existing ubsan users enabling this new check.

FWIW, i'm trying to look into optimizing these new IR patterns right now D49179 D49247.

(The numbers won't be too representable, whole stage-1 takes ~40 minutes here...)

Ah I see, I'll run a few builds and take a stab at it, then.

Yes, please, thank you!

vsk added a comment.Jul 12 2018, 2:31 PM
In D48958#1160479, @vsk wrote:

Thank you for taking a look!

In D48958#1160381, @vsk wrote:

I have some minor comments but overall I think this is in good shape. It would be great to see some compile-time numbers just to make sure this is tractable. I'm pretty sure -fsanitize=null would fire more often across a codebase than this check, so I don't anticipate a big surprise here.

Could you please clarify, which numbers are you looking for, specifically?
The time it takes to build llvm stage2 with -fsanitize=implicit-cast?
Or the time it takes to build llvm stage3 with compiler built with -fsanitize=implicit-cast?

I had in mind measuring the difference between -fsanitize=undefined and -fsanitize=undefined,implicit-cast, with a stage2 compiler. I think that captures the expected use case: existing ubsan users enabling this new check.

FWIW, i'm trying to look into optimizing these new IR patterns right now D49179 D49247.

(The numbers won't be too representable, whole stage-1 takes ~40 minutes here...)

Ah I see, I'll run a few builds and take a stab at it, then.

Yes, please, thank you!

The stage2 build traps before it finishes:

FAILED: lib/IR/AttributesCompatFunc.inc.tmp
cd /Users/vsk/src/builds/llvm.org-lldbsan-stage2-R/tools/clang/stage2-bins && /Users/vsk/src/builds/llvm.org-lldbsan-stage2-R/tools/clang/stage2-bins/bin/llvm-tblgen -gen-attrs -I /Users/vsk/src/llvm.org-lldbsan/llvm/lib/IR -I /Users/vsk/src/llvm.org-lldbsan/llvm/include /Users/vsk/src/llvm.org-lldbsan/llvm/lib/IR/AttributesCompatFunc.td -o lib/IR/AttributesCompatFunc.inc.tmp -d lib/IR/AttributesCompatFunc.inc.d
/Users/vsk/src/llvm.org-lldbsan/llvm/include/llvm/ADT/DenseMap.h:732:66: runtime error: implicit cast from type 'uint64_t' (aka 'unsigned long long') of value 4294967296 (64-bit, unsigned) to type 'unsigned int' changed the value to 0 (32-bit, unsigned)
/bin/sh: line 1: 96848 Abort trap: 6

This looks like a false positive to me. It's complaining about static_cast<unsigned>(NextPowerOf2(...)), but the static_cast is explicit.

In D48958#1160848, @vsk wrote:
In D48958#1160479, @vsk wrote:

Thank you for taking a look!

In D48958#1160381, @vsk wrote:

I have some minor comments but overall I think this is in good shape. It would be great to see some compile-time numbers just to make sure this is tractable. I'm pretty sure -fsanitize=null would fire more often across a codebase than this check, so I don't anticipate a big surprise here.

Could you please clarify, which numbers are you looking for, specifically?
The time it takes to build llvm stage2 with -fsanitize=implicit-cast?
Or the time it takes to build llvm stage3 with compiler built with -fsanitize=implicit-cast?

I had in mind measuring the difference between -fsanitize=undefined and -fsanitize=undefined,implicit-cast, with a stage2 compiler. I think that captures the expected use case: existing ubsan users enabling this new check.

FWIW, i'm trying to look into optimizing these new IR patterns right now D49179 D49247.

(The numbers won't be too representable, whole stage-1 takes ~40 minutes here...)

Ah I see, I'll run a few builds and take a stab at it, then.

Yes, please, thank you!

The stage2 build traps before it finishes:

FAILED: lib/IR/AttributesCompatFunc.inc.tmp
cd /Users/vsk/src/builds/llvm.org-lldbsan-stage2-R/tools/clang/stage2-bins && /Users/vsk/src/builds/llvm.org-lldbsan-stage2-R/tools/clang/stage2-bins/bin/llvm-tblgen -gen-attrs -I /Users/vsk/src/llvm.org-lldbsan/llvm/lib/IR -I /Users/vsk/src/llvm.org-lldbsan/llvm/include /Users/vsk/src/llvm.org-lldbsan/llvm/lib/IR/AttributesCompatFunc.td -o lib/IR/AttributesCompatFunc.inc.tmp -d lib/IR/AttributesCompatFunc.inc.d
/Users/vsk/src/llvm.org-lldbsan/llvm/include/llvm/ADT/DenseMap.h:732:66: runtime error: implicit cast from type 'uint64_t' (aka 'unsigned long long') of value 4294967296 (64-bit, unsigned) to type 'unsigned int' changed the value to 0 (32-bit, unsigned)
/bin/sh: line 1: 96848 Abort trap: 6

This looks like a false positive to me. It's complaining about static_cast<unsigned>(NextPowerOf2(...)), but the static_cast is explicit.

Good to know, so the stack-based logic for ExplicitCastExpr detection needs further tests/refinements..

In D48958#1160848, @vsk wrote:

<...>
The stage2 build traps before it finishes:

FAILED: lib/IR/AttributesCompatFunc.inc.tmp
cd /Users/vsk/src/builds/llvm.org-lldbsan-stage2-R/tools/clang/stage2-bins && /Users/vsk/src/builds/llvm.org-lldbsan-stage2-R/tools/clang/stage2-bins/bin/llvm-tblgen -gen-attrs -I /Users/vsk/src/llvm.org-lldbsan/llvm/lib/IR -I /Users/vsk/src/llvm.org-lldbsan/llvm/include /Users/vsk/src/llvm.org-lldbsan/llvm/lib/IR/AttributesCompatFunc.td -o lib/IR/AttributesCompatFunc.inc.tmp -d lib/IR/AttributesCompatFunc.inc.d
/Users/vsk/src/llvm.org-lldbsan/llvm/include/llvm/ADT/DenseMap.h:732:66: runtime error: implicit cast from type 'uint64_t' (aka 'unsigned long long') of value 4294967296 (64-bit, unsigned) to type 'unsigned int' changed the value to 0 (32-bit, unsigned)
/bin/sh: line 1: 96848 Abort trap: 6

This looks like a false positive to me. It's complaining about static_cast<unsigned>(NextPowerOf2(...)), but the static_cast is explicit.

Good to know, so the stack-based logic for ExplicitCastExpr detection needs further tests/refinements..

creduced down to:

template <typename a> a b(a c, const a &d) {
  if (d)
    ;
  return c;
}
int e = b<unsigned>(4, static_cast<unsigned>(4294967296));
int main() {}

https://godbolt.org/g/1kwGk9

$ ./a.out 
test.cpp:6:46: runtime error: implicit cast from type 'long' of value 4294967296 (64-bit, signed) to type 'unsigned int' changed the value to 0 (32-bit, unsigned)
    #0 0x232f56 in _GLOBAL__sub_I_test.cpp (/home/lebedevri/CREDUCE/a.out+0x232f56)
    #1 0x232fbc in __libc_csu_init (/home/lebedevri/CREDUCE/a.out+0x232fbc)
    #2 0x7fa8c113aaa7 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x22aa7)
    #3 0x212029 in _start (/home/lebedevri/CREDUCE/a.out+0x212029)
lebedev.ri marked 5 inline comments as done.Jul 13 2018, 5:42 AM

@vsk so yeah, no wonder that doesn't work.
Somehow in that test case ScalarExprEmitter::VisitExplicitCastExpr() never gets called.
(I'm pretty sure this worked with the naive implementation, so worst case i'll just revert the 'stack' code)
Trying to assess the issue..

lib/CodeGen/CGExprScalar.cpp
352
[2/14 0.3/sec] Building CXX object tools/clang/lib/CodeGen/CMakeFiles/clangCodeGen.dir/CGExprScalar.cpp.o
FAILED: tools/clang/lib/CodeGen/CMakeFiles/clangCodeGen.dir/CGExprScalar.cpp.o 
/usr/bin/clang++-6.0  -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools/clang/lib/CodeGen -I/build/clang/lib/CodeGen -I/build/clang/include -Itools/clang/include -I/usr/include/libxml2 -Iinclude -I/build/llvm/include -g0 -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -std=c++11 -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wcovered-switch-default -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 -g0  -fPIC   -UNDEBUG  -fno-exceptions -fno-rtti -MD -MT tools/clang/lib/CodeGen/CMakeFiles/clangCodeGen.dir/CGExprScalar.cpp.o -MF tools/clang/lib/CodeGen/CMakeFiles/clangCodeGen.dir/CGExprScalar.cpp.o.d -o tools/clang/lib/CodeGen/CMakeFiles/clangCodeGen.dir/CGExprScalar.cpp.o -c /build/clang/lib/CodeGen/CGExprScalar.cpp
/build/clang/lib/CodeGen/CGExprScalar.cpp:355:52: error: default member initializer for 'TreatBooleanAsSigned' needed within definition of enclosing class 'ScalarExprEmitter' outside of member functions
                       ScalarConversionOpts Opts = ScalarConversionOpts());
                                                   ^
/build/clang/lib/CodeGen/CGExprScalar.cpp:349:10: note: default member initializer declared here
    bool TreatBooleanAsSigned = false;
         ^
/build/clang/lib/CodeGen/CGExprScalar.cpp:355:52: error: default member initializer for 'EmitImplicitIntegerTruncationChecks' needed within definition of enclosing class 'ScalarExprEmitter' outside of member functions
                       ScalarConversionOpts Opts = ScalarConversionOpts());
                                                   ^
/build/clang/lib/CodeGen/CGExprScalar.cpp:350:10: note: default member initializer declared here
    bool EmitImplicitIntegerTruncationChecks = false;
         ^
2 errors generated.
lebedev.ri marked 3 inline comments as done.

Address @vsk review notes, although this will be revered by the next update dropping the faulty 'stack' optimization.

Well, that's just great, with isCastPartOfExplictCast(), the ASTContext::getParents()
also does not return CXXStaticCastExpr as parent for such cases.
I don't know how to proceed.

Breakthrough: no more false-positives due to the MaterializeTemporaryExpr skipping over NoOp casts. (D49508)
Slight docs update.

Ping, please review!
We are so close :)

vsk added inline comments.Jul 19 2018, 12:58 PM
docs/UndefinedBehaviorSanitizer.rst
97

Nitpicks:
kind of issues -> issue
promotions -> conversions

133

Could you make this more explicit? It would help to point out that this check does not diagnose lossy implicit integer conversions, but that the new check does. Ditto for the comment in the unsigned-integer-overflow section.

lib/CodeGen/CodeGenFunction.h
382

Why not 0 instead of 8, given that in the common case, this stack is unused?

387

I'm not sure the cost of maintaining an extra vector is worth the benefit of the added assertion. Wouldn't it be cheaper to just store the number of pushed casts? You'd only need one constructor which accepts an ArrayRef<const CastExpr *>.

test/CodeGen/catch-implicit-integer-truncations.c
30

There's no need to check the profile metadata here.

160

nit, aren't these true-negatives because we expect to see no errors?

lebedev.ri marked 6 inline comments as done.

Rebased ontop of yet-again rewritten D49508.
Addressed all @vsk's review notes.

More review notes wanted :)

lebedev.ri added inline comments.Jul 20 2018, 5:35 AM
docs/UndefinedBehaviorSanitizer.rst
133

Is this better?

lib/CodeGen/CodeGenFunction.h
382

No longer relevant.

387

No longer relevant.

test/CodeGen/catch-implicit-integer-truncations.c
30

I was checking it because otherwise HANDLER_IMPLICIT_CAST would have over-eagerly consumed , !prof !3 too.
But there is actually a way around that..

160

Right.

vsk added inline comments.Jul 20 2018, 10:34 AM
docs/UndefinedBehaviorSanitizer.rst
133

Looks good.

148

Please add "the implicit-cast group of checks" to this list.

lib/CodeGen/CodeGenFunction.h
382

I'm referring to CastExprStack within ScalarExprEmitter, which still allocates space for 8 pointers inline.

lebedev.ri added inline comments.Jul 20 2018, 10:41 AM
lib/CodeGen/CodeGenFunction.h
382

Ah, you mean in the general case when the sanitizer is disabled?

vsk added inline comments.Jul 20 2018, 10:57 AM
lib/CodeGen/CodeGenFunction.h
382

Yes. It's a relatively minor concern, but clang's stack can get pretty deep inside of CodeGenFunction. At one point we needed to outline code by hand to unbreak the ASan build. Later I think we just increased the stack size rlimit. I don't see a countervailing performance benefit of allocating more space inline, at least not here.

lebedev.ri added inline comments.Jul 20 2018, 11:27 AM
lib/CodeGen/CodeGenFunction.h
382

No, i agree and totally understand.
I just didn't think about that sanitizer-less context.

lebedev.ri marked 11 inline comments as done.

Address @vsk's review notes.

Rebased on top of svn tip / git master, now that D49508 has landed,
which means there shouldn't be any more false-positives (and it's a bit faster to detect that the check shouldn't be emitted, too).

Ping, any more notes? :)

vsk accepted this revision.Jul 24 2018, 11:47 AM

LGTM, although I think it'd be helpful to have another +1 just to be safe.

I did two small experiments with this using a Stage1 Rel+Asserts compiler:

Stage2 Rel+Asserts build of llvm-tblgen:
ninja llvm-tblgen 384.27s user 14.98s system 1467% cpu 27.203 total

Stage2 Rel+Asserts build of llvm-tblgen with implicit-cast checking:
ninja llvm-tblgen 385.15s user 15.02s system 1472% cpu 27.170 total

With caveats about having a small sample size here and testing with an asserts-enabled stage1 build, I don't see any red flags about the compile-time overhead of the new check. I would have liked to measure the check against more code, but I couldn't complete a stage2 build due to a diagnostic which might plausibly point to a real issue in tblgen:

/Users/vsk/src/llvm.org-lldbsan/llvm/utils/TableGen/RegisterInfoEmitter.cpp:604:17: runtime error: implicit cast from type 'int' of value -1 (32-bit, signed) to type 'const unsigned short' changed the value to 65535 (16-bit, unsigned)

With -fno-sanitize-recover=all disabled, I found a few more reports:

/Users/vsk/src/llvm.org-lldbsan/llvm/include/llvm/Object/Archive.h:278:38: runtime error: implicit cast from type 'int' of value -1 (32-bit, signed) to type 'uint16_t' (aka 'unsigned short') changed the value to 65535 (16-bit, unsigned)
--> uint16_t FirstRegularStartOfFile = -1;

/Users/vsk/src/llvm.org-lldbsan/llvm/lib/Analysis/MemorySSA.cpp:199:12: runtime error: implicit cast from type 'size_t' (aka 'unsigned long') of value 4969132974595412838 (64-bit, unsigned) to type 'unsigned int' changed the value to 3765474150 (32-bit, unsigned)
--> hash_combine() result casted to unsigned

/Users/vsk/src/llvm.org-lldbsan/llvm/lib/CodeGen/TargetLoweringBase.cpp:1212:30: runtime error: implicit cast from type 'unsigned int' of value 512 (32-bit, unsigned) to type 'unsigned char' changed the value to 0 (8-bit, unsigned)
--> NumRegistersForVT[i] = getVectorTypeBreakdownMVT(...)

/Users/vsk/src/llvm.org-lldbsan/llvm/lib/Transforms/Scalar/EarlyCSE.cpp:136:12: runtime error: implicit cast from type 'size_t' (aka 'unsigned long') of value 16583795711468875482 (64-bit, unsigned) to type 'unsigned int' changed the value to 3116347098 (32-bit, unsigned)
--> hash_combine() result casted to unsigned
...

These four at least don't look like false positives:

  • Maybe we should consider special-casing assignments of "-1" to unsigned values? This seems somewhat idiomatic.
  • At least a few of these are due to not being explicit about dropping the high bits of hash_combine()'s result. Given that this check is opt-in, that that seems like a legitimate diagnostic (lost entropy).
  • The TargetLoweringBase.cpp diagnostic looks a bit scary.
lib/CodeGen/CGExprScalar.cpp
983

nit, extra parens?

This revision is now accepted and ready to land.Jul 24 2018, 11:47 AM
In D48958#1173860, @vsk wrote:

LGTM, although I think it'd be helpful to have another +1 just to be safe.

Thank you for the review!
It would indeed be great if someone else could take a look, especially since we are so close to the branching point.

In D48958#1173860, @vsk wrote:

...

In D48958#1173860, @vsk wrote:

These four at least don't look like false positives:

  • Maybe we should consider special-casing assignments of "-1" to unsigned values? This seems somewhat idiomatic.

I personally would use ~0U there.
One more datapoint: the implicit-sign-change will/should still complain about that case.
So personally i'd like to keep it.

  • At least a few of these are due to not being explicit about dropping the high bits of hash_combine()'s result. Given that this check is opt-in, that that seems like a legitimate diagnostic (lost entropy).
  • The TargetLoweringBase.cpp diagnostic looks a bit scary.
lebedev.ri added inline comments.Jul 25 2018, 2:46 PM
lib/CodeGen/CGExprScalar.cpp
976–1000

Based on IRC disscussion with @rsmith, it seems this should be just return !Cast->getIsPartOfExplicitCast(); (and inline it), and no need for the CastExprStack and stuff.

lebedev.ri marked an inline comment as done.
lebedev.ri added a reviewer: erichkeane.
lebedev.ri added a subscriber: erichkeane.

Address @rsmith & @erichkeane [IRC] review notes:

  • D49838 - [AST] Sink 'part of explicit cast' down into ImplicitCastExpr
  • D49844 - [AST] Add a isActuallyImplicitCast() helper to the CastExpr class.
  • Drop no longer needed CastExprStackGuard, ScalarExprEmitter::IsTopCastPartOfExplictCast(), just use CastExpr::isActuallyImplicitCast() directly.

This should be a NFC change, there should not be any functionality change because of this.

test/CodeGenCXX/catch-implicit-integer-truncations.cpp
9–34

@rsmith these tests should be equivalent to what you have brought up, so that situation was already tested.

erichkeane accepted this revision.Jul 26 2018, 6:29 AM

1 Nit, otherwise LGTM.

docs/UndefinedBehaviorSanitizer.rst
95

I think the last 2 commas in this sentence are unnecessary?

aaron.ballman added inline comments.Jul 26 2018, 8:16 AM
docs/UndefinedBehaviorSanitizer.rst
93–97

How about: Implicit cast from a value of integral type which results in data loss where the demoted value, when cast back to the original type, would have a different value than the original. This issue may be caused by an implicit conversion.

lebedev.ri marked 2 inline comments as done.

Small rewording in docs/UndefinedBehaviorSanitizer.rst thanks to @erichkeane & @aaron.ballman!

docs/UndefinedBehaviorSanitizer.rst
93–97

Thank you!

Rebase,
Address @rsmith review notes - just inline D49844.

rsmith added inline comments.Jul 26 2018, 4:30 PM
docs/ReleaseNotes.rst
49–52

Regarding the name of this sanitizer: C and C++ refer to these as "implicit conversions" not "implicit casts", and "implicit cast" is a contradiction in terms -- a cast is explicit syntax for performing a conversion. We should use the external terminology here ("implicit-conversion") rather than the slightly-odd clang-specific convention of calling an implicit conversion an "implicit cast".

270

got -> may have been

272

implicit -> explicit

docs/UndefinedBehaviorSanitizer.rst
17

Don't use Title Caps here. "Problematic implicit conversions"

95

I would parenthesize the "where the demoted value [...] would have a different value from the original" clause, since it's just explaining what we mean by "data loss".

95

Is this really the right rule, though? Consider:

unsigned int x = 0x81234567;
int y = x; // does the sanitizer catch this or not?

Here, the value of x is not the same as the value of y (assuming 32-bit int): y is negative. But this is not "data loss" according to the documented meaning of the sanitizer. I think we should produce a sanitizer trap on this case.

131–132

Remove the "Please"s here. We don't need to beg the reader to read the rest of the sentence. Just "Note that this [...]. Also note that integer conversions may result in an unexpected computation result, [...]"

137–138

Likewise here.

include/clang/Basic/Sanitizers.def
134–136

-fsanitize=integer should include -fsanitize=implicit-integer-truncation.

lib/CodeGen/CGExprScalar.cpp
986–987

Check the Clang types here, not the LLVM types. There is no guarantee that only integer types get converted to LLVM IntegerTypes. (But if you like, you can assert that SrcTy and DstTy are IntegerTypes after checking that the clang types are both integer types.)

988–990

I believe this is redundant: we don't get here for an integer to boolean conversion, and a boolean to integer conversion would always be caught by the bit width check below.

994–996

I think you should also catch casts that change signedness in the case if the sign bit is set on the value. (Though if you want to defer this to a follow-up change and maybe give the sanitizer a different name, that's fine with me.)

lebedev.ri marked 9 inline comments as done and 2 inline comments as done.
lebedev.ri edited the summary of this revision. (Show Details)

Hopefully address @rsmith review notes:

  • s/cast/conversion/ where appropriate
  • Some wording in docs
  • Some 'legality' checks in ScalarExprEmitter::EmitIntegerTruncationCheck().

Oops, forgot to submit the inline comments.
(It is inconvenient that they aren't submitted with the rest.)

docs/ReleaseNotes.rst
272

Whoops.

docs/UndefinedBehaviorSanitizer.rst
95

I've reverted this to my original text.
It should now convey the correct idea, but i'm not sure this is correct English.

unsigned int x = 0x81234567;
int y = x; // does the sanitizer catch this or not?

No, it does not. It indeed should. I do plan on following-up with that,
thus i've adding the group (`-fsanitize=implicit-conversion`), not just one check.

include/clang/Basic/Sanitizers.def
134–136

Wow. Ok.

lib/CodeGen/CGExprScalar.cpp
986–987

Interesting, ok.

988–990

Uhm, i'll replace it with an assert then.

994–996

Yes, thank you for bringing this up.
That is certainly the plain, but i always planned to add that later on.

rsmith accepted this revision.Jul 30 2018, 10:50 AM

Only comments on documentation and assertions. Feel free to commit once these are addressed to your satisfaction.

docs/ReleaseNotes.rst
276

"Just like -fsanitize=integer" -> "Just like other -fsanitize=integer checks", now that this is part of -fsanitize=integer.

docs/UndefinedBehaviorSanitizer.rst
17

I don't think it makes sense to list this here, as it's not undefined behavior, and this is a list of undefined behavior that UBSan catches. (And I think it makes sense from a communication perspective to consider the non-UB checks to be "not part of UBSan but handled by the same infrastructure".)

95

bigger -> larger

... would read a bit better.

97

I don't think this last sentence adds anything, and it creates the impression that the issue is sometimes caused by something other than implicit integer conversions (which it isn't, as far as I can tell). Maybe just delete the last sentence here?

And instead, something like this would be useful:

"Issues caught by this sanitizer are not undefined behavior, but are often unintentional."

134–138

And here something like:

Unsigned integer overflow, where the result of an unsigned integer computation cannot be represented in its type. Unlike signed integer overflow, this is not undefined behavior, but it is often unintentional. This sanitizer does not check for lossy implicit conversions performed before such a computation (see ``-fsanitize=implicit-conversion``).
135–136

I don't think that's true (not until you add a sanitizer for signed <-> unsigned conversions that change the value). 4U / -2 produces the unexpected result 0U rather than the mathematically-correct result -2, and -fsanitize=implicit-conversion doesn't catch it. Maybe just strike this sentence for now?

In fact... I think this is too much text to be adding to this bulleted list, which is just supposed to summarize the available checks. Maybe replace the description with

Signed integer overflow, where the result of a signed integer computation cannot be represented in its type. This includes all the checks covered by ``-ftrapv``, as well as checks for signed division overflow (``INT_MIN/-1``), but not checks for lossy implicit conversions performed before the computation (see ``-fsanitize=implicit-conversion``).
153

If we're going to list which sanitizers are enabled here, we should list them all:

Enables ``signed-integer-overflow``, ``unsigned-integer-overflow``, ``shift``, ``integer-divide-by-zero``, and ``implicit-integer-truncation``.
lib/CodeGen/CGExprScalar.cpp
992

I think it's generally better for the text in an assertion to describe the violated assumption directly:

"clang integer type lowered to non-integer llvm type"

1000

I think you should only check DstType here. The point of the assert is that there is no such thing as a truncation to bool (conversion from integer to bool is a comparison against 0, and if we get here for such a case, there's a bug elsewhere). Other than that, bool is a perfectly-normal 1-bit unsigned integer type, and doesn't need to be treated as a special case.

lebedev.ri marked 9 inline comments as done.
lebedev.ri edited the summary of this revision. (Show Details)

Address last portion of @rsmith review notes.

@rsmith, @vsk, @erichkeane - thank you for the review!

lebedev.ri added inline comments.Jul 30 2018, 11:59 AM
docs/UndefinedBehaviorSanitizer.rst
135–136

I will assume you meant "lossy implicit conversions performed *after* the computation".

rsmith added inline comments.Jul 30 2018, 1:58 PM
docs/UndefinedBehaviorSanitizer.rst
135–136

I really meant "performed before", for cases like 4u / -2, where -2 is implicitly converted to UINT_MAX - 2 before the computation. Conversions that are performed after a computation aren't part of the computation at all, so I think it's much clearer that they're not in scope for this sanitizer.

lebedev.ri added inline comments.Jul 30 2018, 2:01 PM
docs/UndefinedBehaviorSanitizer.rst
135–136

Ok, with that additional explanation, i do see the error of my ways, and will re-adjust the docs accordingly.
Sorry.