This is an archive of the discontinued LLVM Phabricator instance.

Add an unsigned shift base sanitizer
ClosedPublic

Authored by jfb on Aug 14 2020, 2:51 PM.

Details

Summary

It's not undefined behavior for an unsigned left shift to overflow (i.e. to
shift bits out), but it has been the source of bugs and exploits in certain
codebases in the past. As we do in other parts of UBSan, this patch adds a
dynamic checker which acts beyond UBSan and checks other sources of errors. The
option is enabled as part of -fsanitize=integer.

The flag is named: -fsanitize=unsigned-shift-base
This matches shift-base and shift-exponent flags.

rdar://problem/46129047

Diff Detail

Event Timeline

jfb created this revision.Aug 14 2020, 2:51 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 14 2020, 2:51 PM
Herald added subscribers: Restricted Project, cfe-commits, ributzka and 2 others. · View Herald Transcript
jfb requested review of this revision.Aug 14 2020, 2:51 PM
jfb added a subscriber: dcoughlin.Aug 14 2020, 2:53 PM
vsk added a comment.Aug 14 2020, 3:42 PM

It'd be nice to fold the new check into an existing sanitizer group to bring this to a wider audience. Do you foresee adoption issues for existing -fsanitize=integer adopters? Fwiw some recently-added implicit conversion checks were folded in without much/any pushback.

clang/test/Driver/fsanitize.c
911

Not sure I follow this one. Why is 'NORECOVER' not expecting to see "-fno-sanitize-recover=unsigned-shift-base"?

jfb added a comment.Aug 14 2020, 4:05 PM
In D86000#2219288, @vsk wrote:

It'd be nice to fold the new check into an existing sanitizer group to bring this to a wider audience. Do you foresee adoption issues for existing -fsanitize=integer adopters? Fwiw some recently-added implicit conversion checks were folded in without much/any pushback.

integer does "not actually UB checks", right? I can certainly put it in there if you think I won't get yelled at 😄

clang/test/Driver/fsanitize.c
911

I have no idea! Other parts of this file do this:

// CHECK-implicit-conversion-NORECOVER-NOT: "-fno-sanitize-recover={{((implicit-unsigned-integer-truncation|implicit-signed-integer-truncation|implicit-integer-sign-change),?){3}"}} // ???
// CHECK-implicit-integer-arithmetic-value-change-NORECOVER-NOT: "-fno-sanitize-recover={{((implicit-signed-integer-truncation|implicit-integer-sign-change),?){2}"}} // ???
// CHECK-implicit-integer-truncation-NORECOVER-NOT: "-fno-sanitize-recover={{((implicit-unsigned-integer-truncation|implicit-signed-integer-truncation),?){2}"}} // ???

I was hoping someone who's touched this before would know.

compiler-rt/test/ubsan/TestCases/Integer/unsigned-shift.cpp
3

I don't understand this test... when I run it with lit it fails (and it seems like the bots agree), but manually it works. Am I doing it wrong?

vsk added inline comments.Aug 14 2020, 4:35 PM
clang/test/Driver/fsanitize.c
911

The CHECK-implicit-conversion-NORECOVER-NOT regex looks fishy. There's more parens in there than I'd expect: perhaps that explains why the test passes?

Just above your change, I see something that's closer to what I expect:

// RUN: %clang -fsanitize=float-divide-by-zero %s -fno-sanitize-recover=float-divide-by-zero -### 2>&1 | FileCheck %s --check-prefixes=CHECK-DIVBYZERO,CHECK-DIVBYZERO-NORECOVER
...
// CHECK-DIVBYZERO-NORECOVER-NOT: "-fsanitize-recover=float-divide-by-zero"
compiler-rt/test/ubsan/TestCases/Integer/unsigned-shift.cpp
3

Do you need to explicitly return 1 or something to get a non-zero exit code?

7

Does the test not work without the volatiles?

vsk added a comment.Aug 14 2020, 4:36 PM
In D86000#2219322, @jfb wrote:
In D86000#2219288, @vsk wrote:

It'd be nice to fold the new check into an existing sanitizer group to bring this to a wider audience. Do you foresee adoption issues for existing -fsanitize=integer adopters? Fwiw some recently-added implicit conversion checks were folded in without much/any pushback.

integer does "not actually UB checks", right? I can certainly put it in there if you think I won't get yelled at 😄

Can't guarantee you won't get yelled at, but it does seem like the natural fit. It already includes other unsigned overflow checks.

In D86000#2219322, @jfb wrote:
In D86000#2219288, @vsk wrote:

It'd be nice to fold the new check into an existing sanitizer group to bring this to a wider audience. Do you foresee adoption issues for existing -fsanitize=integer adopters? Fwiw some recently-added implicit conversion checks were folded in without much/any pushback.

integer does "not actually UB checks", right? I can certainly put it in there if you think I won't get yelled at 😄

I'd support this.
"integer" includes unsigned-integer-overflow, it's only recommended to the truly fearless developers :)

clang/test/Driver/fsanitize.c
911

I think that + the following line mean that the frontend depends on the default behavior (neither recover nor no-recover).

lebedev.ri added a comment.EditedAug 14 2020, 11:27 PM

What's next, a sanitizer that input to signed right-shift is always non-negative? :)

FWIW "fixing" these "bugs" via (x & ~(~1U << (32-shamt))) << shamt is going to be fine,
i've already taught instcombine to drop such pointless masking before left-shift
in PR42563: https://godbolt.org/z/5rar14

jfb updated this revision to Diff 288132.Aug 26 2020, 4:39 PM

As Vedant suggested, make it part of 'integer' sanitizer.

jfb updated this revision to Diff 288133.Aug 26 2020, 4:40 PM

Re-upload with full context.

jfb edited the summary of this revision. (Show Details)Aug 26 2020, 4:41 PM

Release notes missing.
I think the canonical way to silence it (via masking?) should be at least mentioned.

clang/lib/CodeGen/CGExprScalar.cpp
3872–3884

Why is this so complicated? Shouldn't this just be: https://alive2.llvm.org/ce/z/scTqfX

$ /repositories/alive2/build-Clang-release/alive-tv /tmp/test.ll --smt-to=100000 --disable-undef-input

----------------------------------------
@2 = global 32 bytes, align 16

define i32 @src(i32 %arg, i32 %arg1) {
%bb:
  %i = icmp ugt i32 %arg1, 31
  %i2 = sub nsw nuw i32 31, %arg1    ; NOPE
  %i3 = lshr i32 %arg, %i2           ; NOPE
  %i4 = icmp ult i32 %i3, 2          ; NOPE
  %i5 = or i1 %i, %i4
  br i1 %i5, label %bb9, label %bb6

%bb6:
  %i7 = zext i32 %arg to i64
  %i8 = zext i32 %arg1 to i64
  %__constexpr_0 = bitcast * @2 to *
  call void @__ubsan_handle_shift_out_of_bounds(* %__constexpr_0, i64 %i7, i64 %i8)
  br label %bb9

%bb9:
  %i10 = shl i32 %arg, %arg1
  ret i32 %i10
}
=>
@2 = global 32 bytes, align 16

define i32 @tgt(i32 %arg, i32 %arg1) {
%bb:
  %i = icmp ugt i32 %arg1, 31
  %iZZ0 = shl i32 %arg, %arg1        ; HI!
  %iZZ1 = lshr i32 %iZZ0, %arg1      ; OVER HERE
  %i4 = icmp eq i32 %arg, %iZZ1      ; LOOK!
  %i5 = or i1 %i, %i4
  br i1 %i5, label %bb9, label %bb6

%bb6:
  %i7 = zext i32 %arg to i64
  %i8 = zext i32 %arg1 to i64
  %__constexpr_0 = bitcast * @2 to *
  call void @__ubsan_handle_shift_out_of_bounds(* %__constexpr_0, i64 %i7, i64 %i8)
  br label %bb9

%bb9:
  ret i32 %iZZ0
}
Transformation seems to be correct!

which will then be migrated to use @llvm.ushl.with.overflow once it's there.

jfb updated this revision to Diff 288383.Aug 27 2020, 10:19 AM
jfb marked 6 inline comments as done.

Address comments

Herald added a project: Restricted Project. · View Herald TranscriptAug 27 2020, 10:19 AM
jfb added inline comments.Aug 27 2020, 10:20 AM
clang/lib/CodeGen/CGExprScalar.cpp
3872–3884

Sure, but that's pre-existing and I'd rather not change it in this patch.

compiler-rt/test/ubsan/TestCases/Integer/unsigned-shift.cpp
3

That should be implicit, but I added it regardless to make sure. It's happy now 🤷‍♂️

7

It seems that LLVM sees too much without volatile, yes.

lebedev.ri added inline comments.Aug 27 2020, 10:22 AM
llvm/docs/ReleaseNotes.rst
153

Surely not ~1U.

jfb updated this revision to Diff 288388.Aug 27 2020, 10:37 AM
jfb marked an inline comment as done.

Fix notes.

llvm/docs/ReleaseNotes.rst
153

Indeed, fixed.

vsk added inline comments.Aug 27 2020, 10:41 AM
compiler-rt/test/ubsan/TestCases/Integer/unsigned-shift.cpp
7

The optimizer isn't running. Perhaps this is necessary because clang's constant folder kicks in?

llvm/docs/ReleaseNotes.rst
151

This could use some more explanation, maybe s/with masking/by clearing bits that would be shifted out/?

153

I don't think this pattern works when rhs = 0 (https://godbolt.org/z/rvEGqh).

lebedev.ri added inline comments.Aug 27 2020, 10:44 AM
llvm/docs/ReleaseNotes.rst
153

Surely not 1U either :)

jfb updated this revision to Diff 288444.Aug 27 2020, 1:32 PM
jfb marked 4 inline comments as done.

Remove the "suppress this" in release notes.

compiler-rt/test/ubsan/TestCases/Integer/unsigned-shift.cpp
7

Probably, I haven't investigate. It's brittle is the main thing, and this forces it to not be.

llvm/docs/ReleaseNotes.rst
151

I just dropped it, I don't think release notes are the right place for this anyways.

153

Right, I don't think it's something we want to explain here.

vsk accepted this revision.Aug 27 2020, 1:36 PM

Lgtm, thanks.

This revision is now accepted and ready to land.Aug 27 2020, 1:36 PM
This revision was automatically updated to reflect the committed changes.