This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] [ubsan] Respect integer overflow handling in abs builtin
ClosedPublic

Authored by artem on Aug 1 2023, 11:07 AM.

Details

Summary

Currenly both Clang and GCC support the following set of flags that control
code gen of signed overflow:

  • -fwrapv: overflow is defined as in two-complement
  • -ftrapv: overflow traps
  • -fsanitize=signed-integer-overflow: if undefined (no -fwrapv), then overflow behaviour is controlled by UBSan runtime, overrides -ftrapv

Howerver, clang ignores these flags for __builtin_abs(int) and its higher-width
versions, so passing minimum integer value always causes poison.

The same holds for *abs(), which are not handled in frontend at all but folded
to llvm.abs.* intrinsics during InstCombinePass. The intrinsics are not
instrumented by UBSan, so the functions need special handling as well.

This patch does a few things:

  • Handle *abs() in CGBuiltin the same way as __builtin_*abs()
  • -fsanitize=signed-integer-overflow now properly instruments abs() with UBSan
  • -fwrapv and -ftrapv handling for abs() is made consistent with GCC

Fixes #45129 and #45794

Diff Detail

Event Timeline

artem created this revision.Aug 1 2023, 11:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2023, 11:07 AM
artem requested review of this revision.Aug 1 2023, 11:07 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 1 2023, 11:07 AM
Herald added subscribers: Restricted Project, cfe-commits. · View Herald Transcript
craig.topper added inline comments.Aug 1 2023, 12:43 PM
clang/lib/CodeGen/CGBuiltin.cpp
1793

recover -> recovery?

Period at the end of the comment

artem updated this revision to Diff 546261.Aug 1 2023, 4:21 PM

Fixed comments and failing test. Added a new compiler-rt test for ubsan

artem updated this revision to Diff 546269.Aug 1 2023, 5:01 PM

typo fix

artem marked an inline comment as done.Aug 2 2023, 8:19 AM

This looks good to me, but I'm not sure I'm a fully qualified reviewer here.

Adding codegen and ubsan code owners for opinions.

clang/lib/CodeGen/CGBuiltin.cpp
1777

The overall approach here seems reasonable. I mean, technically the undefined behavior is happening in the library, but detecting it early seems like a good idea.

This approach does have a significant limitation, though: CGBuiltin won't detect cases that involve taking the address of abs(). So ubsan won't end up picking up undefined behavior in such calls, and -fwrapv won't apply. Maybe that's rare enough we don't really care, though.

Needs a release note.

Do we want to update ubsan documentation to reflect this?

clang/lib/CodeGen/CGBuiltin.cpp
2679

Putting the behavior under both "builtin" and "signed-integer-overflow" feels a little weird; is there any precedent for that?

2684

Can we land the change to directly generate calls to llvm.abs() in the default case separately? This might end up impacting generated code for a variety of workloads, and I'd prefer to have a more clear bisect point.

artem added a comment.Aug 10 2023, 4:09 AM

Thanks for the feedback! I will resolve the problems in the next revision after some clarifications.

clang/lib/CodeGen/CGBuiltin.cpp
2679

The problem is, we are instrumenting a builtin function, so on the one hand it is reasonable to be controlled by -fsanitize=builtin. On the other hand, GCC treats abs() as an arithmetic operation, so it is being instrumented by -fsanitize=signed-integer-overflow (abs(INT_MIN) causes a negation overflow).

I have decided to enable instrumentation under any of the flags, but I am not sure whether it is the right choice.

2684

I used llvm.abs() for code simplicity, since middle-end combines the instructions to it anyways. I think this part can be dropped entirely because the intrinsic is not the best possible option either (the compiler emits conditional move and fails to elide extra sign checks if the argument is known to be non-negative).

efriedma added inline comments.Aug 11 2023, 12:44 PM
clang/lib/CodeGen/CGBuiltin.cpp
2679

I'd prefer to just follow gcc here, I think, if there isn't any strong reason to pick a different approach.

2684

Sure, that works too.

artem updated this revision to Diff 550013.Aug 14 2023, 10:16 AM
artem marked 5 inline comments as done.
artem edited the summary of this revision. (Show Details)

Fixed review comments

This revision is now accepted and ready to land.Aug 14 2023, 2:39 PM

Thanks for the review.

I do not have commit rights, could you please push it? Artem Labazov <123321artyom@gmail.com>

This revision was automatically updated to reflect the committed changes.

This patch might have broke the buildbots, starting with when it was first built in https://lab.llvm.org/buildbot/#/builders/85/builds/18390

/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/libcxx_build_ubsan/include/c++/v1/stdlib.h:114:25: runtime error: negation of -9223372036854775808 cannot be represented in type 'long'; cast to an unsigned type to negate this value to itself
    #0 0x5586b6dfb9aa in abs /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/libcxx_build_ubsan/include/c++/v1/stdlib.h:114:10
    #1 0x5586b6dfb9aa in (anonymous namespace)::GNUELFDumper<llvm::object::ELFType<(llvm::support::endianness)1, true>>::printRelRelaReloc((anonymous namespace)::Relocation<llvm::object::ELFType<(llvm::support::endianness)1, true>> const&, (anonymous namespace)::RelSymbol<llvm::object::ELFType<(llvm::support::endianness)1, true>> const&) /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/tools/llvm-readobj/ELFDumper.cpp:3778:21

(which is good insofar as it finds a real bug, but nonetheless is turning the buildbots red)

In the latest buildbots, the error persists (along with an unrelated failure): https://lab.llvm.org/buildbot/#/builders/85/builds/18413

The compiler code is a tad too nuanced for me to fix with confidence (e.g., my first instinct would be to widen the type to int128 but I don't know if that will break compatibility).

I just put up a patch for RISC-V failure related to this https://reviews.llvm.org/D158304 change.

MaskRay reopened this revision.Aug 18 2023, 1:44 PM
MaskRay added a subscriber: MaskRay.
MaskRay added inline comments.
clang/lib/CodeGen/CGBuiltin.cpp
1787
clang/test/ubsan/TestCases/Misc/abs.cpp
1

Did you mean compiler-rt/ instead of clang/?

// REQUIRES: arch=x86_64 is legacy style.

New style uses // REQUIRES: target={{x86_64.*}}

Actually, this test should be generic and we should remove REQUIRES.

This revision is now accepted and ready to land.Aug 18 2023, 1:44 PM
MaskRay requested changes to this revision.Aug 18 2023, 1:44 PM
This revision now requires changes to proceed.Aug 18 2023, 1:44 PM

This patch might have broke the buildbots, starting with when it was first built in https://lab.llvm.org/buildbot/#/builders/85/builds/18390

/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/libcxx_build_ubsan/include/c++/v1/stdlib.h:114:25: runtime error: negation of -9223372036854775808 cannot be represented in type 'long'; cast to an unsigned type to negate this value to itself
    #0 0x5586b6dfb9aa in abs /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/libcxx_build_ubsan/include/c++/v1/stdlib.h:114:10
    #1 0x5586b6dfb9aa in (anonymous namespace)::GNUELFDumper<llvm::object::ELFType<(llvm::support::endianness)1, true>>::printRelRelaReloc((anonymous namespace)::Relocation<llvm::object::ELFType<(llvm::support::endianness)1, true>> const&, (anonymous namespace)::RelSymbol<llvm::object::ELFType<(llvm::support::endianness)1, true>> const&) /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/tools/llvm-readobj/ELFDumper.cpp:3778:21

(which is good insofar as it finds a real bug, but nonetheless is turning the buildbots red)

In the latest buildbots, the error persists (along with an unrelated failure): https://lab.llvm.org/buildbot/#/builders/85/builds/18413

The compiler code is a tad too nuanced for me to fix with confidence (e.g., my first instinct would be to widen the type to int128 but I don't know if that will break compatibility).

Failed Tests (4):
  LLVM :: CodeGen/RISCV/rvv/fixed-vectors-int.ll
  LLVM :: CodeGen/SystemZ/ctpop-01.ll
  LLVM :: MC/ARM/basic-thumb2-instructions.s
  LLVM :: tools/llvm-readobj/ELF/relocations.test

AFAICT these failures have been fixed and the change can reland after the clang/ test is fixed.

thurston added a comment.EditedAug 18 2023, 2:17 PM
Failed Tests (4):
  LLVM :: CodeGen/RISCV/rvv/fixed-vectors-int.ll
  LLVM :: CodeGen/SystemZ/ctpop-01.ll
  LLVM :: MC/ARM/basic-thumb2-instructions.s
  LLVM :: tools/llvm-readobj/ELF/relocations.test

AFAICT these failures have been fixed and the change can reland after the clang/ test is fixed.

+1. Thanks @MaskRay for fixing the MC/ARM/basic-thumb2-instructions.s and tools/llvm-readobj/ELF/relocations.test cases!

(For completeness, for anyone else following this discussion: the CodeGen/SystemZ/ctpop-01.ll failure is unrelated to this patch and was reverted in https://reviews.llvm.org/rG29b200906155da393b83232dd31d746ba2ad66a5; the RISCV case was fixed by craig.topper in https://reviews.llvm.org/D158304)

artem updated this revision to Diff 551656.Aug 18 2023, 3:02 PM
artem marked 2 inline comments as done.

Fixed the comments. I do not have commit rights

clang/test/ubsan/TestCases/Misc/abs.cpp
1

I think that Aaron has moved the file when committed to the main branch. Anyway, moved.

MaskRay accepted this revision.Aug 18 2023, 3:21 PM
MaskRay added inline comments.
compiler-rt/test/ubsan/TestCases/Misc/abs.cpp
11 ↗(On Diff #551656)

FYI: [[@LINE+3]] is deprecated lit syntax https://llvm.org/docs/CommandGuide/FileCheck.html#filecheck-pseudo-numeric-variables. New code should use [[#@LINE+3]]

{{[0-9]+}} can be simplified as [[#]]

This revision is now accepted and ready to land.Aug 18 2023, 3:21 PM
MaskRay added inline comments.Aug 18 2023, 3:22 PM
clang/lib/CodeGen/CGBuiltin.cpp
1787

Thanks. I think the coding standard omits the outer braces as well.

Currenly both Clang and GCC support the following set of flags that control

code gen of signed overflow:

[...]
Howerver, clang ignores these flags for __builtin_abs(int) and its higher-width

versions, so passing minimum integer value always causes poison.

This paragraph reads as if GCC emits a trap for __builtin_abs in -ftrapv mode, but it doesn't. That said, its -fsanitize=signed-integer-overflow does handle __builtin_abs.

craig.topper added a comment.EditedAug 18 2023, 5:16 PM

Currenly both Clang and GCC support the following set of flags that control

code gen of signed overflow:

[...]
Howerver, clang ignores these flags for __builtin_abs(int) and its higher-width

versions, so passing minimum integer value always causes poison.

This paragraph reads as if GCC emits a trap for __builtin_abs in -ftrapv mode, but it doesn't. That said, its -fsanitize=signed-integer-overflow does handle __builtin_abs.

On X86 at least, gcc does call __negvsi2 for __builtin_abs under -ftrapv. https://godbolt.org/z/8dhn9bsv5

Currenly both Clang and GCC support the following set of flags that control

code gen of signed overflow:

[...]
Howerver, clang ignores these flags for __builtin_abs(int) and its higher-width

versions, so passing minimum integer value always causes poison.

This paragraph reads as if GCC emits a trap for __builtin_abs in -ftrapv mode, but it doesn't. That said, its -fsanitize=signed-integer-overflow does handle __builtin_abs.

On X86 at least, gcc does call __negvsi2 for __builtin_abs under -ftrapv. https://godbolt.org/z/8dhn9bsv5

Thank you. You are right, sorry for my neglection. GCC may use either __negvsi2 or __subvsi3 for different ports.

artem updated this revision to Diff 551966.Aug 21 2023, 3:52 AM
artem marked an inline comment as done.

Rebased and fixed legacy FileCheck syntax. Gentle reminder that I do not have commit rights.

clang/lib/CodeGen/CGBuiltin.cpp
1787

Coding standard says "Use braces on the outer if"

compiler-rt/test/ubsan/TestCases/Misc/abs.cpp
11 ↗(On Diff #551656)

Thanks! Fixed

This revision was landed with ongoing or failed builds.Aug 21 2023, 9:13 AM
This revision was automatically updated to reflect the committed changes.

Just as an FYI note, this found a bug in one of our random-number generators that was taking a random 32-bit integer and applying abs() to it. Thank you!