This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Treat passing undef to noundef params as UB
AcceptedPublic

Authored by aeubanks on Aug 31 2022, 11:26 AM.

Details

Summary

Skip on msan instrumented functions since this hurts msan's usefulness.

Add a flag to disable while ToT LLVM users fix regressions (the flag may be useful for bisecting files). This flag will be removed in the future.

Event Timeline

aeubanks created this revision.Aug 31 2022, 11:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2022, 11:26 AM
aeubanks requested review of this revision.Aug 31 2022, 11:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2022, 11:26 AM
nikic added inline comments.Sep 1 2022, 12:16 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2968

Call.isPassingUndefUB(I). You probably want to flip the checks, because the UndefValue check is cheaper. Also add tests with dereferenceable instead of noundef.

aeubanks added inline comments.Sep 1 2022, 9:16 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2968

isa<UndefValue> seems much less likely to be true though. anyway, flipped

added dereferenceable tests on top of noundef since ints can't be dereferenceable

nikic accepted this revision.Sep 1 2022, 11:22 AM

LG

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2968

nit: unnecessary braces

This revision is now accepted and ready to land.Sep 1 2022, 11:22 AM
aeubanks updated this revision to Diff 457383.Sep 1 2022, 1:49 PM

fix clang test

Herald added a project: Restricted Project. · View Herald TranscriptSep 1 2022, 1:49 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
This revision was landed with ongoing or failed builds.Sep 1 2022, 3:17 PM
This revision was automatically updated to reflect the committed changes.

FYI, this change broke Wine (at least on arm and aarch64). Not saying that it is legit breakage of code that actually was UB to begin with though - it's going to take some time to figure out what's broken though.

omjavaid reopened this revision.Sep 2 2022, 4:12 AM
omjavaid added a subscriber: omjavaid.

This change has broken LLDB Arm/AArch64 Linux buildbots. I dont really understand
the underlying reason. Reverting for now to make buildbot green.

This revision is now accepted and ready to land.Sep 2 2022, 4:12 AM

Apparently some problem occuring due to trouble with ASAN exception
https://lab.llvm.org/buildbot/#/builders/96/builds/28300

TimNN added a subscriber: TimNN.Sep 2 2022, 5:36 AM

This also broke Rust when compiled at LLVM head: https://buildkite.com/llvm-project/rust-llvm-integrate-prototype/builds/13166#0182fb4a-0f2d-4f2e-830f-f62b463b8d48. (I don't know whether this was some existing UB that got only exposed by this patch or not, but wanted to make you aware. Reproduces with 2+ codegen units and a 1+ optimization level).

bjope added a subscriber: bjope.Sep 5 2022, 2:13 AM
aeubanks updated this revision to Diff 462245.Sep 22 2022, 11:17 AM

"fix" lldb test (hopefully)
skip when msan
add flag to disable (to be removed in the future)

Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2022, 11:17 AM
aeubanks updated this revision to Diff 462247.Sep 22 2022, 11:18 AM

add flag to disable (to be removed in the future)
"fix" lldb test
skip on msan instrumented functions

aeubanks edited the summary of this revision. (Show Details)Sep 22 2022, 11:19 AM
aeubanks added a reviewer: vitalybuka.
aeubanks updated this revision to Diff 462249.Sep 22 2022, 11:21 AM

extra parens

nikic requested changes to this revision.Sep 22 2022, 11:34 AM

I'd like to block this on -fsanitize-memory-param-retval (aka msan eager checks) being enabled by default. It seems pretty clear that there is a *lot* of UB due to uninitialized parameters floating around, so we should at least make sure that it is detected in default sanitizer configurations before we start exploiting it this aggressively.

This revision now requires changes to proceed.Sep 22 2022, 11:34 AM

@vitalybuka can we turn on -fsanitize-memory-param-retval by default upstream?

@vitalybuka can we turn on -fsanitize-memory-param-retval by default upstream?

attempt in https://reviews.llvm.org/D134669

nikic accepted this revision.Oct 15 2022, 1:32 PM

The default switch has happened, so unblocking this.

clang/test/CodeGenOpenCL/overload.cl
23

Maybe initialize the relevant variables instead? I'd assume just NULL would work here.

This revision is now accepted and ready to land.Oct 15 2022, 1:32 PM
TimNN added a comment.Nov 12 2022, 9:09 AM

I'm still trying to properly minimize this, but this definitely interacts badly with other optimizations (which triggers the Rust CI failure I mentioned above):

  • We start with two functions, outer and inner. outer calls inner. outer has a noundef argument that it forwards to inner (where the argument is also noundef).
  • PostOrderFunctionAttrsPass decides, for both functions, that the argument is readnone. The readnone attribute is only added to the function definitions. The readnone attribute is not added at the call to inner from outer.
  • The DeadArgumentEliminationPass decides that when outer calls inner the argument should be poison.
  • This patch sees the poison being passed to a noundef argument and kills the call.

I don't know which component should be considered "at fault" here (or if the readnone bit is even relevant).

nikic added a comment.Nov 12 2022, 9:21 AM

@TimNN Do you have an IR sample for this? DAE is supposed to strip UB-implying attributes when converting arguments to poison here: https://github.com/llvm/llvm-project/blob/cbe5b2dd914b7ee47bb4d0f67af154a40be4d208/llvm/lib/Transforms/IPO/DeadArgumentElimination.cpp#LL291C17-L291C37

TimNN added a comment.Nov 12 2022, 9:28 AM

I've included excerpts from the IR below. It will take me a bit to provide something compilable. Though you are right, the noundef did indeed get removed from the call.

*** IR Dump Before DeadArgumentEliminationPass on [module] ***
; Function Attrs: nonlazybind uwtable
define void @_RNvXs2_NtNtCs840rfDNPFol_10proc_macro6bridge3rpcbINtB5_6EncodeuE6encodeB9_(i1 noundef zeroext %0, ptr noalias noundef align 8 dereferenceable(40) %1, ptr noalias nocapture noundef nonnull readnone align 1 %2) unnamed_addr #0 !dbg !18135 {
  %4 = zext i1 %0 to i8, !dbg !18137
  call void @_RNvXs0_NtNtCs840rfDNPFol_10proc_macro6bridge3rpchINtB5_6EncodeuE6encodeB9_(i8 %4, ptr noalias noundef nonnull align 8 dereferenceable(40) %1, ptr noalias noundef nonnull align 1 %2), !dbg !18137
  ret void, !dbg !18138
}
*** IR Dump After DeadArgumentEliminationPass on [module] ***
; Function Attrs: nonlazybind uwtable
define void @_RNvXs2_NtNtCs840rfDNPFol_10proc_macro6bridge3rpcbINtB5_6EncodeuE6encodeB9_(i1 noundef zeroext %0, ptr noalias noundef align 8 dereferenceable(40) %1, ptr noalias nocapture noundef nonnull readnone align 1 %2) unnamed_addr #0 !dbg !18136 {
  %4 = zext i1 %0 to i8, !dbg !18138
  call void @_RNvXs0_NtNtCs840rfDNPFol_10proc_macro6bridge3rpchINtB5_6EncodeuE6encodeB9_(i8 %4, ptr noalias noundef nonnull align 8 dereferenceable(40) %1, ptr noalias nonnull align 1 poison), !dbg !18138
  ret void, !dbg !18139
}
*** IR Dump Before InstCombinePass on _RNvXs2_NtNtCs840rfDNPFol_10proc_macro6bridge3rpcbINtB5_6EncodeuE6encodeB9_ ***
; Function Attrs: nonlazybind uwtable
define available_externally void @_RNvXs2_NtNtCs840rfDNPFol_10proc_macro6bridge3rpcbINtB5_6EncodeuE6encodeB9_(i1 noundef zeroext %0, ptr noalias noundef align 8 dereferenceable(40) %1, ptr noalias nocapture noundef nonnull readnone align 1 %2) unnamed_addr #2 !dbg !20759 {
  %4 = zext i1 %0 to i8, !dbg !20762
  call void @_RNvXs0_NtNtCs840rfDNPFol_10proc_macro6bridge3rpchINtB5_6EncodeuE6encodeB9_(i8 %4, ptr noalias noundef nonnull align 8 dereferenceable(40) %1, ptr noalias nonnull align 1 poison), !dbg !20762
  ret void, !dbg !20763
}
*** IR Dump After InstCombinePass on _RNvXs2_NtNtCs840rfDNPFol_10proc_macro6bridge3rpcbINtB5_6EncodeuE6encodeB9_ ***
; Function Attrs: nonlazybind uwtable
define available_externally void @_RNvXs2_NtNtCs840rfDNPFol_10proc_macro6bridge3rpcbINtB5_6EncodeuE6encodeB9_(i1 noundef zeroext %0, ptr noalias noundef align 8 dereferenceable(40) %1, ptr noalias nocapture noundef nonnull readnone align 1 %2) unnamed_addr #2 !dbg !20759 {
  store i1 true, ptr poison, align 1
  ret void, !dbg !20762
}
TimNN added a comment.EditedNov 12 2022, 4:25 PM

I didn't manage to repro with opt, so still no compilable IR. I did some more debugging, though:

  • Inside removeDeadArgumentsFromCallers, CB->getCalledFunction()->dump() is define void @_RNvXs0_NtNtCs840rfDNPFol_10proc_macro6bridge3rpchINtB5_6EncodeuE6encodeB9_(i8 %0, ptr noalias noundef align 8 dereferenceable(40) %1, ptr noalias nocapture nonnull readnone align 1 %2) unnamed_addr #0 personality ptr @rust_eh_personality !dbg !18034 { ... }. Definition, no noundef.
  • Inside callPassesUndefToPassingUndefUBParam, Call.getCalledFunction()->dump() is declare void @_RNvXs0_NtNtCs840rfDNPFol_10proc_macro6bridge3rpchINtB5_6EncodeuE6encodeB9_(i8 %0, ptr noalias noundef align 8 dereferenceable(40) %1, ptr noalias noundef nonnull align 1 %2) unnamed_addr #2. Declaration, noundef is back.
TimNN added a comment.EditedNov 13 2022, 11:13 AM

I'm sorry for the noise. Further investigation has shown that this happens when Rust is doing (thin) LTO, and I don't think this patch can be considered in any way "at fault" here, so this is the last you'll hear from me on the topic here.

I don't know whether the fault is with how Rust implements (thin) LTO or something on the LLVM side. Basically, after running the FunctionImporter on a module, we end up with a define available_externally void outer and a declare void inner, presumably coming from different modules. outer contains the call to inner with poison, but the parameter is noundef on the declaration on inner.

edit: Issue in the Rust repo: https://github.com/rust-lang/rust/issues/104377

edit2: LLVM issue: https://github.com/llvm/llvm-project/issues/58976

aeubanks updated this revision to Diff 475017.Nov 13 2022, 2:24 PM

I somehow missed the previous comments

rebased on top of fixing UB in tests changes

I'll hold off on submitting this until that bug is figured out

vitalybuka accepted this revision.Nov 14 2022, 12:48 PM

The ThinLTO related breakage I mentioned above should be fixed as of https://github.com/llvm/llvm-project/commit/451799bb8261bde52bbfef226d019caf1d82aa42.