This is an archive of the discontinued LLVM Phabricator instance.

[IR] Add NoUndef attribute to Intrinsics.td
ClosedPublic

Authored by aqjune on Aug 25 2020, 2:01 PM.

Details

Summary

This patch adds NoUndef to Intrinsics.td.
The attribute is attached to llvm.assume's operand, because llvm.assume(undef)
is UB.
It is attached to pointer operands of several memory accessing intrinsics
as well.

This change makes ValueTracking::getGuaranteedNonPoisonOps' intrinsic check
unnecessary, so it is removed.

Diff Detail

Event Timeline

aqjune created this revision.Aug 25 2020, 2:01 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 25 2020, 2:01 PM
aqjune requested review of this revision.Aug 25 2020, 2:01 PM
jdoerfert added inline comments.Aug 25 2020, 2:03 PM
llvm/include/llvm/IR/Intrinsics.td
564

This is the old nonnull discussion all over again. I don't think you can do this here. Do it when we know the size is not 0. Same below.

aqjune updated this revision to Diff 287946.Aug 26 2020, 5:52 AM

Exclude memset/memcpy/memmove

Herald added a project: Restricted Project. · View Herald TranscriptAug 26 2020, 5:52 AM
aqjune added inline comments.Aug 26 2020, 5:54 AM
llvm/include/llvm/IR/Intrinsics.td
564

I think we can make it more explicit by updating LangRef and saying that undef or poison can be given when the size is zero. I'll make a patch for that.

aqjune added inline comments.Aug 26 2020, 6:36 AM
llvm/include/llvm/IR/Intrinsics.td
564

I became to think that it has to be UB, because it makes expanding memset/memcpy to a loop unsound. For example:

memset(p, 255, 0)
->
q = p+0
while (p != q) { *p = 255; ++p; }

If p is undef/poison, p != q is also undef/poison, so the target has branch on undef/poison, which is UB.
In order to support this, memset(undef, 255, 0) should be UB too. LowerMemIntrinsics.cpp does this transformation.

This does not happen when p was null or dangling pointer (which is weaker than undef/poison); pointer comparison does not produce poison in these case.
The original thread http://lists.llvm.org/pipermail/llvm-dev/2017-July/115665.html states the cases that are only about null/dangling pointers.

I'll see whether it is okay to be UB by running Alive2.

jdoerfert added inline comments.Aug 26 2020, 9:49 AM
llvm/include/llvm/IR/Intrinsics.td
564

I am hesitant to argue there is an implementation for which this is UB so we make it UB.

memset(p, c, size) {
  while (size > 0) {
    p[size--] = c;
  }
}

is UB free for size 0.

aqjune added inline comments.Aug 26 2020, 10:11 AM
llvm/include/llvm/IR/Intrinsics.td
564

Found that actually LowerMemIntrinsics.cpp was also checking the size rather than comparing & incrementing pointers (llvm/test/CodeGen/NVPTX/lower-aggr-copies.ll ) . So it was UB-safe.
Though, I wonder why it isn't directly emitting the increment-of-pointer version, which seems to be more efficient. Another concern is the safety of LSR that seems to be related with this as well.

I'm fine with removing noundef from memset/cpy/move (which is the current version).

Made a LangRef patch for memset/cpy/move: D86643

jdoerfert accepted this revision.Aug 26 2020, 10:16 AM

LGTM. Maybe add a test for noundef on the assume arg as well.

This revision is now accepted and ready to land.Aug 26 2020, 10:16 AM
This revision was automatically updated to reflect the committed changes.
efriedma added inline comments.Aug 26 2020, 11:21 AM
llvm/include/llvm/IR/Intrinsics.td
1357

Do masked load/store/gather/scatter have the same problem as memcpy/memmove? In particular, it seems a bit weird that a masked store with an all-false mask has undefined behavior if the pointer argument is undef.

aqjune added inline comments.Aug 26 2020, 11:43 AM
llvm/include/llvm/IR/Intrinsics.td
1357

In case of masked store/load, I followed Memory sanitizer's behavior.
MemorySanitizerVisitor's handleMaskedStore and handleMaskedLoad calls insertShadowCheck for the pointer, which checks whether the pointer is undefined IIUC.
The same thing happens for normal load/stores, but not for memcpy/memset.

efriedma added inline comments.Aug 26 2020, 12:15 PM
llvm/include/llvm/IR/Intrinsics.td
1357

I suspect msan's behavior isn't what we want; a completely masked load/store is sort of an edge case, though.

I'm more concerned about gather/scatter; it's much more likely we'll end up with undef vector lanes there.

aqjune added inline comments.Aug 26 2020, 12:46 PM
llvm/include/llvm/IR/Intrinsics.td
1357

For gather/scatter, I agree the pointer vector may contain undef elements.
Maybe it was a bit naive to add noundefs to them. To be conservative, I'll make a patch that removes noundefs from these masked intrinsics.

aqjune added inline comments.Aug 26 2020, 12:53 PM
llvm/include/llvm/IR/Intrinsics.td
1357

(being conservative means that masked_load and masked_store are to be updated as well.)