This is an archive of the discontinued LLVM Phabricator instance.

[SCCP] Do not replace deref'able ptr with un-deref'able one.
ClosedPublic

Authored by fhahn on Aug 5 2020, 11:04 AM.

Details

Summary

Currently IPSCCP (and others like CVP/GVN) blindly propagate pointer
equalities. In certain cases, that leads to dereferenceable pointers
being replaced, as in the example test case.

I think this is not allowed, as it introduces an access of an
un-dereferenceable pointer. Note that the pointer is inbounds, but one
past the last element, so it is valid, but not dereferenceable.

This patch is mostly to highlight the issue and start a discussion.
Currently it only checks for specifically looking
one-past-the-last-element pointers with array typed bases.

This causes the mis-compile outlined in
https://stackoverflow.com/questions/55754313/is-this-gcc-clang-past-one-pointer-comparison-behavior-conforming-or-non-standar

In the test case, if we replace %p with the GEP for the store, we
subsequently determine that the store and the load cannot alias, because
they are to different underlying objects.

Note that Alive2 seems to think that the replacement is valid:
https://alive2.llvm.org/ce/z/2rorhk

Diff Detail

Event Timeline

fhahn created this revision.Aug 5 2020, 11:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 5 2020, 11:04 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
fhahn requested review of this revision.Aug 5 2020, 11:04 AM

Note that Alive2 seems to think that the replacement is valid:
https://alive2.llvm.org/ce/z/2rorhk

This is because Alive2's pointer comparison is approximating its behavior yet; it is right that we cannot replace a pointer with another pointer in general.

LLVM's memory model isn't completely settled with respect to how to determine the "base" of a pointer; see https://bugs.llvm.org/show_bug.cgi?id=34548 etc. But consensus is that the general transform you're describing isn't legal without proving more about the pointer. (e.g. that the two pointers have the same base, or the replaced value isn't used for any memory operations).

llvm/lib/Transforms/Scalar/SCCP.cpp
1352

I'm not sure I completely follow what this code is doing, but using getPointerElementType() like this is probably wrong.

fhahn added a comment.Aug 5 2020, 2:13 PM

LLVM's memory model isn't completely settled with respect to how to determine the "base" of a pointer; see https://bugs.llvm.org/show_bug.cgi?id=34548 etc. But consensus is that the general transform you're describing isn't legal without proving more about the pointer. (e.g. that the two pointers have the same base, or the replaced value isn't used for any memory operations).

Great, thanks for the reference.

This patch is aimed as a first step towards incrementally improving the current situation by avoiding replacements for which there is agreement that they are problematic. The narrow initial case would be replacing a dereferenceable pointer with a clearly un-dereferenceable one.

If there is consensus on the approach, I think it would make sense to adjust the patch roughly as follows:

  1. Add a canReplaceDueToPointerEqualilty(Ptr1, Ptr2) which returns true if we can replace Ptr1 with Ptr2 if they are considered equal or false if not.
  2. The initial implementation will be incomplete and only reject certain cases (rather than only allowing cases we know are fine (e.g. both pointers dereferenceable and from the same underlying object)
  3. Update GVN/SCCP/CPV to use canReplaceDueToPointerEqualilty.

Does that make sense?

llvm/lib/Transforms/Scalar/SCCP.cpp
1352

My intention here is to catch the case where we create a pointer to the 'one-past-the-last-element', which clearly is not dereferenceable.

As implemented above, it probably only works for GEPs with pointers to array types as bases. So this would need further refinement to make sure we actually use the right base when dealing with nested data types. Is that what you were referring to or is there something else I am missing? Do you know if there are any helpful utilities for doing such a thing?

nikic added a subscriber: nikic.Aug 5 2020, 2:28 PM
efriedma added inline comments.Aug 5 2020, 2:43 PM
llvm/lib/Transforms/Scalar/SCCP.cpp
1352

The simplest correct check I can think of is that you could DecomposeGEPExpression both sides of the equality, and check if the "Base" is equal. That should cover everything, barring any weirdness with noalias/argmemonly/etc.

The problem here isn't really restricted to one-past-the-end. Consider, for example, if you call malloc, then free the memory, then call malloc again. The two pointers can be equal, but don't point to the same object.

fhahn updated this revision to Diff 283906.Aug 7 2020, 7:26 AM

Updated to use canReplacePointersIfEqual, only guard the actual replacement, which still allows for simplification of conditionals based on equality.

efriedma accepted this revision.Sep 2 2020, 5:56 PM

LGTM

This revision is now accepted and ready to land.Sep 2 2020, 5:56 PM
This revision was landed with ongoing or failed builds.Sep 3 2020, 2:22 AM
This revision was automatically updated to reflect the committed changes.