This is an archive of the discontinued LLVM Phabricator instance.

Disable MSan-hostile loop unswitching.
ClosedPublic

Authored by eugenis on Jun 8 2016, 6:20 PM.

Details

Summary

Loop unswitching may cause MSan false positive when the unswitch
condition is not guaranteed to execute.

This is very similar to ASan and TSan special case in
llvm::isSafeToSpeculativelyExecute (they don't like speculative loads
and stores), but for branch instructions.

Fixes PR28054.

Diff Detail

Repository
rL LLVM

Event Timeline

eugenis updated this revision to Diff 60128.Jun 8 2016, 6:20 PM
eugenis retitled this revision from to Disable MSan-hostile loop unswitching..
eugenis updated this object.
eugenis added a reviewer: chandlerc.
eugenis set the repository for this revision to rL LLVM.
eugenis added a subscriber: llvm-commits.

It seems strange to me that this is an MSan-specific issue...

Is that because MSan causes a branch on an uninitialized value to immediately become a side-effect, while LLVM doesn't? If that's the case, I'd really like Sanjoy and David to look at this because we keep finding reasons why branching on undef has more effect than you would naively expect, so I'm worried this could actually lead to a problem outside of MSan....

sanjoy edited edge metadata.Jun 8 2016, 6:42 PM

I want to think this through some more, but overall this doesn't sound
too scary to me. Generally, I don't think we _can_ have branches on
undef be side effects, since we'd like to allow transforming:

%v = maybe_undef()
if (some_cond)
  return;
%v0 = select i1 %v, X, Y

to

%v = maybe_undef()
%v0 = select i1 %v, X, Y
if (some_cond)
  return;

to

%v = maybe_undef()
br %v, label %left, label %right

left:
  br merge

right:
  br merge

merge:
  %v0 = phi([X, left], [Y, right])

if (some_cond)
  return;

In other words, since we want selects to be side-effect free, and also
selects to be equivalent to br-phi, we need branches to be side-effect
free as well.

Thanks Sanjoy for thinking about this.

Provided that my reasoning holds (this is only a problem for MSan because it is essentially treating branch-on-uninit as a side-effect, I have one more high level question: is that even remotely reasonable?

Should the MSan report be suppressed until the side-effect bearing instruction is executed? I think that would cause MSan to not have a false positive here. I'd love to hear the thoughts of the MSan people here.

-Chandler

Should the MSan report be suppressed until the side-effect bearing instruction is executed? I think that would cause MSan to not have a false positive here. I'd love to hear the thoughts of the MSan people here.

Take this example:

while()

if (a)
  if (undef)
    doX
  else
    doY

is transformed to

if (undef)

while()
  if (a)
    doX

else

while()
  if (a)
    doY

The only way to see that the new code is fine (for some definition of fine) when a == false is to execute both branches and make sure that they have the same set of side effects. This sounds like an extremely difficult task in the general case.

Of course, the loop may have other side-effect instructions that do not depend on a.
So we need to suppress reporting of "if undef" not just when the "then" BB has no side effects, but when "then" and "else" BBs have exactly the same set of side effects.

Here is another example where it is basically impossible for MSan to propagate initializeness information exactly after the unswitching optimization.

This loop is unswitched on y, then we have 2 copies of the loop calculating "a" and phi + ret at the end. When y is undef, if we just track a single-bit "control flow is poisoned" state, we end up with "a" being undef, too. To conclude that a is defined MSan has to prove that both loops calculate "a" with the same sequence of operations which sounds impossible in the general case.

volatile int sink;
y = undef
z = 1
int f(bool y, unsigned z) {

int a = 42;
for (int i = 0; i < z; ++i) {
  if (i)
    if (y)
      sink = 1;
  a ^= 123;
}
return a;

}

eugenis updated this revision to Diff 60295.Jun 9 2016, 6:25 PM
eugenis edited edge metadata.

I've updated the bug with a better description of the problem and possible solutions.
I've added a comment to LoopUnswitch.

It looks like this CL is the only way of fixing this problem quickly. The sanitizer-x86_64-bootstrap bot is currently broken due to this, so I'd like to go ahead and commit this change.

Thanks for all the work to understand the nature of this discrepancy between MSan's semantics and the LLVM IR semantics. That bug will hopefully let us converge eventually. This seems like a reasonably isolated and minimal workaround in the mean time. Some factoring comments below.

include/llvm/Transforms/Utils/LoopUtils.h
401–403 ↗(On Diff #60295)

This no longer seems like LICM-specific safety?

lib/Transforms/Scalar/LICM.cpp
737 ↗(On Diff #60295)

The definition should probably move out of LICM at this point.

Maybe do this in a separate patch if it is going to touch a bunch of stuff?

eugenis updated this revision to Diff 60382.Jun 10 2016, 11:42 AM
eugenis marked 2 inline comments as done.

Depends on the refactoring in D21234

chandlerc accepted this revision.Jun 10 2016, 11:47 AM
chandlerc edited edge metadata.

LGTM with a minor tweak below.

lib/Transforms/Scalar/LoopUnswitch.cpp
532–533

We have a member variable for optimizing for size, I would make this a member variable. That way you can initialize it once above for the function (with cleaner syntax) and then just re-use it everywhere.

This revision is now accepted and ready to land.Jun 10 2016, 11:47 AM
eugenis updated this revision to Diff 60384.Jun 10 2016, 11:53 AM
eugenis edited edge metadata.
eugenis marked an inline comment as done.
eugenis closed this revision.Jun 10 2016, 1:10 PM

r272421

Somehow I've lost the test in Diff 3.
I'll commit it separately.