This is an archive of the discontinued LLVM Phabricator instance.

MemorySSA Optimizations: Patch 1 of N
ClosedPublic

Authored by george.burgess.iv on Feb 3 2016, 9:21 PM.

Details

Summary

This patch re-adds two of the optimizations that were taken out of MemorySSA due to sketchiness.

Specifically:

  • We now try to take advantage of memory ordering rules on loads before querying AA
  • We recognize and appropriately react to invariant loads, and loads that AA can prove point constant memory.

Note that the memory ordering check was strengthened to Monotonic, because that's the equivalent of std::memory_order_relaxed, which guarantees no ordering whatsoever, AFAIK.

Also note that I'm not an expert in memory models. So if anything seems fishy, let me know. :)

Diff Detail

Event Timeline

george.burgess.iv retitled this revision from to MemorySSA Optimizations: Patch 1 of N.
george.burgess.iv updated this object.
george.burgess.iv added reviewers: hfinkel, reames.
majnemer added inline comments.
lib/Transforms/Utils/MemorySSA.cpp
980

TODO(username) is more of a Google-ism, I don't think LLVM does this very often.

982–983

Is this clang-format'd?

george.burgess.iv marked 2 inline comments as done.

Addressed all feedback + made a comment more concise.

lib/Transforms/Utils/MemorySSA.cpp
980

Works for me! Thanks for the heads up

982–983

It is now :)

I'll look when i'm back from getting married :)

davidxl added inline comments.Feb 25 2016, 3:01 PM
lib/Transforms/Utils/MemorySSA.cpp
756

Unfinished sentence here.

758

nit: MaybeDef -> MayDef

759

This needs some explanation --> can the alias query be moved here to make this function more general?

769

Move this comment up and combine with the other volatile related comment.

788–789

This does not look correct:

When canUseBeReorderedAboveDef returns false, it is not ok to unconditionally feed it into the AA query -- unless AA query also honors the ordering constraint (I have not checked).

test/Transforms/Util/MemorySSA/atomic-clobber.ll
17

what is this change about?

42

Is this correct? The atomic load can not be reordered before the previous load with acquire.

45

Is this too conservative?

george.burgess.iv marked 4 inline comments as done.

Addressed feedback

lib/Transforms/Utils/MemorySSA.cpp
759

Do you still think this should happen (given my response to your comment below)?

769

I'm assuming you meant just the first sentence -- done. :)

788–789

Looking at the things we query, it seems that AAResults (lib/Analysis/AliasAnalysis.cpp) *always* hands back MRI_ModRef if there's any kind of ordering (including volatile) involved for loads/stores, and will return MRI_ModRef if we hand it an AtomicCmpXchgInst or AtomicRMWInst, with ordering greater than Monotonic. This seems inconsistent, so I've sent out http://reviews.llvm.org/D17631 to see if that's intentional or not.

If said patch goes in, AAResults should always be conservative in the cases that canUseBeReorderedAboveDef says are no bueno.

If you think it would be better, I can try adding a canUseNeverBeReorderedAboveDef function, or I can have canUseBeReorderedAboveDef return an enum of {Never, IfNoAlias, Always}. Entirely up to you. :)

test/Transforms/Util/MemorySSA/atomic-clobber.ll
17

Random noise. Reverted.

42

I believe it is. Even if the load for %2 was non-atomic, it can't be hoisted above an acquire, because that would break cases like

struct Foo {
  std::atomic<int> a;
  int b;
  Foo(): a(0), b(0) {}
};

Foo f;

void thread1() {
  f.b = 1;
  f.a.store(1, std::memory_order_release);
}

void thread2() {
  if (f.a.load(std::memory_order_acquire)) {
    assert(f.b == 1);
  }
}

Wouldn't it? (specifically, if we allowed this, thread2 would be able to load f.b before f.a)

45

For the same reason as above, I think this is all we can do here.

davidxl added inline comments.Feb 25 2016, 7:25 PM
test/Transforms/Util/MemorySSA/atomic-clobber.ll
42

yes -- load acquire prevents all following load/store from being moved above it. The problem is that the memory SSA dump confused me a little:

I thought ID:1 MemoryDef is for the LiveOnEntry -- but actually it is the MemoryDef associated the acquire load.

So indeed this is correct.

45

you are right -- I missread the dump.

davidxl added inline comments.Feb 27 2016, 3:43 PM
lib/Transforms/Utils/MemorySSA.cpp
752

MaybeDef --> MayDef

759

The problem with the current interface design is that 'false' does not really mean 'false' -- it means either 'don't know yet' or 'no it can not be reordered'. This either leads to redundant check later (if it means 'no') or skipped mistakenly later (if it means 'don't know yet) when the client code does know the difference. Possible solution is to move the check that both accesses are loads outside this function and change the function name to : canLoadsBeSafelyReordered( ...)

if (isLoad(Def) && isLoad(Use))

return canLoadsBeSafetlyReordered();

// Using AA interface to do the check here

...
773

This is too strict.

if (LoadDef->getOrdering() <= Monotonic &&

 LoadUse->getOrdering() <= Monotonic)
return true;

// Check other cases. For instance, non acquire loads before an acquire load can be moved after it.

788–789

yes -- there should be an 'hand shaking' with AA interfaces otherwise you will need to special treatment here for synchronization related accesses before AA query. Add test cases to cover those should be good.

george.burgess.iv marked 6 inline comments as done.

Addressed all feedback

lib/Transforms/Utils/MemorySSA.cpp
752

Oops. :)

759

Yeah, I like canLoadsBeReordered better.

773

Nice catch.

788–789

Added check_aa_is_sane (which is really tiny) in atomic-clobber.ll. I think it largely covers the behavior that we need, though.

Fix a typo in the tests.

davidxl added inline comments.Mar 4 2016, 11:29 AM
lib/Transforms/Utils/MemorySSA.cpp
757

The overall structure of the patch looks good -- however this function still needs more scrutiny -- please ping Phillip or Hal for more comments.

george.burgess.iv marked an inline comment as done.Mar 8 2016, 3:42 PM

Ping :)

So, I bugged Philip at the social, and he said the patch (as it stood) looks good, with a few comments:

  1. The LLVM spec is ambiguous about whether we can hoist a non-volatile load above a volatile load when the loads alias. It's probably best not to exploit this ambiguity at the moment by unconditionally allowing the motion of nonvolatile loads above volatile loads (and vice versa).
  1. It may be good to make pointsToConstantMemory bit check not succeed if the operation is a volatile access
  1. A few style things

Items #3 and #1 have been addressed. #1 required a bit of refactoring of the newly-named getLoadReorderability function, because we now have to care about aliasing in some cases.

Item #2 I've thought about, and I'm no longer sure that I agree. Specifically, I believe that MemorySSA's job is to reason about whether one memory operation can cause another memory operation to produce a different result. If MemorySSA determines that memop A doesn't interfere with memop B, it's the *user's* job to determine where it's actually safe to put memop B. For example, consider:

define void @foo(i8* %a) {
	; 1 = MemoryDef(liveOnEntry)
	store volatile i8 0, i8* %a
	br i1 undef, label %if.then, label %if.end

if.then:
	; 2 = MemoryDef(1)
	load volatile i8, i8* %a
	br i1 label %if.end

if.end:
	ret void
}

...MemorySSA will happily (and correctly) say that the volatile load is clobbered by the volatile store. However, it's clearly incorrect to hoist the volatile load into the entry block here, so passes that use MemorySSA will need to exercise some amount of caution in cases involving volatile ops anyway. For this reason, I think it's fine if we say that loads of constant memory are always live on entry, regardless of whether the load is volatile or not.

It may be good to make pointsToConstantMemory bit check not succeed if the operation is a volatile access

Wow, that was phrased poorly.

It may be good to make the pointsToConstantMemory check (line 1122) not succeed if the operation is a volatile access. That's what I meant. :)

This revision was automatically updated to reflect the committed changes.