This is an archive of the discontinued LLVM Phabricator instance.

AliasAnalysis: Be less conservative about volatile than atomic.
ClosedPublic

Authored by dberlin on Apr 5 2017, 2:50 PM.

Details

Summary

getModRefInfo is meant to answer the question "what impact does this
instruction have on a given memory location" (not even another
instruction).

Long debate on this on IRC comes to the conclusion the answer should be "nothing special".

That is, a noalias volatile store does not affect a memory location
just by being volatile. Note: DSE and GVN and memdep currently
believe this, because memdep just goes behind AA's back after it says
"modref" right now.

see line 635 of memdep. Prior to this patch we would get modref there, then check aliasing,
and if it said noalias, we would continue.

getModRefInfo *already* has this same AA check, it just wasn't being used because volatile was
lumped in with ordering.

(I am separately testing whether this code in memdep is now dead except for the invariant load case)

Diff Detail

Repository
rL LLVM

Event Timeline

dberlin created this revision.Apr 5 2017, 2:50 PM
dberlin updated this revision to Diff 94286.Apr 5 2017, 2:52 PM
  • Update patch for now-passing newgvn test
dberlin updated this revision to Diff 94287.Apr 5 2017, 2:53 PM
  • Fix spurious whitespace change
sanjoy added a subscriber: sanjoy.Apr 5 2017, 2:54 PM
sanjoy added inline comments.
lib/Analysis/AliasAnalysis.cpp
336 ↗(On Diff #94285)

Why not isStrongerThan(L->getOrdering(), AtomicOrdering::Unordered)? (same below)

OTOH I'm surprised that this did not break any tests.

dberlin added inline comments.Apr 5 2017, 3:06 PM
lib/Analysis/AliasAnalysis.cpp
336 ↗(On Diff #94285)

Honestly:
Because i don't know enough to know if that's okay ;)
If it is, happy to do it.

It breaks 1 optimizations test that i'm looking at:

memcpy/atomic.ll optimizes *less* now.
Which makes no sense at all. I suspect it was relying on weird aliasing before. I'm debugging.

Otherwise, it does break some memoryssa tests, which i'm updating for the correct answers.
(i'm also still awaiting 3 stage results on linux)

dberlin added inline comments.Apr 5 2017, 3:12 PM
lib/Analysis/AliasAnalysis.cpp
336 ↗(On Diff #94285)

Actually the memcpyopt is exactly the issue you point out, so fixing as part of test updates.

sanjoy added inline comments.Apr 5 2017, 3:13 PM
lib/Analysis/AliasAnalysis.cpp
336 ↗(On Diff #94285)

I think it is, but (more importantly) wasn't that the behavior before?

Previously if L was non-volatile AND (NotAtomic or Unordered) then we'd go through the normal alias analysis codepath.

Now you want to remove the "non-volatile AND" bit. This would mean we'd want to go through the normal path if the load was NotAtomic or Unordered.

dberlin marked 4 inline comments as done.Apr 5 2017, 3:29 PM
dberlin added inline comments.
lib/Analysis/AliasAnalysis.cpp
336 ↗(On Diff #94285)

Yes, you are correct., fixed.

dberlin updated this revision to Diff 94296.Apr 5 2017, 3:29 PM
dberlin marked an inline comment as done.
  • Update to be stronger than Unordered, update all tests for fact that volatile loads are no longer modref.
dberlin added inline comments.Apr 5 2017, 3:31 PM
test/Transforms/Util/MemorySSA/volatile-clobber.ll
27 ↗(On Diff #94296)

Note: we could leave them (volatile loads) as defs if we wanted by overriding getModRefInfo, but this is also consistent with what we do for other cases.

(if we left them as defs, they would link together all volatile loads and stores, everything else would bypass them.

chandlerc added inline comments.Apr 5 2017, 4:42 PM
test/Transforms/Util/MemorySSA/volatile-clobber.ll
27 ↗(On Diff #94296)

I think volatile loads should be defs for other volatile loads, even if the two pointers are known to not alias. I'm not sure how to express this though.

dberlin added inline comments.Apr 5 2017, 5:33 PM
test/Transforms/Util/MemorySSA/volatile-clobber.ll
27 ↗(On Diff #94296)

Note:

MemorySSA does not pretend in any way to represent ordering constraints (it can't sanely do so as part of the default chain without losing optimization).
That said what you want is certainly possible, it just adds extra phis, extra defs, extra things that every other load must bypass or prove equal.

What you'd actually get would be stores + volatile loads as defs.

IE a volatile load would be a def for a regular store as well.

Calling getclobberingaccess on the store would give a correct answer, but costs something.

So we could do it as part of the default chain of course. Like i said, it just blocks optimization. It also tends to confuse things, as there is a real difference between "does this write or read the same memory as that" and "can i reorder these two memory operations".

Memdep tries to return clobbers to let things handle the second case, but it misses cases.

So if your goal is to be able to find the next volatile load, from a volatile load, for reordering, we should just add that, but my inclination (without data) would *not* try to make them defs. Because they aren't defs. They are uses. Just ordered ones :)

(Note, of course, atomics that same visible effect as writing memory are already defs and stay defs no matter what we do)

There are a couple ways to go about representing the ordering constraint:

Add an orderedmemoryuse/def, use it for them, it has an extra operand representing the ordering chain (IE the next thing that it can't be move above)

This would require orderedmemoryphis for that chain, but honestly, 99% of the cost of memoryssa is the use optimization and aliasing, not the initial chain building. Since the the ordering rules themselves are not anywhere as expensive as aliasing, ....

So i'd expect this to cost nearly nothing, but be completely precise for loads (We already know what the next thing that *reads or writes that memory* is, and can combine with that to give precise answers).

The downside is you have two chains. But that basically amounts to: If you use the default chain, you don't touch ordered operations (IE try to replace them).
If you use both chains at once, you get precise answers about both properties.

Of course, just a thought.
I haven't tried it in practice yet :)

(any happy to make volatiles defs against other volatiles in the meanwhile if we like)

dberlin updated this revision to Diff 94312.Apr 5 2017, 6:33 PM
  • Make volatiles defs again
dberlin marked an inline comment as done.Apr 6 2017, 1:40 AM

I"ve now three staged this on darwin and linux with no regressions.

hfinkel added a subscriber: hfinkel.Apr 6 2017, 9:00 AM
hfinkel added inline comments.
lib/Analysis/AliasAnalysis.cpp
336 ↗(On Diff #94312)

Can you please make sure this part of the AA interface contract is now documented in the header file.

lib/Transforms/Utils/MemorySSA.cpp
1468 ↗(On Diff #94312)

Can you add a comment somewhere that we don't need to check atomicrmw or cmpxchg because they're not allowed to have an ordering less than relaxed. The LangRef says so, but I think it is worth noting why these are the only two things you need to check here because the others should always get a ModRef return from AA.

dberlin added inline comments.Apr 6 2017, 11:59 AM
lib/Analysis/AliasAnalysis.cpp
336 ↗(On Diff #94312)

Gave it a shot.

lib/Transforms/Utils/MemorySSA.cpp
1468 ↗(On Diff #94312)

Did this an then tried to explain why we do it for the moment.

  • Update for comments
dberlin marked 4 inline comments as done.Apr 6 2017, 11:59 AM
  • This time check it into the right tree
dberlin added inline comments.Apr 6 2017, 12:03 PM
include/llvm/Analysis/AliasAnalysis.h
534 ↗(On Diff #94414)

I already fixed the double->triple slash :)

hfinkel accepted this revision.Apr 6 2017, 5:24 PM

LGTM

include/llvm/Analysis/AliasAnalysis.h
530 ↗(On Diff #94414)

reordere -> reordered

The volatile load also can't be removed.

This revision is now accepted and ready to land.Apr 6 2017, 5:24 PM
This revision was automatically updated to reflect the committed changes.