This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorize] Introduce CanSinkMemoryInstrs to isFixedOrderRecurrence
AbandonedPublic

Authored by peterwaller-arm on Nov 1 2022, 6:33 AM.

Details

Summary

Given a loop of the form:

double w[1024], x[1024], a[1024];
void f() {
  for (unsigned i = 0; i < 1024-1; i++) {
    w[i] += a[i];
    x[i] += a[i+1];
  }
}

Currently GVN PRE creates a phi which feeds a[i+1] into the next
iteration.

Prior to this patch, this phi is reported as being 'unknown':

remark: loop not vectorized: value that could not be identified as
        reduction is used outside the loop [-Rpass-analysis=loop-vectorize]
    w[i] += a[i];
            ^

The reason is that isFixedOrderRecurrence sees the store to w[i]
occurring before the read from a[i+1] and it knows nothing about how
these are related.

This patch introduces a query for the memory dependences of the phi
input. If the memory dependency analysis reports a nonlocal dependency,
then any memory instructions can be sunk past it.

NOTE: I'm putting this patch up for early comments. Some things I'm currently continuing to look into: Does LoopVectorize need the memory dependence analysis, or is there other infrastructure available that can be reused? Can I find any examples of miscompiles and does it have the desired benchmarking impact?

This patch depends on https://reviews.llvm.org/D137158 which adds the MemoryDependance infrastructure.

Diff Detail

Event Timeline

peterwaller-arm created this revision.Nov 1 2022, 6:33 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 1 2022, 6:33 AM
peterwaller-arm requested review of this revision.Nov 1 2022, 6:33 AM
peterwaller-arm edited the summary of this revision. (Show Details)Nov 1 2022, 6:38 AM
peterwaller-arm added subscribers: Ayal, fhahn, spatel.
georges added a subscriber: georges.Nov 1 2022, 9:10 AM

This is my first gambit, and I'm relatively new to making changes to LoopVectorize. Please don't hold back if this is the wrong approach, I'm happy to put legwork in to finding the right route to making this work.

llvm/lib/Analysis/IVDescriptors.cpp
970

This condition is fairly weak and I think it might be possible to make more specific queries.

At the moment my understanding is that this should allow sinking of memory instructions if it can be shown that the Previous value doesn't depend on anything above it in the block.

Perhaps isFixedOrderRecurrence could handle more cases if TryToPushSinkCandidate could consider specifically whether Previous depends on the SinkCandidate. I'm hesitating because I don't want to accidentally introduce an expensive query.

I'd appreciate input on whether using the memory dependence checker here is appropriate before I consider making this more general.

fhahn added a subscriber: gilr.Dec 20 2022, 3:38 PM

Thanks for sharing the patch! I had a look at some of the patches I had around from a while ago and from that I think there are a couple of potential issues, that make addressing this as extension to current isFixedOrderRecurrence difficult:

  • Sink candidates could be stores that are part of an interleave group, so even if we sink the original store, the actual interleave group may be created earlier and end up in an incorrect position. We don't know about interleave groups in isFixedOrderRecurrence.
  • Sink candidates may move past other memory instructions that may alias them, which I think the current patch doesn't check for.

The current logic of trying to collect a sink map and ensure no constraints are violated later on is quite fragile (there have been several issues with violations in the past) and further extending the logic doesn't seem ideal. One option discussed with @Ayal & @gilr a while ago was to move fixing up re-ordering the instructions to VPlan. At this point we can simplify re-order according to the constraints and bail out if no order can be found. One blocker for this I think is that there's no way to query dominance across regions in VPlan at the moment. I am working on fixing that and hopefully will be able to share a prototype for this approach in a few weeks.

(AFAIK MemoryDependenceAnalysis is considered deprecated and we are working on removing it (most users have already been removed, including DSE & MemCpyOpt).)

This revision is now accepted and ready to land.Dec 21 2022, 12:56 AM
peterwaller-arm abandoned this revision.Dec 21 2022, 12:57 AM

Wrong action selected... :)