This is an archive of the discontinued LLVM Phabricator instance.

LAA: handle GEPs with >2 operands in findForkedSCEVs()
AbandonedPublic

Authored by artagnon on Jul 7 2023, 6:27 AM.

Details

Summary

Consider vectorizing the following case:

struct layer_data {
  unsigned char arr[2048];
} *ld;

void Next_Packet() {
  int l = 0;
  while (l<2048)
  {
    ld->arr[l++] = 3;
    ld->arr[l++] = 4;
  }
}

The GEPs for the assignments have three operands: the ptr ld, a
zero-index into ld, and another indexing depending on the loop induction
variable. However, findForkedSCEVs() suffers from the limitation that it
can only handle GEPs with two operands. Lift this limitation, and
successfully vectorize the above example, albeit with runtime
memory-checks.

Diff Detail

Unit TestsFailed

Event Timeline

artagnon created this revision.Jul 7 2023, 6:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2023, 6:27 AM
artagnon requested review of this revision.Jul 7 2023, 6:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2023, 6:27 AM

To eliminate the runtime check, should we investigate turning on scev-aa for just LAA? Is it possible to do that?

nikic added a comment.Jul 7 2023, 7:00 AM

Not familiar with this code, but this doesn't look correct to me.

llvm/lib/Analysis/LoopAccessAnalysis.cpp
866

This seems to assume that the same scale can be applied to all GEP indices, which is not correct. I think the only reason this happens to work for your example is that one of the indices is zero.

If you want to handle this generically, what you should probably do is either a) use collectOffsets() to convert the GEP into scale multiply representation or b) get the SCEV for the GEP and analyze SCEVUnknowns for forking.

954

If I'm reading this code right, this is claiming that load ptr, ptr %p is the same as %p? That doesn't make sense. load should always be a root instruction, as it was before.

fhahn added inline comments.Jul 7 2023, 7:47 AM
llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
987

I'm not sure if/how this fits into the forked pointer handling. The forked pointer handling is to support cases where a GEP can have multiple base pointers (e.g. due to a select or phi). But in this case there are 2 distinct GEPs each with their separate base pointer.

artagnon added inline comments.Jul 7 2023, 7:58 AM
llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
987

I saw the comment on top of findForkedPointers(), and was initially confused too, but I think findForkedPointers() can be extended to support the use-case at hand: yes, there are two distinct GEPs, with separate base pointers, but really, both of them are loading the same ptr with different offsets: I look inside the load to find the SCEV associated to @ld, and the case at hand does seem to be vectorized successfully. Is the vectorization of the test-case I've provided incorrect?

If I weren't to modify findForkedPointers(), I'm left with SCEV AddExprs, which I have no idea how to handle independently of each other, and vectorization bails out saying that the expressions are not AddRecs.

nikic added inline comments.Jul 7 2023, 8:07 AM
llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
987

For this code to be analyzable the two loads from @ld need to be CSE/GVNed first. Once that has happened, SCEV/LAA will be able to understand the relation between the GEPs.

You cannot simply assume that two loads from the same pointer will return the same value, as there may be clobbers in between.

artagnon added inline comments.Jul 7 2023, 8:11 AM
llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
987

Good point. I'm curious about why CSE/GVN didn't eliminate one of the loads in this example.

nikic added inline comments.Jul 7 2023, 8:14 AM
llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
987

That's because there is a store to %arrayidx in between, and it's possible for %arrayidx and @ld to be the same (if @ld is stored inside @ld).

In some cases TBAA metadata can help avoid this, but as char is involved here, it's not applicable to your case.

artagnon abandoned this revision.Jul 7 2023, 8:22 AM

Thanks for the reviews. The patch is definitely incorrect: abandoning.