This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Allow overlapping vector accesses (WIP).
Needs ReviewPublic

Authored by fhahn on Aug 18 2021, 12:45 PM.

Details

Summary
NOTE: This is an extremely rough draft intended to start a wider discussion on how to allow overlapping memory accesses.

I would like to extend the SLP vectorizer to support overlapping vector
loads. This allows vectorizing cases where we operate on overlapping
vectors that can be loaded efficiently

The simplest C example is something like the snippet below, where we add
<s[0], s[1], s[2], s[3]> and <s[1], s[2], s[3], s[4]>. Those vectors can
be directly loaded from &s[0] and &s[1]. The problem is that currently
overlapping bundles are not allowed, which leads to gathering the second
vector, which is not profitable on AArch64.

void test(int *s,int* __restrict__ d) {
    for (int x=0;x<4;x++,s++) {
        d[x] = s[0] + s[1];
    }
}

The invariant that bundles should not overlap seems to be relied on and
encoded in multiple places. In this patch, I mostly tried to disable
various checks and assertions. It effectively allows overlapping
bundles, iff they first entry in Scalars is unique.

This clearly is not a proper solution, but I am hoping that sharing the
patch can be the start of a discussion on how to properly address the
limitations. It would be great if you could share your thoughts.

Diff Detail

Event Timeline

fhahn created this revision.Aug 18 2021, 12:45 PM
fhahn requested review of this revision.Aug 18 2021, 12:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2021, 12:45 PM
vporpo added a subscriber: vporpo.Nov 11 2021, 8:02 PM