This is an archive of the discontinued LLVM Phabricator instance.

[POC][LoopVectorizer] Allow invariant loads/stores using masked gather/scatter for a scalable VF.

Authored by sdesmalen on Oct 28 2020, 2:09 PM.



This patch is part of a proof of concept for vectorising a loop using
scalable vectors. The patch is shared for reference and there is no
expectation for this patch to land in the current form.

For fixed-width vectors, the loopvectorizer assumes that certain operations
can be scalarized. For example, loads/stores from uniform pointers without
masking are scalarized, which is not possible for scalable vectors. For
these, use gather/scatter instructions instead until we've found a way to
properly widen these types.

void loop(int N, double *a, double *b) {
  #pragma clang loop vectorize_width(4, scalable)
  for (int i = 0; i < N; i++) {
    a[42] = b[i] + 1.0;   // uses llvm.masked.scatter for the store

Diff Detail

Event Timeline

sdesmalen created this revision.Oct 28 2020, 2:09 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2020, 2:09 PM
sdesmalen requested review of this revision.Oct 28 2020, 2:09 PM
khchen added a subscriber: khchen.Oct 28 2020, 5:45 PM
dancgr added a subscriber: dancgr.Nov 3 2020, 9:55 AM
ctetreau added inline comments.

I suppose this is why you don't want to actually merge this currently? What happens if if it's not aarch64?

sdesmalen added inline comments.Nov 9 2020, 7:13 AM

The REQUIRES: aarch64-registered-target is actually unnecessary, not sure why I thought this was needed.
This patch is probably simple enough to be reviewed as-is. The other POC patches I've split up into smaller NFC patches, but there is little to simplify here.

I have no objections to this, but you should probably get some more eyes on it.

ctetreau resigned from this revision.Feb 1 2021, 9:54 AM
sdesmalen abandoned this revision.Mar 15 2021, 5:13 AM

This has since been superseded by other patches.