This is an archive of the discontinued LLVM Phabricator instance.

[SVE][CodeGen] Vector + immediate addressing mode for masked gather/scatter
ClosedPublic

Authored by kmclaughlin on Dec 11 2020, 10:42 AM.

Details

Summary

This patch extends LowerMGATHER/MSCATTER to make use of the vector + reg/immediate
addressing modes for scalable masked gathers & scatters.

selectGatherScatterAddrMode checks if the base pointer is null, in which case
we can swap the base pointer and the index, e.g.

   getelementptr nullptr, <vscale x N x T> (splat(%offset)) + %indices)
-> getelementptr %offset, <vscale x N x T> %indices

Diff Detail

Unit TestsFailed

Event Timeline

kmclaughlin created this revision.Dec 11 2020, 10:42 AM
kmclaughlin requested review of this revision.Dec 11 2020, 10:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 11 2020, 10:42 AM
sdesmalen added inline comments.Dec 13 2020, 11:15 PM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3915

This case here (GLD1_MERGE_ZERO/SST1_PRED) seems to be the default if this function doesn't do anything, probably best not to change that.
The only case you want to handle in this function is the vector+imm case, so I think you can rewrite the control flow to something like:

if (!isNullConstant(BasePtr))
  return false;

ConstantSDNode *Offset = nullptr;
if (Index.getOpcode() == ISD::ADD)
  if (auto SplatVal = DAG.getSplatValue(Index.getOperand(1))
    if (isa<ConstantSDNode>(SplatVal))
      Offset = SplatVal;

Opcode = ....;
if (!Offset)
  // <Swap index/opcode and return>

// <Set index/opcode, calculate immediate and return>

Hi @kmclaughlin, I've just got one function left to review, but your patch looks good so far with lots of thorough testing! I just had a few minor comments.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3981

Could this be hoisted out to just below the definition of 'Opcode' on line 3976? This way you might be able to avoid duplicating the code below as well on line 3986?

llvm/test/CodeGen/AArch64/sve-masked-gather.ll
4

In this function and others in the file it doesn't look like %offset is actually used - can this be removed?

43

Don't know if it matters or not, but the scatter equivalents also have tests for bfloat?

llvm/test/CodeGen/AArch64/sve-masked-scatter-legalise.ll
15

Do we need to worry about stuff like

masked_scatter_nxv2half
masked_scatter_nxv2float?

llvm/test/CodeGen/AArch64/sve-masked-scatter-vec-plus-imm.ll
79

I think this looks fine, but I was expecting something like:

mov x8, #32
st1b { z0.d }, p0, [x8, z1.d]

Perhaps there is a technical reason why we add to the z1 register? Anyway, it's only a minor codegen issue and we can look at later!

david-arm added inline comments.Dec 14 2020, 4:15 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3904

nit: Is it worth getting getting a temporary copy of the offset to avoid calling getZExtValue()? i.e.

uint64_t OffsetVal = Offset->getZExtValue()
kmclaughlin marked 4 inline comments as done.
  • Refactored selectGatherScatterAddrMode based on the suggestions from @sdesmalen
  • Added bfloat tests to the new test files added by this patch
  • Removed unused %offset from sve-masked-gather.ll and removed duplicate tests from sve-masked-scatter-legalise.ll
llvm/test/CodeGen/AArch64/sve-masked-gather.ll
43

Good find :) I've added more tests here after posting D93307, which lowers mgathers with bfloat types

llvm/test/CodeGen/AArch64/sve-masked-scatter-legalise.ll
15

These tests were duplicates of those added to sve-masked-scatter.ll, so I've removed them from this file - in the other file there are also tests for nxv2f16, nxv2bf16, nxv2f32 & nxv2f64

david-arm added inline comments.Dec 16 2020, 7:27 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3898

nit: braces on if statement aren't need here

3911

If the Offset is not a constant splat should we be using the IMM form here?

david-arm added inline comments.Dec 16 2020, 8:32 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3911

Ignore my comment above! We use the IMM form here because baseptr is NULL, and when swapping the operands we will use the form of gather/scatter that takes a vector of pointers.

3918

I think we can probably improve this a little here by adding:

BasePtr = SplatVal;
Index = Index->getOperand(0);

just before the return, since we do have a splat that could live in a scalar register.

kmclaughlin marked 2 inline comments as done.
  • Improve codegen where the splat value is a constant, but out of range for the immediate addressing mode, e.g.
mov x8, xzr
add z1.d, z1.d, #32 // =0x20
st1b { z0.d }, p0, [x8, z1.d]
ret

->

mov w8, #32
st1b { z0.d }, p0, [x8, z1.d]
ret
david-arm accepted this revision.Dec 16 2020, 9:06 AM

LGTM!

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3915

nit: I very much doubt it makes any practical difference, but perhaps it's best to use uint64_t instead of unsigned here, since that's what getZExtValue() returns and this is used to create the i64 constant below?

This revision is now accepted and ready to land.Dec 16 2020, 9:06 AM