This is an archive of the discontinued LLVM Phabricator instance.

[ARM][MVE] Enable extending gathers
ClosedPublic

Authored by anwel on Jan 9 2020, 6:50 AM.

Details

Summary

This patch is built on top of the patch introducing non-extending masked gathers from a base and a vector of offsets (to be found at https://reviews.llvm.org/D72330).

It extends the pass that transforms selected gathers into calls to MVE's masked gather intrinsic by extending gathers.

Diff Detail

Event Timeline

anwel created this revision.Jan 9 2020, 6:50 AM
dmgreen added inline comments.Jan 9 2020, 12:33 PM
llvm/lib/Target/ARM/MVEGatherScatterLowering.cpp
193–197

There are some tests in mve-gather-ind32-scaled.ll like "zext_signed_scaled_i16_i16" that I think can be converted to a VLDRH.u32 Qd, [base, offs.sext, uxtw #1] if we can get the extend and this scale working together. It's the type of the memory being loaded that's important for the scale, not the type of the result. (Err, I think. Check that sounds right)

201

Should this be Type::getInt32Ty ? Or Builder.getInt32() ?

207

u -> U. Or better yet just check that I->hasOneUse(). The Use of a Instruction will always be an Instruction AFAIU.

Also "Parent" makes me think of the BasicBlock/Function, not the user. Maybe just call it "User" or something like it?

212

Maybe drop the "to avoid reevaluation". The rest is reason enough to remove the gather.

anwel updated this revision to Diff 237600.Jan 13 2020, 2:15 AM
anwel marked 3 inline comments as done.

Rebase on parent revision which now takes a more moduralised approach

llvm/lib/Target/ARM/MVEGatherScatterLowering.cpp
201

I assume Type::getInt32Ty(C) is the cleanest. I'll use that instead.

anwel marked 4 inline comments as done.Jan 13 2020, 3:10 AM
anwel added inline comments.
llvm/lib/Target/ARM/MVEGatherScatterLowering.cpp
193–197

You're right. The modularisation fixed whatever was keeping those tests from being converted to a gather, they look better now.

207

Okay. I wasn't sure about whether users always have to be instructions. If they are, the Parent variable becomes superfluous.

anwel updated this revision to Diff 237613.Jan 13 2020, 3:16 AM
anwel marked 2 inline comments as done.

Simplified the search for the gather instruction's user

dmgreen added inline comments.Jan 13 2020, 6:18 AM
llvm/lib/Target/ARM/MVEGatherScatterLowering.cpp
86

This is a little long. Clang-format would fix it up I think.

128–129

This can be left in I would guess? Or removed entirely if it's too noisy. Maybe reword it to "can't find gep" as opposed to "expanding".

209–215

If you moved this "removing" code into tryCreateMaskedGatherOffset or tryCreateMaskedGatherBase it would remove the need for a Root parameter. Not sure if it's cleaner, but it's only tryCreateMaskedGatherOffset that will have a Root, so tryCreateMaskedGatherBase's version will be simpler.

262

This can just be cast<..>

263

I think we have to check the type of the extend, otherwise it might be transforming to any odd thing.

Make sure there is some testcase for this too, I don't think we have maybe tests for stranger types. i64 are a good candidate. Even they are not legal (for the instruction versions that we are dealing with here), they can quite easily be created by the vectorizer.

288

ResultElemSize -> MemoryElemSize I think would be a better name. The "Result" would be after the extend.

anwel marked 7 inline comments as done.Jan 14 2020, 3:55 AM
anwel added inline comments.
llvm/lib/Target/ARM/MVEGatherScatterLowering.cpp
128–129

I think I will remove it completely, to get rid of some noise. With the current implementation, it is not clear anymore whether we will fail (i.e., expand) or not after this point. For types for which we can build masked gathers from a vector of pointers, we are now able to transform this, as demonstrated by the following test:

define arm_aapcs_vfpcc <4 x i32> @qi4(<4 x i32*> %p) {
; CHECK-LABEL: qi4:
; CHECK:       @ %bb.0: @ %entry
; CHECK-NEXT:    vmov.i32 q1, #0x10
; CHECK-NEXT:    vadd.i32 q1, q0, q1
; CHECK-NEXT:    vldrw.u32 q0, [q1]
; CHECK-NEXT:    bx lr
entry:
  %g = getelementptr inbounds i32, <4 x i32*> %p, i32 4
  %gather = call <4 x i32> @llvm.masked.gather.v4i32.v4p0i32(<4 x i32*> %g, i32 4, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x i32> undef)
  ret <4 x i32> %gather
}
209–215

That would make these few lines a bit simpler for tryCreateMaskedGatherBase, right, but I don't think it's worth the code duplication?
Because both kinds of gathers could come with a passthru, thus the 5 lines above this dealing with that that build a select would have to be duplicated or made into a new function to be called in tryCreateMaskedGatherBase as well as in tryCreateMaskedGatherOffset.
That is, whether you just copy or use a new function, around 6 new lines to make it 2 lines simpler in one case?

263

Thanks for spotting that, I wrote a test and discovered that it crashed the compiler. I'll upload a new diff with this fixed.

anwel updated this revision to Diff 237926.Jan 14 2020, 3:59 AM
anwel marked 2 inline comments as done.
dmgreen added inline comments.Jan 15 2020, 4:18 AM
llvm/lib/Target/ARM/MVEGatherScatterLowering.cpp
209–215

Ah I see. Because you are updating Load to point to the select. Fair enough. I had missed that.

In that case we should make sure that the Root passed to tryCreateMaskedGatherOffset doesn't change unless the function returns a valid load. At the moment it looks like the function could find an extend (so set Root) but then not find a valid GEP.

anwel marked 4 inline comments as done.Jan 15 2020, 5:35 AM
anwel added inline comments.
llvm/lib/Target/ARM/MVEGatherScatterLowering.cpp
209–215

Good point. I'll make sure that can't happen.

anwel updated this revision to Diff 238252.Jan 15 2020, 7:02 AM
anwel marked an inline comment as done.

Small change to make sure that Root will never be changed if no Qd load is built

dmgreen accepted this revision.Jan 16 2020, 2:07 AM

LGTM. Thanks

llvm/lib/Target/ARM/MVEGatherScatterLowering.cpp
249

Can you add a comment saying something like "we have already checked the size of the gather in isLegalTypeAndAlignment, and need to check that the extend is to a full vector width."

258

This can just be a cast<Instruction>. The dynamic part should never fail (and the cast<..> will check that).

This revision is now accepted and ready to land.Jan 16 2020, 2:07 AM
anwel marked 3 inline comments as done.Jan 16 2020, 5:31 AM
anwel added inline comments.
llvm/lib/Target/ARM/MVEGatherScatterLowering.cpp
249

Yes, I'll rephrase that comment to be more informative

anwel updated this revision to Diff 238466.Jan 16 2020, 5:32 AM
anwel marked an inline comment as done.
This revision was automatically updated to reflect the committed changes.