This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold for masked gather when loading the same value each time.
ClosedPublic

Authored by CarolineConcatto on Dec 14 2021, 6:12 AM.

Details

Summary

This patch checks in the masked gather when the first operand value is a
splat and the mask is all one, because the masked gather is reloading the
same value each time. This patch replaces this pattern of masked gather by
a scalar load of the value and splats it in a vector.

Diff Detail

Event Timeline

CarolineConcatto requested review of this revision.Dec 14 2021, 6:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 14 2021, 6:12 AM

This looks like a nice change @CarolineConcatto! I just have a few comments about the tests.

llvm/test/Transforms/InstCombine/vscale_masked_intrinsics.ll
4

nit: Could you rename this to @valid_invariant_load_i64 just because the shortened form inv could also mean invalid?

18

nit: Same here - perhaps rename to @invalid_value_invariant_load_i64?

26

This is actually just creating a fully poison vector here because you're splatting the first element, which is poison. Perhaps you can just delete the %broadcast.splat line and pass %broadcast.splatinsert directly to the masked.gather?

32

nit: Rename to something like @invalid_mask_invariant_load_i64?

41

This is equivalent to a fully poison vector because you're splatting the first element, which is poison. Perhaps you can just pass a predicate to the function:

define <vscale x 2 x i64> @invalid_mask_invariant_load_i64(i64* %src, <vscale x 2 x i1> %mask) #0 {
  ...
  %res = call <vscale x 2 x i64> @llvm.masked.gather.nxv2i64(<vscale x 2 x i64*> %broadcast.splat, i32 8, <vscale x 2 x i1> %mask, <vscale x 2 x i64> undef)

?

Alternatively, if you want to test for a constant mask that isn't all zeroes you can do something like:

%res = call <vscale x 2 x i64> @llvm.masked.gather.nxv2i64(<vscale x 2 x i64*> %broadcast.splat, i32 8, <vscale x 2 x i1> shufflevector (<vscale x 2 x i1> insertelement (<vscale x 2 x i1> poison, i1 false, i32 0), <vscale x 2 x i1> poison, <vscale x 2 x i32> zeroinitializer), <vscale x 2 x i64> undef)

which is an all-false constant mask?

CarolineConcatto marked 5 inline comments as done.

-address comments in the test

Thank you @david-arm for the suggestions.
I believe I have addressed all of them.

david-arm accepted this revision.Dec 17 2021, 6:33 AM

LGTM! Thanks for making the changes @CarolineConcatto. :)

This revision is now accepted and ready to land.Dec 17 2021, 6:33 AM

Hi @CarolineConcatto, the changes looks good to me, just left a few more nits on the tests.

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
370

nit: load.scalar ?

llvm/test/Transforms/InstCombine/vscale_masked_intrinsics.ll
2

nit: From what I can see, there is nothing really scalable-specific about this transformation, so maybe just add it to llvm/test/Transforms/InstCombine/masked_intrinsics.ll?
(and also add a test for fixed-width vectors)

6

nit: Did you mean uniform?

I'd also avoid using 'valid' and 'invalid', because all IR seems valid, it's just not necessarily the right input to trigger your transformation.

How about:

@gather_nxv2i64_uniform_ptrs_all_active_mask
@gather_nxv2i64_non_uniform_ptrs_all_active_mask
@gather_nxv2i64_uniform_ptrs_no_all_active_mask
19

nit: Vector of pointers is not a splat value.

26

nit: maybe just pass a <vscale x 2 x i64*> as argument?

31

nit: Do you just mean 'Not all active mask' ?

CarolineConcatto marked 4 inline comments as done.
  • Move tests from vscale_masked_intrinsics.ll to masked_intrinsics.ll .
  • Add test when mask is all zero

Thank you Sander!

llvm/test/Transforms/InstCombine/vscale_masked_intrinsics.ll
2

Ok, I hope that there is no problem that the file you are pointing has no vscale tests in there yet.

31

It is both, because it cannot be all zero, this returns Null

if (!ConstMask)
  return nullptr;
david-arm added inline comments.Jan 4 2022, 7:16 AM
llvm/test/Transforms/InstCombine/masked_intrinsics.ll
375 ↗(On Diff #397274)

Do we actually need this attribute? This test file is in a target-independent location so we should avoid using target-specific attributes here.

  • remove sve attribute from masked_intrinsics.ll
  • remove negative test with vscale
CarolineConcatto marked an inline comment as done.Jan 6 2022, 2:24 AM
sdesmalen added inline comments.Jan 6 2022, 3:37 AM
llvm/test/Transforms/InstCombine/masked_intrinsics.ll
273 ↗(On Diff #397818)

nit: add Test gathers that can be simplified to scalar load + splat

274 ↗(On Diff #397818)

Splat pointer?

297 ↗(On Diff #397818)

just write: <2 x i64*> <i64* %src, i64* %src>

300 ↗(On Diff #397818)

nit: can you add a comment here that these are negative tests?
(and change the name of the functions to @negative_gather_...)

328 ↗(On Diff #397818)

I think you can remove this test, because it's unaffected by your patch.

CarolineConcatto marked 2 inline comments as done.
  • Improve comments in the test file
llvm/test/Transforms/InstCombine/masked_intrinsics.ll
274 ↗(On Diff #397818)

Yes, you are correct. It is pointer. Missed that

297 ↗(On Diff #397818)

I believe this does not work, because src needs to be a constant.
I need to use gep and inserts in each position of the vector. That is the same as doing a splat

328 ↗(On Diff #397818)

That is not true. I added these lines:

auto *ConstMask = dyn_cast<Constant>(II.getArgOperand(2));
if (!ConstMask)
  return nullptr;
sdesmalen added inline comments.Jan 7 2022, 6:34 AM
llvm/test/Transforms/InstCombine/masked_intrinsics.ll
297 ↗(On Diff #397818)

You're right, I didn't realise that this notation doesn't work for constants!

328 ↗(On Diff #397818)

Returning nullptr means that InstCombine is not combining this case, i.e. that your combine function makes no changes.
You can see that in InstCombinerImpl::run where it calls the visit function.

In negative_gather_v2i64_uniform_ptrs_all_inactive_mask you're testing that the code is optimized away entirely if the mask is all-zero, but that's not caused by your code.

Matt added a subscriber: Matt.Jan 13 2022, 10:50 AM
  • Remove test negative_gather_v2i64_uniform_ptrs_no_all_active_mask
sdesmalen accepted this revision.Jan 21 2022, 3:19 AM
This revision was landed with ongoing or failed builds.Jan 21 2022, 6:20 AM
This revision was automatically updated to reflect the committed changes.