This patch checks in the masked gather when the first operand value is a
splat and the mask is all one, because the masked gather is reloading the
same value each time. This patch replaces this pattern of masked gather by
a scalar load of the value and splats it in a vector.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
This looks like a nice change @CarolineConcatto! I just have a few comments about the tests.
llvm/test/Transforms/InstCombine/vscale_masked_intrinsics.ll | ||
---|---|---|
4 | nit: Could you rename this to @valid_invariant_load_i64 just because the shortened form inv could also mean invalid? | |
18 | nit: Same here - perhaps rename to @invalid_value_invariant_load_i64? | |
26 | This is actually just creating a fully poison vector here because you're splatting the first element, which is poison. Perhaps you can just delete the %broadcast.splat line and pass %broadcast.splatinsert directly to the masked.gather? | |
32 | nit: Rename to something like @invalid_mask_invariant_load_i64? | |
41 | This is equivalent to a fully poison vector because you're splatting the first element, which is poison. Perhaps you can just pass a predicate to the function: define <vscale x 2 x i64> @invalid_mask_invariant_load_i64(i64* %src, <vscale x 2 x i1> %mask) #0 { ... %res = call <vscale x 2 x i64> @llvm.masked.gather.nxv2i64(<vscale x 2 x i64*> %broadcast.splat, i32 8, <vscale x 2 x i1> %mask, <vscale x 2 x i64> undef) ? Alternatively, if you want to test for a constant mask that isn't all zeroes you can do something like: %res = call <vscale x 2 x i64> @llvm.masked.gather.nxv2i64(<vscale x 2 x i64*> %broadcast.splat, i32 8, <vscale x 2 x i1> shufflevector (<vscale x 2 x i1> insertelement (<vscale x 2 x i1> poison, i1 false, i32 0), <vscale x 2 x i1> poison, <vscale x 2 x i32> zeroinitializer), <vscale x 2 x i64> undef) which is an all-false constant mask? |
Hi @CarolineConcatto, the changes looks good to me, just left a few more nits on the tests.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
370 | nit: load.scalar ? | |
llvm/test/Transforms/InstCombine/vscale_masked_intrinsics.ll | ||
2 | nit: From what I can see, there is nothing really scalable-specific about this transformation, so maybe just add it to llvm/test/Transforms/InstCombine/masked_intrinsics.ll? | |
6 | nit: Did you mean uniform? I'd also avoid using 'valid' and 'invalid', because all IR seems valid, it's just not necessarily the right input to trigger your transformation. How about: @gather_nxv2i64_uniform_ptrs_all_active_mask @gather_nxv2i64_non_uniform_ptrs_all_active_mask @gather_nxv2i64_uniform_ptrs_no_all_active_mask | |
19 | nit: Vector of pointers is not a splat value. | |
26 | nit: maybe just pass a <vscale x 2 x i64*> as argument? | |
31 | nit: Do you just mean 'Not all active mask' ? |
- Move tests from vscale_masked_intrinsics.ll to masked_intrinsics.ll .
- Add test when mask is all zero
llvm/test/Transforms/InstCombine/masked_intrinsics.ll | ||
---|---|---|
375 ↗ | (On Diff #397274) | Do we actually need this attribute? This test file is in a target-independent location so we should avoid using target-specific attributes here. |
llvm/test/Transforms/InstCombine/masked_intrinsics.ll | ||
---|---|---|
273 ↗ | (On Diff #397818) | nit: add Test gathers that can be simplified to scalar load + splat |
274 ↗ | (On Diff #397818) | Splat pointer? |
297 ↗ | (On Diff #397818) | just write: <2 x i64*> <i64* %src, i64* %src> |
300 ↗ | (On Diff #397818) | nit: can you add a comment here that these are negative tests? |
328 ↗ | (On Diff #397818) | I think you can remove this test, because it's unaffected by your patch. |
- Improve comments in the test file
llvm/test/Transforms/InstCombine/masked_intrinsics.ll | ||
---|---|---|
274 ↗ | (On Diff #397818) | Yes, you are correct. It is pointer. Missed that |
297 ↗ | (On Diff #397818) | I believe this does not work, because src needs to be a constant. |
328 ↗ | (On Diff #397818) | That is not true. I added these lines: auto *ConstMask = dyn_cast<Constant>(II.getArgOperand(2)); if (!ConstMask) return nullptr; |
llvm/test/Transforms/InstCombine/masked_intrinsics.ll | ||
---|---|---|
297 ↗ | (On Diff #397818) | You're right, I didn't realise that this notation doesn't work for constants! |
328 ↗ | (On Diff #397818) | Returning nullptr means that InstCombine is not combining this case, i.e. that your combine function makes no changes. In negative_gather_v2i64_uniform_ptrs_all_inactive_mask you're testing that the code is optimized away entirely if the mask is all-zero, but that's not caused by your code. |
nit: load.scalar ?