This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Optimize vector shift using a splat value from outside block
ClosedPublic

Authored by YolandaCY on Aug 21 2023, 1:54 AM.

Details

Summary

The vector shift operation in WebAssembly uses an i32 shift amount type,
while the LLVM IR requires binary operator uses the same type of operands.
When the shift amount operand is splated from a different block, the splat source
will not be exported and the vector shift will be unrolled to scalar shifts.
This patch enables the vector shift to identify the splat source value from
the other block, and generate expected WebAssembly bytecode when lowering.

Diff Detail

Event Timeline

YolandaCY created this revision.Aug 21 2023, 1:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 21 2023, 1:54 AM
YolandaCY published this revision for review.Aug 21 2023, 5:18 AM
YolandaCY added reviewers: tlively, craig.topper.

This is to resolve a WebAssembly codegen issue when vector shift is used in a loop, while the shift amount is initialized outside the loop. Could you help take a look? Thanks!

Herald added a project: Restricted Project. · View Herald TranscriptAug 21 2023, 5:18 AM

Thanks for the patch! It looks like this will be a nice improvement.

@craig.topper, it would be great to get your comments as well, as someone more familiar with the target-independent infrastructure here.

llvm/include/llvm/CodeGen/SelectionDAG.h
2459–2460 ↗(On Diff #551933)

It would be good to add a comment describing the contents of this map.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
841

Why allow null instruction pointers here? It would seem simpler to ensure that callers pass a valid instruction and to assume we have a valid instruction here.

845

What is the benefit of including isShiftAmountScalar() here, given that we know it is always true?

llvm/test/CodeGen/WebAssembly/simd-shift-in-loop.ll
2

What do you think about using the auto-update script for this test? The output would be more verbose, but it would also be easy to update if anything changes, and I think it would be helpful to see that the whole function is emitted correctly.

17

Can we add a test where the vshift is a phi, just to show that that still works correctly?

YolandaCY updated this revision to Diff 552298.Aug 22 2023, 3:31 AM

Add test and comments

YolandaCY marked 3 inline comments as done.Aug 22 2023, 6:41 AM

Thank you Thomas for the comments! Please see my updates in the new revision.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
841

This is a fast check on the outside block when visit the splat vector, and don't know yet if the splat vector will be used by a shift op.
To identify the instruction I need to iterate all uses of the splat vector until we find the vector shift. Since this is only needed for WebAssembly target, I add a quick check here to reduce the cost for other platforms. Seems a little confusion, do you think we need to seperate it to two functions?

845

This will be called in SelectionDAGBuilder to skip the optimizaiton for other platforms when visit shift.

llvm/test/CodeGen/WebAssembly/simd-shift-in-loop.ll
2

Sure. Previously I make it simple to avoid mismatch on unrelated changes. But if we have an auto-update script that would be helpful to verify the whole function directly.

17

Sure. I have added one more test.

tlively added inline comments.Aug 22 2023, 12:54 PM
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
841

Oh I see, that makes sense.

845

We know that the isShiftAmountScalar() here will be the WebAssemblyTargetLowering version of isShiftAmountScalar(), so it will always be true, right? So this line could be:

return I->isShift() && I->getOperand(1) == Splat;

Could this use the TargetLowering::shouldSinkOperands hook to get CodeGenPrepare to move the splat into the loop. ARM, X86, and RISC-V all do that.

YolandaCY updated this revision to Diff 553057.Aug 24 2023, 3:06 AM
YolandaCY marked 2 inline comments as done.

Use the shouldSinkOperands hook in CodeGenPrepare

Could this use the TargetLowering::shouldSinkOperands hook to get CodeGenPrepare to move the splat into the loop. ARM, X86, and RISC-V all do that.

Thanks for the suggestion! I have revised the code to use this existing hook. Please help take a look again. @craig.topper @tlively Thanks!

tlively accepted this revision.Aug 24 2023, 7:41 AM

Nice, this LGTM. Thanks for the tip, @craig.topper!

This revision is now accepted and ready to land.Aug 24 2023, 7:41 AM

Nice, this LGTM. Thanks for the tip, @craig.topper!

Thanks Thomas. Could you help me commit this change?

Sure thing. I'll use your author email from the other patches in your phabricator profile.

Sure thing. I'll use your author email from the other patches in your phabricator profile.

OK, Thank you!