Page MenuHomePhabricator

[ARM] Convert floating point splats to integer
ClosedPublic

Authored by dmgreen on Apr 23 2020, 9:57 AM.

Details

Summary

Under MVE a vdup will always take a gpr register, not a floating point value. During DAG combine we convert the types to a bitcast to an integer in an attempt to fold the bitcast into other code. This is OK, but only works inside the same basic block. To do the same trick across a basic block boundary we need to convert the type in codegenprepare, before the splat is sunk into the loop.

This adds a convertSplatType function to codegenprepare to do that, putting bitcasts around the splatto force the type to an integer. There is then some adjustment to the code in shouldSinkOperands to handle the extra bitcasts.

Diff Detail

Event Timeline

dmgreen created this revision.Apr 23 2020, 9:57 AM
dmgreen updated this revision to Diff 260568.Apr 28 2020, 2:04 AM

Now also hoists the first bitcast up to it's user, in case it is not in the same block.

Herald added a project: Restricted Project. · View Herald TranscriptApr 28 2020, 2:04 AM
dmgreen updated this revision to Diff 263128.Mon, May 11, 2:57 AM
dmgreen added a reviewer: efriedma.

Rebase the tests on other changes that have gone in recently.

efriedma added inline comments.Mon, May 11, 1:03 PM
llvm/lib/CodeGen/CodeGenPrepare.cpp
6424

I guess this is only loosely related to your patch, but should this transform also be handled by tryToSinkFreeOperands?

6513

Checking whether an insertion point is legal is more complicated than this. In particular, you can't insert code after an invoke/callbr, and you can't insert code into a block terminated by a catchswitch.

I'm not sure if there's a helper for this check anywhere.

llvm/lib/Target/ARM/ARMISelLowering.cpp
15782

Is this specifically a ShuffleVectorInst?

dmgreen updated this revision to Diff 263377.Tue, May 12, 1:29 AM
dmgreen marked 3 inline comments as done.

Instruction -> ShuffleVectorInstruction and tighten the conditions for hoisting bitcasts.

dmgreen added a subscriber: spatel.Tue, May 12, 1:29 AM
dmgreen added inline comments.
llvm/lib/CodeGen/CodeGenPrepare.cpp
6424

Yeah, I think it pre-dates the tryToSinkFreeOperands version but only handles the shifts that X86 cares about. I can have a look at changing it over to see if it can use SinkFreeOperands, but @spatel is updating it in D79718. I'll at least wait until that has been done.

6513

Ah right. I've added Terminators and EHPads (mostly because there is likely little to be gained from moving the bitcast, even if it is safe). That with the phi's should exclude catchswitch too I believe.

There is ScalarEvolutionExpander::findInsertPointAfter which is trying to find an insertion point after Op. Here we don't care as much about giving up, and some of the semantics of following invokes will be different. I didn't spot much else.

I can try and make a function if you think that's better.

spatel added inline comments.Tue, May 12, 5:13 AM
llvm/lib/CodeGen/CodeGenPrepare.cpp
6424

Thanks for cc'ing me. This function seems worse than just awkwardly structured; it can do transforms that were not intended. That's because we are walking the users of the splat shuffle without checking whether the actual use is as a shift's amount operand (operand 1 of a regular shift opcode). That could be a (hopefully minor) perf bug if it ever happens, but that's probably still not what we want to happen.

I think this transform should begin from the shift instruction, not the shuffle. tryToSinkFreeOperands() looks promising to replace this and the TLI hook. I'll take a look at updating this now, but that doesn't need to block this patch.

This revision is now accepted and ready to land.Tue, May 12, 12:58 PM
This revision was automatically updated to reflect the committed changes.