This is an archive of the discontinued LLVM Phabricator instance.

[MachineLICM] Don't treat cross copies as cheap
Needs ReviewPublic

Authored by dmgreen on Mar 13 2020, 8:25 AM.

Details

Summary

This adjusts the cost of cross register bank copy instructions in MachineLICM, no longer treating them as expensive in order to allow them to be pulled out of inner loops. This can be especially important for MVE code where we sink VDUP's into blocks attempting to fold them into register variants of vector instructions. Where this scalar is a float value we are left with a COPY from SPR to GPR which then needs to be hoisted.

This is an alternative to D76024.
The isCrossCopy code was taken from DetectDeadLanes with some adjustment/cleanup. I don't feel like an expert on some of these areas like subreg copies.
I'm not 100% sure all these test diffs are better, but most of them look OK to me.

Diff Detail

Event Timeline

dmgreen created this revision.Mar 13 2020, 8:25 AM
arsenm added inline comments.Mar 13 2020, 9:13 AM
llvm/lib/CodeGen/MachineLICM.cpp
1157–1187

This looks pretty much identical to TargetRegisterInfo::shouldRewriteCopySrc/shareSameRegisterFile

dmgreen updated this revision to Diff 250256.Mar 13 2020, 10:33 AM

Thanks for the suggestion.

I made it call shouldRewriteCopySrc, which allows the target to modify things if it wishes. The AMDGPU/idiv-licm.ll test changed ever so slightly. Let me know if I should change things around to instead call shareSameRegisterFile more directly.

Thanks for the suggestion.

I made it call shouldRewriteCopySrc, which allows the target to modify things if it wishes. The AMDGPU/idiv-licm.ll test changed ever so slightly. Let me know if I should change things around to instead call shareSameRegisterFile more directly.

If DetectDeadLanes also has the same code, then they should probably be merged and directly exposed

asbirlea removed a subscriber: asbirlea.Mar 13 2020, 2:34 PM
dmgreen updated this revision to Diff 250503.Mar 16 2020, 3:33 AM

Now using TargetRegisterInfo::shareSameRegisterFile.

I have altered the way that MVE VDUP are lowered, and we end up with a VMOV now instead of using a COPY. That means I don't need this any more for MVE at least.

@arsenm, let me know whether you want me to continue with this for AMDGPU or not.

arsenm added inline comments.Mar 23 2020, 9:23 AM
llvm/include/llvm/CodeGen/TargetRegisterInfo.h
533 ↗(On Diff #250503)

There's no reason for this to be virtual?

llvm/test/CodeGen/AMDGPU/idiv-licm.ll
276

This probably is an improvement

dmgreen updated this revision to Diff 252263.Mar 24 2020, 4:29 AM

No longer a virtual method.

arsenm added inline comments.Mar 24 2020, 8:54 AM
llvm/lib/CodeGen/MachineLICM.cpp
1191–1193

I'd like to get somebody else's opinion on this too. Based on the isAsCheapAsAMove check, I'm not sure what the intended handling of copies is here. For AMDGPU for example, copies get more expensive with wider register classes, which isn't a property of the register bank

qcolombet added inline comments.May 4 2020, 10:40 AM
llvm/lib/CodeGen/MachineLICM.cpp
1173

I feel we should stick to shouldRewriteCopySrc to take into account the "cheapness" of the copy.

Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2020, 10:40 AM
RKSimon resigned from this revision.Jun 18 2020, 2:08 AM