This patch updates register allocation to enable spilling gprs to vector registers rather than the stack. A new register class is added which is a super class of G8RC and VSFRC, called GPFPRC. The getLargestLegalSuperClass then returns GPFPRC for an input of G8RC. The patch also adds post RA pseudo instructions (VSRSPILL_LD, VSRSPILL_ST) for spilling a register of the new class used in LoadRegFromStackSlot and StoreRegToStackSlot. These are then expanded after register allocation to either a scalar or vector load.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Looks interesting.
This patch potentially increases the number of VSR save/restore in method prologue/epilogue (depending on which VSR is selected for spilling). Is my understanding correct?
lib/Target/PowerPC/PPCInstrInfo.cpp | ||
---|---|---|
52 ↗ | (On Diff #104661) | I feel this name somewhat misleading. MTVSR instruction may be used for other purposes. NumGPRtoVSRSpill or somthing? |
test/CodeGen/PowerPC/gpr-vsr-spill2.ll | ||
25 ↗ | (On Diff #104661) | What's the intention of this complicated test case without spills to VSR? |
test/CodeGen/PowerPC/gpr-vsr-spill.ll | ||
---|---|---|
19 ↗ | (On Diff #104661) | Actually, I cannot catch why we need spill here. |
test/CodeGen/PowerPC/gpr-vsr-spill2.ll | ||
---|---|---|
25 ↗ | (On Diff #104661) | This case shows how a spill of the new reg class is handled. Here we spilled a GPR to GPFPR where the new reg was also a gpr. We then needed to spill the new GPFPR using either a scalar store or vector store depending on the allocated register. |
test/CodeGen/PowerPC/gpr-vsr-spill.ll | ||
---|---|---|
19 ↗ | (On Diff #104661) | Yes, but we need a register to save the result of the add. The result register used for the add is r30 and so one of the input parameters is spilled. |
test/CodeGen/PowerPC/gpr-vsr-spill2.ll | ||
---|---|---|
1 ↗ | (On Diff #104661) | Having this as an IR-level test seems fragile. Could you make this into a (simpler) MIR test that shows the behavior? |
lib/Target/PowerPC/PPCInstrInfo.cpp | ||
---|---|---|
1994 ↗ | (On Diff #104661) | Well, this is a pseudo that requires being expandPostRAPseudo()-ed. Wouldn't we want to say return expandPostRAPseudo(MI) here? |
Overall, I like the patch. Seems quite nice and simple. However, I'm really not a fan of the naming convention. It is not clear to me why someone would be expected to make the connection between "GPFPRC" and "VSRSPILL". I think those should use the same base name to make the connection clear. Furthermore, I don't really think you should convey what registers are in the register class, but what the register class is used for. Perhaps the class and the related artifacts should be something like SPILLTOVSRRC and SPILLTOVSR_LD, etc. Perhaps other reviewers can chime in here as well.
Also, I think an important opportunity is lost since we don't do this for GPRRC. Of course, it doesn't have to be done in this patch, but I think a comment including a FIXME indicating this limitation is in order. Then if this turns out to be a performance win, we can follow this up with a patch that handles the 32-bit registers as well.
Only spill to volatile vsrs as spilling to non-volatile increases prologue/epilogue and leads to performance degradation.
test/CodeGen/PowerPC/gpr-vsr-spill2.ll | ||
---|---|---|
1 ↗ | (On Diff #104661) | I tried to create an MIR case using this, but the limitation with MIR tests identified in https://reviews.llvm.org/D33562 with MachineFunctionInfo not being saved/dumped as part of emitting .mir leads to machine verified errors. I tried to change the global vars to local vars to get around this limitation. However, doing that no longer reproduces the narrowed case so I will leave this as an IR test. |
Other than a few minor inline comments, this LGTM.
Perhaps some of the other reviewers want to chime in on this. Otherwise please address those nits and commit.
lib/Target/PowerPC/PPCInstrInfo.cpp | ||
---|---|---|
934 ↗ | (On Diff #108980) | I think for most (all?) other conditions, we have the source first. Please stick to that convention here as well. |
2036 ↗ | (On Diff #108980) | Just a nit. The register you're spilling is the source and the stack slot you're spilling it to is the target. So calling it TargetReg is a bit misleading when it's a store. :) |
lib/Target/PowerPC/PPCRegisterInfo.cpp | ||
54 ↗ | (On Diff #108980) | Nit: it's not actually called gp8rc but g8rc if I remember correctly. |
341 ↗ | (On Diff #108980) | // For Power9 we allow the user to enable GPR to vector spills. Since we don't currently enable it by default even on Power9. |
344 ↗ | (On Diff #108980) | Please add a check for ELFv2 ABI. We are allowing spills only to the volatile VSR's, so we want to enable this only on the ABI where the VSR's we've selected are actually volatile. |
lib/Target/PowerPC/PPCRegisterInfo.td | ||
308 ↗ | (On Diff #108980) | // Allow spilling GPR's into caller-saved VSR's. |
test/CodeGen/PowerPC/gpr-vsr-spill2.ll | ||
1 ↗ | (On Diff #108980) | As implemented, this test case doesn't really test anything meaningful. It really just tests that there's a reg-to-reg copy (implemented as a move-register) followed by a spill of the target register. The two could be separated by arbitrary amount of code (including redefinition of the register). Unless you can add more meaningful testing to this complicated test case, I would simply get rid of it. |