In r333455 we added a peephole to fix the corner cases that result from separating base + offset lowering of global address.The peephole didn't handle some of the cases because it only has a basic block view instead of a function level view. This patch replaces that logic with a machine function pass. In addition to handling the original cases it handles uses of the global address across blocks in function and folding an offset from LW\SW instruction. This pass won't run for OptNone compilation, so there will be a negative impact overall vs the old approach at O0.
Details
Diff Detail
Event Timeline
Thanks, I'll have to finish stepping through the code tomorrow but this seems to be working great. For the GCC torture suite at O1, ~40 of the test cases see improved codegen (fewer instructions). Two gain a single instruction, but that's just noise due to slight regalloc differences.
One change it would be worth noting in the commit message / patch description: this pass won't run for OptNone compilation, so there will be a negative impact overall vs the old approach at O0. Not a problem - having this skipped at O0 makes sense, but worth mentioning all the same. Arguably the RISCVISelDAGToDAG peepholes should be skipped for OptNone too.
lib/Target/RISCV/RISCVMergeBaseOffset.cpp | ||
---|---|---|
1 | This line is only 79 chars long. Probably nicer to keep the usual format (three =) and have a shorter description. e.g. //== RISCVMergeBaseOffset.cpp - Fold offset into address lowering sequence ==// | |
143 | Given that we know the MI is an Add, is looping over the operands the clearest way of handling this? |
lib/Target/RISCV/RISCVMergeBaseOffset.cpp | ||
---|---|---|
1 | Oops, wrong paste in that comment. Corrected: //===-- RISCVMergeBaseOffset.cpp - Optimise address calculations ---------===// |
Thanks, this looks good to me assuming the minor nits are addressed. The only one requiring code changes is the for loop in matchLargeOffset.
lib/Target/RISCV/RISCVMergeBaseOffset.cpp | ||
---|---|---|
25 | There are a few miscellaneous minor formatting issues in this file that will be fixed with clang-format | |
28–38 | A number of these includes seem totally unused. e.g. STLExtras., RISCVTargetObjctFile.h, TargetLoweringObjectFileIMpl.h |
lib/Target/RISCV/RISCVMergeBaseOffset.cpp | ||
---|---|---|
1 | I actually spent a long time pondering the cleanest way to add this comment, yours is better so I'l use it :) | |
28–38 | Thanks for the comment, yeah I believe I need a second look here | |
143 | I need to find the input argument that is coming from the offset creation; which can be a LUI or ADDI followed by LUI as explained in the comment above the function. Do you think there is a cleaner way to do it? |
lib/Target/RISCV/RISCVMergeBaseOffset.cpp | ||
---|---|---|
143 | I was imagining removing the loop with something like the below: assert((TailAdd.getOpcode() == RISCV::ADD) && "Expected ADD instruction!"); unsigned Rs = TailAdd.getOperand(1).getReg(); unsigned Rt = TailAdd.getOperand(2).getReg(); unsigned Reg = Rs == GAReg ? Rt : Rs; // Can't fold if the register has more than one use. if (!MRI->hasOneUse(Reg)) return false; // This can point to an ADDI or a LUI: MachineInstr &OffsetTail = *MRI->getVRegDef(Reg); if (OffsetTail.getOpcode() == RISCV::ADDI) { // The offset value has non zero bits in both %hi and %lo parts. // Detect an ADDI that feeds from a LUI instruction. MachineOperand &AddiImmOp = OffsetTail.getOperand(2); if (AddiImmOp.getTargetFlags() != RISCVII::MO_None) return false; int64_t OffLo = AddiImmOp.getImm(); MachineInstr &OffsetLui = *MRI->getVRegDef(OffsetTail.getOperand(1).getReg()); MachineOperand &LuiImmOp = OffsetLui.getOperand(1); if (OffsetLui.getOpcode() != RISCV::LUI || LuiImmOp.getTargetFlags() != RISCVII::MO_None || !MRI->hasOneUse(OffsetLui.getOperand(0).getReg())) return false; int64_t OffHi = OffsetLui.getOperand(1).getImm(); Offset = (OffHi << 12) + OffLo; LLVM_DEBUG(dbgs() << " Offset Instrs: " << OffsetTail << " " << OffsetLui); DeadInstrs.insert(&OffsetTail); DeadInstrs.insert(&OffsetLui); return true; } else if (OffsetTail.getOpcode() == RISCV::LUI) { // The offset value has all zero bits in the lower 12 bits. Only LUI // exists. LLVM_DEBUG(dbgs() << " Offset Instr: " << OffsetTail); Offset = OffsetTail.getOperand(1).getImm() << 12; DeadInstrs.insert(&OffsetTail); return true; } return false; |
This line is only 79 chars long. Probably nicer to keep the usual format (three =) and have a shorter description.
e.g. //== RISCVMergeBaseOffset.cpp - Fold offset into address lowering sequence ==//