This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add machine function pass to merge base + offset
ClosedPublic

Authored by sabuasal on Jun 6 2018, 4:55 PM.

Details

Summary
In r333455 we added a peephole to fix the corner cases that result
from separating base + offset lowering of global address.The
peephole didn't handle some of the cases because it only has a basic
block view instead of a function level view.

This patch replaces that logic with a machine function pass. In
addition to handling the original cases it handles uses of the global
address across blocks in function and folding an offset from LW\SW
instruction. This pass won't run for OptNone compilation, so there
will be a negative impact overall vs the old approach at O0.

Diff Detail

Event Timeline

sabuasal created this revision.Jun 6 2018, 4:55 PM
asb added a comment.Jun 13 2018, 5:06 PM

Thanks, I'll have to finish stepping through the code tomorrow but this seems to be working great. For the GCC torture suite at O1, ~40 of the test cases see improved codegen (fewer instructions). Two gain a single instruction, but that's just noise due to slight regalloc differences.

One change it would be worth noting in the commit message / patch description: this pass won't run for OptNone compilation, so there will be a negative impact overall vs the old approach at O0. Not a problem - having this skipped at O0 makes sense, but worth mentioning all the same. Arguably the RISCVISelDAGToDAG peepholes should be skipped for OptNone too.

lib/Target/RISCV/RISCVMergeBaseOffset.cpp
2

This line is only 79 chars long. Probably nicer to keep the usual format (three =) and have a shorter description.

e.g. //== RISCVMergeBaseOffset.cpp - Fold offset into address lowering sequence ==//

144

Given that we know the MI is an Add, is looping over the operands the clearest way of handling this?

asb added inline comments.Jun 13 2018, 5:09 PM
lib/Target/RISCV/RISCVMergeBaseOffset.cpp
2

Oops, wrong paste in that comment. Corrected:

//===-- RISCVMergeBaseOffset.cpp - Optimise address calculations ---------===//

asb accepted this revision.Jun 14 2018, 1:39 AM

Thanks, this looks good to me assuming the minor nits are addressed. The only one requiring code changes is the for loop in matchLargeOffset.

lib/Target/RISCV/RISCVMergeBaseOffset.cpp
26

There are a few miscellaneous minor formatting issues in this file that will be fixed with clang-format

29–39

A number of these includes seem totally unused. e.g. STLExtras., RISCVTargetObjctFile.h, TargetLoweringObjectFileIMpl.h

This revision is now accepted and ready to land.Jun 14 2018, 1:39 AM
sabuasal added inline comments.Jun 15 2018, 1:42 PM
lib/Target/RISCV/RISCVMergeBaseOffset.cpp
2

I actually spent a long time pondering the cleanest way to add this comment, yours is better so I'l use it :)

29–39

Thanks for the comment, yeah I believe I need a second look here

144

I need to find the input argument that is coming from the offset creation; which can be a LUI or ADDI followed by LUI as explained in the comment above the function.

Do you think there is a cleaner way to do it?

asb added inline comments.Jun 20 2018, 1:31 PM
lib/Target/RISCV/RISCVMergeBaseOffset.cpp
144

I was imagining removing the loop with something like the below:

assert((TailAdd.getOpcode() == RISCV::ADD) && "Expected ADD instruction!");
 unsigned Rs = TailAdd.getOperand(1).getReg();
 unsigned Rt = TailAdd.getOperand(2).getReg();
 unsigned Reg = Rs == GAReg ? Rt : Rs;

 // Can't fold if the register has more than one use.
 if (!MRI->hasOneUse(Reg))
   return false;
 // This can point to an ADDI or a LUI:
 MachineInstr &OffsetTail = *MRI->getVRegDef(Reg);
 if (OffsetTail.getOpcode() == RISCV::ADDI) {
   // The offset value has non zero bits in both %hi and %lo parts.
   // Detect an ADDI that feeds from a LUI instruction.
   MachineOperand &AddiImmOp = OffsetTail.getOperand(2);
   if (AddiImmOp.getTargetFlags() != RISCVII::MO_None)
     return false;
   int64_t OffLo = AddiImmOp.getImm();
   MachineInstr &OffsetLui =
       *MRI->getVRegDef(OffsetTail.getOperand(1).getReg());
   MachineOperand &LuiImmOp = OffsetLui.getOperand(1);
   if (OffsetLui.getOpcode() != RISCV::LUI ||
       LuiImmOp.getTargetFlags() != RISCVII::MO_None ||
       !MRI->hasOneUse(OffsetLui.getOperand(0).getReg()))
     return false;
   int64_t OffHi = OffsetLui.getOperand(1).getImm();
   Offset = (OffHi << 12) + OffLo;
   LLVM_DEBUG(dbgs() << "  Offset Instrs: " << OffsetTail
                     << "                 " << OffsetLui);
   DeadInstrs.insert(&OffsetTail);
   DeadInstrs.insert(&OffsetLui);
   return true;
 } else if (OffsetTail.getOpcode() == RISCV::LUI) {
   // The offset value has all zero bits in the lower 12 bits. Only LUI
   // exists.
   LLVM_DEBUG(dbgs() << "  Offset Instr: " << OffsetTail);
   Offset = OffsetTail.getOperand(1).getImm() << 12;
   DeadInstrs.insert(&OffsetTail);
   return true;
 }
 return false;
sabuasal updated this revision to Diff 153166.Jun 27 2018, 1:35 PM

Updated comments by @asb .

sabuasal edited the summary of this revision. (Show Details)Jun 27 2018, 1:42 PM
This revision was automatically updated to reflect the committed changes.