This adds two new target hooks to TargetInstrInfo: the first one maps branch instructions to branches that use a smaller encoding, and the second one checks if the offset is in range of a branch instruction.
Details
Diff Detail
Event Timeline
I think what you're doing overlaps with what I'm doing: https://github.com/arsenm/llvm/commits/long-branch
| include/llvm/Target/TargetInstrInfo.h | ||
|---|---|---|
| 483 | I was debating whether this should provide the complete branch and dest block offsets, or if one single offset. In either case the types should be int64_t | |
Hi Matt, thanks for reviewing. I have changed the offsets to int64_t.
I had a look and there indeed appears to be some overlap.
The motivation for my change is to minimise branch distances by reordering machine blocks, for which I need these helper functions.
I was thinking about trying to optimize the branch relaxation pass by reordering and duplicating blocks to avoid long expansions. Do you think it makes sense as a separate pass, or for BranchRelaxation to try to do both?
| include/llvm/Target/TargetInstrInfo.h | ||
|---|---|---|
| 488 | This should return int because of the failure -1 value. I think the name needs work too. It's unclear to me what shrunkBranchOpcode means, especially with no context of size. Maybe something like optimizeShortBranch(MachineInstr &MI, int64_t Offset)? | |
I have changed the function name and its arguments. For consistency, I changed the other function to have the same arguments.
Regarding the overlap: I understand things a bit better now. You mentioned that you want to "optimize the branch relaxation pass by reordering and duplicating blocks to avoid long expansions", which I think is exactly the same I am trying to achieve. I was focusing on ARM codegen and thus the ARM backend, and was actually not aware of this AArch64 BranchRelaxation backend pass. :-( There are quite a bit of utility functions in there that I was implementing as well, so I will be looking into factoring some of this out. I was making my chances in codegen pass MachineBlockPlacement, which runs before the target specific passes. As far as I understand now, AArch64 BranchRelaxation is making changes for legality reasons: if the displacement of conditional branches is out of range, it creates a conditional branch with the inverse condition, followed by an unconditional branch. If you want to optimise branch distances (create short branches), then MachineBlockPlacement might look like a better place, which is what I was doing and thus we might be working on the same things. Let me know if we need to further synchronise on this.
Changed the 1st argument of isBranchOffsetInRange back to unsigned Opcode (because optimizeShortBranch is returning an integer).
| include/llvm/Target/TargetInstrInfo.h | ||
|---|---|---|
| 492–503 | Why do you need these? I don't think these are the right API, and the isBranchOffsetInRange is better | |
| include/llvm/Target/TargetInstrInfo.h | ||
|---|---|---|
| 492–503 | I think I need both the isBranchOffsetInRange and the getBranchRange (or similar) functions. A check to see if a branch is in range is one thing, but for moving a block in range actual values are needed. Looking at BranchRelaxation and AArch64InstrInfo.cpp, I now see I probably want to have this in the base class: static unsigned getBranchDisplacementBits(unsigned Opc) because that's what I am trying to do. | |
| include/llvm/Target/TargetInstrInfo.h | ||
|---|---|---|
| 492–503 | In my patches I avoid this, because in AMDGPU I need to add an additional offset to the branch distance, because the distance is computed from the address of the next instruction which I think is more awkward to express this way | |
Do you mean that having function isBranchOffsetInRange is enough? I agree, but for a block placement algorithm that would possibly require invoking it a lot of times. Thus I thought exposing the ranges would be convenient.
Yes. How about you keep isBranchOffsetInRange in this patch, and add the others to your placement patch to show what you mean?
Got rid of the recently added getRange functions.
This patch just introduces these 2 new target hooks: optimizeShortBranch and isBranchOffsetInRange.
| include/llvm/Target/TargetInstrInfo.h | ||
|---|---|---|
| 486–488 | I think the design of this one is too restrictive, since it implies the branch can be optimized by simply changing the instruction opcode. I would expect such a simple change be handled by MC. If you still need the MI transformation, it would be better to expect the target to modify the branch (or insert a new instruction sequence to replace it) so that the target is free to change the operand structure or whatever else it needs to do | |
| include/llvm/Target/TargetInstrInfo.h | ||
|---|---|---|
| 486–488 | Changing it so that the target can modify the branch seems indeed to be more in line with other hooks. I will make this change. I am not entirely sure that I am following the part about simple transformations being done in MC. | |
I was debating whether this should provide the complete branch and dest block offsets, or if one single offset. In either case the types should be int64_t