This is an archive of the discontinued LLVM Phabricator instance.

AArch64: Move remaining target specific BranchRelaxation bits to TII
ClosedPublic

Authored by arsenm on Aug 8 2016, 9:30 AM.

Diff Detail

Event Timeline

arsenm updated this revision to Diff 67183.Aug 8 2016, 9:30 AM
arsenm retitled this revision from to AArch64: Move remaining target specific BranchRelaxation bits to TII.
arsenm updated this object.
arsenm added a subscriber: llvm-commits.
t.p.northover added inline comments.
lib/Target/AArch64/AArch64InstrInfo.cpp
131–132

This doesn't look right to me. I see that it's coming from a "return -1" but unconditional branches don't have unlimited range. They've got 26-bit signed range.

lib/Target/AArch64/AArch64InstrInfo.h
204–207

Could these APIs be expressed in terms of InsertBranch? At the moment it seems like we're crafting a parallel set of special-purpose branch-mangling functions but instead we could add something like

SmallVectorImpl<MachineOperand> getUnconditionalCond();
SmallVectorImpl<MachineOperand> getInvertedCond(SmallVectorImpl<MachineOperand> &Cond);

(feel free to bikeshed the interface, I'm not attached) and then these 3 become fairly simple wrappers around the existing functionality.

arsenm added inline comments.Aug 8 2016, 1:27 PM
lib/Target/AArch64/AArch64InstrInfo.cpp
131–132

The current code assumes they don't need to be relaxed, so I just assumed infinite range. I have another patch which adds support for expanding unconditional branches

lib/Target/AArch64/AArch64InstrInfo.h
204–207

I don't think it's quite the same. I thought about trying to use the existing family of analyzeBranch functions, but I think they have a different purpose. A better name would be something like analyzeBlockBranches. I hadn't thougt about trying to produce the cond vector from some other means.

The important features this one has are supporting inserting an additional branch block, passing the branch offset and returning the change in code size. I also may eventually want to introduce the ability to add even more blocks during he unconditional expansion, so at that point I don't think it has much in common with the existing InsertBranch, which just inserts a simple assumed legal branch. I suppose I could change the caller of this for conditional branches to use that instead of using 0 to assume a legal offset

t.p.northover added inline comments.Aug 8 2016, 2:49 PM
lib/Target/AArch64/AArch64InstrInfo.h
204–207

I don't think it's quite the same. I thought about trying to use the existing family of analyzeBranch functions, but I think they have a different purpose. A better name would be something like analyzeBlockBranches.

Sure, but these new functions are actually being used in a very similar way already.

  • The setDestBlock/insertInvertedBranch pair is essentially an insertBranch with inverted condition and the specified blocks.
  • insertInverted/insertUnconditional ditto.
  • The lone insertUnconditional (patching a fallthrough) is just an InsertBranch with only one BB.

The important features this one has are supporting inserting an additional branch block

Not sure I follow this one. Are you planning to make the TargetInstrInfo do part of the relaxation (creating BBs if the unconditional branch is out of range)? That sounds like the wrong place if so.

passing the branch offset

I take it this is going to be used for AMDGPU? It seems unused here, and I can't quite see the implications.

and returning the change in code size.

That is unfortunate. It still seems like it'd be better to extend the InsertBranch interface rather than duplicate it though. Simplest would be returning an std::pair<unsigned, unsigned>, or perhaps adding a helper to convert from NumInstructions to a size.

BTW, I should say that I love the idea of the pass becoming generic in general! It's got to be a really commonly useful bit of functionality.

arsenm added inline comments.Aug 10 2016, 2:02 PM
lib/Target/AArch64/AArch64InstrInfo.h
204–207

I posted D23373 which uses these. The point is unconditional branches need to be expanded, which requires knowing the offsets. The pass now inserts the new block. I have another problem to solve someday which will involve inserting more blocks for weird spilling cases, but I'm not sure how that should work quite yet.

Sorry I dropped this last week. It doesn't seem unreasonable to add KnownOffset arguments to InsertBranch's interface, and the return value really doesn't seem like a blocker to using InsertBranch.

In fact, I've just noticed that ReverseBranchCondition already exists. So I think all that'd be needed is the getUnconditionalCondition.

Would you like me to put together an alternative patch, and we can see if it'll fit your needs too?

Sorry I dropped this last week. It doesn't seem unreasonable to add KnownOffset arguments to InsertBranch's interface, and the return value really doesn't seem like a blocker to using InsertBranch.

In fact, I've just noticed that ReverseBranchCondition already exists. So I think all that'd be needed is the getUnconditionalCondition.

Would you like me to put together an alternative patch, and we can see if it'll fit your needs too?

I think it might be just a question of naming. I think it would make sense to use InsertBranch when expanding unconditional branches. The main use for insertUnconditionalBranch (at least for my purposes) is insertIndirectBranch. I can switch the branch insertion for expanding conditional branches to use the regular InsertBranch, and rename it for expanding unconditionals.

I think it might be just a question of naming.

For me it's the code duplication, though I think we might be in agreement now and just looking at it from different angles. AArch64BranchRelaxation contains quite a bit of code basically copy/pasted from AArch64InstrInfo.cpp just because the Analyze/Insert/Remove branch interface wasn't quite right. This looks even sillier when both instances are in InstrInfo with subtly different semantics.

Adding a convenience insertUnconditionalBranch function (and even insertInvertedConditionBranch) seems reasonable, but one way or the other delegation with InsertBranch should be involved.

Anyway, a very preliminary and largely untested outline of what I have in mind is at https://github.com/TNorthover/llvm/tree/branch-relax (making use of a possibly buggy version of D23379 so we can analyze everything). The diff is +57/-174 at the moment. It seems like we'd just have to make InsertBranch return the bytes as well (if not for IfConversion we could do a straight swap) and accept optional KnownOffset args and it could do everything you need.

I think it might be just a question of naming.

For me it's the code duplication, though I think we might be in agreement now and just looking at it from different angles. AArch64BranchRelaxation contains quite a bit of code basically copy/pasted from AArch64InstrInfo.cpp just because the Analyze/Insert/Remove branch interface wasn't quite right. This looks even sillier when both instances are in InstrInfo with subtly different semantics.

Adding a convenience insertUnconditionalBranch function (and even insertInvertedConditionBranch) seems reasonable, but one way or the other delegation with InsertBranch should be involved.

Anyway, a very preliminary and largely untested outline of what I have in mind is at https://github.com/TNorthover/llvm/tree/branch-relax (making use of a possibly buggy version of D23379 so we can analyze everything). The diff is +57/-174 at the moment. It seems like we'd just have to make InsertBranch return the bytes as well (if not for IfConversion we could do a straight swap) and accept optional KnownOffset args and it could do everything you need.

We could get away with it returning the number of instructions to get the size by calling getInstSizeInBytes on them (though I always thought the return the number of instructions API was weird). Your patch untangles the weird mess with NeedSplit trying to skip analyzeBranch which I was hoping to fix later, so overall I think your patch makes sense. The API changes I really need make more sense in the context of inserting an indirect branch, so I think keeping those bits as a new function separate from InsertBranch still makes sense rather than cluttering up all of the other uses

arsenm updated this revision to Diff 71675.Sep 16 2016, 11:08 AM

Update for trunk

SjoerdMeijer accepted this revision.Oct 6 2016, 8:15 AM
SjoerdMeijer added a reviewer: SjoerdMeijer.
SjoerdMeijer added a subscriber: SjoerdMeijer.

This looks reasonable to me. In fact, we have been discussing this before in D23067, which was also supposed to add hook isBranchOffsetInRange. I had to park this work (hope to pick it up), but anyway, they are more than useful hooks to have and the implementation here makes sense to me.

This revision is now accepted and ready to land.Oct 6 2016, 8:15 AM
arsenm closed this revision.Oct 6 2016, 8:51 AM

r283458