This is an archive of the discontinued LLVM Phabricator instance.

[ARM] WLS/LE Code Generation
ClosedPublic

Authored by samparker on Jun 26 2019, 5:37 AM.

Details

Summary

Backend changes to enable WLS/LE low-overhead loops for armv8.1-m:

  1. Use TTI to communicate to the HardwareLoop pass that we should try to generate intrinsics that guard the loop entry, as well as setting the loop trip count.
  2. Lower the BRCOND that uses said intrinsic to an Arm specific node: ARMWLS.
  3. ISelDAGToDAG the node to a new pseudo instruction: t2WhileLoopStart.
  4. Add support in ArmLowOverheadLoops to handle the new pseudo instruction.

Diff Detail

Repository
rL LLVM

Event Timeline

samparker created this revision.Jun 26 2019, 5:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 26 2019, 5:37 AM
SjoerdMeijer added inline comments.Jun 28 2019, 1:09 AM
lib/Target/ARM/ARMISelLowering.cpp
12736 ↗(On Diff #206640)

Initially I was a bit confused by this, the xor in particular. It makes sense though, but you're not checking it here in the patterns?

SjoerdMeijer added inline comments.Jun 28 2019, 1:35 AM
lib/Target/ARM/ARMLowOverheadLoops.cpp
144 ↗(On Diff #206640)

As soon as we found one, can we stop? Or, can we continue looking as a sanity check to see if we find more and bail with a diagnostic?

183 ↗(On Diff #206640)

Who reviewed this? ;-)
These 2 if statements here, can it just be this:

if (!(Start && Dec && End))
dmgreen added inline comments.Jun 28 2019, 1:37 AM
lib/Target/ARM/ARMLowOverheadLoops.cpp
320 ↗(On Diff #206640)

These is something called ARMBaseInstrInfo::analyzeBranch which can help optimise away these unconditional fallthroughs too, if it can be made to work with loloop instructions.

samparker marked 2 inline comments as done.Jun 28 2019, 5:17 AM
samparker added inline comments.
lib/Target/ARM/ARMLowOverheadLoops.cpp
144 ↗(On Diff #206640)

We carry on searching just in case there's multiple predecessors with multiple loop starts. I'll add a TODO here because I think having multiple LoopStarts is fine and we need to support it, but currently this pass is not setup for it.

320 ↗(On Diff #206640)

I'll add a TODO

samparker updated this revision to Diff 207050.Jun 28 2019, 6:23 AM
  • Added some TODOs and FIXMEs.
  • Checking for xor/setcc in the backend while lowering the intrinsic.
  • Added some more tests.
SjoerdMeijer accepted this revision.Jun 28 2019, 7:17 AM

This looks fine as an initial commit to me. I.e., this can be followed up, addressing the TODOs, to produce even more efficient code and handle more cases.

lib/Target/ARM/ARMInstrInfo.td
103 ↗(On Diff #207050)

nit:

// start supporting *TP versions

I think for clarity I would spell this out as the tail-predicated versions

test/CodeGen/Thumb2/LowOverheadLoops/loop-guards.ll
4 ↗(On Diff #207050)

nit:

".. that changes the generic HardwareLoop .."

->

".. that changes in the generic HardwareLoop.. "?

This revision is now accepted and ready to land.Jun 28 2019, 7:17 AM
dmgreen added inline comments.Jun 28 2019, 9:11 AM
lib/Target/ARM/ARMLowOverheadLoops.cpp
320 ↗(On Diff #206640)

Sounds good. Cheers. I'm not sure if those functions will be a little awkward to use with hardware loops, but they may allow some extra BB layout if we teach them how to analyse the branches.

This revision was automatically updated to reflect the committed changes.
RKSimon added a subscriber: RKSimon.Jul 3 2019, 5:54 AM
RKSimon added inline comments.
llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
12911

@samparker This is problematic on several levels... - we have an unused variable warning in NDEBUG builds and the else case only occurs if Const == NULL

if (isa<ConstantSDNode>(CC->getOperand(1)))
  assert(cast<ConstantSDNode>(CC->getOperand(1)).isOne() && "Expected to compare against 1");
samparker marked an inline comment as done.Jul 3 2019, 6:13 AM
samparker added inline comments.
llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
12911

Ah - good point! I'll push up a fix. Thanks.

llvm/trunk/test/Transforms/HardwareLoops/ARM/structure.ll