This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Alter t2DoLoopStart to define lr
ClosedPublic

Authored by dmgreen on Oct 21 2020, 6:52 AM.

Details

Summary

This changes the definition of t2DoLoopStart from

t2DoLoopStart rGPR

to

GPRlr = t2DoLoopStart rGPR

This will hopefully mean that low overhead loops are more tied together, and we can more reliably generate loops without reverting or being at the whims of the register allocator.

This is a fairly simple change in itself, but leads to a number of other required alterations.

  • The hardware loop pass, if UsePhi is set, now generates loops of the form:
     %start = llvm.start.loop.iterations(%N)
   loop:
     %p = phi [%start], [%dec]
	 %dec = llvm.loop.decrement.reg(%p, 1)
	 %c = icmp ne %dec, 0
	 br %c, loop, exit
  • For this a new llvm.start.loop.iterations intrinsic was added, identical to llvm.set.loop.iterations but produces a value as seen above, gluing the loop together more through def-use chains.
  • This new instrinsic conceptually produces the same output as input, which is taught to SCEV so that the checks in MVETailPredication are not effected.
  • Some minor changes are needed to the ARMLowOverheadLoop pass, but it has been left mostly as before. We should now more reliably be able to tell that the t2DoLoopStart is correct without having to prove it, but t2WhileLoopStart and tail-predicated loops will remain the same.
  • And all the tests have been updated. There are a lot of them.

This patch on it's own might cause more trouble that it helps, with more tail-predicated loops being reverted, but some additional patches can hopefully improve upon that to get to something that is better overall.

Diff Detail

Event Timeline

dmgreen created this revision.Oct 21 2020, 6:52 AM
dmgreen requested review of this revision.Oct 21 2020, 6:52 AM

A heroic effort to change all those tests! Are you planning on also updating WLS and the test_set intrinsic too..?

llvm/docs/LangRef.rst
15516

void -> i32, i64

llvm/lib/CodeGen/HardwareLoops.cpp
328

Guess this should have a nice target agnostic name.

llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
1072–1073

I seem to remember that these mov lr searches only affect DoLoopStart, as the WhileLoopStart would usually/always be in the predecessor block, so you might be able to remove the two searches without having to change any other tests.

dmgreen marked 3 inline comments as done.Oct 22 2020, 3:21 AM

A heroic effort to change all those tests! Are you planning on also updating WLS and the test_set intrinsic too..?

Yeah, I was having fun :-/ I think all the tests are still OK (baring one that I removed, as I don't believe we can get into the state it found itself in any more). Some of them may no longer test what they originally tested though.

WLS - maybe. It really needs to be some sort of terminator that produces a value. I will look into it.

This will need an extra patch to not cause regressions with tail predication. Perhaps something like the patch that moved the t2DoLoopStart to the end of the loop, but moving things around at this stage can be difficult to get really right.

llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
1072–1073

I am always worried about what checks we still need and whether there are enough to not cause problems. No tests changed, which is a good sign. I'll try and run some testing too.

dmgreen updated this revision to Diff 299908.Oct 22 2020, 3:23 AM
dmgreen marked an inline comment as done.

Updated.

samparker accepted this revision.Oct 22 2020, 4:03 AM

Many thanks Dave, LGTM

This revision is now accepted and ready to land.Oct 22 2020, 4:03 AM
This revision was automatically updated to reflect the committed changes.
llvm/test/CodeGen/Thumb2/LowOverheadLoops/unsafe-cpsr-loop-use.mir