Page MenuHomePhabricator

Add LiveRangeShrink pass to shrink live range within BB.
ClosedPublic

Authored by danielcdh on Apr 26 2017, 2:46 PM.

Details

Summary

LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
MatzeB added inline comments.May 9 2017, 10:58 AM
lib/CodeGen/LiveRangeShrink.cpp
51

You never seem to be using negative numbers, consider using unsigned instead of int.

65

M[Old]?

66

M[New] or better avoid the double lookup with M.count(New) above and use an M.find(New) for both.

85

Why not pass a MachineBasicBlock &MBB as parameter to the function instead of the iterator and doing the Start->getParent()->instr_end() dance? You could have just used for (MachineInstr &I : MBB) {} in that case.

125

I would have expected this outside the loop so it can be reused (considering that you also went through the trouble of clear()ing it inside BuildInstOrderMap.

126

Collecting the SawStore value is unnecessary. The only user appears to be MachineInstr::isSafeToMove() below which writes(!) the variable.

151

Barrier.

212–216

There's a variant of splice that can move a sequence of instructions which should work nicely here.

It is also not clear to me if moving the debug instructions is legal here. They are not considered when checking how early an instruction can move so you are possibly moving the debug instruction above a point where the value its operand is defined.

This revision now requires changes to proceed.May 9 2017, 10:58 AM
danielcdh marked 21 inline comments as done.May 9 2017, 2:39 PM

Thanks for the review.

lib/CodeGen/LiveRangeShrink.cpp
65

M is const map, thus cannot use M[Old].

85

Because Start may not be MBB.begin(), but could also be instruction that has side effect.

The reason I used Start->getParent()->instr_end() instead of passing in MBB and use MBB.instr_end() is because I want to pass one less parameter to make the function clearer.

94–95

Code updated to inline the logic into caller to make the logic clearer.

126

SawStore is both input and output of MachineInstr::isSafeToMove().

212–216

Done

For the Debug Value, it should be legal as the same logic is used by MachineLICM.

danielcdh updated this revision to Diff 98351.May 9 2017, 2:40 PM
danielcdh edited edge metadata.

update

skatkov added inline comments.May 9 2017, 10:27 PM
lib/CodeGen/LiveRangeShrink.cpp
136

Why don't you refresh SawStore here?

andreadb added inline comments.May 10 2017, 12:21 PM
lib/CodeGen/LiveRangeShrink.cpp
95–96

You should early exit if this function must be skipped.

if (skipFunction(*MF.getFunction())
  return false;

We want to be able to 'opt-bisect' this pass.
Also, this pass shouldn't be run on optnone functions.

179

I think that you should also skip debug vaue instructions.

The loop at line 192 works under the assumption that a MI's DBG_VALUE always follows directly after it. This is also suggested by your comment at line 189.

If you don't skip debug values, then you potentially break the above-mentioned assumption.

danielcdh added inline comments.May 10 2017, 1:56 PM
lib/CodeGen/LiveRangeShrink.cpp
95–96

skipFunction added. Thanks for the comment.

This will not be run in optnone as in:

if (getOptLevel() != CodeGenOpt::None)
  addPass(&LiveRangeShrinkID);
136

Not sure if I understand the comment. SawStore is updated in MI.isSafeToMove. Do you mean that I can remove the previous "SawStore = BuildInstOrderMap", and set SawStore's initial value to "false"?

179

Done. Great catch!

skatkov added inline comments.May 10 2017, 8:11 PM
lib/CodeGen/LiveRangeShrink.cpp
136

My understanding is that if you meet that MI has side effect and it is a barrier you will not move later instruction through this one. SawStore is also a kind of barrier and you cannot move instruction which intersects with the store instruction. After you set the new barrier you can reset SawStore. Say if you saw store before new barrier and there is no stores after the barrier you are free to move instruction which do not intersect with stores. However if SawStore has been set to true once I do not see the place where it can be to reset to false. Do I miss anything?

However the overall logic around SawStore is not clear to me. SawStore is set in BuildInstOrderMap. So it detects where there is a store in the tail. It would prevent us re-ordering instruction (we move instruction up always) even if store is located later. Is it expected?

Currently, we set 'SawStore' if there is at least one 'mayStore' instruction in the scheduling region.

However, we could provide a much more accurate information about where 'mayStore' instructions are in the code.
For example, 'BuildInstOrderMap' could populate a (ordered) map '<order_number, MI*>', where MI is a 'mayStore' instruction, and 'order_map' is the instruction number in the sequence.

Later on, you could query that map to check if it is safe to move a load from loc_B to loc_A. If the map contains a store with an order_number in range [loc_A, loc_B], then you know that it is not safe to move it.

This is just an idea for a future improvement.

lib/CodeGen/LiveRangeShrink.cpp
82–86

We can early exit from the loop in BuildInstOrderMap as soon as we encounter an instruction with unmodeled side effects.

An instruction with unmodeled side effects is essentially a scheduling barrier.
We don't need to keep numbering instructions after a scheduling barrier because the algorithm would force a recomputation of OrderMap (see line 138).

136

You are right. When BuildInstrOrderMap() is called, SawStore should be updated too. Otherwise we risk to prevent the reordering of loads in the new scheduling region.

193–201

It would be nice to have a STATISTIC that counts the number of instructions reordered by this pass.

danielcdh updated this revision to Diff 98635.May 11 2017, 8:39 AM
danielcdh marked an inline comment as done.

update

Thanks for the reviews!

lib/CodeGen/LiveRangeShrink.cpp
82–86

That would actually make the compile time slower by adding additional check on the frequent path (BuildInstOrderMap), as instructions with unmodeled side effects is extremely rare in real programs.

136

I've updated the logic to update SawStore on the fly and reset to false when an instruction with unmodeled side effect is met.

MatzeB accepted this revision.May 11 2017, 10:52 AM

Accepting this now as it looks reasonably safe and fast to me (nitpicks below).

It is a conservative/simple heuristic but given the fact that we lack other means of global code motion (in the sense of inter-basic block) in CodeGen today this is fine.

Please wait for some of the other reviewers which showed interest here to accept before committing.

lib/CodeGen/LiveRangeShrink.cpp
11–13

Needs three slashes to be recognized as doxygen. (I know this is wrong in a lot of existing code, but we should get it right in new code)

113

Maybe move UseMap outside the loop and clear() it each iteration instead so there is a chance that memory allocations can be reused.

212–216

I'm not sure why this would work. Maybe you should double check by compiling the llvm-testsuite (or some other set of programs) with a debug compiler and -g...

This revision is now accepted and ready to land.May 11 2017, 10:52 AM
andreadb accepted this revision.May 11 2017, 11:09 AM

Accepting this now as it looks reasonably safe and fast to me (nitpicks below).

It is a conservative/simple heuristic but given the fact that we lack other means of global code motion (in the sense of inter-basic block) in CodeGen today this is fine.

Please wait for some of the other reviewers which showed interest here to accept before committing.

LGTM too.

Thanks!

danielcdh updated this revision to Diff 98677.May 11 2017, 2:25 PM
danielcdh marked 2 inline comments as done.

update

danielcdh added inline comments.May 11 2017, 2:25 PM
lib/CodeGen/LiveRangeShrink.cpp
212–216

glad that I run this with -g and actually caught a bug: when I move instructions together with debug info, I should increment the Next pointer, otherwise the iterator will iterate to instructions that have already been processed. Updated the implementation and added test coverage.

Thanks all for the reviews. Any other comments/concerns? If not, I'll commit the patch tomorrow.

andreadb accepted this revision.May 12 2017, 3:54 AM

Thanks for fixing the issue with debug values.

The patch still looks good to me.
A couple of comments can be slightly improved (see below).

lib/CodeGen/LiveRangeShrink.cpp
167

Please remember to add a period at the end of each sentence.

183

s/shrinked/shrunk
Also, please add a period at the end of this sentence.

186

You should update this comment since we also skip debug values now.

danielcdh updated this revision to Diff 98774.May 12 2017, 8:35 AM
danielcdh marked 3 inline comments as done.

update

I'll submit the patch now to allow rest of today to catch potential buildbot breakage. Thanks again for all the reviews.

Dehao

danielcdh closed this revision.May 12 2017, 12:42 PM

Sorry about the regression. As Quentin pointed out, moving instructions close to phi should not be a problem as the algorithm checks the moved instruction uses >1 single-use defs. Looking at the example of the regression case, the transformation looks reasonable if we only consider shrinking live-ranges. Could you provide an IR that I can try reproduce the issue and see why register allocation is worse?

During the mean time, I can prepare a patch to move this to x86-only, or revert the patch if necessary.

patch to move the pass to x86 is proposed: https://reviews.llvm.org/D33294

Will have a separate patch to fix the REG_SEQUENCE handling once I get a test that I can include in unittesting. I suppose the fix is: ignore instruction whose def register size is no less than the sum of the use register sizes?

Thanks,
Dehao

patch to move the pass to x86 is proposed: https://reviews.llvm.org/D33294

Will have a separate patch to fix the REG_SEQUENCE handling once I get a test that I can include in unittesting. I suppose the fix is: ignore instruction whose def register size is no less than the sum of the use register sizes?

REG_SEQUENCE is typically used to construct tuple register classes. So you get a set of values in consecutively numbered registers (you may watch my llvm dev meeting talk to see some introduction).

In any case I think the safest solution here would be to NOT trigger on REG_SEQUENCE but on register classes; For the REG_SEQUENCES in question a number of small 1-register values are shrunk while extending a larger multi-register value, so the initial assumption that everything is well when you shrink more ranges than you extend is wrong. So a conservative answer here would be to simply bail out if the register classes of the vregs you are shrinking/extended don't match up.

Hi, escha,

Could you send me an IR file or snippet that could trigger this performance issue? I can include it in the patch as unittest.

Thanks,
Dehao

danielcdh reopened this revision.May 18 2017, 6:26 PM

The patch was reverted because it breaks tests at runtime.

This revision is now accepted and ready to land.May 18 2017, 6:26 PM
danielcdh updated this revision to Diff 99641.May 19 2017, 4:00 PM

Updated the implementation and tests:

  1. Do not hoist instructions if its def MO does not share the same register class with the use MO.
  2. Include both instruction order and MI in the use map. Using instruction order alone is not good enough because when instruction get hoisted, its order will be the same as the next instruction. In this case, we need to ensure the insertion point is after the last barrier, i.e. last instruction MO (implicitly defined by MI) has been used.

PTAL, Thanks!

danielcdh updated this revision to Diff 99642.May 19 2017, 4:01 PM

clang-format...

MatzeB accepted this revision.May 26 2017, 1:39 PM

Register class logic looks good to me with the nitpicks below addressed.

lib/CodeGen/LiveRangeShrink.cpp
165–166

This should better be !TargetRegisterInfo::isVirtualRegister(Reg) which is not the same as isPhysicalRegister() because a register operand can also be NoReg (=0) which shouldn't stop movement.

178–179
  • As we got surprised by it in the first try, it would be good to document the necessity for the regClass() check. Something along the lines of:
/// The heuristic does not handle different register classes yet (registers of different sizes, looser/tighter constraints).
  • getRegClass() only works for virtual registers, as far as I understand this needs a check as we could get here with a constant physreg.
danielcdh updated this revision to Diff 100804.May 30 2017, 5:23 PM
danielcdh marked 2 inline comments as done.

update

lib/CodeGen/LiveRangeShrink.cpp
178–179

Updated the logic to make sure that DefMO and MO will always be virtual register.

MatzeB accepted this revision.May 30 2017, 5:31 PM

LGTM, with nitpicks below.

lib/CodeGen/LiveRangeShrink.cpp
166–167

You can probably use this so instructions are still moved even if they use %noreg.

if (!Reg || MRI.isConstantPhysReg(Reg))
  continue;
168–169
  • Remove commented code.
danielcdh updated this revision to Diff 100809.May 30 2017, 5:48 PM
danielcdh marked an inline comment as done.

update

danielcdh marked an inline comment as done.May 30 2017, 5:48 PM
danielcdh updated this revision to Diff 100917.May 31 2017, 1:44 PM

update test

danielcdh closed this revision.May 31 2017, 4:25 PM

For the record: This landed in r304371