This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/CodeGen/
-
CodeGen/
-
AggressiveAntiDepBreaker.cpp

Differential D31908

[AntiDepBreaker] Don't rename callee saved register restore instruction
AcceptedPublic

Authored by Carrot on Apr 10 2017, 3:02 PM.

Download Raw Diff

Details

Reviewers

echristo
MatzeB
hfinkel
kparzysz

Summary

Current AntiDepBreaker implementation can rename callee saved register restore instruction, and then callee saved register contain garbage value after return.

We should not rename callee saved register restore instruction.

Diff Detail

Event Timeline

Carrot created this revision.Apr 10 2017, 3:02 PM

ping

echristo added a reviewer: MatzeB.Apr 26 2017, 4:36 PM

Testcase? Seems reasonable otherwise, but Matthias might have something more here.

This hits on a known problem in llvms CodeGen today: There is no good way to differentiate between a register use/def that is required because of ABI (or other) constraints and just a normal assignment by the register allocator. Only the latter can be renamed/recolored which makes it tricky to it correctly today.

I personally don't care about AggressiveAntiDepBreaker too much. However this "fix" seems pretty pessimistic/conservative to me: As I understand it you just exclude all the callee saves from renaming. I think targets that rely on the anti dep breaker (seems to be used by PowerPC/Hexagon today) should at least do some benchmarking before accepting this.

FWIW this is fixing a wrong code case for Power.

As a possible better design: Last time I had a similar problem in the context of copy propagation I checked whether the uses/defs involved had a register forced in their MCInstrDesc description and I assumed all implicit uses/defs to be constrained as well as they are typically used for calls/asm.

That seems to work for most targets (some had additional constraints not matched by this heuristic but I think the big ones X86, ARM, AArch64, Power lit tests worked when I experimented with that) though none of these heuristics made it into trunk so take this with a grain of salt...

In D31908#738932, @echristo wrote:

FWIW this is fixing a wrong code case for Power.

Sure I guess wrong code beats performance. Though at some point you may just as well disable the AntiDepBreaker pass for improved correctness :)

Anyway won't be me standing in the way of this fix, I leave the decision to powerpc/hexagon folks.

This revision is now accepted and ready to land.Apr 26 2017, 5:16 PM

In D31908#738924, @echristo wrote:

Testcase? Seems reasonable otherwise, but Matthias might have something more here.

Unfortunately the test case in pr32608 is not reliable, the usage of callee saved register is not optimal, future enhancement can and should remove the callee saved register usage in the test case.

In D31908#738929, @MatzeB wrote:

This hits on a known problem in llvms CodeGen today: There is no good way to differentiate between a register use/def that is required because of ABI (or other) constraints and just a normal assignment by the register allocator. Only the latter can be renamed/recolored which makes it tricky to it correctly today.

I personally don't care about AggressiveAntiDepBreaker too much. However this "fix" seems pretty pessimistic/conservative to me: As I understand it you just exclude all the callee saves from renaming. I think targets that rely on the anti dep breaker (seems to be used by PowerPC/Hexagon today) should at least do some benchmarking before accepting this.

The patch also checks (RegRefs.count(AntiDepReg) == 1), with this condition, it almost ensures the instruction is callee saved register restore instruction, if it is not, it should be some other rare cases.

We add all callee saved registers as live-in on all paths from the restores to the exit, so this problem shouldn't affect us. Actually the anti-dep breaker was the reason why we did it in the first place.

In D31908#741341, @kparzysz wrote:

We add all callee saved registers as live-in on all paths from the restores to the exit, so this problem shouldn't affect us. Actually the anti-dep breaker was the reason why we did it in the first place.

All callee saves? That would be new to me. I think we only do that for callee saves that are saved/restored *somewhere* in the function, the rest is considered "pristine" and not explicitely mentioned in the live-in lists.

In D31908#741342, @MatzeB wrote:

All callee saves? That would be new to me. I think we only do that for callee saves that are saved/restored *somewhere* in the function, the rest is considered "pristine" and not explicitely mentioned in the live-in lists.

Yes, that's what I meant---the ones that were actually clobbered in the function.

In D31908#741347, @kparzysz wrote:

In D31908#741342, @MatzeB wrote:

All callee saves? That would be new to me. I think we only do that for callee saves that are saved/restored *somewhere* in the function, the rest is considered "pristine" and not explicitely mentioned in the live-in lists.

Yes, that's what I meant---the ones that were actually clobbered in the function.

In any case as far as I understand it this patch is about not renaming one of the restored registers (so we do not suddenly restore the callee save into some unrelated register that happens to be free). So it's not about liveness but about which registers are legal to rename/recolor.

In D31908#740062, @Carrot wrote:

In D31908#738929, @MatzeB wrote:

This hits on a known problem in llvms CodeGen today: There is no good way to differentiate between a register use/def that is required because of ABI (or other) constraints and just a normal assignment by the register allocator. Only the latter can be renamed/recolored which makes it tricky to it correctly today.

I personally don't care about AggressiveAntiDepBreaker too much. However this "fix" seems pretty pessimistic/conservative to me: As I understand it you just exclude all the callee saves from renaming. I think targets that rely on the anti dep breaker (seems to be used by PowerPC/Hexagon today) should at least do some benchmarking before accepting this.

The patch also checks (RegRefs.count(AntiDepReg) == 1), with this condition, it almost ensures the instruction is callee saved register restore instruction, if it is not, it should be some other rare cases.

Oh indeed that sounds a lot less conservative as I initially assumed. So the effect on performance is probably minimal/negligible.

I don't think this is really the best approach. This will simply disable renaming of callee-saved registers, no matter where they are used.

In D31908#741348, @MatzeB wrote:

In any case as far as I understand it this patch is about not renaming one of the restored registers (so we do not suddenly restore the callee save into some unrelated register that happens to be free). So it's not about liveness but about which registers are legal to rename/recolor.

Then what's the problematic situation? It was exactly my understanding that we renamed a callee-saved register defined by the restore instruction. What else could be the problem?

In D31908#741358, @kparzysz wrote:

In D31908#741348, @MatzeB wrote:

In any case as far as I understand it this patch is about not renaming one of the restored registers (so we do not suddenly restore the callee save into some unrelated register that happens to be free). So it's not about liveness but about which registers are legal to rename/recolor.

Then what's the problematic situation? It was exactly my understanding that we renamed a callee-saved register defined by the restore instruction. What else could be the problem?

Yes that's what I understood as well, but I don't see how that is related to liveness information.

Being live-on-exit will prevent renaming, so by updating the liveness we prevent renaming on the paths where it shouldn't happen. So, the problem that this patch is trying to fix will not happen on Hexagon.

Ah, I remember---it was the reverse problem: renaming other register into a restored callee-saved register.

In any case, the problem that this patch is trying to fix still doesn't apply to Hexagon. We add all of the clobbered callee-saved register as implicit uses to the return instructions.

I think that this patch tries to fix a problem where it manifests itself, not where it actually happens.

In D31908#741341, @kparzysz wrote:

We add all callee saved registers as live-in on all paths from the restores to the exit, so this problem shouldn't affect us. Actually the anti-dep breaker was the reason why we did it in the first place.

The second loop in function AggressiveAntiDepBreaker::StartBlock tries to do the work you described. Unfortunately it doesn't work with shrink wrapping. It says:

Mark live-out callee-saved registers. In a return block this is
all callee-saved registers. In non-return this is any
// callee-saved register that is not saved in the prolog.

1 The function MFI.getPristineRegs(MF) gets all registers actually saved in the function, not necessary at the prolog position if shrink wrapping has occurred.
2 If the register restore instructions are moved to non-return BB, the logic should be same as return block, then the logic in the code is wrong.

Any suggestions on how to fix it?

1 The function MFI.getPristineRegs(MF) gets all registers actually saved in the function, not necessary at the prolog position if shrink wrapping has occurred.

This statement is wrong! getPristineRegs() gives you all ABI callee saved registers that are not save/restored anywhere in the function (because the register was never modified or clobbered, so the save/restore so there was no need for a save/restore).

In D31908#744019, @MatzeB wrote:

1 The function MFI.getPristineRegs(MF) gets all registers actually saved in the function, not necessary at the prolog position if shrink wrapping has occurred.

This statement is wrong! getPristineRegs() gives you all ABI callee saved registers that are not save/restored anywhere in the function (because the register was never modified or clobbered, so the save/restore so there was no need for a save/restore).

Thank you for the correction, actually I want to say

The condition !Pristine.test(Reg) can only check if the specified callee saved register has been saved in the function, but not necessary at the prolog position if shrink wrapping has occurred.

In D31908#743873, @Carrot wrote:

In D31908#741341, @kparzysz wrote:

We add all callee saved registers as live-in on all paths from the restores to the exit, so this problem shouldn't affect us. Actually the anti-dep breaker was the reason why we did it in the first place.

The second loop in function AggressiveAntiDepBreaker::StartBlock tries to do the work you described. Unfortunately it doesn't work with shrink wrapping. It says:

This should be done in PEI, not in the anti-dependency breaker. This is where the CSR registers get the concrete saving/restoring code. Passes running afterwards should not need to know about what register is CSR or not (except pristine registers, but that's general enough).

In D31908#745037, @kparzysz wrote:

In D31908#743873, @Carrot wrote:

In D31908#741341, @kparzysz wrote:

We add all callee saved registers as live-in on all paths from the restores to the exit, so this problem shouldn't affect us. Actually the anti-dep breaker was the reason why we did it in the first place.

The second loop in function AggressiveAntiDepBreaker::StartBlock tries to do the work you described. Unfortunately it doesn't work with shrink wrapping. It says:

This should be done in PEI, not in the anti-dependency breaker. This is where the CSR registers get the concrete saving/restoring code. Passes running afterwards should not need to know about what register is CSR or not (except pristine registers, but that's general enough).

Note that this is true for live-ins, however we don't explicitely safe live-out lists in the basic blocks. So for return blocks you have to take all CSR registers (depending on what you do with/without pristines). (See also https://reviews.llvm.org/D32464 where I partially rediscovered that fact).

In D31908#745251, @MatzeB wrote:

So for return blocks you have to take all CSR registers (depending on what you do with/without pristines). (See also https://reviews.llvm.org/D32464 where I partially rediscovered that fact).

We (Hexagon) add all the non-pristine CSRs as implicit uses of the return instructions.

In D31908#745271, @kparzysz wrote:

In D31908#745251, @MatzeB wrote:

So for return blocks you have to take all CSR registers (depending on what you do with/without pristines). (See also https://reviews.llvm.org/D32464 where I partially rediscovered that fact).

We (Hexagon) add all the non-pristine CSRs as implicit uses of the return instructions.

Yeah I like that way of modeling things explicitely. Unfortunately that is not how most llvm targets work today.

In D31908#745274, @MatzeB wrote:

Yeah I like that way of modeling things explicitely. Unfortunately that is not how most llvm targets work today.

Would it be reasonable to implement that for all targets?

In D31908#745958, @kparzysz wrote:

In D31908#745274, @MatzeB wrote:

Yeah I like that way of modeling things explicitely. Unfortunately that is not how most llvm targets work today.

Would it be reasonable to implement that for all targets?

Could we check this in the verifier? One advantage this might have is that, for blocks ending in unreachable, we can easily realize that we don't need the restoration code.

In D31908#746050, @hfinkel wrote:

Could we check this in the verifier? One advantage this might have is that, for blocks ending in unreachable, we can easily realize that we don't need the restoration code.

Are we adding restoration code to such blocks now? That could be checked in the verifier, but it's not incorrect. Maybe a warning could be issued...

The verifier should be able to check if the callee-saved registers are properly marked as live/used in the relevant places (return instructions, etc.)

In D31908#746071, @kparzysz wrote:

In D31908#746050, @hfinkel wrote:

Could we check this in the verifier? One advantage this might have is that, for blocks ending in unreachable, we can easily realize that we don't need the restoration code.

Are we adding restoration code to such blocks now? That could be checked in the verifier, but it's not incorrect. Maybe a warning could be issued...

I didn't mean checking that in the verifier. I meant checking that the return insts had the necessary implicit operands in the non-unreachable case. Looks like you say yes.

The verifier should be able to check if the callee-saved registers are properly marked as live/used in the relevant places (return instructions, etc.)

gberry added a subscriber: gberry.May 4 2017, 8:11 AM

In D31908#746072, @hfinkel wrote:

In D31908#746071, @kparzysz wrote:

I didn't mean checking that in the verifier. I meant checking that the return insts had the necessary implicit operands in the non-unreachable case. Looks like you say yes.

The verifier should be able to check if the callee-saved registers are properly marked as live/used in the relevant places (return instructions, etc.)

Sorry, yes, that is correct.

In D31908#746050, @hfinkel wrote:

In D31908#745958, @kparzysz wrote:

In D31908#745274, @MatzeB wrote:

Yeah I like that way of modeling things explicitely. Unfortunately that is not how most llvm targets work today.

Would it be reasonable to implement that for all targets?

I think it would work and I think it would make the representation a bit cleaner. If there happened to be a target with hundreds of CSRs that would be a lot of extra operands, on the other hand that target already has hundreds of entries in the CSR list anyway.

Changing this would simplify some code in LivePhysReg, RegScavenger, DepBreaker etc. where people iterate the CSRs today. It's a small thing though I don't expect it to have a huge impact one way or another...

Could we check this in the verifier? One advantage this might have is that, for blocks ending in unreachable, we can easily realize that we don't need the restoration code.

Verifying it today would be easy. Though if we go that route it may be reasonable to just not model an explicit list of CSRs anymore after PrologEpilog inserter (which means we couldn't verify anymore, but I always like it if we can do the same with less concepts/complexity).

It would indeed fix some situations in which we cannot properly differentiate between return vs. unreachable/noreturn call anymore (imagine if-converting a return instructions that ends up in the middle of a basic block, which MachineBasicBlock::isReturnBlock() would miss).

In D31908#745251, @MatzeB wrote:

In D31908#745037, @kparzysz wrote:

In D31908#743873, @Carrot wrote:

In D31908#741341, @kparzysz wrote:

We add all callee saved registers as live-in on all paths from the restores to the exit, so this problem shouldn't affect us. Actually the anti-dep breaker was the reason why we did it in the first place.

The second loop in function AggressiveAntiDepBreaker::StartBlock tries to do the work you described. Unfortunately it doesn't work with shrink wrapping. It says:

This should be done in PEI, not in the anti-dependency breaker. This is where the CSR registers get the concrete saving/restoring code. Passes running afterwards should not need to know about what register is CSR or not (except pristine registers, but that's general enough).

Note that this is true for live-ins, however we don't explicitely safe live-out lists in the basic blocks. So for return blocks you have to take all CSR registers (depending on what you do with/without pristines). (See also https://reviews.llvm.org/D32464 where I partially rediscovered that fact).

Can your patch in D32464 also fix this problem?

In D31908#746553, @Carrot wrote:

In D31908#745251, @MatzeB wrote:

In D31908#745037, @kparzysz wrote:

In D31908#743873, @Carrot wrote:

In D31908#741341, @kparzysz wrote:

We add all callee saved registers as live-in on all paths from the restores to the exit, so this problem shouldn't affect us. Actually the anti-dep breaker was the reason why we did it in the first place.

The second loop in function AggressiveAntiDepBreaker::StartBlock tries to do the work you described. Unfortunately it doesn't work with shrink wrapping. It says:

This should be done in PEI, not in the anti-dependency breaker. This is where the CSR registers get the concrete saving/restoring code. Passes running afterwards should not need to know about what register is CSR or not (except pristine registers, but that's general enough).

Note that this is true for live-ins, however we don't explicitely safe live-out lists in the basic blocks. So for return blocks you have to take all CSR registers (depending on what you do with/without pristines). (See also https://reviews.llvm.org/D32464 where I partially rediscovered that fact).

Can your patch in D32464 also fix this problem?

Possibly, try it out.

In D31908#746557, @MatzeB wrote:

In D31908#746553, @Carrot wrote:

In D31908#745251, @MatzeB wrote:

In D31908#745037, @kparzysz wrote:

In D31908#743873, @Carrot wrote:

In D31908#741341, @kparzysz wrote:

We add all callee saved registers as live-in on all paths from the restores to the exit, so this problem shouldn't affect us. Actually the anti-dep breaker was the reason why we did it in the first place.

The second loop in function AggressiveAntiDepBreaker::StartBlock tries to do the work you described. Unfortunately it doesn't work with shrink wrapping. It says:

This should be done in PEI, not in the anti-dependency breaker. This is where the CSR registers get the concrete saving/restoring code. Passes running afterwards should not need to know about what register is CSR or not (except pristine registers, but that's general enough).

Note that this is true for live-ins, however we don't explicitely safe live-out lists in the basic blocks. So for return blocks you have to take all CSR registers (depending on what you do with/without pristines). (See also https://reviews.llvm.org/D32464 where I partially rediscovered that fact).

Can your patch in D32464 also fix this problem?

Possibly, try it out.

If that is the case then there probably wasn't a bug in AggressiveAntiDepBreaker but you got hit by incomplete live-in lists.

Can your patch in D32464 also fix this problem?

Possibly, try it out.

Will do, thanks a lot.

Note that this is true for live-ins, however we don't explicitely safe live-out lists in the basic blocks. So for return blocks you have to take all CSR registers (depending on what you do with/without pristines). (See also https://reviews.llvm.org/D32464 where I partially rediscovered that fact).

Can your patch in D32464 also fix this problem?

Possibly, try it out.

D32464 does fix the problem in pr32608.
Thanks a lot!

I have an initial implementation of the discussed approach in D33003.

Revision Contents

Path

Size

lib/

CodeGen/

AggressiveAntiDepBreaker.cpp

4 lines

Diff 94744

lib/CodeGen/AggressiveAntiDepBreaker.cpp

	Show All 22 Lines
	DEBUG(dbgs() << " (not critical-path)\n");			DEBUG(dbgs() << " (not critical-path)\n");
	continue;			continue;
	} else if (PassthruRegs.count(AntiDepReg) != 0) {			} else if (PassthruRegs.count(AntiDepReg) != 0) {
	// If the anti-dep register liveness "passes-thru", then			// If the anti-dep register liveness "passes-thru", then
	// don't try to change it. It will be changed along with			// don't try to change it. It will be changed along with
	// the use if required to break an earlier antidep.			// the use if required to break an earlier antidep.
	DEBUG(dbgs() << " (passthru)\n");			DEBUG(dbgs() << " (passthru)\n");
	continue;			continue;
				} else if ((RegClassInfo.getLastCalleeSavedAlias(AntiDepReg) != 0) &&
				(RegRefs.count(AntiDepReg) == 1)) {
				// Don't rename callee saved register restore instruction.
				continue;
	} else {			} else {
	// No anti-dep breaking for implicit deps			// No anti-dep breaking for implicit deps
	MachineOperand *AntiDepOp = MI.findRegisterDefOperand(AntiDepReg);			MachineOperand *AntiDepOp = MI.findRegisterDefOperand(AntiDepReg);
	assert(AntiDepOp && "Can't find index for defined register operand");			assert(AntiDepOp && "Can't find index for defined register operand");
	if (!AntiDepOp \|\| AntiDepOp->isImplicit()) {			if (!AntiDepOp \|\| AntiDepOp->isImplicit()) {
	DEBUG(dbgs() << " (implicit)\n");			DEBUG(dbgs() << " (implicit)\n");
	continue;			continue;
	}			}
	Show All 22 Lines