Current AntiDepBreaker implementation can rename callee saved register restore instruction, and then callee saved register contain garbage value after return.
We should not rename callee saved register restore instruction.
Differential D31908
[AntiDepBreaker] Don't rename callee saved register restore instruction Carrot on Apr 10 2017, 3:02 PM. Authored by
Details
Diff Detail Event TimelineComment Actions This hits on a known problem in llvms CodeGen today: There is no good way to differentiate between a register use/def that is required because of ABI (or other) constraints and just a normal assignment by the register allocator. Only the latter can be renamed/recolored which makes it tricky to it correctly today. I personally don't care about AggressiveAntiDepBreaker too much. However this "fix" seems pretty pessimistic/conservative to me: As I understand it you just exclude all the callee saves from renaming. I think targets that rely on the anti dep breaker (seems to be used by PowerPC/Hexagon today) should at least do some benchmarking before accepting this. Comment Actions As a possible better design: Last time I had a similar problem in the context of copy propagation I checked whether the uses/defs involved had a register forced in their MCInstrDesc description and I assumed all implicit uses/defs to be constrained as well as they are typically used for calls/asm. That seems to work for most targets (some had additional constraints not matched by this heuristic but I think the big ones X86, ARM, AArch64, Power lit tests worked when I experimented with that) though none of these heuristics made it into trunk so take this with a grain of salt... Comment Actions Sure I guess wrong code beats performance. Though at some point you may just as well disable the AntiDepBreaker pass for improved correctness :) Comment Actions Anyway won't be me standing in the way of this fix, I leave the decision to powerpc/hexagon folks. Comment Actions Unfortunately the test case in pr32608 is not reliable, the usage of callee saved register is not optimal, future enhancement can and should remove the callee saved register usage in the test case. Comment Actions The patch also checks (RegRefs.count(AntiDepReg) == 1), with this condition, it almost ensures the instruction is callee saved register restore instruction, if it is not, it should be some other rare cases. Comment Actions We add all callee saved registers as live-in on all paths from the restores to the exit, so this problem shouldn't affect us. Actually the anti-dep breaker was the reason why we did it in the first place. Comment Actions All callee saves? That would be new to me. I think we only do that for callee saves that are saved/restored *somewhere* in the function, the rest is considered "pristine" and not explicitely mentioned in the live-in lists. Comment Actions In any case as far as I understand it this patch is about not renaming one of the restored registers (so we do not suddenly restore the callee save into some unrelated register that happens to be free). So it's not about liveness but about which registers are legal to rename/recolor. Comment Actions Oh indeed that sounds a lot less conservative as I initially assumed. So the effect on performance is probably minimal/negligible. Comment Actions I don't think this is really the best approach. This will simply disable renaming of callee-saved registers, no matter where they are used. Comment Actions Then what's the problematic situation? It was exactly my understanding that we renamed a callee-saved register defined by the restore instruction. What else could be the problem? Comment Actions Yes that's what I understood as well, but I don't see how that is related to liveness information. Comment Actions Being live-on-exit will prevent renaming, so by updating the liveness we prevent renaming on the paths where it shouldn't happen. So, the problem that this patch is trying to fix will not happen on Hexagon. Comment Actions Ah, I remember---it was the reverse problem: renaming other register into a restored callee-saved register. In any case, the problem that this patch is trying to fix still doesn't apply to Hexagon. We add all of the clobbered callee-saved register as implicit uses to the return instructions. Comment Actions I think that this patch tries to fix a problem where it manifests itself, not where it actually happens. Comment Actions The second loop in function AggressiveAntiDepBreaker::StartBlock tries to do the work you described. Unfortunately it doesn't work with shrink wrapping. It says: Mark live-out callee-saved registers. In a return block this is 1 The function MFI.getPristineRegs(MF) gets all registers actually saved in the function, not necessary at the prolog position if shrink wrapping has occurred. Any suggestions on how to fix it? Comment Actions
This statement is wrong! getPristineRegs() gives you all ABI callee saved registers that are not save/restored anywhere in the function (because the register was never modified or clobbered, so the save/restore so there was no need for a save/restore). Comment Actions Thank you for the correction, actually I want to say The condition !Pristine.test(Reg) can only check if the specified callee saved register has been saved in the function, but not necessary at the prolog position if shrink wrapping has occurred. Comment Actions This should be done in PEI, not in the anti-dependency breaker. This is where the CSR registers get the concrete saving/restoring code. Passes running afterwards should not need to know about what register is CSR or not (except pristine registers, but that's general enough). Comment Actions Note that this is true for live-ins, however we don't explicitely safe live-out lists in the basic blocks. So for return blocks you have to take all CSR registers (depending on what you do with/without pristines). (See also https://reviews.llvm.org/D32464 where I partially rediscovered that fact). Comment Actions We (Hexagon) add all the non-pristine CSRs as implicit uses of the return instructions. Comment Actions Yeah I like that way of modeling things explicitely. Unfortunately that is not how most llvm targets work today. Comment Actions Could we check this in the verifier? One advantage this might have is that, for blocks ending in unreachable, we can easily realize that we don't need the restoration code. Comment Actions Are we adding restoration code to such blocks now? That could be checked in the verifier, but it's not incorrect. Maybe a warning could be issued... The verifier should be able to check if the callee-saved registers are properly marked as live/used in the relevant places (return instructions, etc.) Comment Actions I didn't mean checking that in the verifier. I meant checking that the return insts had the necessary implicit operands in the non-unreachable case. Looks like you say yes.
Comment Actions I think it would work and I think it would make the representation a bit cleaner. If there happened to be a target with hundreds of CSRs that would be a lot of extra operands, on the other hand that target already has hundreds of entries in the CSR list anyway. Changing this would simplify some code in LivePhysReg, RegScavenger, DepBreaker etc. where people iterate the CSRs today. It's a small thing though I don't expect it to have a huge impact one way or another...
Verifying it today would be easy. Though if we go that route it may be reasonable to just not model an explicit list of CSRs anymore after PrologEpilog inserter (which means we couldn't verify anymore, but I always like it if we can do the same with less concepts/complexity). It would indeed fix some situations in which we cannot properly differentiate between return vs. unreachable/noreturn call anymore (imagine if-converting a return instructions that ends up in the middle of a basic block, which MachineBasicBlock::isReturnBlock() would miss). Comment Actions If that is the case then there probably wasn't a bug in AggressiveAntiDepBreaker but you got hit by incomplete live-in lists. Comment Actions
Will do, thanks a lot. Comment Actions
D32464 does fix the problem in pr32608. |