The iterator over super and sub registers doesn't include both 8-bit
registers in its list. So if both registers are used and only one of
them is live on return, then we need to make sure that the other 8-bit
register is also marked as live and not zeroed out.
Details
- Reviewers
nickdesaulniers craig.topper arsenm pengfei
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
The iterator over super and sub registers doesn't include both 8-bit registers in its list.
This is the use of sub_and_superregs_inclusive in the caller PEI::insertZeroCallUsedRegs() that you're referring to? Does that need to be fixed? Is there another iterator that gives us both? Should there be? Maybe @pengfei , @RKSimon , or @craig.topper knows?
The generated TargetRegisterClass for GR8RegClass and GRH8RegClass BOTH have set CoveredBySubRegs to false.
llvm/lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
527 ↗ | (On Diff #481464) | Is live out of the function itself? Or between basic blocks? MachineRegisterInfo can answer whether a given register is live in MachineRegisterInfo::isLiveIn. I assume that's the same side of the coin for "is a given register live out of my predecessor?" Perhaps that's better to query than looking into every operand of every instruction in the function? MachineBasicBlock::liveouts() also gives you an iterator to a RegisterPairMask. Might be useful here? |
540–541 ↗ | (On Diff #481464) | Is it possible to never set the bits of RegsToZero in PEI::insertZeroCallUsedRegs() for these in the first place? |
llvm/test/CodeGen/X86/zero-call-used-regs-i386.ll | ||
21 | add nounwind to to the attribute list of @test1 to remove these obnoxious .cfi directives which are just noise. Though, I do also wonder if a MIR test (-stop-after=prologepilog) would be less brittle/dependent on %al being allocated. Then we could have 1 function per register and know precisely what happens for which registers. |
llvm/lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
499–501 ↗ | (On Diff #481464) | Is there an example of how to use them? |
527 ↗ | (On Diff #481464) | MachineBasicBlock::liveouts() doesn't appear to work here. |
535 ↗ | (On Diff #481464) | No. We're only checking if it's live on the return instruction. I won't be counted as a def. |
540–541 ↗ | (On Diff #481464) | There doesn't seem to be an easy mechanism for doing that. The iterator doesn't appear to go over both the high and low registers. I'm not sure if that's a bug or by design. |
That might be by design, if by sub register they mean something like %al is a sub register of %ax, because they "overlap."
I think x86 is an oddity in that TargetRegisterInfo::sub_and_superregs_inclusive doesn't return sub_8bit or sub_8bit_hi. I suspect this is because in llvm/lib/Target/X86/X86RegisterInfo.td they're not marked as CoveredBySubRegs. This patch isn't doing that much work; it's looking at each MachineBasicBlock terminator operands if the terminate is a return instruction only.
The offending MachineInstr in this case is RET 0, $al.
I spent some time playing with MCRegisterDesc and TargetRegisterInfo::getNumSubRegIndices (and sub register indices in general) and couldn't quite get what I wanted. getX86SubSuperRegister looks like a hack/lacks symmetry/works around CoveredBySubRegs and I can't help but wonder if that's a bug, or if marking sub_8bit or sub_8bit_hi as CoveredBySubRegs would break other things.
Please update the comment and add nounwind to the test.
llvm/lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
527 ↗ | (On Diff #481464) | Please update the comment to s/live out/live out of the function's return instructions/ |
I didn't know sub_and_superregs_inclusive. But I think I figured it out with a small code https://godbolt.org/z/9q8j66sYr
Unfortunately I cannot figure out how to succefully link it in CE. Here is the command I used locally clang++ Regs.cpp -I llvm/include/ -std=c++17 -Ibuilds/include -lLLVMSupport -lLLVMCore -lLLVMMC -lLLVMMCDisassembler -lLLVMTarget -lLLVMX86Desc -lLLVMX86Disassembler -lLLVMX86Info -lLLVMBinaryFormat -Lbuilds/lib -fuse-ld=lld -lpthread -lz -lzstd -fno-rtti -ltinfo.
As you can see, when pass AH (2) to it, it returns AH, AX, EAX and RAX. But if we pass AX (3) to it, it returns AL, AH, AX, EAX and RAX. I think the API works as expected because AL is not sub of AH but sibling.
Thanks, @pengfei, that confirms what I suspected. Maybe there should be an "and siblings" iterator as well.
llvm/lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
499–501 ↗ | (On Diff #481464) | Simplest looking one looks like DeadMachineInstructionElim. Just addLiveOuts and stepBackward to the point you want. Then check what's live |
llvm/lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
530–531 ↗ | (On Diff #482299) | I think you would be better off operating in terms of regunits and then going backwards to registers. |
llvm/lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
530–531 ↗ | (On Diff #482299) | Akchewally, it looks like using RegUnits in the prologue-epilogue inserter gives us what we want. I'll change this. |
Not sure why phab hasn't closed this as merged, but it's merged as https://github.com/llvm/llvm-project/commit/14d4cddc5506fb0fd3c4ac556b4edd970aa151eb.
add nounwind to to the attribute list of @test1 to remove these obnoxious .cfi directives which are just noise.
Though, I do also wonder if a MIR test (-stop-after=prologepilog) would be less brittle/dependent on %al being allocated. Then we could have 1 function per register and know precisely what happens for which registers.