AAPCS requires i1 argument to be zero-extended to 8-bits by the caller.
Emit a new AArch64ISD::ASSERT_ZEXT_BOOL hint to enable some
optimization opportunities. In particular, when the argument is forwarded
to the callee, we can avoid zero-extension and use it as-is.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
On the SelectionDAG ISel side, I'm not really happy we're working around the usual framework for dealing with known bits. Can we use some sort of target-specific equivalent to AssertZExt to encode the known bits, then use those known bits when we generate the call?
If it's possible to handle this using AssertZExt on the SDAG side, I think it would be nice to handle it similarly on the GISel side using G_ASSERT_ZEXT.
llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp | ||
---|---|---|
1079 | I think that you can handle most of the logic in here using MIPatternMatch: Register OrigReg; if (!mi_match(AI.Regs[0], MRI, m_GTrunc(m_Reg(OrigReg)))) return false; | |
1088 | Maybe this loop can be replaced with llvm::any_of? |
Used an AssertZExt to keep information about zero-extended low bits.
There were a number of places where the compiler failed to optimize this correctly (found by LIT tests), so I fixed them as well.
This turned out much messier than I was hoping for...
I guess we have the following possibilities:
- Go back to the original patch.
- Go with something like this approach, maybe with a few small cleanups.
- Try to use target-specific nodes to dodge the legalization issues. Instead of using the target-independent nodes, introduce AArch64ISD::AssertZextBool/AArch64::ZERO_EXTEND_BOOL which have an i32 argument/result, and then DAGCombine them away after type legalization cleans up all the extend/truncate ops which make this harder to analyze. Maybe more code, but easier to understand the implications, I think.
You might want to add a testcase where the call isn't in the entry block; I think maybe the original patch handles this?
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp | ||
---|---|---|
3889 ↗ | (On Diff #368054) | You can't unconditionally truncate to i1 here, I think; this might be after type legalization. |
Added AArch64ISD::ASSERT_ZEXT_BOOL opcode to avoid legalization issues with AssertZExt.
Added a test case with a call not in the entry block.
Thanks a lot! I decided to try the option 3, and it seems to work fine.
You might want to add a testcase where the call isn't in the entry block; I think maybe the original patch handles this?
This case does not work well with SelectionDAG - the hint and the call are in different blocks, and thus not visible.
I noticed that zeroext flag is propagated correctly between blocks, so we might be able to do the same for the assertion.
I guess that ideally we need to propagate known bits instead, but I'm not sure if it is possible to do this.
For cross-block information, we do some limited tracking in target-independent code. See FunctionLoweringInfo::ComputePHILiveOutRegInfo. We could maybe mess with that code to track more information, or add a target hook to SelectionDAG around GetLiveOutRegInfo(). Actually, we could maybe even just feed the info from ComputePHILiveOutRegInfo directly into ComputeKnownBits; we actually keep all the information around already? (Not sure what the history is here...)
But we can leave that for a followup. The current SelectionDAG changes seem fine. I'm not familiar enough with the globalisel bits to approve that.
llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp | ||
---|---|---|
553 | This can be LLT::scalar(8) | |
587 | If these are bool args then we can assume OrigTy.getScalarSizeInBits() == 1 right? | |
588–589 | MIRBuilder's builder methods return a MachineInstrBuilder that can be directly passed into most source operands of other builder methods. You can simplify these lines into something like: MIRBuilder.buildTrunc(OrigReg, MIRBuilder.buildAssertZExt(WideTy, WideReg, 1)); |
This can be LLT::scalar(8)