This fixes the case where callees with SVE arguments outside of the z0-z7
range were incorrectly deduced as SVE calling convention functions
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
6752 | (A && B) || A simplifies to A. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
6752 | Oops, didn't see that one. Thanks |
I've simplified the pointless if(A && B || A) check. This was leftover from when I copied and simplified the code from AArch64TargetLowering::LowerFormalArguments which has seperate logic for predicates and scalable vectors like so:
if (VA.isRegLoc()) { // Arguments stored in registers. EVT RegVT = VA.getLocVT(); const TargetRegisterClass *RC; if (RegVT == MVT::i32) RC = &AArch64::GPR32RegClass; else if (RegVT == MVT::i64) RC = &AArch64::GPR64RegClass; else if (RegVT == MVT::f16 || RegVT == MVT::bf16) RC = &AArch64::FPR16RegClass; else if (RegVT == MVT::f32) RC = &AArch64::FPR32RegClass; else if (RegVT == MVT::f64 || RegVT.is64BitVector()) RC = &AArch64::FPR64RegClass; else if (RegVT == MVT::f128 || RegVT.is128BitVector()) RC = &AArch64::FPR128RegClass; else if (RegVT.isScalableVector() && RegVT.getVectorElementType() == MVT::i1) { FuncInfo->setIsSVECC(true); RC = &AArch64::PPRRegClass; } else if (RegVT.isScalableVector()) { FuncInfo->setIsSVECC(true); RC = &AArch64::ZPRRegClass; } else llvm_unreachable("RegVT not supported by FORMAL_ARGUMENTS Lowering");
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
6747–6748 | I can't think of a way that you can get a returned scalable type which doesn't go in a register, but it doesn't seem inconceivable it may be possible in the distant future. Since we're leaving a subtlety here it seems worth a brief comment of this form: // Check the types of the returned values. For the time being it is sufficient to check the VT as the RegVT is not known at this point. | |
6750 | You could also break once the CallConv is set. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
6747–6748 | Can we handle this in the same way as we do for arguments. The necessary work is done within LowerCallResult. It looks like every target does something special with LowerCall our only caller so I think we should hoist the necessary logic to construct RVLocs into LowerCall and pass it into LowerCallResult instead of Ins. Doing this means we can use the same lambda as above to give us a bit of future proofness (I don't expect extra test coverage because I don't believe this is all that well defined given there's no way to write a suitable example in C). | |
6749 | Not sure if there's a coding standard rule against this but it's not the typically style used. Most would write: if (EVT RegVT = ArgLocs[i].getLocVT()) if (RegVT.isScalableVector()) CallConv = CallingConv::AArch64_SVE_VectorCall; That said I think relying on the ValueType is fragile. Given ArgLocs contains actual register assignments would the following also work instead of the original lambda function? if (!Loc.isRegLoc()) return false; return AArch64::ZPRRegClass.contains(Loc.getLocReg()) || AArch64::PPRRegClass.contains(Loc.getLocReg()); | |
llvm/test/CodeGen/AArch64/sve-calling-convention-mixed.ll | ||
478 | Given the context of the patch I think you could do with a few more tests to ensure all the code paths are covered. For example: non_sve_caller_non_sve_high_range_callee non_sve_high_range_caller_non_sve_high_range_callee sve_caller_non_sve_high_range_callee - this is the function you have today sve_ret_caller_non_sve_high_range_callee - as above but with no sve arguments It's also worth adding another aapcs# callee test whereby the callee has an out of range SVE argument but returns an SVE result. | |
575 | It's better for this to be a declared function only so there's no chance of something peeking into the called function. If non_sve_callee_high_range serves no other purpose then you can just delete its definition. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
6747–6748 | Is it possible to hoist this cleanly and eliminate all of the RVLocs assignments in LowerCallResult? The work done in LowerCallResult is dependent on CallConv which is set by the CalleeInSVE/CalleeOutSVE lambdas here. When 1:1 hoisting the RVLocs work in LowerCallResult to replace CalleeInSVE here I'm seeing test failures in arm64_32-fastisel.ll. |
@paulwalker-arm In the interest of moving this ticket along I've implemented most of your feedback with the exception of hoisting the work done inside LowerFormalArguments into LowerCall. This is because LowerFormalArguments expects CallConv to be changed by CalleeInSVE/CalleeOutSVE in LowerCall. Removing the work in LowerFormalArguments causes test failures.
I don't understand what you mean by "exception of hoisting the work done inside LowerFormalArguments into LowerCall". This is not something I requested is it? I mentioned passing RVLocs directly into LowerCallResult in place of the current Ins parameter because we shouldn't need to compute it twice. I've just tried this myself with your patch and see no issues.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
6753–6756 | You don't need the Callee... local variables anymore but I guess they don't hurt. |
@paulwalker-arm I'm referring to this comment, particularly the sentence "I think we should hoist the necessary logic to construct RVLocs into LowerCall and pass it into LowerCallResult instead of Ins" which I assumed was referring to this whole block at the start of LowerCallResult here
SDValue AArch64TargetLowering::LowerCallResult( SDValue Chain, SDValue InFlag, CallingConv::ID CallConv, bool isVarArg, const SmallVectorImpl<ISD::InputArg> &Ins, const SDLoc &DL, SelectionDAG &DAG, SmallVectorImpl<SDValue> &InVals, bool isThisReturn, SDValue ThisVal) const { CCAssignFn *RetCC = CCAssignFnForReturn(CallConv); // Assign locations to each value returned by this call. SmallVector<CCValAssign, 16> RVLocs; DenseMap<unsigned, SDValue> CopiedRegs; CCState CCInfo(CallConv, isVarArg, DAG.getMachineFunction(), RVLocs, *DAG.getContext()); CCInfo.AnalyzeCallResult(Ins, RetCC);
Is it not the case that CCInfo(CallConv... here is part of the necessary logic to construct RVLocs here and it is dependent on the work done in LowerCall, which can set CallConv to CallingConv::AArch64_SVE_VectorCall and affect this block of code? I do not see all tests passing when I move this entire block. It sounds like we're not on the same page on the word hoist which I assumed meant cut and paste everything related to RVLocs in LowerCallResult to LowerCall. Did you mean to keep the lines containing CCInfo in LowerCallResult but pass RVLocs by a non-const reference?
Hoisted RVLocs work out of LowerFormalArguments
Added further test coverage for SVE/Non-SVE caller-callee situations
I can't think of a way that you can get a returned scalable type which doesn't go in a register, but it doesn't seem inconceivable it may be possible in the distant future. Since we're leaving a subtlety here it seems worth a brief comment of this form: