This is an archive of the discontinued LLVM Phabricator instance.

[SVE][AArch64] Refine hasSVEArgsOrReturn
ClosedPublic

Authored by MattDevereau on Jun 7 2022, 5:28 AM.

Details

Summary

As described in aapcs64 (https://github.com/ARM-software/abi-aa/blob/2022Q1/aapcs64/aapcs64.rst#scalable-vector-registers) AAVPCS is used only when registers z0-z7 take an SVE argument. This fixes the case where floats occupy the lower bits of registers z0-z7 but SVE arguments in registers greater than z7 cause a function to use AAVPCS where it should use AAPCS.

Moving SVE function deduction from AArch64RegisterInfo::hasSVEArgsOrReturn to AArch64TargetLowering::LowerFormalArguments where physical register lowering is more accurate fixes this.

Diff Detail

Event Timeline

MattDevereau created this revision.Jun 7 2022, 5:28 AM
Herald added a project: Restricted Project. · View Herald Transcript
MattDevereau requested review of this revision.Jun 7 2022, 5:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2022, 5:28 AM
efriedma added inline comments.Jun 7 2022, 1:06 PM
llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
81

Using isFloatTy() like this isn't going to be accurate. If you want to correctly compute the registers used for a call, you really should just use the result of analyzeCallOperands() from isel. Maybe compute it in isel, then save the result in AArch64FunctionInfo.

efriedma added inline comments.Jun 7 2022, 1:08 PM
llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
81

Ignore my reference to analyzeCallOperands(). The actual computation of the registers used happens in AArch64TargetLowering::LowerFormalArguments.

MattDevereau edited the summary of this revision. (Show Details)

Moved SVE function deduction from AArch64RegisterInfo::hasSVEArgsOrReturn to AArch64TargetLowering::LowerFormalArguments
Added IsSVE bool to AArch64FunctionInfo which describes if an AArch64Function has SVE args or return type in physical registers

paulwalker-arm added inline comments.Jun 10 2022, 4:13 AM
llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
69–85

Does AArch64FunctionInfo contain all the information we need? It looks like it contains a MachineFunction *. I ask because I'm wondering if we can just get rid of AArch64RegisterInfo::hasSVEArgsOrReturn entirely.

MattDevereau added inline comments.Jun 10 2022, 4:25 AM
llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
69–85

AArch64TargetLowering::isEligibleForTailCallOptimization and AArch64AsmPrinter::emitFunctionEntryLabel() also use AArch64RegisterInfo::hasSVEArgsOrReturn. As AArch64FunctionInfo has a reference to a machine function, it should be possible to move isa<ScalableVectorType>(MF->getFunction().getReturnType()) into the new AArch64FunctionInfo::IsSVE method.

So yes, I do think AArch64FunctionInfo has all the information we need

MattDevereau updated this revision to Diff 435880.EditedJun 10 2022, 5:29 AM

Removed hasSVEArgsOrReturn

MattDevereau edited the summary of this revision. (Show Details)Jun 10 2022, 6:06 AM
This revision is now accepted and ready to land.Jun 10 2022, 11:53 AM
efriedma requested changes to this revision.Jun 10 2022, 12:05 PM
efriedma added inline comments.
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
199 ↗(On Diff #435880)

Hmm... actually, looking at this again, I'm a little concerned about checking the IR return type directly, instead of using AnalyzeReturn. I'm not sure if this actually comes up in practice, but calling convention lowering does various transforms on the return type. Maybe if you write something silly like "<vscale x 1000 x double>".

Maybe we can add a bit of code to LowerFormalArguments to also check the return type?

This revision now requires changes to proceed.Jun 10 2022, 12:05 PM
Matt added a subscriber: Matt.Jun 10 2022, 3:14 PM
MattDevereau added inline comments.Jun 15 2022, 5:51 AM
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
199 ↗(On Diff #435880)

Modifying test @sve_caller_sve_callee() in sve-tailcall.ll to use return type <vscale x 1000 x double> caused the output "LLVM ERROR: Invalid size request on a scalable vector.".

In similar fashion to the changes in LowerFormalArguments, I checked the types of the return values in AArch64TargetLowering::LowerReturn instead of checking the IR return type directly. This ends up passing on the rest of the tests except sve_caller_sve_callee and sve_caller_sve_callee_fastcc, where the return type is not eligible for tail optimization and the frame-pointer is incorrectly unpreserved. Is directly checking the IR return type in AArch64TargetLowering::LowerReturn after AnalyzeReturn still not sufficient?

efriedma added inline comments.Jun 15 2022, 2:59 PM
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
199 ↗(On Diff #435880)

I don't think we call AArch64TargetLowering::LowerReturn for all functions, in general, only functions that return. We need to do the computation somewhere that's called for every function.

Besides my <vscale x 1000 x double> example, I don't have any specific cases where just checking isa<ScalableVectorType> doesn't work, but I'd like to avoid falling out of sync if we do ever add support for, for example, returning fixed-width vectors in SVE registers.

Modifying test @sve_caller_sve_callee() in sve-tailcall.ll to use return type <vscale x 1000 x double> caused the output "LLVM ERROR: Invalid size request on a scalable vector.".

That sounds like a bug we should fix. Although maybe not a high priority one.

sdesmalen added inline comments.
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
199 ↗(On Diff #435880)

I don't think we call AArch64TargetLowering::LowerReturn for all functions, in general, only functions that return. We need to do the computation somewhere that's called for every function.

I'm not sure I understand this point.

For arguments this check is done in LowerFormalArguments, so that case is all covered. For the return value, I'd expect that LowerReturn is sufficient, because if the function doesn't return (and thus doesn't call LowerReturn), it also can't return a value in an SVE register.

201 ↗(On Diff #435880)

nit: Should this be something like UsesSVERegsForArgsOrReturnValue ?

efriedma added inline comments.Jun 16 2022, 10:12 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
6196 ↗(On Diff #437524)

I don't understand what you're doing here; FuncInfo describes the caller, not the callee.

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
199 ↗(On Diff #435880)

A function that doesn't return can still throw an exception.

MattDevereau added inline comments.Jun 16 2022, 10:36 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
6196 ↗(On Diff #437524)

Not doing this causes sve_caller_sve_callee and sve_caller_sve_callee_fastcc in sve-tailcall.ll to fail.
From ISD::InputArg:

/// InputArg - This struct carries flags and type information about a
/// single incoming (formal) argument or incoming (from the perspective
/// of the caller) return value virtual register.
///
efriedma added inline comments.Jun 16 2022, 10:53 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
6196 ↗(On Diff #437524)

In @sve_caller_sve_callee, the return type is an SVE type, so we should be calling setIsSVECC(true) elsewhere.

I think this maybe another case of the issue that LowerReturn doesn't run/runs too late?

sdesmalen added inline comments.Jun 16 2022, 3:25 PM
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
199 ↗(On Diff #435880)

If a function never returns because the only 'return' is an exceptional return, then it doesn't have to preserve the entire contents of z8-z23, which I think is what this patch tries to ascertain. Perhaps I was just reading too much into your statement, which confused me. I would have expected the function that analyses/lowers the return to be the right place to handle this, if it is called at the right time of course.

efriedma added inline comments.Jun 16 2022, 4:15 PM
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
199 ↗(On Diff #435880)

If we throw an exception, the unwinder has to restore z8-z23, I think? Or is there some carveout for unwinding that lets us get away without preserving those registers somehow? Is that documented somewhere?

There's also the issue that tail calls don't go through LowerReturn.

In any case, given that we're calling isSVECC() inside isel, we should ensure that all the setIsSVECC() happen before we start calling isSVECC(); anything else is confusing at best...

sdesmalen added inline comments.Jun 17 2022, 1:37 AM
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
199 ↗(On Diff #435880)

If we throw an exception, the unwinder has to restore z8-z23, I think? Or is there some carveout for unwinding that lets us get away without preserving those registers somehow? Is that documented somewhere?

I'm not entirely sure where this is officially documented (or whether this is just common practice of existing unwinders), but D84737 states that unwinders may only preserve the lower 64bits.

There's also the issue that tail calls don't go through LowerReturn.

In any case, given that we're calling isSVECC() inside isel, we should ensure that all the setIsSVECC() happen before we start calling isSVECC(); anything else is confusing at best...

Yes, I agree.

MattDevereau added inline comments.Jun 20 2022, 5:26 AM
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
199 ↗(On Diff #435880)

Is there any particular case where isSVECC() is likely to be called before setIsSVECC()? I've not observed this so far while working on this

efriedma added inline comments.Jun 20 2022, 2:08 PM
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
199 ↗(On Diff #435880)

isEligibleForTailCallOptimization() calls isSVECC(); I expect that runs before LowerReturn().

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
6186–6192 ↗(On Diff #437524)

Is this logic not the logic that this patch is trying to fix? This needs to be changed to make use of the isSVECC flag.

I see here that the calling convention of the Callee is based on this, but doesn't it need to account for the rules as described in the summary of the patch?

Thinking aloud, my interpretation of what's going on is that the underlying CallLoweringInfo::CallConv is C/Fast, because that's what the 'user' specified, but this is 'magically' implied to be a AArch64_SVE_VectorCall only when it matters during the lowering of call/return.

Architecturally this is a little confusing since it means we have these bits of code which 'patch up' the calling convention at the last moment. Is it possible we can fix the CallingConv earlier? I take it not easily, because the details of which registers are used is decided by e.g. AnalyzeReturn, and this only runs during the lowering.

MattDevereau planned changes to this revision.Jun 28 2022, 3:39 AM

Added logic to LowerFormalArguments to check the return type for scalable vector types. This is more reliable than checking the return types in LowerReturn as LowerReturn does not always run when checking calls, i.e. in the tail call case

This revision is now accepted and ready to land.Jun 29 2022, 1:05 PM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
6202 ↗(On Diff #441023)

Note: "Ins" contains the return values of the call, and "Outs" contains the parameters to the call. I think these are confusingly named, at best.

This update of CallConv is a reference to CLI.CallConv, so it's setting the CallConv in the CLI, which gets passed around for various decision making.

I note that the logic for setting CalleeOutSVE is incorrect, because it does not account for the position of the arguments. If I put a scalable vector in any parameter it will update the calling convention, even though we know that's not the correct logic.

I didn't yet find if/where this matters. If it does not matter I think it may be best removed to save future confusion, since it's incorrect.

peterwaller-arm accepted this revision.Jun 30 2022, 3:30 AM
peterwaller-arm added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
6202 ↗(On Diff #441023)

With reflection, I'm changing my position on this for this patch, so please go ahead.

This revision was landed with ongoing or failed builds.Jul 1 2022, 6:26 AM
This revision was automatically updated to reflect the committed changes.