Page MenuHomePhabricator

[AArch64] Implement initial SVE calling convention support
ClosedPublic

Authored by c-rhodes on Jul 30 2019, 6:48 AM.

Details

Summary

This patch adds initial support for the SVE calling convention such that
SVE types can be passed as arguments and return values to/from a
subroutine.

The SVE AAPCS states [1]:

z0-z7 are used to pass scalable vector arguments to a subroutine,
and to return scalable vector results from a function. If a
subroutine takes arguments in scalable vector or predicate
registers, or if it is a function that returns results in such
registers, it must ensure that the entire contents of z8-z23 are
preserved across the call. In other cases it need only preserve the
low 64 bits of z8-z15, as described in §5.1.2.

p0-p3 are used to pass scalable predicate arguments to a subroutine
and to return scalable predicate results from a function. If a
subroutine takes arguments in scalable vector or predicate
registers, or if it is a function that returns results in these
registers, it must ensure that p4-p15 are preserved across the call.
In other cases it need not preserve any scalable predicate register
contents.

SVE predicate and data registers are passed indirectly (i.e. spilled to the
stack and pass the address) if they exceed the registers used for argument
passing defined by the PCS referenced above. Until SVE stack support is merged
we can't spill SVE registers to the stack, so currently an llvm_unreachable is
used where we will eventually handle this.

[1] https://static.docs.arm.com/100986/0000/100986_0000.pdf

Diff Detail

Repository
rL LLVM

Event Timeline

c-rhodes created this revision.Jul 30 2019, 6:48 AM

Out of interest, why does the ABI allow functions which don't have SVE args/returns to clobber the P registers? For Z registers, we've got to be compatible with old code which only needed to save the bottom half of v8-v15, but there should be no existing code which uses P registers, so we could enforce a mixture of callee- and caller-saved P registers for all code. Existing code is already compliant with this, because it doesn't touch the P regs.

lib/Target/AArch64/AArch64ISelLowering.cpp
3968 ↗(On Diff #212324)

Comments should be complete sentences.

test/CodeGen/AArch64/sve-calling-convention.ll
9 ↗(On Diff #212324)

These must be in this order, so shouldn't be CHECK-DAG.

32 ↗(On Diff #212324)

These should use a regex for the vreg number on the right.

34 ↗(On Diff #212324)

Can we check that the return value is pulled out of the right register?

Out of interest, why does the ABI allow functions which don't have SVE args/returns to clobber the P registers? For Z registers, we've got to be compatible with old code which only needed to save the bottom half of v8-v15, but there should be no existing code which uses P registers, so we could enforce a mixture of callee- and caller-saved P registers for all code. Existing code is already compliant with this, because it doesn't touch the P regs.

One reason is that syscalls do not need to preserve any state that they wouldn't without SVE, so all the predicate registers are clobbered by syscalls: https://www.kernel.org/doc/Documentation/arm64/sve.txt . This seemed better than forcing the kernel to preserve the registers when the registers are only intended for short-term working data. There's also the setjmp/longjmp problem: you could longjmp from a normal function to a normal function in a way that effectively unwinds through an SVE function. This won't restore the predicate registers properly unless extra state is added to jmp_buf, which is something we wanted to avoid (and would be difficult to do in a backward-compatible, length-agnostic way).

c-rhodes updated this revision to Diff 212585.Jul 31 2019, 8:15 AM
  • Improved tests
  • Fixed up comments
ostannard accepted this revision.Aug 1 2019, 9:14 AM

LGTM, thanks

This revision is now accepted and ready to land.Aug 1 2019, 9:14 AM

@ostannard Thanks for review!

I'll merge when the patch adding mappings between scalable MVT types and IR types lands (D47770).

This revision was automatically updated to reflect the committed changes.