Previously when a packed struct, containing vector data types such as
uint16x8_t, is passed as a function argument, the alignment of the
struct used by the function caller and the alignment used by the callee
to load the argument from stack does not match.
This patch implements section 6.8.2, stage C.4 of the Procedure Call
Standard for the Arm 64-bit Architecture (AAPCS64): "If the argument is
an HFA, an HVA, a Quad-precision Floating-point or short vector type
then the NSAA is rounded up to the next multiple of 8 if its natural
alignment is ≤ 8 or the next multiple of 16 if its natural alignment
is ≥ 16." This ensures the alignments of the packed structs used as
function arguments are the same as described in the AAPCS64 for both
the caller and callee.
Reference:
AAPCS64 (https://github.com/ARM-software/abi-aa/blob/latest-release/aapcs64/aapcs64.rst)
I've always felt the data flow in this function was excessively convoluted. Let's puzzle it out to figure out what's going on. Ignoring the AIX stuff which I assume can't coincide with AArch64, we've got:
where MaxAlignmentInChars is the highest value of all the alignment attributes on the field and MaxFieldAlignment is the value of #pragma pack that was active at the time of the struct definition.
Note that this gives us PackedFieldAlign <= FieldAlign <= UnpackedFieldAlign.
So:
Also, AAPCS64 seems to define UnadjustedAlignment as the "natural alignment", and there's a doc comment saying it's the max of the type alignments. That makes me wonder if we should really be considering either the aligned attribute or #pragma pack in this computation at all; maybe we should just be looking at the type alignment.