This is an attempt to fix issue: https://github.com/llvm/llvm-project/issues/55900
PPC64_SVR4_ABI handles those by-value aggregate fits in one register using coerced integer type.
https://github.com/llvm/llvm-project/blob/51d33afcbe0a81bb8508d5685f38dc9fdb2b60c9/clang/lib/CodeGen/TargetInfo.cpp#L5351
Regarding the issue, the aggregate is passed using i8 as parameter. On big-endian, after register content stored to memory, the char locates at 7th byte. However current PPC64_SVR4_ABIInfo::EmitVAArg() generates argument access using the original type, so there is type mismatch between caller and callee.
This patch tries to teach PPC64_SVR4_ABIInfo::EmitVAArg() regarding the type coerce. I'm not sure if this should be fixed in clang or backend, but I guess in the clang more likely, since there is logic taking care of argument smaller than a slot:
https://github.com/llvm/llvm-project/blob/51d33afcbe0a81bb8508d5685f38dc9fdb2b60c9/clang/lib/CodeGen/TargetInfo.cpp#L356
Please help me review and let me know if any comments. Thank you!
Argh, Phabricator dropped one of my comments, and it's the one that explains why I CC'ed Tim Northover.
I'm a little worried about the existing uses of this function because this function is sensitive to the type produced by ConvertTypeForMem. ConvertTypeForMem *mostly* only generates IR struct types for C structs and unions, but there are a few places where it generates an IR struct for some fundamental type that stores multiple values. Most of those types are at least as large as an argument slot (e.g. they contain pointers), unless there's some weird target with huge slots. However, some of them are not; I think the most important example is _Complex T, which of course gets translated into a struct containing two Ts. So if T is smaller than half an argument slot, we're not going to right-align _Complex T on big-endian targets other than PPC64, and I don't know if that's right.
That would affect _Complex _Float16 on 64-bit targets; on 32-bit targets, I think you'd need something obscure like _Complex char to exercise it.
Now, if Clang generates arguments for one of these types using a single value that's also of IR struct type, and the backend considers that when deciding whether to right-align arguments, then maybe those two decisions cancel out and we've at least got call/va_arg compatibility, even if it's not necessarily what's formally specified by the appropriate psABI. But DirectTy is definitely not necessarily the type that call-argument lowering will use, so I'm a little worried.