This is a fix to the code in clang which inserts padding arguments to ensure that the ARM backend can emit AAPCS-VFP compliant code. This code needs to track the number of registers which have been allocated in order to do this. When passing a very large struct (>64 bytes) by value, clang emits IR which takes a pointer to the struct, but the backend converts this back to passing the struct in registers and on the stack. The bug was that this was being considered by clang to only use one register, meaning that there were situations in which padding arguments were incorrectly emitted by clang.
Details
Diff Detail
Event Timeline
lib/CodeGen/TargetInfo.cpp | ||
---|---|---|
4206 | Couldn't this be: NumRegs = (getContext().getTypeSize(Ty) + 63) / 32; markAllocatedGPRs(2, NumRegs); and avoid the extra multiplication? |
Inline comment
lib/CodeGen/TargetInfo.cpp | ||
---|---|---|
4206 | The division by 64 is part of rounding up to the nearest multiple of 64 [0], and this simplification would give different results if the size is, for example, 64 bits. That said, this code is only used when the size is greater than 64 bytes, meaning it will always use all available GPRs, so I only left the calculations in to make the intention clear. I wouldn't be averse to replacing both arms of the "if" with markAllocatedGPRs(1, 4) and a comment, if you would prefer that? [0] This is because all types in the AAPCS have a size which is a multiple of their alignment. |
lib/CodeGen/TargetInfo.cpp | ||
---|---|---|
4206 | Of course, silly me. I think making it 4 and explaining why would make more sense. |
Simplify, with comment explaining why this can be done.
This patch also fixes the (textual) alignment of the NumRequired parameter in the definition of ARMABIInfo::markAllocatedGPRs, which is present in the patch I uploaded but not showing up in Phab. This may be because it is a whitespace-only change?
Couldn't this be:
and avoid the extra multiplication?