The current implementation with +crbits feature (enabled if >=O2)
(caller) passes i1 stack arguments by writing a single byte on the offset
of the stack object and (callee) reads the single byte. Say we have two
boolean (i1) true in stack arguments:
xx xx xx xx <- r1 (the pointer to the previous frame, big-endian) 01 ?? ?? ?? +8 (stored 0x01 to r1+0x8) 01 ?? ?? ?? +c (stored 0x01 to r1+0xC)
According to _PowerPC Processor ABI Supplement (September 1995)_, any type
smaller than i32 must be extended to i32 first. It had been implemented correctly
before adding +crbits feature:
xx xx xx xx <- r1 00 00 00 01 +8 (0x01 extended to 0x00000001 and stored in big-endian) 00 00 00 01 +c (0x01 extended to 0x00000001 and stored in big-endian)
The +crbits (introduced at r202451) makes i1 a special case to handle,
which has only 8 bits to fill into the stack frame. It did not handle the
alignment properly. The result will be unknown if it loads/stores with different
sizes on the same address in a big-endian architecture.
This patch fixes the bug by:
- Callee: adding offset (stack object size(4) - actual size(1) = 3 bytes) for MFI.CreateFixedObject() in LowerFormalArguments_32SVR4();
- Caller: extending the i1 to i32 before passing in LowerCall_32SVR4(), no matter VA.isRegLoc or VA.isMemLoc.
This patch fixes https://bugs.llvm.org/show_bug.cgi?id=38661
In your test case, the arguments have the zeroext attribute. That seems right. However, that being the case, should this be checking for i1, or should we check for Flags.isZExt()?