The current implementation with +crbits feature (enabled if >=O2)
(caller) passes i1 stack arguments by writing a single byte on the offset
of the stack object and (callee) reads the single byte. Say we have two
boolean (i1) true in stack arguments:
xx xx xx xx <- r1 (the pointer to the previous frame, big-endian) 01 ?? ?? ?? +8 (stored 0x01 to r1+0x8) 01 ?? ?? ?? +c (stored 0x01 to r1+0xC)
According to _PowerPC Processor ABI Supplement (September 1995)_, any type
smaller than i32 must be extended to i32 first. It had been implemented correctly
before adding +crbits feature:
xx xx xx xx <- r1 00 00 00 01 +8 (0x01 extended to 0x00000001 and stored in big-endian) 00 00 00 01 +c (0x01 extended to 0x00000001 and stored in big-endian)
The +crbits (introduced at r202451) makes i1 a special case to handle,
which has only 8 bits to fill into the stack frame. It did not handle the
alignment properly. The result will be unknown if it loads/stores with different
sizes on the same address in a big-endian architecture.
This patch fixes the bug by:
- Callee: adding offset (stack object size(4) - actual size(1) = 3 bytes) for MFI.CreateFixedObject() in LowerFormalArguments_32SVR4();
- Caller: extending the i1 to i32 before passing in LowerCall_32SVR4(), no matter VA.isRegLoc or VA.isMemLoc.
This patch fixes https://bugs.llvm.org/show_bug.cgi?id=38661