This is an archive of the discontinued LLVM Phabricator instance.

Fix i386 stack alignment for parameter type with breakdowns
Needs ReviewPublic

Authored by wxiao3 on Apr 27 2019, 8:22 AM.



Parameters are broken down to register types for value passing. E.g,
float128 is broken to 4Xi32. But current implementation doesn't
take the alignment of original type into account when allocating
stack space for the first piece of the broken down parameter. E.g,
float128 parameter is passed as 4 byte-aligned instead of 16
byte-aligned which is inconsistent with i386 ABI. This will cause
runtime failure if generated code calling to libraries built by
other compiler such as GCC which follows i386 ABI.

This patch fixes the bug by taking original alignment into account.

Diff Detail

Event Timeline

wxiao3 created this revision.Apr 27 2019, 8:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2019, 8:22 AM
wxiao3 edited the summary of this revision. (Show Details)Apr 27 2019, 8:24 AM

Hi all,

Any comments?

rnk added a comment.May 22 2019, 2:10 PM

IIUC, this bug is only present when SSE2 is disabled, and your change only has any effect on tests that don't enable SSE2. The normal calling convention rules for vectors pass them in XMM0-2, and then in 16 byte aligned stack memory. So, in most normal operation, LLVM does the right thing.

The fix itself is pretty obscure. You're making a change to the way all i32 and f32 arguments passed in memory are aligned, but I think you probably want the same fix for v2f64, which is probably broken down into two doubles, and handled by this 32-bit f64 rule:

// Doubles get 8-byte slots that are 4-byte aligned.
CCIfType<[f64], CCAssignToStack<8, 4>>,

No, in most normal operation for x86_32, LLVM does the wrong thing. A simple example to show the ABI bug:

$ cat test.c
void callee(int, __float128);
void caller() {
  callee(1, 2.q);
$ $ clang -m32 -S test.c -o -
caller:                                 # @caller
# %bb.0:                                # %entry
        pushl   %ebp
        movl    %esp, %ebp
        subl    $24, %esp
        movl    %esp, %eax
        movl    $1073741824, 16(%eax)   # imm = 0x40000000
        movl    $0, 12(%eax)
        movl    $0, 8(%eax)
        movl    $0, 4(%eax)
        movl    $1, (%eax)
        calll   callee
        addl    $24, %esp
        popl    %ebp

$ gcc -m32 -S test.c -o -
        pushl   %ebp
        .cfi_def_cfa_offset 8
        .cfi_offset 5, -8
        movl    %esp, %ebp
        .cfi_def_cfa_register 5
        subl    $8, %esp
        subl    $16, %esp
        movl    $0, (%esp)
        movl    $0, 4(%esp)
        movl    $0, 8(%esp)
        movl    $1073741824, 12(%esp)
        subl    $12, %esp
        pushl   $1
        call    callee
        addl    $32, %esp

For 32bit, float128 is always passed by memory even with sse/avx2/avx512 enabled.
And float128 is broken to 4xi32 for x86_32.