This is an archive of the discontinued LLVM Phabricator instance.

[ppc64] Don't apply sibling call optimization if callee has any byval arg
ClosedPublic

Authored by cycheng on Aug 12 2016, 2:17 AM.

Details

Summary

This is a quick work around, because in some situations, e.g. caller's stack size > callee's stack size, we are still able to apply sibling call optimization even callee has byval arg.

This patch fix: https://llvm.org/bugs/show_bug.cgi?id=28328

Diff Detail

Event Timeline

cycheng updated this revision to Diff 67803.Aug 12 2016, 2:17 AM
cycheng retitled this revision from to [ppc64] Don't apply sibling call optimization if callee has any byval arg.
cycheng updated this object.
cycheng added reviewers: hfinkel, kbarton, nemanjai, amehsan.
cycheng added subscribers: llvm-commits, tjablin.

Note that gcc is able to do SCO when caller uses more stack space than callee. Please look at this example:

#define noinline __attribute__((noinline))

struct test {
  long int a;
  char ary[56];
};
struct test gTest;

noinline int callee(struct test v, struct test *b) { 
  b->a = v.a; 
  return 0; 
}

void caller1(struct test a, struct test c, struct test *b) { callee(gTest, b); }
void caller2(struct test *b) { callee(gTest, b); }

Generated by gcc:

caller1:                                caller2:
0:  addis 2,12,.TOC.-0b@ha              0:  addis 2,12,.TOC.-0b@ha
    addi 2,2,.TOC.-0b@l                     addi 2,2,.TOC.-0b@l
    .localentry	caller1,.-caller1           .localentry	caller2,.-caller2
    std 3,32(1)                             mflr 0
    ld 3,160(1)                             addis 10,2,.LC1@toc@ha		# gpr load fusion, type long
    addis 11,2,.LC0@toc@ha                  ld 10,.LC1@toc@l(10)
    ld 11,.LC0@toc@l(11)                    std 0,16(1)
    std 31,-8(1)                            stdu 1,-112(1)
    std 4,40(1)                             std 3,96(1)
    std 5,48(1)                             ld 4,8(10)
    std 6,56(1)                             ld 3,0(10)
    std 3,96(1)                             ld 5,16(10)
    std 7,64(1)                             ld 6,24(10)
    std 8,72(1)                             ld 7,32(10)
    std 9,80(1)                             ld 8,40(10)
    std 10,88(1)                            ld 9,48(10)
    ld 31,-8(1)                             ld 10,56(10)
    ld 3,0(11)                              bl callee
    ld 4,8(11)                              addi 1,1,112
    ld 5,16(11)                             ld 0,16(1)
    ld 6,24(11)                             mtlr 0
    ld 7,32(11)                             blr
    ld 8,40(11)
    ld 9,48(11)
    ld 10,56(11)
    b callee

We can see that caller1 can sibcall to callee, but caller2 can't do that.

hans added a subscriber: hans.Aug 16 2016, 11:19 AM

This is for a bug marked as 3.9 release blocker.

This looks like a reasonable defensive fix to me.

Hal: do you have a moment to take a look?

hfinkel accepted this revision.Aug 16 2016, 11:24 AM
hfinkel edited edge metadata.
In D23441#516961, @hans wrote:

This is for a bug marked as 3.9 release blocker.

This looks like a reasonable defensive fix to me.

Hal: do you have a moment to take a look?

Sure, LGTM.

This revision is now accepted and ready to land.Aug 16 2016, 11:24 AM
cycheng closed this revision.Aug 16 2016, 8:29 PM

Committed: r278900