This is an archive of the discontinued LLVM Phabricator instance.

[AIX] Enable frame pointer for AIX and add related test suite
ClosedPublic

Authored by Xiangling_L on Jan 9 2020, 7:13 AM.

Details

Summary

This patch:

  • enable frame pointer for AIX
  • update some of red zone comments[we may need an extra NFC patch to address all red zone related comments]
  • add testcases

Diff Detail

Event Timeline

Xiangling_L created this revision.Jan 9 2020, 7:13 AM
ZarkoCA added inline comments.Jan 10 2020, 7:00 AM
llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
62–63

The comment talks about the Darwin ABI but it looks that part of the code is removed. Should the comment be edited or am I misunderstanding?

llvm/lib/Target/PowerPC/PPCSubtarget.h
298

Can this be rewritten to:
On the AIX ABI, the scratch area is called the stack floor as opposed to the red zone. The stack floor is 220 bytes lower than the stack pointer in 32-bit mode and 288 bytes lower in 64-bit mode? Does this capture the same meaning?

llvm/test/CodeGen/PowerPC/Frames-alloca.ll
1

Is this testing the 64-bit ELFV1 ABI or the 64-bit ELFV2 ABI? Basically I'm asking whether this should be powerpc64le-unknown-linux-gnu or is this ok?

15

typo: should be referred, this happens in other test cases below

llvm/test/CodeGen/PowerPC/aix-frames-alloca-with-func-call.ll
47 ↗(On Diff #237075)

Thanks for including this test in this form. We can update it once we enable this.

Xiangling_L marked 10 inline comments as done.Jan 10 2020, 12:44 PM
Xiangling_L added inline comments.
llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
62–63

It is because we have the same logic for SVR4, AIX and Darwin, so I skip the isDarwinABI() check. Maybe I can adjust the structure of the comments like:

Darwin comments...;
SVR4 ABI/AIX ABI:...;

return STI.isPPC64() ? -8U : -4U;

Please let me know if you think it would help.

llvm/lib/Target/PowerPC/PPCSubtarget.h
298

Yes, I think so. I will adjust the comment based on your suggestion. Thanks.

llvm/test/CodeGen/PowerPC/Frames-alloca.ll
1

The default ABI for 64bit BE on Linux platform is ELFV1 ABI, the which AIX ABI is similar to. So it should be fine to leave it as it is.

15

Thanks. updated it.

llvm/test/CodeGen/PowerPC/aix-frames-alloca-with-func-call.ll
47 ↗(On Diff #237075)

No problem.

Xiangling_L marked 5 inline comments as done.

Update comments and fix typo in the testcase;

ZarkoCA added inline comments.Jan 13 2020, 10:48 AM
llvm/test/CodeGen/PowerPC/Frames-alloca.ll
1

Thanks for the explanation.

15

Sorry, I wasn't clear the same typo happens in the other test case files and not only in this one. I've marked the typos I caught.

llvm/test/CodeGen/PowerPC/Frames-large.ll
26

s/referrd/referred/

llvm/test/CodeGen/PowerPC/aix-frames-alloc-char.ll
18 ↗(On Diff #237421)

s/referrd/referred/

llvm/test/CodeGen/PowerPC/aix-frames-alloca-with-func-call.ll
19 ↗(On Diff #237421)

s/referrd/referred/

26 ↗(On Diff #237421)

Can the #0 be removed?

llvm/test/CodeGen/PowerPC/aix-frames-dyn-multi-alloc.ll
17 ↗(On Diff #237421)

s/referrd/reffered/

llvm/test/CodeGen/PowerPC/aix-frames-small.ll
20 ↗(On Diff #237421)

s/referrd/referred/

llvm/test/CodeGen/PowerPC/aix32-frames-stack-floor.ll
10 ↗(On Diff #237421)

s/referrd/referred/

llvm/test/CodeGen/PowerPC/aix64-frames-stack-floor.ll
10 ↗(On Diff #237421)

s/referrd/referred/

I think this patch is really close, just needs some typo fixes. I'm hesitant to approve since no one else has had a chance to look at it.

llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
62–63

Yes, I think that's better thanks.

llvm/test/CodeGen/PowerPC/Frames-alloca.ll
13

I find the comments in the tests really helpful in understanding what is happening and what to expect in the assembly, thanks for including these.

llvm/test/CodeGen/PowerPC/aix32-frames-stack-floor.ll
1 ↗(On Diff #237421)

Nit: can you combine this and aix43-frames-stack-floor.ll in one file?

Xiangling_L marked 13 inline comments as done.Jan 19 2020, 3:25 PM
Xiangling_L added inline comments.
llvm/test/CodeGen/PowerPC/Frames-alloca.ll
15

Thank you.

llvm/test/CodeGen/PowerPC/aix32-frames-stack-floor.ll
1 ↗(On Diff #237421)

The testcase for 32bit and 64bit are different in terms of how much bytes allocated will pass the stack floor limit. So that's why I separate them out.

Xiangling_L marked 2 inline comments as done.

Fix typos

Preliminary comments. I still need to go through the test changes more thoroughly.

llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
62–63

I think it's ok to start removing darwin logic. I think the comment block should be discarded.

llvm/lib/Target/PowerPC/PPCSubtarget.h
294–295

Suggestion to pull the comments into the logic:

unsigned getRedZoneSize() const {
  if (isDarwinABI())
    return 224;
  if (isPPC64())
    // 288 bytes = 18*8 (FPRs) + 18*8 (GPRs, GPR13 reserved)
    return 288;
  // AIX PPC32: 220 bytes = 18*8 (FPRs) + 19*4 (GPRs)
  // PPC32 SVR4ABI has no redzone.
  return isAIXABI() ? 220 : 0;
}
298

I think this comment is unnecessary because it does not add anything beyond what is readily discerned from the code. What is useful is how the stack floor / redzone is calculated.

llvm/test/CodeGen/PowerPC/Frames-alloca.ll
2

Why did you drop the -NOFP from the prefix? Is the output expected to be the same between default and -frame-pointer=all?

4

isn't this redudant to line 1?

llvm/test/CodeGen/PowerPC/aix64-frames-stack-floor.ll
1 ↗(On Diff #239012)

Why isn't there a PPC32 variant of this test?

Xiangling_L marked 8 inline comments as done.Jan 22 2020, 7:00 PM
Xiangling_L added inline comments.
llvm/test/CodeGen/PowerPC/Frames-alloca.ll
2
  1. Because -frame-pointer=all means disable frame pointer optimization elimination. I kinda feel NOFP here is misleading. And also it's not consistent with other LINUX frame pointer tests naming convention.
  1. And yes, the output is expected to be same.
4

This is to check if the default and -frame-pointer=all behave the same, the answer of which is they should be in this case.

llvm/test/CodeGen/PowerPC/aix64-frames-stack-floor.ll
1 ↗(On Diff #239012)

They are in another file named aix32-frames-stack-floor.ll.

cebowleratibm added inline comments.Jan 23 2020, 5:20 AM
llvm/test/CodeGen/PowerPC/Frames-alloca.ll
2

I'm confused. Because the llc default is opt enabled, I would expect frame pointer elimination to be enabled.

Xiangling_L marked 8 inline comments as done.Jan 26 2020, 4:32 PM
Xiangling_L added inline comments.
llvm/test/CodeGen/PowerPC/Frames-alloca.ll
2

Yes, the frame pointer elimination is enabled. But frame pointer elimination really means that only when frame pointer is not totally necessary, llc can do optimization and eliminate the frame pointer. In this testcase, because we have variable length argument n for alloca instruction, the frame pointer becomes necessary and also an efficient way to calculate the original SP value.

Xiangling_L marked an inline comment as done.

Address comments and update testcases.

sfertile added inline comments.Jan 28 2020, 10:50 AM
llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
63

If its the first spill slot in the gpr save area for both ELF and XCOFF we should remove SVR4 ABI.

Xiangling_L marked an inline comment as done.

Fix the comment

Some early issues. I need more time to go through the AIX assembly in detail to ensure it's correct.

llvm/test/CodeGen/PowerPC/Frames-dyn-alloca-with-func-call.ll
8

There is trailing whitespace after the \ in the patch. 2 occurrences detected when I applied the patch.

14

I expect there is a dependency on D73209, which adds the stack arg work. Perhaps it's cleaner if we make this patch dependent on that so that we can avoid churn.

70

are these stwux deliberately omitting the operands?

llvm/test/CodeGen/PowerPC/Frames-dyn-alloca.ll
36

I find it useful to insert whitespace to line up the assembly text.

53

whitespace

71

whitespace

86

whitespace

Xiangling_L marked 10 inline comments as done.Jan 30 2020, 8:15 AM
Xiangling_L added inline comments.
llvm/test/CodeGen/PowerPC/Frames-dyn-alloca-with-func-call.ll
14

I think if we want this patch have a dependency on stack arg work depends on which patch is likely to land first? If D73209 is close to land, then I think it makes sense to have this dependency?

70

Yes, cuz the rest of stwux is doing almost the same thing to update the SP and store allocated area address. And I focus more on We actually have 9 instances of dynamic stack allocation, identified by the stdux used to update the back-chain link.

llvm/test/CodeGen/PowerPC/Frames-dyn-alloca.ll
36

Thanks for mentioning this, I will update it.

sfertile added inline comments.Jan 30 2020, 10:34 AM
llvm/test/CodeGen/PowerPC/Frames-dyn-alloca-with-func-call.ll
30

I have a couple questions about this test.

We actually have 9 instances of dynamic stack allocation, identified by the stdux used to update the back-chain link.

Why 9 allocas? What is that exercising that say 2 or maybe 3 allocas wouldn't? The way this is worded it might imply that 9 dynamic allocations in the IR *must* produce 9 stdux instructions. Is that intended?

Allocated area is 4 bytes aligned in the case of int.

Where do we verify the stack adjustments are 4 byte aligned? Would it be more appropriate to say 'is at least 4 byte aligned' instead?

llvm/test/CodeGen/PowerPC/Frames-large.ll
29

Does this test warrant such a detailed description? To me it seems obvious what we are testing: updating the stack frame when the adjustment is too large to be encoded in an immediate.

llvm/test/CodeGen/PowerPC/Frames-small.ll
30

Ditto on the description being much too verbose for the test.

llvm/test/CodeGen/PowerPC/aix-frames-stack-floor.ll
13 ↗(On Diff #240977)

Ditt on on the description being overly verbose. This test is simple enough I think its completely self describing.

Xiangling_L marked 7 inline comments as done.Jan 30 2020, 1:45 PM
Xiangling_L added inline comments.
llvm/test/CodeGen/PowerPC/Frames-dyn-alloca-with-func-call.ll
30
  1. Why 9 allocas?

Because we want to test when we pass more than 8 parameters to callee, how the stack pointer rather than frame pointer comes into play to save the 9th on the stack, which is parameter save area are referred by stack pointer with correct offset.

2.The way this is worded it might imply that 9 dynamic allocations in the IR *must* produce 9 stdux instructions. Is that intended?
Yes, it's intended.

3.Would it be more appropriate to say 'is at least 4 byte aligned' instead?
Yes, I agree, and I will update it.

llvm/test/CodeGen/PowerPC/Frames-large.ll
29

Thanks for mentioning this, one reason I added this verbose description is there are some disturbing instructions like subf, addic in the CHECK content. Those content is to add more integrity for the CHECK. But I would expect viewers to not focus too much on those by skimming through the description first.

Besides, I guess it depends on reviewers to think if this description is useful or not. AFAIK, @ZarkoCA and @cebowleratibm may prefer this.

Anyway, I will leave it open for a while to see if there is some other input. Personally, I think it does no harm to have this description.

Xiangling_L marked 2 inline comments as done.

Fix formatting issue of testcases

Besides, I guess it depends on reviewers to think if this description is useful or not. AFAIK, @ZarkoCA and @cebowleratibm may prefer this.

I am not against them in general even though I let alot of feedback on why we should remove most of them :) .If you think they are useful I am willing to work with you to refine them into something we are both happy with. I think the same kind of goals we have when writing code comments apply to test comments:

  • Anything that can reasonably be made self documenting should be.
  • There shouldn't be any redundant comments.
  • Any comments we do add need to be completely clear and accurate.

I think the 3rd point is already covered, Its 1 and 2 that I'm pushing for.

llvm/test/CodeGen/PowerPC/Frames-large.ll
29

Thanks for mentioning this, one reason I added this verbose description is there are some disturbing instructions like subf, addic in the CHECK content.

But the comment doesn't address that. For example if I look at the ELF 32-bit no-frame-pointer test, I see th e subf instruction but the test description doesn't help me understand why. Adding the CHECK for it makes it look like it is significant to the test as well, but as far as I can tell its completely redundant.

# Build up the stack frame adjustment in register $r0
        lis 0, -1
        ori 0, 0, 32752

# Allocate new stack frame.
        stwux 1, 1, 0

# Why do we have this instruction? It doesn't set any condition register bits, and the result is killed next instruction. The test description doesn't help me understand this.
        subf 0, 0, 1

# Spill $r31 so we can use it to restore the original stack pointer. If $r0 is available to spill to ... why not just use it as the scratch register though ...
        mr 0, 31

# Get original stack pointer value from the backchain.
        lwz 31, 0(1)

# Calculate the address of the stack allocated object. Why is it 20 bytes off the stack-pointer? I'm not very familiar with PPC32, this is soemthing I'd like commented becuase its non-obvious.
        addi 3, 1, 20

# deallocate stack frame.
        mr 1, 31

# Restore non-volatile register we clobbered.
        mr 31, 0

# Return, end of test.
        blr
31

I think you already did a good job of documenting this through your filecheck prefixes 'FP' vs 'NO-FP'.

33

I think instead of these 2 lines it would better to have white space before and after the add instruction that calculates the address of the allocated object. You could add a comment at that instruction if you want to draw extra attention but if I don't already know what that instructions importance is I don't think this comment helps.

36

There is useful information in here that I don't think we could 'self-document' as part of the test.

Only one instance of dynamic stack allocation in this case

Documented through the check prefixes in every test except PPC32-LINUX-NOFP becuase you check-next everything no other st[dw]ux could sneak in before the return.

identified by
; the stdux used to update the back-chain link when allocated frame is large
; that we can not address it by a 16-bit signed integer.

That stdux/stwux is the register + register equivalent form of the register + immediate stores we would normally use is helpful to the test. How it adjusts the stack pointer and maintains the backchain might be useful as well, and that the limit where we have to switch to r+r from r+imm is when it overflows a signed 16-bit value.

hubert.reinterpretcast edited reviewers, added: ZarkoCA; removed: zarko.Feb 3 2020, 9:03 AM

Only minor test questions / requests. The change looks ok. I think we should explain the stack adj differences with the XL compiler before committing, specifically: do we need to allocate the PSA for the current frame even if there are no calls?

llvm/test/CodeGen/PowerPC/Frames-dyn-alloca.ll
80

Why are we adding 32 to the r3 return reg?

llvm/test/CodeGen/PowerPC/Frames-small.ll
65

XL compiler stack bump was 448 for this case. I suspect they are allocating 32 bytes for the min PSA even though there are no calls.

llvm/test/CodeGen/PowerPC/aix-frames-stack-floor.ll
1 ↗(On Diff #241739)

I like test consistency across the PPC subtargets. I'd like to see Linux 32/64 as part of this test for consistency and documenting the behaviour differences of the PPC subtargets. If it makes sense to have an AIX stack floor test, then it makes sense to have a Linux stack floor test, even if it's not part of the test suite yet.

Xiangling_L marked 5 inline comments as done.

Update the testcases to make the description of each less verbose;
Add some comments to make the offset clearer with which the address of allocated area is referred by stack pointer;

Xiangling_L added inline comments.Feb 5 2020, 7:45 AM
llvm/test/CodeGen/PowerPC/Frames-dyn-alloca.ll
80

32 is the linkage 16-byte aligned linkage area size since the linkage area is always at the top of the stack even after we alloca some space during runtime.

Update the testcase after putting arguments on stack patch land;

sfertile accepted this revision.Feb 7 2020, 11:25 AM

LGTM.

This revision is now accepted and ready to land.Feb 7 2020, 11:25 AM
This revision was automatically updated to reflect the committed changes.