This is an archive of the discontinued LLVM Phabricator instance.

RegCall - Handling long double arguments
ClosedPublic

Authored by oren_ben_simhon on Oct 31 2016, 8:03 AM.

Details

Summary

The change is part of RegCall calling convention support for LLVM.
Long double (f80) requires special treatment as the first f80 parameter is saved in FP0 (floating point stack).
This review present the change and the corresponding tests.

Diff Detail

Repository
rL LLVM

Event Timeline

oren_ben_simhon retitled this revision from to RegCall - Handling long double arguments.
oren_ben_simhon updated this object.
oren_ben_simhon set the repository for this revision to rL LLVM.
oren_ben_simhon added a subscriber: llvm-commits.

A kindly reminder. I will appreciate your inputs on the review.

ahatanak added inline comments.Nov 7 2016, 3:44 PM
lib/Target/X86/X86FloatingPoint.cpp
485 ↗(On Diff #76408)

I guess this is needed only for the entry basic block? If so, perhaps you can move this to runOnMachineFunction so that we don't have to check whether the calling convention is regcall every time a basic block is visited.

956 ↗(On Diff #76408)

You can move this outside the loop since the calling convention and StackTop don't change inside the loop.

973 ↗(On Diff #76408)

I'm not sure what this comment means.

It seems like you are clearing StackTop because the first FP register can be used to pass an argument when the calling convention is regcall. Is that correct?

Implementing comments posted until 11/07

oren_ben_simhon marked 2 inline comments as done.Nov 8 2016, 6:27 AM
oren_ben_simhon added inline comments.
lib/Target/X86/X86FloatingPoint.cpp
973 ↗(On Diff #76408)

The problem occurs when FP0 was passed as argument and we try to return a parameter in FP0 as well.
In that case the StackTop points to 1 and we get to the following loop and we call pushreg(0) when FP0 is already assigned and StackTop is 1.

So what actually happened before pushreg:

Stack[] = { 0, ...}
RegMap[] = { 0, ...}

While after pushreg[0] (No stacktop reset):

Stack[] = { 0, 1..}
RegMap[] = { 1...}

The issue causes debug assertion in line 169 to fail because then regmap[stack[0]] != 0.

I changed the way I handle the issue.
Instead of reseting the StackTop to zero, I do not push registers if they were already pushed into the stack.

ahatanak added inline comments.Nov 8 2016, 11:41 PM
lib/Target/X86/X86FloatingPoint.cpp
348 ↗(On Diff #77181)

If this is just assigning 0 (FP0), maybe you can just explicitly do so rather than assigning Stack[0] (which is OK because it's initialized to 0 using memset) to make your intent clear?

973 ↗(On Diff #76408)

Wouldn't this push an extra register when a function taking a long double argument and returning complex long double were called (in which case N==2 because it's returned in st0 and st1)?

oren_ben_simhon added inline comments.Nov 9 2016, 1:38 AM
lib/Target/X86/X86FloatingPoint.cpp
973 ↗(On Diff #76408)

AFAIK, Long doubles are saved inside a single ST register (size of 80 bit).
So in the case of one FP argument and one FP retruned value, I expect N to be 1.
Meaning, The value will be returned in ST0.

Implementing comments posted until 11/08

oren_ben_simhon marked an inline comment as done.Nov 9 2016, 1:52 AM
ahatanak added inline comments.Nov 9 2016, 6:10 PM
lib/Target/X86/X86FloatingPoint.cpp
973 ↗(On Diff #76408)

When the following code is compiled with clang, the complex long double return value is returned in two registers. N is 2 and StackTop is 1 when the call to foo3 is visited in FPS::handleCall:

typedef long double _Complex CLD;
long double d0;

CLD __regcall foo3(long double a);

CLD __regcall  foo2(void) {
  return foo3(d0);
}

$ clang -cc1 -ffreestanding -triple=x86_64-pc-linux-gnu -o - -S test1.c

Do you know why this is happening?

lib/Target/X86/X86FloatingPoint.cpp
973 ↗(On Diff #76408)

You are correct. It acts according the default x64 calling convention, which returns COMPLEX_X87 in two registers.
The updated condition below will not break this behaviour and still two registers will be assigned.

ahatanak added inline comments.Nov 16 2016, 7:06 PM
lib/Target/X86/X86CallingConv.td
160 ↗(On Diff #77320)

Since RC.FP includes only FP0, this indicates that only one register is used to return a value even though two registers are used when returning complex long double. I think this currently works only because it falls backs on RetCC_X86Common to find the return register for the latter half of the complex long double return value.

Wouldn't it be better to include both FP0 and FP1 in RC.FP?

lib/Target/X86/X86FloatingPoint.cpp
973 ↗(On Diff #76408)

If StackTop == 1 and N ==2, this pushes FP0 and FP1 to the stack in the reverse order (FP0 is on the stack already and then pushReg(1) is called). I think you have to reset StackTop and clear Stack and RegMap before pushing all return registers to the stack.

ahatanak added inline comments.Nov 16 2016, 7:09 PM
lib/Target/X86/X86CallingConv.td
160 ↗(On Diff #77320)

I mean you can use separate register lists for the argument and return registers.

oren_ben_simhon marked 3 inline comments as done.

Implemented comments posted until 11/17

ahatanak accepted this revision.Nov 17 2016, 12:34 PM
ahatanak edited edge metadata.

LGTM

This revision is now accepted and ready to land.Nov 17 2016, 12:34 PM
This revision was automatically updated to reflect the committed changes.