The change is part of RegCall calling convention support for LLVM.
Long double (f80) requires special treatment as the first f80 parameter is saved in FP0 (floating point stack).
This review present the change and the corresponding tests.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Target/X86/X86FloatingPoint.cpp | ||
---|---|---|
485 ↗ | (On Diff #76408) | I guess this is needed only for the entry basic block? If so, perhaps you can move this to runOnMachineFunction so that we don't have to check whether the calling convention is regcall every time a basic block is visited. |
956 ↗ | (On Diff #76408) | You can move this outside the loop since the calling convention and StackTop don't change inside the loop. |
973 ↗ | (On Diff #76408) | I'm not sure what this comment means. It seems like you are clearing StackTop because the first FP register can be used to pass an argument when the calling convention is regcall. Is that correct? |
lib/Target/X86/X86FloatingPoint.cpp | ||
---|---|---|
973 ↗ | (On Diff #76408) | The problem occurs when FP0 was passed as argument and we try to return a parameter in FP0 as well. So what actually happened before pushreg: Stack[] = { 0, ...} RegMap[] = { 0, ...} While after pushreg[0] (No stacktop reset): Stack[] = { 0, 1..} RegMap[] = { 1...} The issue causes debug assertion in line 169 to fail because then regmap[stack[0]] != 0. I changed the way I handle the issue. |
lib/Target/X86/X86FloatingPoint.cpp | ||
---|---|---|
348 ↗ | (On Diff #77181) | If this is just assigning 0 (FP0), maybe you can just explicitly do so rather than assigning Stack[0] (which is OK because it's initialized to 0 using memset) to make your intent clear? |
973 ↗ | (On Diff #76408) | Wouldn't this push an extra register when a function taking a long double argument and returning complex long double were called (in which case N==2 because it's returned in st0 and st1)? |
lib/Target/X86/X86FloatingPoint.cpp | ||
---|---|---|
973 ↗ | (On Diff #76408) | AFAIK, Long doubles are saved inside a single ST register (size of 80 bit). |
lib/Target/X86/X86FloatingPoint.cpp | ||
---|---|---|
973 ↗ | (On Diff #76408) | When the following code is compiled with clang, the complex long double return value is returned in two registers. N is 2 and StackTop is 1 when the call to foo3 is visited in FPS::handleCall: typedef long double _Complex CLD; long double d0; CLD __regcall foo3(long double a); CLD __regcall foo2(void) { return foo3(d0); } $ clang -cc1 -ffreestanding -triple=x86_64-pc-linux-gnu -o - -S test1.c Do you know why this is happening? |
lib/Target/X86/X86FloatingPoint.cpp | ||
---|---|---|
973 ↗ | (On Diff #76408) | You are correct. It acts according the default x64 calling convention, which returns COMPLEX_X87 in two registers. |
lib/Target/X86/X86CallingConv.td | ||
---|---|---|
160 ↗ | (On Diff #77320) | Since RC.FP includes only FP0, this indicates that only one register is used to return a value even though two registers are used when returning complex long double. I think this currently works only because it falls backs on RetCC_X86Common to find the return register for the latter half of the complex long double return value. Wouldn't it be better to include both FP0 and FP1 in RC.FP? |
lib/Target/X86/X86FloatingPoint.cpp | ||
973 ↗ | (On Diff #76408) | If StackTop == 1 and N ==2, this pushes FP0 and FP1 to the stack in the reverse order (FP0 is on the stack already and then pushReg(1) is called). I think you have to reset StackTop and clear Stack and RegMap before pushing all return registers to the stack. |
lib/Target/X86/X86CallingConv.td | ||
---|---|---|
160 ↗ | (On Diff #77320) | I mean you can use separate register lists for the argument and return registers. |