Page MenuHomePhabricator

[WIP][RISCV] Implement ilp32e ABI
Needs ReviewPublic

Authored by lenary on Nov 18 2019, 8:34 AM.



This commit includes the necessary changes to clang and LLVM to support
the ilp32e ABI.

The central changes in this ABI, compared to ilp32 are:

  • Only 6 integer argument registers (rather than 8);
  • Only 2 callee-saved registers (rather than 12); and
  • A Stack Alignment of 32bits (rather than 128bits).

This means there are a variety of changes including to the datalayout
string, to ABI lowering in Clang, to calling convention handling, and
most complex, to stack management.

The difficulty with stack management is compiling for ilp32e with the
standard RISC-V D extension enabled. This is because with the
D-extension, the floating point registers are 8 bytes and have a spill
alignment of 8 bytes.

This becomes a problem if you have assigned an FPR64 virtual register,
as during a call this is not preserved, and will need to be spilled.
This spill is 8-byte aligned, so the stack needs realigning to
accomodate it. With sufficient register pressure (which is not
difficult, as there are only two callee-saved GPRs), fp might have
already been allocated, and so realigning the stack is not possible. To
prevent this, when the D-extension is enabled with ilp32e, and a
function contains any calls, we conservatively force the frame pointer
to be reserved, in case we need it later to realign the stack.

(For background: we have to choose registers to reserve before register
allocation; during register allocation spills may be created; and then
we insert the prolog/epilog (several passes) after register allocation).

One thing this patch fixes, compared to previous ilp32e patches, is that
the ABI does not influence the registers saved during an interrupt. This
choice is made solely based on the architecture extensions available.

This is based upon D59592 by Daliang Xu (xudaliang.pku). I have fixed
the issue with interrupt CSRs, and excessive duplication of the GPR

Diff Detail

Event Timeline

lenary created this revision.Nov 18 2019, 8:34 AM
lenary marked 2 inline comments as done.Nov 18 2019, 8:39 AM

Some notes and queries about this patch.


In the epilog, return values have to be in a0 and a1, which we might as well be explicit about.


@shiva0217 I think this test is failing because of the base pointer patch, but I'm not sure. Can you look at the issue? It thinks that x8 gets killed by a store (which I don't think should be using x8), and therefore x8 is not live when we come to the epilog. It's a super confusing issue.

luismarques added inline comments.Nov 18 2019, 10:26 AM

Consider extracting this logic into its own function and calling that function in the various places in this patch where that logic is used.


There are many renames like these. Consider making ArgGPRs the ArrayRef (instead of AllocatableArgGPRs) and renaming the original array to something less generic (ArgGPRs_Standard? _Full? _NonE?)

shiva0217 added inline comments.Nov 19 2019, 12:01 AM

The variadic argument for ilp32e doesn't need to align to even register. We could also add a test line in vararg.ll.


Hi @lenary, it seems that hasBP() return false in this case, the issue trigger by register allocation allocating x8 which should be preserved. I'm not sure why it will happen, I try to write a simple C code to reproduce the case but fail to do that. Could you obtain the C code for the test case?

Jim marked an inline comment as done.Nov 19 2019, 10:22 PM
Jim added inline comments.

It seems that RISCVRegisterInfo::getReservedRegs doesn't add x8(fp) into reserved registers (TFI->hasFP(MF) return false), then x8 is a candidate register for register allocation. After register allocation, some of fpr64 splitted into stack that makes stack need to be realign (MaxAlignment(8) > StackAlignment(4)), therefore x8 should be used as frame pointer (TFI->hasFP(MF) return true). In emitting epilogue, instructions for fp adjustment is inserted.

shiva0217 added inline comments.Nov 20 2019, 12:15 AM

With the investigation from @Jim, here is the simple C could reproduce the case.

extern double var;
extern void callee();
void test(){
  double val = var;
  var = val;

Thanks, @Jim

shiva0217 added inline comments.Nov 20 2019, 6:21 PM

There're might be few ways to fix the issue:

  1. hasFP() return true for ilp32e ABI with feature D
  2. hasFP() return true for ilp32e ABI with feature D and there's is a virtual register with f64 type.
  3. Not allow ilp32e ABI with feature D.

Given that most of the targets supported double float instructions have stack alignment at least eight bytes to avoid frequently realignment. Would it more reasonable to have a new embedded ABI with stack alignment at least eight bytes to support feature D?

lenary marked an inline comment as done.Nov 21 2019, 11:20 AM
lenary added inline comments.

@Jim, @shiva0217, thank you very much for tracking down this bug, and providing a small testcase, that's very helpful.

We talked about this on the call this week, and I indicated I was going to go with a solution as close to 2 as I could.

I have since started an investigation (which I hoped would be quicker than it is) of what happens if we implement canRealignStackFrame to check if FP is unused, and this also seems to solve the problem. I'm doing some deeper checks (which require implementing parts of the backend around MIR that I haven't looked at before), but I think this might be a better solution? I'll keep this patch updated on when I upload the fix for stack realignment to cover this case. In the case that this fix isn't enough, I'll look to implement solution 2.

In any case, it's evident that allocating a spill slot for a register that has higher spill alignment than the stack slot, is the kernel of the problem, and this may arise again depending on how we choose to implement other extensions.

lenary retitled this revision from [RISCV] Implement ilp32e ABI to [WIP][RISCV] Implement ilp32e ABI.Nov 21 2019, 11:20 AM
lenary updated this revision to Diff 233089.Dec 10 2019, 6:53 AM
  • Address ILP32E / rv32d frame pointer issue. I didn't manage to find a reasonable way to query for defined fpr64s, so I'm just enabling the FP if the D extension is enabled.
  • Update callee-saved-fpr64s.ll to clearly document expectations about the test
lenary updated this revision to Diff 233092.Dec 10 2019, 7:09 AM
  • Test Updates
lenary updated this revision to Diff 233103.Dec 10 2019, 7:42 AM
  • Update CSR definition comments
lenary updated this revision to Diff 233106.Dec 10 2019, 7:56 AM
  • Clang support for ilp32e
Herald added a project: Restricted Project. · View Herald TranscriptDec 10 2019, 7:56 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
lenary updated this revision to Diff 233353.Dec 11 2019, 6:39 AM
lenary marked 7 inline comments as done.
  • Update vararg.ll test to add ILP32E tests
  • Address code duplication review comments.
lenary added inline comments.Dec 11 2019, 6:40 AM

I'm not sure I agree with this interpretation of the psABI. The ILP32E Section makes no exception for variadic arguments, and the base calling convention is only defined in relation to XLEN, not in terms of stack alignment.

I will add a test to vararg.ll so the behaviour is at least tested.


I couldn't find a reasonable way to check for a virtual (or physical) register of type fp64, without iterating over all the instructions in a function, which I'd prefer not to do.

So Instead I have implemented option 1 in hasFP.

lenary updated this revision to Diff 233356.Dec 11 2019, 6:44 AM
  • Correct typo
lenary updated this revision to Diff 233369.Dec 11 2019, 7:26 AM
  • Ran clang-format
luismarques added inline comments.Dec 11 2019, 12:03 PM

Nitpick: "the interrupt happens" -> "an interrupt happens" (or, even better, "is serviced").


It would be good to describe in a comment what the bug is.

shiva0217 added inline comments.Dec 12 2019, 5:14 AM

It seems to be the current GCC behavior and the following case could observe that double will not align to even pair.

#include <stdarg.h>
void va_double (int n, ...) {
  va_list args;
  va_start (args, n);
  if (va_arg (args, double) != 2.0)
    abort ();
  va_end (args);
 int main (int a) {
  va_double (1, 2.0);
  return a;

In a second thought, it seems that non-fixed double arguments may generate incorrect code, even with align even pair.
For ilp32 or lp64 ABI with feature D, stack alignment will be 16, so even pair can make sure when pushing/popping the non-fixed double to/from the stack, it will be 8-byte alignment. For ilp32e with 4-byte alignment, even pair can not guarantee the double will be pushed to stack with 8-byte alignment.


I think option 1 could be a reasonable way to fix the issue.

lenary updated this revision to Diff 234070.Dec 16 2019, 8:09 AM
  • Update Comments
lenary marked 3 inline comments as done.Dec 16 2019, 8:32 AM
lenary added inline comments.

Ah, I see the issue.

It's not clear that choosing to spill to a register pair where the first register is a multiple of 4 would solve the problem either, right? The problem is that we actually need to realign the spill slots for these register pairs.

I'm not sure how we achieve this. I will investigate further.

Jim added inline comments.Dec 24 2019, 7:33 PM

No FileCheck line here.

khchen added a subscriber: khchen.Aug 3 2020, 2:29 AM
lenary updated this revision to Diff 316800.Jan 14 2021, 3:54 PM
  • Provide correct datalayout
  • Add clang support and tests
  • Modify fp reservation code so we don't always use a frame pointer, even if we conservatively reserve it when using ilp32e + D extension.
jrtc27 added inline comments.Jan 14 2021, 4:08 PM

I think it'd be better to have a NumArgGPRs(EAABI ? 6 : 8) here as having a default value that gets overwritten is more error-prone (and harder to follow).


Underscores with camel-case isn't great. Maybe ArgIGPRs and ArgEGPRs or similar?


Shouldn't this all be done by the generic stack realignment code like any other allocation? Or is the issue because it's _register spills_ not explicit allocas?


Multiple prefixes is a bad idea with, and why is this one done differently from the rest?

lenary updated this revision to Diff 316806.Jan 14 2021, 4:21 PM
lenary marked an inline comment as done.
  • Add Variadic Testcase

I missed that I need to cover this case.

I'm going to upload a testcase based on your example, but I'm not quite convinced it's correct. It does seem to align the stack correctly for the fp64, but that's maybe not the right thing to be doing here?

I haven't managed to execute the assembly in the testcase, but I thought adding the testcase was important.


I went back and thinking about this, we just need to make sure fp is reserved for later, rather than overriding hasFP, so we don't need to reserve FP unnecessarily.

Iterating over used registers to find FP64 registers didn't fill me with joy, and if you override canRealignStackFrame, it seems you just get very incorrect stack management (where the code just… doesn't bother to realign the stack before saving/restoring).

rkruppe removed a subscriber: rkruppe.Jan 15 2021, 12:56 AM
lenary edited the summary of this revision. (Show Details)Jan 15 2021, 1:24 AM
lenary added a reviewer: jrtc27.
lenary updated this revision to Diff 316875.Jan 15 2021, 1:43 AM
lenary marked an inline comment as done.

Address @jrtc27's Feedback:

  • Cleaned up Clang's RISC-V ABI Lowering Code
  • Cleaned up tests
  • Cleaned up leftover method implementation
  • Updated fp reservation comment
  • Cleaned up names
lenary marked 3 inline comments as done.Jan 15 2021, 1:51 AM
lenary added inline comments.

Yeah the issue is because it’s register spills. I have a nice long commit message I wrote that I should update the summary with.

Comment updated nonetheless

40 ↗(On Diff #316806)

I forgot this was left in after some experimentation I did. Will remove it in the next update.


These check lines are left over from before. will remove


It also doesn’t help to avoid duplication here.