This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add support for using -msave-restore with tailcalls
Needs ReviewPublic

Authored by edward-jones on Nov 18 2020, 8:34 AM.

Details

Summary

This introduces the use of a new compiler runtime library function which behaves the same as riscv_restore_N, however instead of returning to the callee these functions do a tailcall to the function whose pointer is stored in t1. This allows the -msave-restore optimization to be applied even in a tailcall context. The new runtime library functions are called riscv_restore_tailcall_X where X is the number of callee-saved registers which will be restored.

Original patch by Simon Cook.

Diff Detail

Event Timeline

edward-jones created this revision.Nov 18 2020, 8:34 AM
edward-jones requested review of this revision.Nov 18 2020, 8:34 AM

Where is this specified? I only see the non-tailcall versions in GCC. I don't think we should merge anything until it's at least a draft specification in an official place.

jrtc27 added inline comments.Nov 18 2020, 8:43 AM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
1133

We should not need to duplicate all this code surely? Certainly these need to honour the code model and not assume medlow.

lenary resigned from this revision.Jan 14 2021, 9:43 AM

I have updated this so that the generation of the address passed to the restore functions is done in a separate method RISCVFrameLowering::buildAddr. This handles both the code models and mirrors RISCVISelLowering::getAddr - it would be nice if these functions could share code, or at least be in a common place.

I've also opened a pull request to document the existing -msave-restore behaviour in riscv-toolchain-conventions, and I have a follow up patch to document the tail call versions of the entry points too.

I've expanded the test to cover the different code models, and pic mode too.

lenary removed a subscriber: lenary.Apr 19 2021, 11:29 AM

Documentation for -msave-restore has been committed to riscv-toolchain-conventions, I have an additional pull request to document this proposed tail-call version here

On the RISC-V sync-up call I said I would try and provide some code size numbers for these changes. First I mentioned that I saw a code size regression when compiled Embench, here's per-benchmark numbers:

-Oz -msave-restore (no tailcall version)

 Benchmark            size
---------            ----
aha-mont64          1,964
crc32               1,140
cubic              55,942
edn                 2,364
huffbench           2,456
matmult-int         1,420
minver             11,160
nbody              10,748
nettle-aes          3,852
nettle-sha256       8,552
nsichneu           14,268
picojpeg            9,476
qrduino             7,256
sglib-combined      3,528
slre                3,652
st                 11,056
statemate           4,908
ud                  1,620
wikisort           15,060
---------           -----
Geometric mean      5,347
Geometric SD            2.65
Geometric range  12,143.345300622888

-Oz -msave-restore (with tailcall version)

Benchmark            size
---------            ----
aha-mont64          1,968
crc32               1,340
cubic              55,942
edn                 2,372
huffbench           2,656
matmult-int         1,476
minver             11,164
nbody              10,752
nettle-aes          3,896
nettle-sha256       8,588
nsichneu           14,272
picojpeg            9,476
qrduino             7,428
sglib-combined      6,512
slre                3,656
st                 11,116
statemate           4,912
ud                  1,620
wikisort           15,060
---------           -----
Geometric mean      5,620
Geometric SD            2.59
Geometric range  12,365.236826034434

The Embench benchmarks are very small, so the constant cost of the additional __riscv_restore_tailcall entry points exceeds the benefit of the optimization.

To try something larger I also grabbed sqlite 3.35.5 and tried building that (again -Oz -msave-restore). Below are results for that experiment.

  text    data     bss     dec     hex filename
652816    5604     903  659323   a0f7b sqlite3-save-restore.o
645328    5604     903  651835   9f23b sqlite3-save-restore-tailcall.o

So for this larger codebase where the constant cost of the additional functions is minor, we see approximately a 1% reduction in code size (note that this only compiling to an object file, so it doesn't include the the cost of the save-restore compiler-rt functions)

I've rebased and added an option so that the tailcall version of save/restore is only used if -mllvm -riscv-save-restore-tailcall is provided.

asb added a comment.Jul 22 2021, 3:40 AM

I've rebased and added an option so that the tailcall version of save/restore is only used if -mllvm -riscv-save-restore-tailcall is provided.

Thanks Ed, I think making it opt-in like this makes the decision to merge or not slightly easier, but ideally we can get at least some confirmation in https://github.com/riscv/riscv-toolchain-conventions/pull/10 as to whether anybody has major concerns about going in this direction.

I'm not entirely sure if there's anything that needs to be agreed on still for this. It feels like this has soft agreement, but I don't know whether I would get push back if I get this committed

To summarize where I think this stands. We don't want to implement something in LLVM which is undocumented or not likely to be implemented for other tools, so I also opened a pull request to document this change (https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/10) in the hope that it could be later implemented for GCC. From the discussion on that pull request there's an understandable reluctance to introduce a feature which would stop libgcc/compiler-rt from being drop in replacements for each other.

So I think to unstick this a patch to libgcc needs to be submitted.

The documentation changes to the riscv-toolchain-conventions repository have been merged now, (https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/10). In addition, there's a pending patch to implement this in GCC (https://gcc.gnu.org/pipermail/gcc-patches/2022-March/592467.html). I think this means there's nothing blocking this except a rebase.

asb added a comment.May 11 2022, 5:26 AM

The documentation changes to the riscv-toolchain-conventions repository have been merged now, (https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/10). In addition, there's a pending patch to implement this in GCC (https://gcc.gnu.org/pipermail/gcc-patches/2022-March/592467.html). I think this means there's nothing blocking this except a rebase.

I think that's right - thanks for your perseverance with this Ed.