Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

[RISCV][LLD] Add RISCV zcmt optimise in linker relaxation
Needs ReviewPublic

Authored by VincentWu on Sep 25 2022, 4:27 AM.



This patch implements optimizations for the zcmt extension in lld.

A new TableJumpSectio has been added.

Scans each R_RISCV_CALL/R_RISCV_CALL_PLT relocType in each section before the linker relaxation, recording the name of the symbol.

In finalizeContents the recorded symbol names are sorted in descending order by the number of jumps.

Optimise and insert a new cm.jt/cm.jalt during the relax process. in the process, we reused the`R_RISCV_JAL` relocType

co-author: @ScottEgerton

Diff Detail

Event Timeline

VincentWu created this revision.Sep 25 2022, 4:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2022, 4:27 AM
VincentWu requested review of this revision.Sep 25 2022, 4:27 AM
kito-cheng added a comment.EditedSep 26 2022, 4:14 AM

I think this should really write a psABI spec proposal first.


This relaxation type remove 6 byte, so this should has higher priority than R_RISCV_CALL/R_RISCV_CALL_PLT to jal?

1165 ↗(On Diff #462714)

Should be SHT_PROGBITS, SHT_RISCV_ATTRIBUTES is only used for .riscv.attribute.

1188–1192 ↗(On Diff #462714)

This should be enough.

1273–1287 ↗(On Diff #462714)

Hmm, how do you deal with more than one local symbol with same name?

Following example has 2 different bar:


static int bar (){return 1;}

int foo1(){return bar();}


static int bar (){return 2;}

int foo2(){return bar();}


int foo1();
int foo2();
int main(){return foo1()+foo2();}

addCMJTEntryCandidate/getaddCMJTEntryIndex, and same for addEntryZero/getEntryZero


entriesZero -> CMJTEntryCandidates
finalizedEntriesZero -> finalizedCMJTEntries
and same for entriesRa/finalizedEntriesRa.

We only need to record symbol name for finalEntries, so std::vector<std::string> is enough?


CMJTEntry/CMJALTEntry rather than Zero/Ra, e.g. maxCMJTEntrySize/maxCMJALTEntrySize, startCMJTEntryIdx/startCMJALTEntryIdx.


newlib's implementation using __tbljalvec_base$? and maybe use STB_GLOBAL like __global_pointer$?

VincentWu updated this revision to Diff 462882.Sep 26 2022, 6:29 AM
VincentWu marked 6 inline comments as done.

address part of comments

1237 ↗(On Diff #462714)

We only need to record symbol name for finalEntries, so std::vector<std::string> is enough?

yep but we need sort it in descending order by the number of jumps.
it will be spand more linking time to copy again.

so I chose memory space instead of time.

did you mean it is acceptable if i spend more time to copy it again?

sinan added a subscriber: sinan.Sep 27 2022, 1:37 AM
sinan added inline comments.
1165 ↗(On Diff #462882)
craig.topper added inline comments.Sep 30 2022, 10:31 AM


MaskRay added inline comments.Oct 1 2022, 12:03 AM

FF: we require that new options must start with --, not -

1164 ↗(On Diff #462882)

The implementation should be moved to Arch/RISCV.cpp to avoid redundant virtual functions in TargetInfo.


This is inefficient. If DenseMap<CachedStringRef, ...> works, prefer it.

We have the RISC-V attributes section, you should use that (and --(no-)relax) to determine whether or not to use Zcmt relaxation, not invent some new flag, especially since compiler drivers can't use it unless they're sure the linker supports that new flag (think Clang + binutils as the common case where it's hard to know for sure; at least with LLD the likelihood is it's not older than the Clang version, especially in the embedded space) or you require users to opt-in at link time to the relaxation.

jrtc27 added inline comments.Oct 1 2022, 11:46 AM

This seems wrong


This seems like poor design, you shouldn't be using R_RISCV_JAL once it's no longer a JAL instruction there

1213 ↗(On Diff #462882)

When moving this to RISCV.cpp please use extractBits


Why isn't this done when we read in the relocations the first time?


Huh? This doesn't make any sense? You're just writing the jump table section on top of every SHF_ALLOC output section? Make it act like every single other synthetic section and end up in an output section.

VincentWu updated this revision to Diff 487031.Jan 6 2023, 7:52 PM
VincentWu marked 12 inline comments as done.

Refactoring the implementation of lld from psABI.
rebase & address comments

VincentWu added inline comments.Jan 6 2023, 7:54 PM

Use R_RISCV_32 or R_RISCV_64 instead


IMHO, we can calculate the gain value directly if we scan them when all relaxation have been done.

VincentWu updated this revision to Diff 510671.Apr 3 2023, 7:59 PM

Reduce the time complexity of the algorithm by storing symbol pointers instead of symbol names

Try to compile *Vim* with zcmt opt, the code size of result is about 2.5M. get 0.1M (3.8%) code size decrease when it compare to the code size of *Vim* without zcmt (2.6M).

VincentWu updated this revision to Diff 516050.Apr 22 2023, 3:15 AM

extend the size of InputSection

extend the size of InputSection

Why is this change? InputSection may have millions of instances for a large link. Increasing a word may increase the peak memory usage by 1%. We should strive to avoid it.

Revert commit "extend sizeof InputSection"

jrtc27 added inline comments.Apr 24 2023, 12:40 AM

These are extremely dodgy


Blank line doesn't match style


This naming doesn't fit


"Entrys" is not an English word


We call finalizeContents more than once?

42 ↗(On Diff #516290)


kito-cheng added inline comments.May 24 2023, 12:13 AM
evandro removed a subscriber: evandro.May 24 2023, 1:47 PM
VincentWu updated this revision to Diff 534880.Jun 27 2023, 2:29 AM
VincentWu marked 6 inline comments as done.

address comments

VincentWu marked an inline comment as not done.Jun 27 2023, 2:30 AM

This method will be called at the end of the last round of relaxation. the changed in Writer<ELFT>::finalizeAddressDependentContent() might be set as true, which will cause the linker to do an additional relaxation. the code here is just try to avoiding it being called again in an additional relaxation.

jrtc27 added inline comments.Tue, Aug 29, 2:14 PM

Uh, what is this diff now? That's not what Kito said to do, and doesn't make it less dodgy. All you've done is given it a name, but that doesn't change the fact that the whole approach is dodgy.

VincentWu updated this revision to Diff 556100.Wed, Sep 6, 5:56 PM

use INTERNAL_ relocation type for Zcmt

VincentWu marked an inline comment as done.Wed, Sep 6, 5:56 PM