Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

[lld][ELF] Support LoongArch
ClosedPublic

Authored by xen0n on Nov 16 2022, 7:24 AM.

Details

Summary

This adds support for the LoongArch ELF psABI v2.00 [1] relocation
model to LLD. The deprecated stack-machine-based psABI v1 relocs are not
supported.

The code is tested by successfully bootstrapping a Gentoo/LoongArch
stage3, complete with common GNU userland tools and both the LLVM and
GNU toolchains (GNU toolchain is present only for building glibc,
LLVM+Clang+LLD are used for the rest). Large programs like QEMU are
tested to work as well.

[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
xen0n added a comment.Jun 23 2023, 6:01 AM

Don't check e_flags compatibility for input files that does not contain code.

xen0n updated this revision to Diff 534962.Jun 27 2023, 6:52 AM

Add support for the new R_LARCH_64_PCREL reloc that's only added after LoongArch ELF psABI v2.00, but trivial to support and helpful to the ClangBuiltLinux effort.

MaskRay added inline comments.Jul 7 2023, 2:20 PM
lld/ELF/Arch/LoongArch.cpp
23

This can be rephrase to mention what a page is first, then mention that behaviorally equivalent to the AArch64 adrp. Ideally understanding an arch doesn't require understanding another independent one.

29

" (The official ISA manual gives no such "full names".)" this probably should be dropped. The official ISA manual can improve and I don't hope that we push another commit to fix this.

30

It'd be better to define the helpers after class LoongArch.
The AArch64 style in this particular place is something we'd not do.

38

"The higher bits need ..." appears to describe result -= 0x100000000 but is not clear. An example will work better

156

write32le

195

delete braces

225

mention toString(ctx->objectFiles[0]) . See D134198

395

This link gives "This repository has been archived by the owner on Apr 20, 2023. It is now read-only.". Is it stale?

Is R_LARCH_JIRL_LO12 still planned?

438

It seems that https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html is wrong about GR[rd] = PC + SignExtend({si20, 12'b0}, GRLEN).

The low 12-bits of GR[rd] is zeroed.

452

https://github.com/loongson/LoongArch-Documentation/pull/69 is now a stale link as the repo is archived and may be deleted in the future.

lld/test/ELF/loongarch-pc-aligned.s
9

You can use something like --section-start .rodata=0x11000 --section-start=.text=0x11ffc to replace the linker script.

21

I hope that we can reduce the number of ld.lld invocations.

Does it help to increase the number of instructions in .text?

pcalau12i $a0, %pc_hi20(x)
pcalau12i $a0, %pc_hi20(x)
ld.w      $a0, $a0, %pc_lo12(x)
ld.w      $a0, $a0, %pc_lo12(x)
131

This is complex. This needs a comment to explain how these immediate mean.

170

You can define a macro to repeat the code sequence 3 times so that we don't need many extreme*.t

MaskRay added inline comments.Jul 7 2023, 2:41 PM
lld/ELF/Arch/LoongArch.cpp
361

This wording may sound irony for a public project. Also, the paragraph is too long.

We could just R_ABS, but the JIRL instruction reuses the relocation type for a different purpose.
This questionable use case is part of glibc 2.37 Scrt1.o (is it true?), which is linked into user programs, so we have to work around it for a while, even if a new relocation type may be introduced in the future.

I wonder why the medium code model designer decides to use pcalau12i+jirl. Range extension thunks are much better.

371

The long description about "* R_LARCH_PCALA_LO12 just places its computed value unmodified" can be simplified to just mention that the computation is different. How it is different is the job of the comment in relocate.

MaskRay added inline comments.Jul 7 2023, 2:46 PM
lld/ELF/Arch/LoongArch.cpp
407

These R_LARCH_ADD* and R_LARCH_SUB* are untested.

You can use .reloc ., R_LARCH_ADD8, sym to force generating a relocation in a test.

xen0n updated this revision to Diff 538576.Jul 10 2023, 4:04 AM
xen0n marked 19 inline comments as done.

Addressed all review comments:

  • Moved the helpers around the class LoongArch definition
  • Adjusted various comments
    • Toned down some rants
    • Replaced links to archived repo with archive.org snapshots or new refreshed links
    • Other fixes
  • Renamed getLoongArchPageOffset -> getLoongArchPageDelta, fixed its hi32 adjustment logic and added test cases
  • Improve diagnostics message for e_flags mismatch
  • Added tests for R_LARCH_MARK_LA and R_LARCH_{ADD,SUB}{8,16,32,64}
  • write32le the break insn
  • Set defaultCommonPageSize to 16KiB
lld/ELF/Arch/LoongArch.cpp
38

Thanks, I've discovered even the BFD implementation is wrong for some of the edge cases. (This implementation needs a fix as well.) I've added an example and fixed the logic in the latest revision.

342

I've now added R_LARCH_MARK_LA's in the loongarch-abs64.s cases. I can't seem to find any usage of R_LARCH_MARK_PCREL in the wild, not even in the old world binaries, so I'm not sure if removing support for it (instead of treating as no-op per common sense) would be better.

361

I added the rant following R_386_GOT32 and R_386_GOT32X's precedent. I agree though that such wording would better belong somewhere else (e.g. in an issue raised in the ABI specs repo) and I'll use your suggested wording instead.

395

No, AFAIK the repo's information is still current and the repo isn't going away. They archived it for non-technical reasons (maybe not giving the impression that people outside of Loongson can change their official docs at will, before their CLA automation is ready).

I've nevertheless archived the page with archive.org and replaced with a new PR to the document's current home. I expect the discussion to continue but only after they get their CLA mechanism finished.

438

You may be looking at pcaddu12i? That's the same as RISCV auipc. The PCALA relocs all use pcalau12i that indeed has the lower bits zeroing behavior mentioned. See the corresponding documentation source.

452

The repo isn't going anywhere, as explained in another reply, but I've replaced with a new refreshed PR on the currently maintained LA ABI specs repo nevertheless (I've replicated all info there so no context is lost even if the original repo is deleted).

lld/test/ELF/loongarch-pc-aligned.s
9

You can use something like --section-start .rodata=0x11000 --section-start=.text=0x11ffc to replace the linker script.

Sorry, but as I've previously explained as soon as I replace the linker script with the --section-start arguments that really should be equivalent, I instead get errors like ld.lld: error: output file too large: 18446744073709494880 bytes. This doesn't happen with the BFD linker so I suspect something is wrong with the common logic, although I haven't debugged yet.

For now though I've discovered your suggestion somehow works (by ensuring the base address to be always greater than defaultMaxPageSize i.e. 0x10000), and I've amended the cases.

21

Probably no, I have to ensure sufficiently different PC and target values for each case, in order to be able to cover the edge cases. Adding a trivial number of instructions won't help (i.e. I can't put the target value 4GiB away without utilizing --section-start or linker scripts anyway).

Do you mean I should just leave the PC alone and use --defsym for testing multiple offsets at once?

131

I've (programmatically) generated more cases for complete coverage of getLoongArchPageDelta (previously getLoongArchPageOffset), complete with explanations for every immediate there.

170

The linker script inputs are all gone in the latest revision. Can't seem to reduce lld invocations though, happy to see any suggestion.

MaskRay added inline comments.Jul 13 2023, 8:19 PM
lld/ELF/Arch/LoongArch.cpp
112

This example doesn't trigger the interesting case signLo12 != signHi20. We need a different example.

126

excessively indented

220

Delete nullness check.

We probably should change the parameter to a const reference.

223

drop braces

lld/ELF/InputSection.cpp
655

Delete this comment. This is not fallthrough, which usually means [[fallthrough]].

lld/ELF/Relocations.h
110

... from other architectures.

lld/test/ELF/loongarch-branch-b16.s
22 ↗(On Diff #538576)

For error messages, we check the error: prefix.

Fix this elsewhere.

23 ↗(On Diff #538576)

Add -NEXT whenever applicable.

lld/test/ELF/loongarch-branch-b21.s
22 ↗(On Diff #538576)

For error messages, we check the error: prefix.

lld/test/ELF/loongarch-branch-b26.s
1 ↗(On Diff #538576)

I think it helps readability if loongarch-branch-b26.s is merged into loongarch-branch-b21.s.

27 ↗(On Diff #538576)

For error messages, we check the error: prefix.

lld/test/ELF/loongarch-interlink.test
77

Indent instructions

lld/test/ELF/loongarch-plt.s
84

The column number of 20040: should not be less than the column number of the colon in DIS32-NEXT:.
We may need more indentation for instructions.

xen0n updated this revision to Diff 540853.Jul 16 2023, 9:06 PM
xen0n marked 13 inline comments as done.

Address all review comments.

xen0n added inline comments.Jul 16 2023, 9:08 PM
lld/ELF/Arch/LoongArch.cpp
112

Ah, I copied a wrong fixture from my experiments, sorry for that. I've made new examples to cover both directions of adjustment.

lld/test/ELF/loongarch-branch-b16.s
23 ↗(On Diff #538576)

Okay. Here it's not applicable though, as the actual diagnostics have >>> defined in --defsym:1 and an empty line in between, which is irrelevant to our intended assertion.

xen0n added a comment.Jul 19 2023, 8:39 PM

Friendly ping; ideally I'd like this port to get included in LLVM 17, which will vastly simplify things ecosystem-wide, such as ClangBuiltLinux testing, Gentoo LLVM stagebuilding, and possibly many more large projects depending on lld.

Friendly ping; ideally I'd like this port to get included in LLVM 17, which will vastly simplify things ecosystem-wide, such as ClangBuiltLinux testing, Gentoo LLVM stagebuilding, and possibly many more large projects depending on lld.

This is on my radar and I will take a look again soon. There are also random other LoongArch things I need to take a look and I did. This is totally fine to be cherry picked into the release/17.x branch even after it is created.
I am sorry that I did not respond in time for the initial period of the patch (when it changed from the "WIP" state to the normal in-review state), but I think the request after 3 days may be a bit too quick.
Honestly the linker is probably I care about the most and I don't want to lower the bar. (There are other things that I am concerned with as well, but I may not have too much energy to prevent something less appropriate happen; but if I notice, I'll make a comment).

xen0n added a comment.Jul 19 2023, 9:16 PM

Friendly ping; ideally I'd like this port to get included in LLVM 17, which will vastly simplify things ecosystem-wide, such as ClangBuiltLinux testing, Gentoo LLVM stagebuilding, and possibly many more large projects depending on lld.

This is on my radar and I will take a look again soon. There are also random other LoongArch things I need to take a look and I did. This is totally fine to be cherry picked into the release/17.x branch even after it is created.
I am sorry that I did not respond in time for the initial period of the patch (when it changed from the "WIP" state to the normal in-review state), but I think the request after 3 days may be a bit too quick.
Honestly the linker is probably I care about the most and I don't want to lower the bar. (There are other things that I am concerned with as well, but I may not have too much energy to prevent something less appropriate happen; but if I notice, I'll make a comment).

Thanks for the clarification. I didn't know if the release branch cherry-pick could be done in this case so I was a little anxious, sorry for the less-than-a-week ping.

I fully understand your passion and care for the linker and I agree with not lowering the bar; I'll do my best to improve the code together.

Friendly ping; ideally I'd like this port to get included in LLVM 17, which will vastly simplify things ecosystem-wide, such as ClangBuiltLinux testing, Gentoo LLVM stagebuilding, and possibly many more large projects depending on lld.

This is on my radar and I will take a look again soon. There are also random other LoongArch things I need to take a look and I did. This is totally fine to be cherry picked into the release/17.x branch even after it is created.
I am sorry that I did not respond in time for the initial period of the patch (when it changed from the "WIP" state to the normal in-review state), but I think the request after 3 days may be a bit too quick.
Honestly the linker is probably I care about the most and I don't want to lower the bar. (There are other things that I am concerned with as well, but I may not have too much energy to prevent something less appropriate happen; but if I notice, I'll make a comment).

Thanks for the clarification. I didn't know if the release branch cherry-pick could be done in this case so I was a little anxious, sorry for the less-than-a-week ping.

I fully understand your passion and care for the linker and I agree with not lowering the bar; I'll do my best to improve the code together.

After being disgusted by the code of [another linker] multiple times I now totally agree we must be careful when we write a linker...

MaskRay added inline comments.Jul 21 2023, 12:36 AM
lld/ELF/Arch/LoongArch.cpp
89

I was quite confused by the description, then I realized that I probably should just figure out the formula myself:)

I wonder whether it's clear if we remove these textual descriptions and focus on the deduction:

// We have to check if the higher 32 bits need adjustment too, due to
// the large code model access pattern:
//
//     pcalau12i       A, %foo_hi20(sym)        // b in [0,0xfff]
//     addi.d          T, zero, %foo_lo12(sym)  // pageoff in [0,0xfff]
//     lu32i.d         T, %foo64_lo20(sym)      // c in [0,0xfffff]
//     lu52i.d         T, T, %foo64_hi12(sym)   // d in [0,0xfff]
//     {ldx,stx,add}.* A, A, T
//
// If we treat the immediates in the operands as unsigned, the adjusted page
// delta (return value) is b<<12 | c<<32 | d<<52.
// We have (pageoff<0x800?pageoff:pageoff-0x1000+0x100000000) + ((b<0x80000?b:b-0x100000)<<12) + (c<<32) + (d<<52) = dest - page(pc).
//
// When pageoff is 0xfff, its contribution to dest-page(pc) is effectively 0xfff-0x1000 + 0x100000000 as the high 32-bits are
// set by lu32i.d+lu52i.d, e.g.
//
//     addi.d    T, zero, -1  ; T = 0xfff'fffff'fffff'fff
//     lu32i.d   T, 1         ; T = 0x000'00001'fffff'fff
//     lu52i.d   T, T, -1     ; T = 0xfff'00001'fffff'fff
229

We probably should change the parameter to a const reference.

Not done?

Reminder: update lld/docs/ReleaseNotes.rst and lld/docs/index.rst when we finish this.

xen0n updated this revision to Diff 543850.Jul 25 2023, 12:05 AM

Address review comments from @MaskRay and @SixWeining:

  • Rewritten the docs for getLoongArchPageDelta, detailing the calculations;
  • Added a missing const qualifier (I apparently misunderstood the review comment);
  • Updated the docs.
MaskRay accepted this revision.Jul 25 2023, 12:41 AM
MaskRay added inline comments.
lld/ELF/Arch/LoongArch.cpp
24

delete blank line

42

delete blank line

115

<< needs some parentheses due to operator precedence :)

lld/test/ELF/loongarch-abs64.s
66

delete trailing blank line

lld/test/ELF/loongarch-pcala-lo12-jirl-shared.s
18

Delete addresses for instructions that are not referenced. Fewer addresses makes test updating easier.

This revision is now accepted and ready to land.Jul 25 2023, 12:41 AM
xen0n updated this revision to Diff 543861.Jul 25 2023, 12:55 AM

(hopefully) final cosmetic fixes:

  • Removed several extra blank lines
  • Slightly tweaked getLoongArchPageDelta doc wording again and added parens so the operators in the pseudo-code follows C precedence
  • Removed several extraneous disassembly addresses from the loongarch-pcala-lo12-jirl-shared.s test case
xen0n marked 7 inline comments as done.Jul 25 2023, 1:02 AM

Thanks for all the careful reviews that helped tremendously in improving the code quality (and forcing me to improve my ELF wrangling skills).

I've gone through the especially hard parts (getLoongArchPageDelta in particular; I had to JIT some funclets to ensure the operation is 100% correct) and I believe they should be good to go... And if things go awry I would surely be around to fix. Thanks go to everyone who has helped reviewing this code.

A question: does LLD automatically supports pack-relative-relocs once a new port is added? I'm trying to implement it for LoongArch BFD linker but it seems I don't have enough skill.

xen0n added a comment.Jul 25 2023, 1:37 AM

A question: does LLD automatically supports pack-relative-relocs once a new port is added? I'm trying to implement it for LoongArch BFD linker but it seems I don't have enough skill.

It seems to be the case. Last time I did the Gentoo LLVM bootstrap I saw binaries full with DT_RELR (I didn't remember if I manually forced the flag or not; Gentoo seems to enable it in some base profile anyway) and the glibc rtld consumed them just fine.

This revision was landed with ongoing or failed builds.Jul 25 2023, 2:11 AM
This revision was automatically updated to reflect the committed changes.
xen0n added a comment.Jul 25 2023, 2:12 AM

A question: does LLD automatically supports pack-relative-relocs once a new port is added? I'm trying to implement it for LoongArch BFD linker but it seems I don't have enough skill.

It seems to be the case. Last time I did the Gentoo LLVM bootstrap I saw binaries full with DT_RELR (I didn't remember if I manually forced the flag or not; Gentoo seems to enable it in some base profile anyway) and the glibc rtld consumed them just fine.

I've just confirmed with a fresh QEMU build that the support is indeed working end-to-end.

It looks like there's a test failure happening in post-commit from one of the newly added tests: https://lab.llvm.org/buildbot/#/builders/42/builds/10617 Can you investigate (and revert if that's going to take a while to resolve)?

It looks like there's a test failure happening in post-commit from one of the newly added tests: https://lab.llvm.org/buildbot/#/builders/42/builds/10617 Can you investigate (and revert if that's going to take a while to resolve)?

I guess it's D156293.

ruiu added a subscriber: ruiu.Aug 8 2023, 1:16 AM
ruiu added inline comments.
lld/ELF/Arch/LoongArch.cpp
162–163

Don't you need to recompute negativeA after adding 0x1000 to result? That yields a different result if the original result` value happened to be 0x7fff'f000.