This is an archive of the discontinued LLVM Phabricator instance.

[lld][ELF] Support LoongArch
ClosedPublic

Authored by xen0n on Nov 16 2022, 7:24 AM.

Details

Summary

This adds support for the LoongArch ELF psABI v2.00 [1] relocation
model to LLD. The deprecated stack-machine-based psABI v1 relocs are not
supported.

The code is tested by successfully bootstrapping a Gentoo/LoongArch
stage3, complete with common GNU userland tools and both the LLVM and
GNU toolchains (GNU toolchain is present only for building glibc,
LLVM+Clang+LLD are used for the rest). Large programs like QEMU are
tested to work as well.

[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
xen0n added a comment.Dec 25 2022, 8:09 AM

revert the broken TLS IE offset change

xen0n added inline comments.Dec 25 2022, 8:14 AM
lld/ELF/InputSection.cpp
653

I've located the TLS IE offset logic in the BFD code (IE symbol = offset by 2 GOT entries) and tried to do the same in LLD, but unfortunately the output was broken so I have to revert most of it. Sadly I won't have enough time to investigate before end of year.

brad added a subscriber: brad.Jan 13 2023, 9:26 AM

With the backend enabled, it would be good to work on this further.

xen0n added a comment.Jan 15 2023, 2:33 AM

With the backend enabled, it would be good to work on this further.

Hi, thanks for the comment! Indeed the target's been enabled by default for a few days now, and no I haven't abandoned the work. It's simply hard to find time to work on this, though.

At this point it's still the test cases that need more work. I've successfully bootstrapped a Gentoo stage3 with this code and a recent LLVM snapshot a while ago, so the code is indeed working, but of course mainlining requires more specific / cheaper tests than distro bootstrapping. I'm not entirely sure that can be done before the LLVM 16 branching, though, however I'll try.

According to https://llvm.org/docs/HowToReleaseLLVM.html#release-patch-rules there might be 10 days (before RC1) for us. Hope we can finish it.

lld/test/ELF/loongarch-branch-b16.s
4

Two dashes. :)

4

Ditto.

xen0n updated this revision to Diff 491057.Jan 21 2023, 2:24 AM

Add a couple of tests, implement the R_LARCH_PCALA_LO12 on JIRL kludge

xen0n retitled this revision from [WIP][ELF] Support LoongArch to [ELF] Support LoongArch.Jan 21 2023, 2:25 AM
xen0n edited the summary of this revision. (Show Details)

Maybe should mention LLD in the title?

I just verified the latest change, lld still fail to link llvm and report the error:

ld.lld: error: relocation R_LARCH_PCALA_HI20 cannot be used against symbol '__cxa_atexit'; recompile with -fPIC                                                                              
>>> defined in /usr/lib64/libc.so.6
>>> referenced by atexit.c:46 (../stdlib/atexit.c:46)
>>>               atexit.oS:(atexit) in archive /usr/lib64/libc_nonshared.a
clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.
xen0n added a comment.Jan 21 2023, 7:41 AM

I just verified the latest change, lld still fail to link llvm and report the error:

ld.lld: error: relocation R_LARCH_PCALA_HI20 cannot be used against symbol '__cxa_atexit'; recompile with -fPIC                                                                              
>>> defined in /usr/lib64/libc.so.6
>>> referenced by atexit.c:46 (../stdlib/atexit.c:46)
>>>               atexit.oS:(atexit) in archive /usr/lib64/libc_nonshared.a
clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.

Hmm, let me check after some sleep... I probably should write more tests but unfortunately time's not enough. I'll try to push a new revision tomorrow morning.

xen0n updated this revision to Diff 491119.Jan 21 2023, 7:49 PM
xen0n edited the summary of this revision. (Show Details)

fix PCALA_HI20 behavior in case of PCALA_LO12 on JIRL, and cleanups

xen0n retitled this revision from [ELF] Support LoongArch to [lld][ELF] Support LoongArch.Jan 21 2023, 7:51 PM
xen0n marked 5 inline comments as done.Jan 21 2023, 8:15 PM
xen0n updated this revision to Diff 491121.Jan 21 2023, 8:28 PM

amend comments

xen0n updated this revision to Diff 491122.Jan 21 2023, 8:36 PM

fix a switch arm and tweak comments

FWIW, BFD linker handles R_LARCH_PCALA_HI20 by creating a PLT for every R_LARCH_PCALA_HI20 against an undefined symbol. This can be really annoying if someone happens to make a simple typo in a symbol name (causing an runtime crash which is difficult to be diagnosed). And I guess using such an approach in LLD will annoy LLD maintainers.

We are out here to contribute, to make contact with the FLOSS project maintainers, to establish friendly relations but not to cause trouble, and absolutely not to annoy people. And now, look at what we are doing. ref.

xen0n updated this revision to Diff 491132.Jan 21 2023, 11:05 PM

add test cases for PCALA relocs, fix getLoongArchPageOffset's adjustment to the higher half as uncovered by the tests

Verification result: llvm is successfully built/linked by clang/lld with this patch applied.

lld/ELF/InputSection.cpp
653

Ping for not getting reply about here for a long time.

File: test.s

la.tls.ie $r4, y
la.tls.gd $r5, y

.section .tbss
.globl y
y:
.word   0
.size   y, 4

Pre-cmd:

$ llvm-mc -filetype=obj --triple=loongarch64 test.s -o test.o
$ ld.lld test.o -shared -o test.lld
$ ld test.o -shared -o test.ld

Compare1:

$ readelf -Wr test.{ld, lld}
test.ld:
0000000000008008  0000000300000007 R_LARCH_TLS_DTPMOD64   0000000000000000 y + 0
0000000000008010  0000000300000009 R_LARCH_TLS_DTPREL64   0000000000000000 y + 0
0000000000008018  000000030000000b R_LARCH_TLS_TPREL64    0000000000000000 y + 0

test.lld:
0000000000020398  0000000100000002 R_LARCH_64             0000000000000000 y + 0
00000000000203a0  0000000100000007 R_LARCH_TLS_DTPMOD64   0000000000000000 y + 0
00000000000203a8  0000000100000009 R_LARCH_TLS_DTPREL64   0000000000000000 y + 0
00000000000203b0  000000010000000b R_LARCH_TLS_TPREL64    0000000000000000 y + 0

Compare2:

$ objdump -d test.{ld, lld}
test.ld:
0000000000000250 <.text>:
 250:	1a000104 	pcalau12i   	$a0, 8(0x8)
 254:	28c06084 	ld.d        	$a0, $a0, 24(0x18)
 258:	1a000105 	pcalau12i   	$a1, 8(0x8)
 25c:	02c020a5 	addi.d      	$a1, $a1, 8(0x8)

test.lld:
00000000000102d0 <.text>:
   102d0:	1a000204 	pcalau12i   	$a0, 16(0x10)
   102d4:	28ce8084 	ld.d        	$a0, $a0, 928(0x3a0)
   102d8:	1a000205 	pcalau12i   	$a1, 16(0x10)
   102dc:	02ce80a5 	addi.d      	$a1, $a1, 928(0x3a0)
xen0n added a comment.Feb 13 2023, 5:59 AM

Sorry for the late reply, recently I've been too busy to resume work on this patch as much as I'd want to.

Last time I tried out your suggestion but the resulting output binary was invariably broken. I think I'd have to write some more tests to figure out whether the current behavior would be correct despite deviating from BFD behavior.

xen0n added inline comments.Apr 2 2023, 7:42 PM
lld/ELF/InputSection.cpp
653

Actually I'm not sure if the behavior difference is mission-critical, as things fully work (or at least seems to) when I bootstrapped Gentoo with LLVM/Clang. Do you want me to take care of this before merging the whole thing?

MQ-mengqing added inline comments.Apr 3 2023, 12:37 AM
lld/ELF/InputSection.cpp
653

It not matters, it is hard to happen, maybe. It is a hidden danger
and developers should notice something may wrong if that occurs.
But, I also think it is not mission-critical. It can be documented and
fixed later.

xen0n marked 3 inline comments as done.Apr 3 2023, 12:50 AM
xen0n added inline comments.
lld/ELF/InputSection.cpp
653

Okay, I'll soon add the test case but note the current behavioral difference so we can get to fix it later.

Sorry for not handling this as promptly as I'd like to; real life and other reasons kick in.

zjiaz added a subscriber: zjiaz.May 3 2023, 6:55 PM
hev added a subscriber: hev.May 25 2023, 12:06 AM
xen0n updated this revision to Diff 526322.May 28 2023, 4:28 AM
  • Rebased
  • Fixed TLS GD/IE bug pointed out by @MQ-mengqing
  • Minor cleanups and test additions (ported from RISCV)
xen0n updated this revision to Diff 526385.May 28 2023, 11:06 PM

Use double dashes for llvm-mc invocations in test cases, and add a comment to the tls-gd-and-ie case documenting its edge-case-ness.

MQ-mengqing added inline comments.May 29 2023, 6:35 PM
lld/ELF/Arch/LoongArch.cpp
419

Since TLS_GD also uses GOT64_PC_{HI20, LO12}, it may have something mistake.

xen0n updated this revision to Diff 526494.May 29 2023, 7:44 PM

more cleanups to the TLS logic

xen0n marked an inline comment as done.May 29 2023, 7:47 PM
xen0n added inline comments.
lld/ELF/Arch/LoongArch.cpp
419

Thanks for pointing out that; seems I've heavily underestimated the amount of reuse happening! I've just unified about as much TLS logic as possible into the existing GOT case (although I don't have time to write the tests until late night).

xen0n added a comment.Jun 7 2023, 1:48 AM

@MQ-mengqing: gentle ping; I believe all issues reported by you are now fixed. Please test if that's really the case. Thanks!

I've tested the TLS-relative tests, this patch works well! Thanks for your works, patience and time!

SixWeining accepted this revision.Jun 8 2023, 12:04 AM

This patch has been reviewed and tested for more than 6 months and all comments are well addressed. It looks good to me from the LoongArch side, so I will accept.
@MaskRay Could you help to review this patch once more? Thanks.

MaskRay requested changes to this revision.Jun 8 2023, 9:27 PM

Thank you for the contribution. This is very well-written. I've put up some comments and will review some technical corners more carefully.

lld/ELF/Arch/LoongArch.cpp
32

It seems that our codebase is not consistent with 0x[a-f] and 0x[A-F], but [a-f] is more common and probably should be preferred for new code.

84

Delete this variable and adjust use sites.

228

This is untested. You can reuse loongarch-elf-flags.s

261

glibc rtld does not use this field. We can delete this override.

342

What are R_LARCH_MARK_LA and R_LARCH_MARK_PCREL? They are untested.

383

https://www.gnu.org/software/libc/ glibc 2.37 has been release. This sentence should be revised.

488

These cases are probably not useful.
Reporting unknown relocation for deprecated/essentially-unused relocation types should be fine.

lld/ELF/Driver.cpp
175

elf32loongarch is untested. Perhaps add RUN lines to emulation-loongarch.s. loongarch-elf-flags.s is possibly not needed.

lld/ELF/InputSection.cpp
682

Is The R_LARCH_GOT64_PC_* relocs are also reused for TLS GD. accurate? Is this just R_LOONGARCH_GOT_PAGE_PC?

684

R_LOONGARCH_GOT_TLSGD_PC does not exist.

lld/ELF/Relocations.cpp
1065

This can be simplified as else if (!sym.isTls() || config->emachine != EM_LOONGARCH)

lld/ELF/ScriptParser.cpp
442

Test this with OUTPUT_FORMAT linker script command.

lld/test/ELF/loongarch-branch-b16.s
11

The encoding is unneeded detail and may change if someone decides to remove spaces between hex pairs. Run llvm-objdump with --no-show-raw-insn

lld/test/ELF/loongarch-pc-aligned.s
16

Consider --section-start

lld/test/ELF/loongarch-pcala-lo12-jirl-shared-error.s
1 ↗(On Diff #526494)

foo-error.s can usually be added into foo.s by leveraging split-file.

lld/test/ELF/loongarch-tls-gd-and-ie.s
1 ↗(On Diff #526494)

What does loongarch-tls-gd-and-ie.s mean?

This revision now requires changes to proceed.Jun 8 2023, 9:27 PM
MaskRay added inline comments.Jun 8 2023, 9:30 PM
lld/test/ELF/loongarch-tls-gd.s
133

Avoid LLVM assembler-specific .tbss not recognized by GNU as.
.section .tbss,"awT",@nobits

xen0n updated this revision to Diff 532600.Jun 19 2023, 4:18 AM
xen0n marked 16 inline comments as done.

rebased and addressed most review comments

xen0n added inline comments.Jun 19 2023, 4:19 AM
lld/ELF/Arch/LoongArch.cpp
342

AFAIK they serve as markers for external consumers only: R_LARCH_MARK_LA is emitted by binutils for each la.abs, and used by (old) Linux and grub to perform their own relocation; R_LARCH_MARK_PCREL is unused so far. Are test cases still needed in this case?

lld/test/ELF/loongarch-pc-aligned.s
16

Strangely I get error: output file too large for some of the cases (like this one; but not with others). Can't see a reason for this...

lld/test/ELF/loongarch-tls-gd-and-ie.s
1 ↗(On Diff #526494)

It's an edge case discovered by @MQ-mengqing and explained in a comment below. I've renamed the test case and moved the comment to top for better visibility.

xen0n updated this revision to Diff 533938.Jun 23 2023, 6:01 AM
xen0n added a comment.Jun 23 2023, 6:01 AM

Don't check e_flags compatibility for input files that does not contain code.

xen0n updated this revision to Diff 534962.Jun 27 2023, 6:52 AM

Add support for the new R_LARCH_64_PCREL reloc that's only added after LoongArch ELF psABI v2.00, but trivial to support and helpful to the ClangBuiltLinux effort.

MaskRay added inline comments.Jul 7 2023, 2:20 PM
lld/ELF/Arch/LoongArch.cpp
22

This can be rephrase to mention what a page is first, then mention that behaviorally equivalent to the AArch64 adrp. Ideally understanding an arch doesn't require understanding another independent one.

28

" (The official ISA manual gives no such "full names".)" this probably should be dropped. The official ISA manual can improve and I don't hope that we push another commit to fix this.

29

It'd be better to define the helpers after class LoongArch.
The AArch64 style in this particular place is something we'd not do.

37

"The higher bits need ..." appears to describe result -= 0x100000000 but is not clear. An example will work better

155

write32le

194

delete braces

224

mention toString(ctx->objectFiles[0]) . See D134198

394

This link gives "This repository has been archived by the owner on Apr 20, 2023. It is now read-only.". Is it stale?

Is R_LARCH_JIRL_LO12 still planned?

437

It seems that https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html is wrong about GR[rd] = PC + SignExtend({si20, 12'b0}, GRLEN).

The low 12-bits of GR[rd] is zeroed.

451

https://github.com/loongson/LoongArch-Documentation/pull/69 is now a stale link as the repo is archived and may be deleted in the future.

lld/test/ELF/loongarch-pc-aligned.s
8

You can use something like --section-start .rodata=0x11000 --section-start=.text=0x11ffc to replace the linker script.

20

I hope that we can reduce the number of ld.lld invocations.

Does it help to increase the number of instructions in .text?

pcalau12i $a0, %pc_hi20(x)
pcalau12i $a0, %pc_hi20(x)
ld.w      $a0, $a0, %pc_lo12(x)
ld.w      $a0, $a0, %pc_lo12(x)
130

This is complex. This needs a comment to explain how these immediate mean.

169

You can define a macro to repeat the code sequence 3 times so that we don't need many extreme*.t

MaskRay added inline comments.Jul 7 2023, 2:41 PM
lld/ELF/Arch/LoongArch.cpp
360

This wording may sound irony for a public project. Also, the paragraph is too long.

We could just R_ABS, but the JIRL instruction reuses the relocation type for a different purpose.
This questionable use case is part of glibc 2.37 Scrt1.o (is it true?), which is linked into user programs, so we have to work around it for a while, even if a new relocation type may be introduced in the future.

I wonder why the medium code model designer decides to use pcalau12i+jirl. Range extension thunks are much better.

370

The long description about "* R_LARCH_PCALA_LO12 just places its computed value unmodified" can be simplified to just mention that the computation is different. How it is different is the job of the comment in relocate.

MaskRay added inline comments.Jul 7 2023, 2:46 PM
lld/ELF/Arch/LoongArch.cpp
406

These R_LARCH_ADD* and R_LARCH_SUB* are untested.

You can use .reloc ., R_LARCH_ADD8, sym to force generating a relocation in a test.

xen0n updated this revision to Diff 538576.Jul 10 2023, 4:04 AM
xen0n marked 19 inline comments as done.

Addressed all review comments:

  • Moved the helpers around the class LoongArch definition
  • Adjusted various comments
    • Toned down some rants
    • Replaced links to archived repo with archive.org snapshots or new refreshed links
    • Other fixes
  • Renamed getLoongArchPageOffset -> getLoongArchPageDelta, fixed its hi32 adjustment logic and added test cases
  • Improve diagnostics message for e_flags mismatch
  • Added tests for R_LARCH_MARK_LA and R_LARCH_{ADD,SUB}{8,16,32,64}
  • write32le the break insn
  • Set defaultCommonPageSize to 16KiB
lld/ELF/Arch/LoongArch.cpp
37

Thanks, I've discovered even the BFD implementation is wrong for some of the edge cases. (This implementation needs a fix as well.) I've added an example and fixed the logic in the latest revision.

342

I've now added R_LARCH_MARK_LA's in the loongarch-abs64.s cases. I can't seem to find any usage of R_LARCH_MARK_PCREL in the wild, not even in the old world binaries, so I'm not sure if removing support for it (instead of treating as no-op per common sense) would be better.

360

I added the rant following R_386_GOT32 and R_386_GOT32X's precedent. I agree though that such wording would better belong somewhere else (e.g. in an issue raised in the ABI specs repo) and I'll use your suggested wording instead.

394

No, AFAIK the repo's information is still current and the repo isn't going away. They archived it for non-technical reasons (maybe not giving the impression that people outside of Loongson can change their official docs at will, before their CLA automation is ready).

I've nevertheless archived the page with archive.org and replaced with a new PR to the document's current home. I expect the discussion to continue but only after they get their CLA mechanism finished.

437

You may be looking at pcaddu12i? That's the same as RISCV auipc. The PCALA relocs all use pcalau12i that indeed has the lower bits zeroing behavior mentioned. See the corresponding documentation source.

451

The repo isn't going anywhere, as explained in another reply, but I've replaced with a new refreshed PR on the currently maintained LA ABI specs repo nevertheless (I've replicated all info there so no context is lost even if the original repo is deleted).

lld/test/ELF/loongarch-pc-aligned.s
8

You can use something like --section-start .rodata=0x11000 --section-start=.text=0x11ffc to replace the linker script.

Sorry, but as I've previously explained as soon as I replace the linker script with the --section-start arguments that really should be equivalent, I instead get errors like ld.lld: error: output file too large: 18446744073709494880 bytes. This doesn't happen with the BFD linker so I suspect something is wrong with the common logic, although I haven't debugged yet.

For now though I've discovered your suggestion somehow works (by ensuring the base address to be always greater than defaultMaxPageSize i.e. 0x10000), and I've amended the cases.

20

Probably no, I have to ensure sufficiently different PC and target values for each case, in order to be able to cover the edge cases. Adding a trivial number of instructions won't help (i.e. I can't put the target value 4GiB away without utilizing --section-start or linker scripts anyway).

Do you mean I should just leave the PC alone and use --defsym for testing multiple offsets at once?

130

I've (programmatically) generated more cases for complete coverage of getLoongArchPageDelta (previously getLoongArchPageOffset), complete with explanations for every immediate there.

169

The linker script inputs are all gone in the latest revision. Can't seem to reduce lld invocations though, happy to see any suggestion.

MaskRay added inline comments.Jul 13 2023, 8:19 PM
lld/ELF/Arch/LoongArch.cpp
112

This example doesn't trigger the interesting case signLo12 != signHi20. We need a different example.

126

excessively indented

220

Delete nullness check.

We probably should change the parameter to a const reference.

223

drop braces

lld/ELF/InputSection.cpp
655

Delete this comment. This is not fallthrough, which usually means [[fallthrough]].

lld/ELF/Relocations.h
110

... from other architectures.

lld/test/ELF/loongarch-branch-b16.s
23

For error messages, we check the error: prefix.

Fix this elsewhere.

24

Add -NEXT whenever applicable.

lld/test/ELF/loongarch-branch-b21.s
23

For error messages, we check the error: prefix.

lld/test/ELF/loongarch-branch-b26.s
2

I think it helps readability if loongarch-branch-b26.s is merged into loongarch-branch-b21.s.

28

For error messages, we check the error: prefix.

lld/test/ELF/loongarch-interlink.test
77

Indent instructions

lld/test/ELF/loongarch-plt.s
84

The column number of 20040: should not be less than the column number of the colon in DIS32-NEXT:.
We may need more indentation for instructions.

xen0n updated this revision to Diff 540853.Jul 16 2023, 9:06 PM
xen0n marked 13 inline comments as done.

Address all review comments.

xen0n added inline comments.Jul 16 2023, 9:08 PM
lld/ELF/Arch/LoongArch.cpp
112

Ah, I copied a wrong fixture from my experiments, sorry for that. I've made new examples to cover both directions of adjustment.

lld/test/ELF/loongarch-branch-b16.s
24

Okay. Here it's not applicable though, as the actual diagnostics have >>> defined in --defsym:1 and an empty line in between, which is irrelevant to our intended assertion.

xen0n added a comment.Jul 19 2023, 8:39 PM

Friendly ping; ideally I'd like this port to get included in LLVM 17, which will vastly simplify things ecosystem-wide, such as ClangBuiltLinux testing, Gentoo LLVM stagebuilding, and possibly many more large projects depending on lld.

Friendly ping; ideally I'd like this port to get included in LLVM 17, which will vastly simplify things ecosystem-wide, such as ClangBuiltLinux testing, Gentoo LLVM stagebuilding, and possibly many more large projects depending on lld.

This is on my radar and I will take a look again soon. There are also random other LoongArch things I need to take a look and I did. This is totally fine to be cherry picked into the release/17.x branch even after it is created.
I am sorry that I did not respond in time for the initial period of the patch (when it changed from the "WIP" state to the normal in-review state), but I think the request after 3 days may be a bit too quick.
Honestly the linker is probably I care about the most and I don't want to lower the bar. (There are other things that I am concerned with as well, but I may not have too much energy to prevent something less appropriate happen; but if I notice, I'll make a comment).

xen0n added a comment.Jul 19 2023, 9:16 PM

Friendly ping; ideally I'd like this port to get included in LLVM 17, which will vastly simplify things ecosystem-wide, such as ClangBuiltLinux testing, Gentoo LLVM stagebuilding, and possibly many more large projects depending on lld.

This is on my radar and I will take a look again soon. There are also random other LoongArch things I need to take a look and I did. This is totally fine to be cherry picked into the release/17.x branch even after it is created.
I am sorry that I did not respond in time for the initial period of the patch (when it changed from the "WIP" state to the normal in-review state), but I think the request after 3 days may be a bit too quick.
Honestly the linker is probably I care about the most and I don't want to lower the bar. (There are other things that I am concerned with as well, but I may not have too much energy to prevent something less appropriate happen; but if I notice, I'll make a comment).

Thanks for the clarification. I didn't know if the release branch cherry-pick could be done in this case so I was a little anxious, sorry for the less-than-a-week ping.

I fully understand your passion and care for the linker and I agree with not lowering the bar; I'll do my best to improve the code together.

Friendly ping; ideally I'd like this port to get included in LLVM 17, which will vastly simplify things ecosystem-wide, such as ClangBuiltLinux testing, Gentoo LLVM stagebuilding, and possibly many more large projects depending on lld.

This is on my radar and I will take a look again soon. There are also random other LoongArch things I need to take a look and I did. This is totally fine to be cherry picked into the release/17.x branch even after it is created.
I am sorry that I did not respond in time for the initial period of the patch (when it changed from the "WIP" state to the normal in-review state), but I think the request after 3 days may be a bit too quick.
Honestly the linker is probably I care about the most and I don't want to lower the bar. (There are other things that I am concerned with as well, but I may not have too much energy to prevent something less appropriate happen; but if I notice, I'll make a comment).

Thanks for the clarification. I didn't know if the release branch cherry-pick could be done in this case so I was a little anxious, sorry for the less-than-a-week ping.

I fully understand your passion and care for the linker and I agree with not lowering the bar; I'll do my best to improve the code together.

After being disgusted by the code of [another linker] multiple times I now totally agree we must be careful when we write a linker...

MaskRay added inline comments.Jul 21 2023, 12:36 AM
lld/ELF/Arch/LoongArch.cpp
89

I was quite confused by the description, then I realized that I probably should just figure out the formula myself:)

I wonder whether it's clear if we remove these textual descriptions and focus on the deduction:

// We have to check if the higher 32 bits need adjustment too, due to
// the large code model access pattern:
//
//     pcalau12i       A, %foo_hi20(sym)        // b in [0,0xfff]
//     addi.d          T, zero, %foo_lo12(sym)  // pageoff in [0,0xfff]
//     lu32i.d         T, %foo64_lo20(sym)      // c in [0,0xfffff]
//     lu52i.d         T, T, %foo64_hi12(sym)   // d in [0,0xfff]
//     {ldx,stx,add}.* A, A, T
//
// If we treat the immediates in the operands as unsigned, the adjusted page
// delta (return value) is b<<12 | c<<32 | d<<52.
// We have (pageoff<0x800?pageoff:pageoff-0x1000+0x100000000) + ((b<0x80000?b:b-0x100000)<<12) + (c<<32) + (d<<52) = dest - page(pc).
//
// When pageoff is 0xfff, its contribution to dest-page(pc) is effectively 0xfff-0x1000 + 0x100000000 as the high 32-bits are
// set by lu32i.d+lu52i.d, e.g.
//
//     addi.d    T, zero, -1  ; T = 0xfff'fffff'fffff'fff
//     lu32i.d   T, 1         ; T = 0x000'00001'fffff'fff
//     lu52i.d   T, T, -1     ; T = 0xfff'00001'fffff'fff
229

We probably should change the parameter to a const reference.

Not done?

Reminder: update lld/docs/ReleaseNotes.rst and lld/docs/index.rst when we finish this.

xen0n updated this revision to Diff 543850.Jul 25 2023, 12:05 AM

Address review comments from @MaskRay and @SixWeining:

  • Rewritten the docs for getLoongArchPageDelta, detailing the calculations;
  • Added a missing const qualifier (I apparently misunderstood the review comment);
  • Updated the docs.
MaskRay accepted this revision.Jul 25 2023, 12:41 AM
MaskRay added inline comments.
lld/ELF/Arch/LoongArch.cpp
24

delete blank line

42

delete blank line

115

<< needs some parentheses due to operator precedence :)

lld/test/ELF/loongarch-abs64.s
66

delete trailing blank line

lld/test/ELF/loongarch-pcala-lo12-jirl-shared.s
18

Delete addresses for instructions that are not referenced. Fewer addresses makes test updating easier.

This revision is now accepted and ready to land.Jul 25 2023, 12:41 AM
xen0n updated this revision to Diff 543861.Jul 25 2023, 12:55 AM

(hopefully) final cosmetic fixes:

  • Removed several extra blank lines
  • Slightly tweaked getLoongArchPageDelta doc wording again and added parens so the operators in the pseudo-code follows C precedence
  • Removed several extraneous disassembly addresses from the loongarch-pcala-lo12-jirl-shared.s test case
xen0n marked 7 inline comments as done.Jul 25 2023, 1:02 AM

Thanks for all the careful reviews that helped tremendously in improving the code quality (and forcing me to improve my ELF wrangling skills).

I've gone through the especially hard parts (getLoongArchPageDelta in particular; I had to JIT some funclets to ensure the operation is 100% correct) and I believe they should be good to go... And if things go awry I would surely be around to fix. Thanks go to everyone who has helped reviewing this code.

A question: does LLD automatically supports pack-relative-relocs once a new port is added? I'm trying to implement it for LoongArch BFD linker but it seems I don't have enough skill.

xen0n added a comment.Jul 25 2023, 1:37 AM

A question: does LLD automatically supports pack-relative-relocs once a new port is added? I'm trying to implement it for LoongArch BFD linker but it seems I don't have enough skill.

It seems to be the case. Last time I did the Gentoo LLVM bootstrap I saw binaries full with DT_RELR (I didn't remember if I manually forced the flag or not; Gentoo seems to enable it in some base profile anyway) and the glibc rtld consumed them just fine.

This revision was landed with ongoing or failed builds.Jul 25 2023, 2:11 AM
This revision was automatically updated to reflect the committed changes.
xen0n added a comment.Jul 25 2023, 2:12 AM

A question: does LLD automatically supports pack-relative-relocs once a new port is added? I'm trying to implement it for LoongArch BFD linker but it seems I don't have enough skill.

It seems to be the case. Last time I did the Gentoo LLVM bootstrap I saw binaries full with DT_RELR (I didn't remember if I manually forced the flag or not; Gentoo seems to enable it in some base profile anyway) and the glibc rtld consumed them just fine.

I've just confirmed with a fresh QEMU build that the support is indeed working end-to-end.

It looks like there's a test failure happening in post-commit from one of the newly added tests: https://lab.llvm.org/buildbot/#/builders/42/builds/10617 Can you investigate (and revert if that's going to take a while to resolve)?

It looks like there's a test failure happening in post-commit from one of the newly added tests: https://lab.llvm.org/buildbot/#/builders/42/builds/10617 Can you investigate (and revert if that's going to take a while to resolve)?

I guess it's D156293.

ruiu added a subscriber: ruiu.Aug 8 2023, 1:16 AM
ruiu added inline comments.
lld/ELF/Arch/LoongArch.cpp
163–164

Don't you need to recompute negativeA after adding 0x1000 to result? That yields a different result if the original result` value happened to be 0x7fff'f000.