This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Support R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128 for .uleb128 directives
ClosedPublic

Authored by MaskRay on Aug 10 2023, 2:46 PM.

Details

Summary

For a label difference like .uleb128 A-B, MC folds A-B even if A and B
are separated by a RISC-V linker-relaxable instruction. This incorrect
behavior is currently abused by DWARF v5 .debug_loclists/.debug_rnglists
(DW_LLE_offset_pair/DW_RLE_offset_pair entry kinds) implemented in
Clang/LLVM (see https://github.com/ClangBuiltLinux/linux/issues/1719 for
an instance).

https://github.com/riscv-non-isa/riscv-elf-psabi-doc/commit/96d6e190e9fc04a8517f9ff7fb9aed3e9876cbd6
defined R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128. This patch generates such
a pair of relocations to represent A-B that should not be folded.
GNU assembler computes the directive size by ignoring shrinkable section
content, therefore after linking the value of A-B cannot use more bytes
than the reserved number (final size of uleb128 value at offset ... exceeds available space).
We make the same assumption.

w1:
  call foo
w2:
  .space 120
w3:
.uleb128 w2-w1  # 1 byte, 0x08
.uleb128 w3-w1  # 2 bytes, 0x80 0x01

We do not conservatively reserve 10 bytes (maximum size of an uleb128
for uint64_t) as that would pessimize DWARF v5
DW_LLE_offset_pair/DW_RLE_offset_pair, nullifying the benefits of
introducing R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128 relocations.

The supported expressions are limited. For example,

  • non-subtraction .uleb128 A is not allowed
  • .uleb128 A-B: report an error unless A and B are both defined and in the same section

The new cl::opt -riscv-uleb128-reloc can be used to suppress the
relocations.

Diff Detail

Event Timeline

MaskRay created this revision.Aug 10 2023, 2:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2023, 2:46 PM
MaskRay requested review of this revision.Aug 10 2023, 2:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2023, 2:46 PM

We do not conservatively reserve 10 bytes (maximum size of an uleb128
for uint64_t) as that would pessimize DWARF v5
DW_LLE_offset_pair/DW_RLE_offset_pair, nullifying the benefits of
introducing R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128 relocations.

That fixed one of major issue in my version (D142880), it will always reserve 10 bytes instead of just reserve enough byte, and apparently I didn't handle handle debug info properly.

RISC-V part LGTM, thanks :)

jrtc27 added inline comments.Aug 16 2023, 4:13 PM
llvm/include/llvm/MC/MCFixup.h
48

I guess the sign is relevant here but not for the others because we have to know how to pad it? (Though what about FK_Data_8 on a 32-bit target?..)

llvm/test/MC/ELF/RISCV/gen-dwarf.s
53

Is this actually the null symbol or has this just not read in the symbol table properly?

llvm/test/MC/RISCV/leb128.s
33–34

Double spaces here

Also why is this one CHECK but w2-w1 is RELAX?

35

Make this CHECK and move it down as it's shared?

MaskRay updated this revision to Diff 550990.Aug 16 2023, 9:28 PM
MaskRay marked 4 inline comments as done.

address comments

llvm/include/llvm/MC/MCFixup.h
48

I agree. Changed to FK_Data_leb128.

(Though what about FK_Data_8 on a 32-bit target?..)

They usually lead to errors.

llvm/test/MC/ELF/RISCV/gen-dwarf.s
53

This is a symbol with an empty name, not symbol index 0.

llvm-readobj since 82b4368f7f89e2b16b9eb09787b7a7af07e3b7cd (Jul 2023) prints <null>, matching GNU readelf.

vpalatin added inline comments.
llvm/lib/MC/MCAssembler.cpp
1020–1022

I'm seeing a failure on 'lld :: MachO/compact-unwind-lsda-folding.s' with this part of the change

******************** TEST 'lld :: MachO/compact-unwind-lsda-folding.s' FAILED ********************
Script:
--
: 'RUN: at line 6';   rm -rf build.ASSERT/tools/lld/test/MachO/Output/compact-unwind-lsda-folding.s.tmp; mkdir build.ASSERT/tools/lld/test/MachO/Output/compact-unwind-lsda-folding.s.tmp
: 'RUN: at line 7';   build.ASSERT/bin/llvm-mc -filetype=obj -triple=x86_64-apple-macos11.0 -emit-compact-unwind-non-canonical=true -o build.ASSERT/tools/lld/test/MachO/Output/compact-unwind-lsda-folding.s.tmp/lsda.o lld/test/MachO/compact-unwind-lsda-folding.s
: 'RUN: at line 8';   ld64.lld -arch x86_64 -platform_version macos 11.0 11.0 -syslibroot lld/test/MachO/Inputs/MacOSX.sdk -lSystem -fatal_warnings -dylib --icf=all -lSystem -lc++ -o build.ASSERT/tools/lld/test/MachO/Output/compact-unwind-lsda-folding.s.tmp/liblsda.dylib build.ASSERT/tools/lld/test/MachO/Output/compact-unwind-lsda-folding.s.tmp/lsda.o
: 'RUN: at line 9';   build.ASSERT/bin/llvm-objdump --macho --syms --unwind-info build.ASSERT/tools/lld/test/MachO/Output/compact-unwind-lsda-folding.s.tmp/liblsda.dylib | build.ASSERT/bin/FileCheck lld/test/MachO/compact-unwind-lsda-folding.s
--
Exit Code: 1

Command Output (stderr):
--
lld/test/MachO/compact-unwind-lsda-folding.s:87:21: error: .uleb128 expression is not absolute
        .uleb128 Lfunc_end0-Ltmp1               ##   Call between Ltmp1 and Lfunc_end0
MaskRay updated this revision to Diff 553848.Aug 27 2023, 11:49 PM
MaskRay marked an inline comment as done.

work around Mach-O .subsections_via_symbols

llvm/lib/MC/MCAssembler.cpp
1020–1022

Thanks. I added a llvm/test/MC test in 04c1e521b9547944333379528e8519f14e4a665c as this hasn't been tested by llvm/ before. Thanks to smeenai for creating this test!

Ready for re-review:)

Ping:)

(The lld patch may take some time. I'll be out of town for about 10 days.)

asb accepted this revision.Nov 8 2023, 3:05 AM

LGTM. I think Kito and Jessica are best placed to review this and it already got a LGTM from Kito. I think we should go ahead and merge this.

This revision is now accepted and ready to land.Nov 8 2023, 3:05 AM

LGTM. I think Kito and Jessica are best placed to review this and it already got a LGTM from Kito. I think we should go ahead and merge this.

Thanks! I'll merge this tomorrow if @jrtc27 does not have blocking comments.

This revision was landed with ongoing or failed builds.Nov 9 2023, 9:27 AM
This revision was automatically updated to reflect the committed changes.
nikic added a subscriber: nikic.Nov 18 2023, 3:17 AM

It looks like this change has substantially increased the memory usage of ReleaseLTO-g builds on X86. Here are the max-rss stats for the linking step (data extracted from here):

kc.link 	232MiB 	238MiB (+2.58%)
sqlite3.link 	212MiB 	221MiB (+3.98%)
consumer-typeset.link 	158MiB 	161MiB (+1.99%)
bullet.link 	199MiB 	202MiB (+1.62%)
tramp3d-v4.link 	683MiB 	713MiB (+4.41%)
clamscan.link 	231MiB 	238MiB (+2.97%)
lencod.link 	274MiB 	281MiB (+2.75%)
7zip-benchmark.link 	457MiB 	473MiB (+3.60%)

So this increases memory usage by about 2-5% for builds using optimization and debug info.

Can something be done about this? It seems pretty bad that a RISCV specific change pessimizes other targets to this degree.

craig.topper added inline comments.
llvm/include/llvm/MC/MCFragment.h
438

Should this Contents field be removed?

MaskRay marked an inline comment as done.Mon, Nov 20, 10:50 PM
MaskRay added inline comments.
llvm/include/llvm/MC/MCFragment.h
438

Thanks for 3b916ad6733b04a86ca0aec57be647daf4647d5b [MC] Remove duplicate Contents field from MCLEBFragment.

Another thing we can tune is the template arguments in MCEncodedFragmentWithFixups<10, 1>, e.g. <8, 0> <8, 1> <10, 1>.

nikic added inline comments.Mon, Dec 11, 2:44 AM
llvm/include/llvm/MC/MCFragment.h
438