This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lld/
-
ELF/
-
Arch/
3/10
RISCV.cpp
3/6
InputSection.h
1/4
InputSection.cpp
-
Relocations.h
3/6
Relocations.cpp
-
Target.h
-
Writer.cpp
-
test/ELF/
-
ELF/
-
riscv-gp.s
-
riscv-relax-align-rvc.s
-
riscv-relax-align.s
-
riscv-relax-call.s
-
riscv-relax-hi20-lo12.s
-
riscv-relax-pcrel.s
-
riscv-relax-syms.s
-
riscv-reloc-align.s

Differential D100835

[WIP][LLD][RISCV] Linker Relaxation
AbandonedPublic

Authored by MaskRay on Apr 20 2021, 2:54 AM.

Download Raw Diff

Details

Reviewers

ruiu
jrtc27
luismarques
gkm

Summary

This patch fully implements linker relaxation for RISC-V including relaxation for R_RISCV_CALL, R_RISCV_HI20/LO12, R_RISCV_PCREL_HI20/LO12 and handling for R_RISCV_ALIGN. Just for reference/link, there were some previous efforts/discussion to implementation linker relaxation in D77694 and D79105.

As linker relaxation is highly specific to RISC-V at the moment, most of the work are done in the Target::finalizeContents() function and isolated from generic code. For now I'm avoiding trying to come up with a common relaxation framework for multiple targets as their needs may be greatly different.

The relaxation process is split into several passes:

For each executable input section, search its relocation vector for R_RISCV_RELAX and determine how to relax the previous relocation.
If the relocation is R_RISCV_CALL (auipc+jalr pair), try to relax to jal or c.jal if the jump target is in range. This assumes that the PC-relative offset can only become smaller during the relaxation process.
If the relocation is R_RISCV_HI20/LO12 (absolute addressing) and the target symbol can be addressed from __global_pointer$ (defaulted to .sdata+0x800), delete the lui and rewrite the lo part to use gp as source register.
If the relocation is R_RISCV_PCREL_HI20/LO12, this requires two-pass to relax as PCREL_LO12 links to its PCREL_HI20 for address calculation and they may appear in arbitrary order in an input section. To preserve the look up from PCREL_LO12 the first pass relaxes/deletes`PCREL_LO12` and the second pass handles PCREL_HI20. The implementation is simpler than that in bfd because lld doesn't allow addends in PCREL_LO12 (https://github.com/riscv/riscv-elf-psabi-doc/issues/184), and so both parts can be relaxed independently.
The range of bytes are deleted after processing relaxation for a whole input section. This requires adjusting section content, symbol addresses/sizes and relocation offsets, which is handled in InputSectionBase::deleteRanges. Compared to other approaches we use a algorithm that is not quadratic by sorting symbols, relocations and bytes to be deleted by offset.
After relaxation, handle alignment as symbols addresses are now fixed modulo section alignment. This is always enabled regardless of the --relax option as this is required for correctness.

There are still some issues to solve:

--emit-relocs will be broken on relaxed section as currently it just copies the corresponding .rela section verbatim from input. It needs to be fixed to build relocation entries from the section's relocation vector.
The patch adds two additional RelExpr (R_RISCV_GPREL and R_RELAX_HINT) which unfortunately makes it unable to fit into a 64-bit mask, so it is now changed back to sequential checking. I think it should be somehow split into target-independent (R_ABS, R_PC, ...) exprs and target-dependent (R_RISCV_..., R_MIPS_...) exprs which can overlap in numeric range, but it is probably out of the scope of this patch.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

PkmX created this revision.Apr 20 2021, 2:54 AM

Herald added subscribers: vkmr, frasercrmck, dang and 28 others. · View Herald TranscriptApr 20 2021, 2:54 AM

PkmX requested review of this revision.Apr 20 2021, 2:54 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 20 2021, 2:54 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Why write your own deleteRanges? I took great care in writing mine to ensure there were no hidden quadratic complexities, but your simpler version has introduced them. Just take mine, unless there is a problem with it nobody's mentioned. Same goes for mutableData being a less efficient version of mine.

I also don't like making this overly target-specific. We should take the time to get the interface right, not just punt on it and shove it all into target-specific code.

lld/ELF/Arch/RISCV.cpp
643	Can we not just keep two iterators around? That feels nicer to me than careful +1s everywhere (e.g. it's not immediately obvious this won't crash on the first element until you look at the start point of the loop).
663–664	This condition seems unnecessary
712	When would the addend ever be 0? That seems like invalid input.
759–760	Why a limit of 5? It is guaranteed to converge, I don't see why there needs to be a limit, and it'll probably take just 1 or 2 almost all of the time.
766	Not for relocatable
lld/ELF/InputSection.cpp
180–181	This looks quadratic
201–202	As does this
212–213	And this
lld/ELF/InputSection.h
167	Make the relevant fields mutable and mark this const, like data()
178	std::map seems like a high-overhead way to store what is just a sorted list of intervals.
256	Why do we need a whole new pointer when rawData is also being updated?
lld/ELF/Relocations.cpp
129	(a) this is unrelated to the diff (b) this completely removes the whole point of oneof

Harbormaster completed remote builds in B99665: Diff 338775.Apr 20 2021, 5:02 AM

PkmX added inline comments.Apr 20 2021, 8:36 PM

lld/ELF/Arch/RISCV.cpp
663–664	It's necessary as `processRelaxations` needs to access two iterators which would be invalid if the container is empty. I will move this into `processRelaxations` which is where it should be in the next patch.
759–760	Will be removed in the next patch.
766	Will do in the next patch.
lld/ELF/InputSection.cpp
180–181	This isn't quadratic. We are really looping over the symbols in this section while simultaneously accumulating the number of bytes deleted. Since symbols and delete ranges are sorted by offset, if we know X bytes are deleted before offset A, and B > A, we only have to add up delete ranges between A and B to know the total number of bytes deleted before B. In essence every symbol and delete range is only visited once in this loop.
lld/ELF/InputSection.h
178	I agree. `std::vector<std::pair<uint64_t, uint64_t>>` should be sufficient although it needs to be sorted for `deleteRanges` as the vector may not sorted with two-pass handling for `PCREL_HI20`.
256	Will change it to use a `copiedData` bool as in your patch.
lld/ELF/Relocations.cpp
129	As noted in the summary adding more `RelExpr` makes it unable to fit in a 64-bit mask, so this is temporary change to make this patch work for now. There were already 64 `RelExpr`s before this patch so there is no room for growing and I don't think it is workable in the long run. The only solution I can think of is split them into target specific `RelExpr` with some common exprs like `R_ABS` being shared, so as long as the target doesn't define more than ~20 `RelExpr` it should be fine. However this probably should belong to another patch.

PkmX added inline comments.Apr 21 2021, 1:01 AM

lld/ELF/Arch/RISCV.cpp
712	Yes, valid addends should only be 2^n-2 for n >= 2. With that said I don't think lld is really checking this anywhere, so malicious input could cause lld to crash.
lld/ELF/Relocations.cpp
129	Alternatively I think we can use two 64-bit masks, so this should work up to 128 `RelExpr`s.

This addresses some of the comments by @jrtc27:

Check for empty relocation vector is moved into processRelaxations
Remove the iteration limit for relaxation
Don't perform alignment handling for relocatable link
No more mutData in InputSectionBase, now it uses copiedData to check if a mutable copy has been made.
DeleteRanges is now a std::vector of offset and size pairs.
oneof now tests for two 64-bit masks depending on which subset (<64 or >= 64) the expr is in.

arichardson added inline comments.Apr 21 2021, 4:46 AM

lld/ELF/Relocations.cpp
132–135	This change should be a separate review. I would very much like support for > 64 RelExpr values to land upstream since we also had to make that change that for our CHERI LLD.

Harbormaster completed remote builds in B99946: Diff 339181.Apr 21 2021, 5:27 AM

In D100835#2701062, @jrtc27 wrote:

Why write your own deleteRanges? I took great care in writing mine to ensure there were no hidden quadratic complexities, but your simpler version has introduced them. Just take mine, unless there is a problem with it nobody's mentioned. Same goes for mutableData being a less efficient version of mine.

I also don't like making this overly target-specific. We should take the time to get the interface right, not just punt on it and shove it all into target-specific code.

+1 for a low time complexity implementation. The quadratic time complexity of the binutils approach may run into scalability problems.

I haven't closely looked into this yet, but it seems that R_RISCV_GPREL_* support can be contributed separately.

In D100835#2707537, @MaskRay wrote:

In D100835#2701062, @jrtc27 wrote:

Why write your own deleteRanges? I took great care in writing mine to ensure there were no hidden quadratic complexities, but your simpler version has introduced them. Just take mine, unless there is a problem with it nobody's mentioned. Same goes for mutableData being a less efficient version of mine.

I also don't like making this overly target-specific. We should take the time to get the interface right, not just punt on it and shove it all into target-specific code.

+1 for a low time complexity implementation. The quadratic time complexity of the binutils approach may run into scalability problems.

I haven't closely looked into this yet, but it seems that R_RISCV_GPREL_* support can be contributed separately.

Do you mean the whole gp-relaxation support? As only implementing relocations for R_RISCV_GPREL seems pointless as they are only generated and consumed by the linker.

lld/ELF/Relocations.cpp
132–135	That's okay for me. I will split this out into a new patch.

Ping, What is the plan for this revisoin @PkmX?

VincentWu added a subscriber: VincentWu.Aug 30 2021, 7:53 AM

MaskRay mentioned this in D112385: [ELF] Support 128-bit bitmask in oneof(RelExpr).Oct 24 2021, 12:04 PM

MaskRay mentioned this in rGa14ccaf5098a: [ELF] Support 128-bit bitmask in oneof(RelExpr).Oct 25 2021, 1:05 PM

liaolucy added a subscriber: liaolucy.Dec 20 2021, 12:48 AM

Herald added subscribers: luke957, achieveartificialintelligence. · View Herald TranscriptDec 20 2021, 12:48 AM

abukharmeh added a subscriber: abukharmeh.Mar 3 2022, 10:18 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 10:18 AM

Herald added subscribers: • pcwang-thead, eopXD. · View Herald Transcript

hudson-ayers added a subscriber: hudson-ayers.Mar 9 2022, 10:50 AM

I rebuilt LLVM with this patch and used it to build a non-public RISC-V binary, compiled with -Os + LTO. I found that it reduced code size of a ~450 kB application by over 28 kB (6.2%). That application is a mixture of Rust and C code; the savings from rustc generated code and Clang generated code were similar. There are also accompanying performance improvements thanks to the removed unnecessary instructions.

It would be really nice to see this patch merged given the magnitude of the improvements.

gkm added a subscriber: gkm.Mar 28 2022, 11:01 AM

Herald added subscribers: • s, StephenFan. · View Herald TranscriptMar 28 2022, 11:01 AM

alistair23 added a subscriber: alistair23.Apr 4 2022, 5:19 PM

Herald added a subscriber: sunshaoce. · View Herald TranscriptApr 4 2022, 5:19 PM

Thanks for the patch. It would be great to finally have linker relaxations for RISC-V in LLD.

This patch fully implements linker relaxation for RISC-V

Well, it doesn't implement things like:

  lui a0, %hi(x)
  addi a0, a0, %lo(x)
->
  addi a0, zero, %lo(x)

For &lo that fits in the 12 bit addi immediate. BFD implements that.

Likewise for auipc+addi with %pcrel_*. It's only implemented for gp / __global_pointer$, not for x0.

The patched LLD also doesn't seem to support relaxations for an explicitly defined __global_pointer$. Whether it's supposed to or not, I think that case should be included in the tests.

luismarques added inline comments.Apr 12 2022, 7:11 AM

lld/ELF/Relocations.cpp
132–135	we also had to make that change that for our CHERI LLD. That's okay for me. I will split this out into a new patch. I ran into the same problem in another context. It would be good to have some guidance about what's the best way forward to address this for the long term (e.g. two 64-bit masks?). @ruiu? @PkmX Are you still planning to submit your separate patch for this mask issue? And, more broadly, to update this LLD RISC-V relaxation patch?

This was discussed in the RISC-V sync-up call. As suggested by Alex, I'm sharing in writing here some of the things I mentioned or were discussed in the call:

I found a linking failure issue with this patch. I hadn't yet reported that here because I am still in the process of reducing and analyzing the problematic case.
We touched on the topic of relaxation performance and I shared this graph of my benchmarking results of LLD with this patch compared with BFD:
@gkm reported that he was also analyzing/working on this. If anybody else is planning to work on this it would be good to coordinate with him and the rest of the community, to avoid duplicate work.

lld/ELF/Arch/RISCV.cpp
502	This is also called for R_RISCV_CALL_PLT

I have also planned to pick up the work. I think binutils is trying to improve their time complexity, and we should just aim for the best time complexity initially.

In D100835#3452256, @MaskRay wrote:

I have also planned to pick up the work. I think binutils is trying to improve their time complexity, and we should just aim for the best time complexity initially.

Is this patch lacking in some way in terms of performance (absolute or asymptotic)?

In D100835#3452269, @luismarques wrote:

In D100835#3452256, @MaskRay wrote:

I have also planned to pick up the work. I think binutils is trying to improve their time complexity, and we should just aim for the best time complexity initially.

Is this patch lacking in some way in terms of performance (absolute or asymptotic)?

Asymptotic. See https://reviews.llvm.org/D100835#2707537
It has been long time since I looked at this patch, I'll need to re-read :)

In D100835#3452280, @MaskRay wrote:

In D100835#3452269, @luismarques wrote:

In D100835#3452256, @MaskRay wrote:

I have also planned to pick up the work. I think binutils is trying to improve their time complexity, and we should just aim for the best time complexity initially.

Is this patch lacking in some way in terms of performance (absolute or asymptotic)?

Asymptotic. See https://reviews.llvm.org/D100835#2707537
It has been long time since I looked at this patch, I'll need to re-read :)

I think that’s been addressed?

In D100835#3452066, @luismarques wrote:

I found a linking failure issue with this patch. I hadn't yet reported that here because I am still in the process of reducing and analyzing the problematic case.

It was a big pain to do the initial steps of the reduction (I had to start with the object files) but, eventually, I managed to do it and from there I reduced it further. Here's a simpler version of the problematic case:

$ cat test.s
.global a
a:
tail a
.rept 2008
.byte 0
.endr

$ clang --target=riscv64 -c test.s
$ ld.lld -shared test.o
ld.lld: error: test.o:(.text+0x0): relocation R_RISCV_RVC_JUMP out of range: 1028 is not in [-1024, 1023]; references a
>>> defined in test.o

In D100835#3456472, @luismarques wrote:

It was a big pain to do the initial steps of the reduction (I had to start with the object files) but, eventually, I managed to do it and from there I reduced it further. Here's a simpler version of the problematic case:

BTW, the issue seems to be that the call is going through the PLT but the relaxation offset check assumes that it's jumping directly to the symbol so it incorrectly concludes that it's within bounds to relax to a compressed jump. (It seems like we could avoid going through the PLT anyway. If so, it would be good to optimize that, as that's a preexisting behavior).

a could be preempted, so no, it has to go via the PLT for that specific example.

Incidentally, this issue of relaxation correctness is why I want the optimisation relaxations separated out from the alignment support that’s required; I can see the optimisations otherwise holding up supporting but not optimising -mrelax code.

In D100835#3456809, @jrtc27 wrote:

Incidentally, this issue of relaxation correctness is why I want the optimisation relaxations separated out from the alignment support that’s required;

That sounds sensible.

Original author @PkmX has been inactive here for a year, and did not respond to email.

gkm edited reviewers, added: luismarques; removed: PkmX.Apr 21 2022, 3:34 PM

Herald added a subscriber: PkmX. · View Herald TranscriptApr 21 2022, 3:34 PM

Rebase to get 1 year of LLVM changes

Harbormaster completed remote builds in B160736: Diff 424315.Apr 21 2022, 3:56 PM

reames added a subscriber: reames.Apr 25 2022, 9:29 AM

gkm mentioned this in D125036: [RISCV] Alignment relaxation.May 5 2022, 12:53 PM

luyahan added a subscriber: luyahan.May 11 2022, 7:13 PM

luismarques mentioned this in D125497: [RISCV] Call relaxation.May 25 2022, 4:03 AM

joshua-arch1 added a subscriber: joshua-arch1.May 31 2022, 1:16 AM

irichter added a subscriber: irichter.Jun 9 2022, 11:44 AM

IdanHo added a subscriber: IdanHo.Jun 21 2022, 6:21 AM

pierce added a subscriber: pierce.Jul 1 2022, 10:16 PM

For everyone following this, https://reviews.llvm.org/D127611 and https://reviews.llvm.org/D125036 have been merged, and seem to obsolete this!

Pretty-box added a subscriber: Pretty-box.Oct 11 2022, 3:42 AM

Pretty-box added inline comments.

lld/ELF/InputSection.h
178	I would like to ask why a copy operation is performed here. The data in it does not seem to have changed. If I do not perform this operation and use data().data() directly in relaxHi20Lo12, there will be a link error. Why is this?

fanghuaqi added a subscriber: fanghuaqi.Feb 7 2023, 1:27 AM

Herald added a subscriber: luke. · View Herald TranscriptFeb 7 2023, 1:27 AM

MaskRay commandeered this revision.Feb 7 2023, 1:32 PM

MaskRay edited reviewers, added: gkm; removed: MaskRay.

Herald added a subscriber: jobnoorman. · View Herald TranscriptFeb 7 2023, 1:32 PM

MaskRay abandoned this revision.Feb 7 2023, 1:32 PM

craig.topper mentioned this in D143673: [lld][RISCV] Implement GP relaxation for R_RISCV_HI20/R_RISCV_LO12_I/R_RISCV_LO12_S..Feb 9 2023, 12:38 PM

Revision Contents

Path

Size

lld/

ELF/

Arch/

313 lines

29 lines

93 lines

2 lines

3 lines

2 lines

2 lines

test/

ELF/

riscv-gp.s

15 lines

riscv-relax-align-rvc.s

31 lines

riscv-relax-align.s

33 lines

riscv-relax-call.s

72 lines

riscv-relax-hi20-lo12.s

51 lines

riscv-relax-pcrel.s

28 lines

riscv-relax-syms.s

32 lines

riscv-reloc-align.s

Diff 424315

lld/ELF/Arch/RISCV.cpp

//===- RISCV.cpp ----------------------------------------------------------===//		//===- RISCV.cpp ----------------------------------------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "InputFiles.h"		#include "InputFiles.h"
		#include "OutputSections.h"
#include "Symbols.h"		#include "Symbols.h"
#include "SyntheticSections.h"		#include "SyntheticSections.h"
#include "Target.h"		#include "Target.h"

using namespace llvm;		using namespace llvm;
using namespace llvm::object;		using namespace llvm::object;
using namespace llvm::support::endian;		using namespace llvm::support::endian;
using namespace llvm::ELF;		using namespace llvm::ELF;
Show All 13 Lines	public:
void writePltHeader(uint8_t *buf) const override;		void writePltHeader(uint8_t *buf) const override;
void writePlt(uint8_t *buf, const Symbol &sym,		void writePlt(uint8_t *buf, const Symbol &sym,
uint64_t pltEntryAddr) const override;		uint64_t pltEntryAddr) const override;
RelType getDynRel(RelType type) const override;		RelType getDynRel(RelType type) const override;
RelExpr getRelExpr(RelType type, const Symbol &s,		RelExpr getRelExpr(RelType type, const Symbol &s,
const uint8_t *loc) const override;		const uint8_t *loc) const override;
void relocate(uint8_t *loc, const Relocation &rel,		void relocate(uint8_t *loc, const Relocation &rel,
uint64_t val) const override;		uint64_t val) const override;
		void finalizeSections() const override;
};		};

} // end anonymous namespace		} // end anonymous namespace

const uint64_t dtpOffset = 0x800;		const uint64_t dtpOffset = 0x800;

enum Op {		enum Op {
ADDI = 0x13,		ADDI = 0x13,
AUIPC = 0x17,		AUIPC = 0x17,
JALR = 0x67,		JALR = 0x67,
LD = 0x3003,		LD = 0x3003,
LW = 0x2003,		LW = 0x2003,
SRLI = 0x5013,		SRLI = 0x5013,
SUB = 0x40000033,		SUB = 0x40000033,
};		};

enum Reg {		enum Reg {
X_RA = 1,		X_RA = 1,
		X_GP = 3,
X_T0 = 5,		X_T0 = 5,
X_T1 = 6,		X_T1 = 6,
X_T2 = 7,		X_T2 = 7,
X_T3 = 28,		X_T3 = 28,
};		};

static uint32_t hi20(uint32_t val) { return (val + 0x800) >> 12; }		static uint32_t hi20(uint32_t val) { return (val + 0x800) >> 12; }
static uint32_t lo12(uint32_t val) { return val & 4095; }		static uint32_t lo12(uint32_t val) { return val & 4095; }
▲ Show 20 Lines • Show All 197 Lines • ▼ Show 20 Lines	case R_RISCV_TLS_GD_HI20:
return R_TLSGD_PC;		return R_TLSGD_PC;
case R_RISCV_TLS_GOT_HI20:		case R_RISCV_TLS_GOT_HI20:
config->hasTlsIe = true;		config->hasTlsIe = true;
return R_GOT_PC;		return R_GOT_PC;
case R_RISCV_TPREL_HI20:		case R_RISCV_TPREL_HI20:
case R_RISCV_TPREL_LO12_I:		case R_RISCV_TPREL_LO12_I:
case R_RISCV_TPREL_LO12_S:		case R_RISCV_TPREL_LO12_S:
return R_TPREL;		return R_TPREL;
case R_RISCV_RELAX:
case R_RISCV_TPREL_ADD:		case R_RISCV_TPREL_ADD:
return R_NONE;		return R_NONE;
		case R_RISCV_RELAX:
case R_RISCV_ALIGN:		case R_RISCV_ALIGN:
// Not just a hint; always padded to the worst-case number of NOPs, so may		return R_RELAX_HINT;
// not currently be aligned, and without linker relaxation support we can't
// delete NOPs to realign.
errorOrWarn(getErrorLocation(loc) + "relocation R_RISCV_ALIGN requires "
"unimplemented linker relaxation; recompile with -mno-relax");
return R_NONE;
default:		default:
error(getErrorLocation(loc) + "unknown relocation (" + Twine(type) +		error(getErrorLocation(loc) + "unknown relocation (" + Twine(type) +
") against symbol " + toString(s));		") against symbol " + toString(s));
return R_NONE;		return R_NONE;
}		}
}		}

// Extract bits V[Begin:End], where range is inclusive, and Begin must be < 63.		// Extract bits V[Begin:End], where range is inclusive, and Begin must be < 63.
▲ Show 20 Lines • Show All 176 Lines • ▼ Show 20 Lines	void RISCV::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
case R_RISCV_TLS_DTPREL32:		case R_RISCV_TLS_DTPREL32:
write32le(loc, val - dtpOffset);		write32le(loc, val - dtpOffset);
break;		break;
case R_RISCV_TLS_DTPREL64:		case R_RISCV_TLS_DTPREL64:
write64le(loc, val - dtpOffset);		write64le(loc, val - dtpOffset);
break;		break;

case R_RISCV_RELAX:		case R_RISCV_RELAX:
		case R_RISCV_TPREL_ADD:
return; // Ignored (for now)		return; // Ignored (for now)

default:		default:
llvm_unreachable("unknown relocation");		llvm_unreachable("unknown relocation");
}		}
}		}

		static uint64_t maxOutputSectionAlignment() {
		uint64_t maxAlign = 1;
		for (auto *os : outputSections) {
		maxAlign = std::max<uint64_t>(maxAlign, os->alignment);
		}

		return maxAlign;
		}

		static void setRs1(uint8_t *buf, int rs1) {
		write32le(buf, (read32le(buf) & 0xfff07fff) \| rs1 << 15);
		}

		static int64_t addWorstCaseAlignment(int64_t offset, uint64_t alignment) {
		return offset >= 0 ? offset + alignment : offset - alignment;
		}

		using DeleteRanges = std::vector<InputSectionBase::DeleteRange>;

		static void addDeleteRange(DeleteRanges &ranges, uint64_t offset,
		uint64_t size) {
		ranges.push_back({offset, size});
		}

		// Relax R_RISCV_CALL to jal or c.jal.
		luismarquesUnsubmitted Not Done Reply Inline Actions This is also called for R_RISCV_CALL_PLT luismarques: This is also called for R_RISCV_CALL_PLT
		//
		// We always assume during relaxation the symbols can only come closer modulo
		// the effects of alignment.
		static bool relaxCall(InputSection *is, Relocation &rel,
		DeleteRanges &deleteRanges) {
		auto *sym = dyn_cast_or_null<Defined>(rel.sym);
		if (!sym \|\| !sym->section)
		return false;

		uint64_t pc = is->getVA(rel.offset);
		uint64_t target = sym->getVA(rel.addend);
		int64_t offset = target - pc;

		// As the call site and callee may reside in different sections, we need to
		// consider the worst case possible offset caused by alignment.
		if (is->getOutputSection() != sym->getOutputSection()) {
		offset = addWorstCaseAlignment(offset, maxOutputSectionAlignment());
		} else if (is != sym->section) {
		offset = addWorstCaseAlignment(offset, is->getOutputSection()->alignment);
		}

		bool rvc = config->eflags & EF_RISCV_RVC;
		unsigned rd =
		(read32le(is->data().data() + rel.offset + 4) & 0x00000fe0) >> 7;

		// Convert to c.j or c.jal (RV32-only) if offset fits in 12 bits.
		if (rvc && isInt<12>(offset) && rd == 0) {
		write16le(is->mutableData().data() + rel.offset, 0xa001); // c.j 0
		addDeleteRange(deleteRanges, rel.offset + 2, 6);
		rel.type = R_RISCV_RVC_JUMP;
		return true;
		}

		if (!config->is64 && rvc && isInt<12>(offset) && rd == 1) {
		write16le(is->mutableData().data() + rel.offset, 0x2001); // c.jal 0
		addDeleteRange(deleteRanges, rel.offset + 2, 6);
		rel.type = R_RISCV_RVC_JUMP;
		return true;
		}

		// Convert to jal if offset fits in 21 bits.
		if (isInt<21>(offset)) {
		write32le(is->mutableData().data() + rel.offset,
		0x0000006f \| rd << 7); // jal rd, 0
		addDeleteRange(deleteRanges, rel.offset + 4, 4);
		rel.type = R_RISCV_JAL;
		return true;
		}

		return false;
		}

		// For R_RISCV_HI20 and R_RISCV_LO12_[IS], only relax to GP-relative form if
		// __global_pointer$ symbol is defined and the target symbol is within the
		// same section as gp. This assumes the offset between gp and the target
		// symbol is static during relaxation.
		static bool relaxHi20Lo12(InputSection *is, Relocation &rel,
		DeleteRanges &deleteRanges) {
		bool rvc = config->eflags & EF_RISCV_RVC;
		uint64_t target = rel.sym->getVA(rel.addend);

		Defined *gp = ElfSym::riscvGlobalPointer;

		auto relaxToCLui = [&]() -> bool {
		unsigned rd = (read32le(is->data().data() + rel.offset) & 0x00000fe0) >> 7;
		if (rvc &&
		isInt<6>(SignExtend64(target + 0x800, config->wordsize * 8) >> 12) &&
		rd != 0 && rd != 2 && target != 0) {
		write16le(is->mutableData().data() + rel.offset,
		0x6001 \| rd << 7); // c.lui rd, 0
		addDeleteRange(deleteRanges, rel.offset + 2, 2);
		rel.type = R_RISCV_RVC_LUI;
		return true;
		}
		return false;
		};

		if (!gp \|\| rel.sym->getOutputSection() != gp->section->getOutputSection())
		return rel.type == R_RISCV_HI20 ? relaxToCLui() : false;

		uint64_t offset = target - gp->getVA();

		if (isInt<12>(offset)) {
		if (rel.type == R_RISCV_HI20) {
		addDeleteRange(deleteRanges, rel.offset, 4);
		rel.type = R_RISCV_NONE;
		rel.expr = R_NONE;
		} else { // R_RISCV_LO12_[IS]
		setRs1(is->mutableData().data() + rel.offset, X_GP);
		rel.expr = R_RISCV_GPREL;
		}
		return true;
		}

		return false;
		}

		// Relaxing PCREL relocations requires two passes due to the linkage from
		// LO12 to HI20. The first pass only relaxes PCREL_LO12 and the second one
		// relaxes PCREL_HI20.
		static bool relaxPcrel(InputSection *is, Relocation &rel,
		DeleteRanges &deleteRanges) {
		Defined *gp = ElfSym::riscvGlobalPointer;
		if (!gp)
		return false;

		const Relocation *hi20 = &rel;
		if (rel.type == R_RISCV_PCREL_LO12_I \|\| rel.type == R_RISCV_PCREL_LO12_S) {
		hi20 = getRISCVPCRelHi20(rel.sym, rel.addend);
		if (!hi20)
		return false;
		}

		if (hi20->sym->getOutputSection() != gp->section->getOutputSection())
		return false;

		uint64_t target = hi20->sym->getVA(hi20->addend);
		uint64_t offset = target - gp->getVA();

		if (isInt<12>(offset)) {
		if (rel.type == R_RISCV_PCREL_HI20) {
		addDeleteRange(deleteRanges, rel.offset, 4);
		rel.type = R_RISCV_NONE;
		rel.expr = R_NONE;
		} else {
		setRs1(is->mutableData().data() + rel.offset, X_GP);
		rel.sym = hi20->sym;
		rel.addend = hi20->addend + rel.addend;
		rel.type =
		rel.type == R_RISCV_PCREL_LO12_I ? R_RISCV_LO12_I : R_RISCV_LO12_S;
		rel.expr = R_RISCV_GPREL;
		}
		return true;
		}

		return false;
		}

		template <typename F>
		void processRelaxations(MutableArrayRef<Relocation> rels, F f) {
		if (rels.empty())
		jrtc27Unsubmitted Not Done Reply Inline Actions Can we not just keep two iterators around? That feels nicer to me than careful +1s everywhere (e.g. it's not immediately obvious this won't crash on the first element until you look at the start point of the loop). jrtc27: Can we not just keep two iterators around? That feels nicer to me than careful +1s everywhere…
		return;

		for (auto r = rels.begin() + 1, e = rels.end(); r != e; ++r) {
		if (r->type != R_RISCV_RELAX)
		continue;

		Relocation *rel = std::prev(r);
		if (r->offset != rel->offset)
		continue;

		if (f(*rel)) {
		r->type = R_RISCV_NONE;
		r->expr = R_NONE;
		}
		}
		}

		static bool relax() {
		bool changed = false;

		for (OutputSection *os : outputSections) {
		jrtc27Unsubmitted Not Done Reply Inline Actions This condition seems unnecessary jrtc27: This condition seems unnecessary
		PkmXUnsubmitted Done Reply Inline Actions It's necessary as `processRelaxations` needs to access two iterators which would be invalid if the container is empty. I will move this into `processRelaxations` which is where it should be in the next patch. PkmX: It's necessary as `processRelaxations` needs to access two iterators which would be invalid if…
		for (InputSection is : getInputSections(os)) {
		if (!(is->flags & SHF_EXECINSTR))
		continue;

		DeleteRanges deleteRanges;
		processRelaxations(is->relocations, [&](Relocation &rel) {
		switch (rel.type) {
		case R_RISCV_CALL:
		case R_RISCV_CALL_PLT:
		return relaxCall(is, rel, deleteRanges);
		case R_RISCV_HI20:
		case R_RISCV_LO12_I:
		case R_RISCV_LO12_S:
		return relaxHi20Lo12(is, rel, deleteRanges);
		case R_RISCV_PCREL_LO12_I:
		case R_RISCV_PCREL_LO12_S:
		return relaxPcrel(is, rel, deleteRanges);
		}

		return false;
		});

		// The second-pass for PCREL_HI20 relaxation.
		processRelaxations(is->relocations, [&](Relocation &rel) {
		if (rel.type != R_RISCV_PCREL_HI20)
		return false;

		return relaxPcrel(is, rel, deleteRanges);
		});

		using DeleteRange = InputSectionBase::DeleteRange;
		llvm::sort(deleteRanges,
		[](const DeleteRange &lhs, const DeleteRange &rhs) {
		return lhs.offset < rhs.offset;
		});

		is->deleteRanges(deleteRanges);
		script->assignAddresses();
		changed \|= !deleteRanges.empty();
		}
		}

		return changed;
		}

		static void relaxAlign() {
		bool rvc = config->eflags & EF_RISCV_RVC;

		jrtc27Unsubmitted Not Done Reply Inline Actions When would the addend ever be 0? That seems like invalid input. jrtc27: When would the addend ever be 0? That seems like invalid input.
		PkmXUnsubmitted Not Done Reply Inline Actions Yes, valid addends should only be 2^n-2 for n >= 2. With that said I don't think lld is really checking this anywhere, so malicious input could cause lld to crash. PkmX: Yes, valid addends should only be 2^n-2 for n >= 2. With that said I don't think lld is really…
		for (OutputSection *os : outputSections) {
		for (InputSection is : getInputSections(os)) {
		if (!(is->flags & SHF_EXECINSTR))
		continue;

		uint64_t bytesDeleted = 0;
		DeleteRanges deleteRanges;
		for (auto &rel : is->relocations) {
		if (rel.type == R_RISCV_ALIGN && rel.addend > 0) {
		uint64_t pc = is->getVA(rel.offset) - bytesDeleted;
		uint64_t alignment = PowerOf2Ceil(rel.addend + 2);
		uint64_t nopBytes = alignTo(pc, alignment) - pc;

		if (nopBytes % 2 != 0 \|\| (!rvc && nopBytes % 4 != 0)) {
		errorOrWarn(is->getObjMsg(rel.offset) + ": alignment requires " +
		Twine(nopBytes) + " of nop");
		break;
		}

		if (nopBytes > (uint64_t)rel.addend) {
		errorOrWarn(is->getObjMsg(rel.offset) + ": alignment requires " +
		Twine(nopBytes) + " of nop, but only " +
		Twine(rel.addend) + " bytes are available");
		break;
		}
		uint64_t bytesToDelete = rel.addend - nopBytes;

		if (bytesToDelete > 0) {
		addDeleteRange(deleteRanges, rel.offset + nopBytes, bytesToDelete);
		bytesDeleted += bytesToDelete;
		}

		uint8_t *buf = is->mutableData().data() + rel.offset;
		while (nopBytes != 0) {
		if (nopBytes >= 4) {
		write32le(buf, 0x00000013); // nop
		nopBytes -= 4;
		buf += 4;
		} else if (rvc && nopBytes == 2) {
		write16le(buf, 0x0001); // c.nop
		nopBytes -= 2;
		buf += 2;
		}
		}
		}
		}

		is->deleteRanges(deleteRanges);
		jrtc27Unsubmitted Not Done Reply Inline Actions Why a limit of 5? It is guaranteed to converge, I don't see why there needs to be a limit, and it'll probably take just 1 or 2 almost all of the time. jrtc27: Why a limit of 5? It is guaranteed to converge, I don't see why there needs to be a limit, and…
		PkmXUnsubmitted Done Reply Inline Actions Will be removed in the next patch. PkmX: Will be removed in the next patch.
		script->assignAddresses();
		}
		}
		}

		void RISCV::finalizeSections() const {
		jrtc27Unsubmitted Not Done Reply Inline Actions Not for relocatable jrtc27: Not for relocatable
		PkmXUnsubmitted Done Reply Inline Actions Will do in the next patch. PkmX: Will do in the next patch.
		// Can't perform relaxation if it is not a final link.
		if (config->relocatable)
		return;

		if (config->relax)
		while (relax())
		;

		relaxAlign();
		}

TargetInfo *elf::getRISCVTargetInfo() {		TargetInfo *elf::getRISCVTargetInfo() {
static RISCV target;		static RISCV target;
return &target;		return &target;
}		}

lld/ELF/InputSection.h

//===- InputSection.h -------------------------------------------- C++ --===//		//===- InputSection.h -------------------------------------------- C++ --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLD_ELF_INPUT_SECTION_H		#ifndef LLD_ELF_INPUT_SECTION_H
#define LLD_ELF_INPUT_SECTION_H		#define LLD_ELF_INPUT_SECTION_H

#include "Relocations.h"		#include "Relocations.h"
		#include "lld/Common/CommonLinkerContext.h"
#include "lld/Common/LLVM.h"		#include "lld/Common/LLVM.h"
		#include "lld/Common/Memory.h"
#include "llvm/ADT/CachedHashString.h"		#include "llvm/ADT/CachedHashString.h"
#include "llvm/ADT/DenseSet.h"		#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/TinyPtrVector.h"		#include "llvm/ADT/TinyPtrVector.h"
#include "llvm/Object/ELF.h"		#include "llvm/Object/ELF.h"

namespace lld {		namespace lld {
namespace elf {		namespace elf {

▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines	public:
}		}

ArrayRef<uint8_t> data() const {		ArrayRef<uint8_t> data() const {
if (uncompressedSize >= 0)		if (uncompressedSize >= 0)
uncompress();		uncompress();
return rawData;		return rawData;
}		}

		MutableArrayRef<uint8_t> mutableData() const {
		jrtc27Unsubmitted Done Reply Inline Actions Make the relevant fields mutable and mark this const, like data() jrtc27: Make the relevant fields mutable and mark this const, like data()
		if (!copiedData) {
		size_t size = data().size();
		uint8_t *mutData = context().bAlloc.Allocate<uint8_t>(size);
		memcpy(mutData, data().data(), size);
		rawData = llvm::makeArrayRef(mutData, size);
		copiedData = true;
		}

		return llvm::makeMutableArrayRef(const_cast<uint8_t *>(rawData.data()),
		rawData.size());
		}
		jrtc27Unsubmitted Not Done Reply Inline Actions std::map seems like a high-overhead way to store what is just a sorted list of intervals. jrtc27: std::map seems like a high-overhead way to store what is just a sorted list of intervals.
		PkmXUnsubmitted Done Reply Inline Actions I agree. `std::vector<std::pair<uint64_t, uint64_t>>` should be sufficient although it needs to be sorted for `deleteRanges` as the vector may not sorted with two-pass handling for `PCREL_HI20`. PkmX: I agree. `std::vector<std::pair<uint64_t, uint64_t>>` should be sufficient although it needs to…
		Pretty-boxUnsubmitted Not Done Reply Inline Actions I would like to ask why a copy operation is performed here. The data in it does not seem to have changed. If I do not perform this operation and use data().data() directly in relaxHi20Lo12, there will be a link error. Why is this? Pretty-box: I would like to ask why a copy operation is performed here. The data in it does not seem to…

		// A pair of range to delete in (offset, size)
		struct DeleteRange {
		uint64_t offset;
		uint64_t size;
		};

		// Delete ranges and adjust section content, symbols and relocations.
		// The deleteRanges must be sorted by offset and must not overlap.
		void deleteRanges(ArrayRef<DeleteRange> deleteRanges);

// The next member in the section group if this section is in a group. This is		// The next member in the section group if this section is in a group. This is
// used by --gc-sections.		// used by --gc-sections.
InputSectionBase *nextInSectionGroup = nullptr;		InputSectionBase *nextInSectionGroup = nullptr;

template <class ELFT> RelsOrRelas<ELFT> relsOrRelas() const;		template <class ELFT> RelsOrRelas<ELFT> relsOrRelas() const;

// InputSections that are dependent on us (reverse dependency for GC)		// InputSections that are dependent on us (reverse dependency for GC)
llvm::TinyPtrVector<InputSection *> dependentSections;		llvm::TinyPtrVector<InputSection *> dependentSections;
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	public:

template <typename T> llvm::ArrayRef<T> getDataAs() const {		template <typename T> llvm::ArrayRef<T> getDataAs() const {
size_t s = rawData.size();		size_t s = rawData.size();
assert(s % sizeof(T) == 0);		assert(s % sizeof(T) == 0);
return llvm::makeArrayRef<T>((const T *)rawData.data(), s / sizeof(T));		return llvm::makeArrayRef<T>((const T *)rawData.data(), s / sizeof(T));
}		}

mutable ArrayRef<uint8_t> rawData;		mutable ArrayRef<uint8_t> rawData;
		mutable bool copiedData;

protected:		protected:
template <typename ELFT>		template <typename ELFT>
void parseCompressedHeader();		void parseCompressedHeader();
void uncompress() const;		void uncompress() const;

// This field stores the uncompressed size of the compressed data in rawData,		// This field stores the uncompressed size of the compressed data in rawData,
		jrtc27Unsubmitted Not Done Reply Inline Actions Why do we need a whole new pointer when rawData is also being updated? jrtc27: Why do we need a whole new pointer when rawData is also being updated?
		PkmXUnsubmitted Done Reply Inline Actions Will change it to use a `copiedData` bool as in your patch. PkmX: Will change it to use a `copiedData` bool as in your patch.
// or -1 if rawData is not compressed (either because the section wasn't		// or -1 if rawData is not compressed (either because the section wasn't
// compressed in the first place, or because we ended up uncompressing it).		// compressed in the first place, or because we ended up uncompressing it).
// Since the feature is not used often, this is usually -1.		// Since the feature is not used often, this is usually -1.
mutable int64_t uncompressedSize = -1;		mutable int64_t uncompressedSize = -1;
};		};

// SectionPiece represents a piece of splittable section contents.		// SectionPiece represents a piece of splittable section contents.
// We allocate a lot of these and binary search on them. This means that they		// We allocate a lot of these and binary search on them. This means that they
▲ Show 20 Lines • Show All 139 Lines • ▼ Show 20 Lines

private:		private:
template <class ELFT, class RelTy>		template <class ELFT, class RelTy>
void copyRelocations(uint8_t *buf, llvm::ArrayRef<RelTy> rels);		void copyRelocations(uint8_t *buf, llvm::ArrayRef<RelTy> rels);

template <class ELFT> void copyShtGroup(uint8_t *buf);		template <class ELFT> void copyShtGroup(uint8_t *buf);
};		};

static_assert(sizeof(InputSection) <= 160, "InputSection is too big");		static_assert(sizeof(InputSection) <= 168, "InputSection is too big");

inline bool isDebugSection(const InputSectionBase &sec) {		inline bool isDebugSection(const InputSectionBase &sec) {
return (sec.flags & llvm::ELF::SHF_ALLOC) == 0 &&		return (sec.flags & llvm::ELF::SHF_ALLOC) == 0 &&
(sec.name.startswith(".debug") \|\| sec.name.startswith(".zdebug"));		(sec.name.startswith(".debug") \|\| sec.name.startswith(".zdebug"));
}		}

// The list of all input sections.		// The list of all input sections.
extern SmallVector<InputSectionBase *, 0> inputSections;		extern SmallVector<InputSectionBase *, 0> inputSections;

// The set of TOC entries (.toc + addend) for which we should not apply		// The set of TOC entries (.toc + addend) for which we should not apply
// toc-indirect to toc-relative relaxation. const Symbol * refers to the		// toc-indirect to toc-relative relaxation. const Symbol * refers to the
// STT_SECTION symbol associated to the .toc input section.		// STT_SECTION symbol associated to the .toc input section.
extern llvm::DenseSet<std::pair<const Symbol *, uint64_t>> ppc64noTocRelax;		extern llvm::DenseSet<std::pair<const Symbol *, uint64_t>> ppc64noTocRelax;

		Relocation getRISCVPCRelHi20(const Symbol sym, uint64_t addend);
} // namespace elf		} // namespace elf

std::string toString(const elf::InputSectionBase *);		std::string toString(const elf::InputSectionBase *);
} // namespace lld		} // namespace lld

#endif		#endif

lld/ELF/InputSection.cpp

Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines

InputSectionBase::InputSectionBase(InputFile *file, uint64_t flags,		InputSectionBase::InputSectionBase(InputFile *file, uint64_t flags,
uint32_t type, uint64_t entsize,		uint32_t type, uint64_t entsize,
uint32_t link, uint32_t info,		uint32_t link, uint32_t info,
uint32_t alignment, ArrayRef<uint8_t> data,		uint32_t alignment, ArrayRef<uint8_t> data,
StringRef name, Kind sectionKind)		StringRef name, Kind sectionKind)
: SectionBase(sectionKind, name, flags, entsize, alignment, type, info,		: SectionBase(sectionKind, name, flags, entsize, alignment, type, info,
link),		link),
file(file), rawData(data) {		file(file), rawData(data), copiedData(false) {
// In order to reduce memory allocation, we assume that mergeable		// In order to reduce memory allocation, we assume that mergeable
// sections are smaller than 4 GiB, which is not an unreasonable		// sections are smaller than 4 GiB, which is not an unreasonable
// assumption as of 2017.		// assumption as of 2017.
if (sectionKind == SectionBase::Merge && rawData.size() > UINT32_MAX)		if (sectionKind == SectionBase::Merge && rawData.size() > UINT32_MAX)
error(toString(this) + ": section too large");		error(toString(this) + ": section too large");

// The ELF spec states that a value of 0 means the section has		// The ELF spec states that a value of 0 means the section has
// no alignment constraints.		// no alignment constraints.
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	if (shdr.sh_type == SHT_REL) {
assert(shdr.sh_type == SHT_RELA);		assert(shdr.sh_type == SHT_RELA);
ret.relas = makeArrayRef(reinterpret_cast<const typename ELFT::Rela *>(		ret.relas = makeArrayRef(reinterpret_cast<const typename ELFT::Rela *>(
file->mb.getBufferStart() + shdr.sh_offset),		file->mb.getBufferStart() + shdr.sh_offset),
shdr.sh_size / sizeof(typename ELFT::Rela));		shdr.sh_size / sizeof(typename ELFT::Rela));
}		}
return ret;		return ret;
}		}

		void InputSectionBase::deleteRanges(ArrayRef<DeleteRange> ranges) {
		if (ranges.empty())
		return;

		// Adjust all symbol offsets and sizes within the InputSection, using the
		// following algorithm to avoid quadratic behavior.

		// Gather all symbols within the section.
		SmallVector<Defined *> symbols;
		for (auto &sym : file->getSymbols()) {
		auto *dr = dyn_cast<Defined>(sym);
		if (!dr \|\| dr->section != this)
		continue;

		symbols.push_back(dr);
		}

		// Sort symbols by their starting address.
		llvm::sort(symbols, [](const Defined a, const Defined b) {
		return a->value < b->value;
		});

		// Adjust each symbol's address by bytes deleted and also enlarge the symbol's
		// size to keep its "end" fixed.
		{
		uint64_t removedBytes = 0;
		const auto r = ranges.begin(), rend = ranges.end();
		for (auto *dr : symbols) {
		for (; r != rend && r->offset < dr->value; ++r)
		jrtc27Unsubmitted Not Done Reply Inline Actions This looks quadratic jrtc27: This looks quadratic
		PkmXUnsubmitted Done Reply Inline Actions This isn't quadratic. We are really looping over the symbols in this section while simultaneously accumulating the number of bytes deleted. Since symbols and delete ranges are sorted by offset, if we know X bytes are deleted before offset A, and B > A, we only have to add up delete ranges between A and B to know the total number of bytes deleted before B. In essence every symbol and delete range is only visited once in this loop. PkmX: This isn't quadratic. We are really looping over the symbols in this section while…
		removedBytes += r->size;

		dr->value -= removedBytes;
		dr->size += removedBytes;
		}
		}

		const auto endOff = [](const Defined *dr) { return dr->value + dr->size; };

		// Sort symbols by their "end" address before relaxation.
		llvm::sort(symbols, [&](const Defined a, const Defined b) {
		return endOff(a) < endOff(b);
		});

		// Adjust each symbol's end address to their actual end by reducing size.
		{
		uint64_t removedBytes = 0;
		const auto r = ranges.begin(), rend = ranges.end();
		for (auto *dr : symbols) {
		for (; r != rend && r->offset < endOff(dr); ++r)
		removedBytes += r->size;
		jrtc27Unsubmitted Not Done Reply Inline Actions As does this jrtc27: As does this

		dr->size -= removedBytes;
		}
		}

		// Adjust relocation offsets within the section.
		uint64_t removedBytes = 0;
		const auto r = ranges.begin(), rend = ranges.end();
		for (auto &rel : relocations) {
		for (; r != rend && r->offset < rel.offset; ++r)
		removedBytes += r->size;
		jrtc27Unsubmitted Not Done Reply Inline Actions And this jrtc27: And this

		rel.offset -= removedBytes;
		}

		// Adjust section content piece-wise and resize the section.
		MutableArrayRef<uint8_t> buf = this->mutableData();
		auto *dst = buf.begin() + ranges.begin()->offset;
		for (auto it = ranges.begin(), e = ranges.end(); it != e; ++it) {
		auto *from = buf.begin() + it->offset + it->size;
		auto *to = std::next(it) != ranges.end()
		? (buf.begin() + std::next(it)->offset)
		: buf.end();
		dst = std::copy(from, to, dst);
		}

		// Resize the section
		rawData = makeArrayRef(data().data(), dst);
		}

uint64_t SectionBase::getOffset(uint64_t offset) const {		uint64_t SectionBase::getOffset(uint64_t offset) const {
switch (kind()) {		switch (kind()) {
case Output: {		case Output: {
auto *os = cast<OutputSection>(this);		auto *os = cast<OutputSection>(this);
// For output sections we treat offset -1 as the end of the section.		// For output sections we treat offset -1 as the end of the section.
return offset == uint64_t(-1) ? os->size : offset;		return offset == uint64_t(-1) ? os->size : offset;
}		}
case Regular:		case Regular:
▲ Show 20 Lines • Show All 388 Lines • ▼ Show 20 Lines
}		}

// For R_RISCV_PC_INDIRECT (R_RISCV_PCREL_LO12_{I,S}), the symbol actually		// For R_RISCV_PC_INDIRECT (R_RISCV_PCREL_LO12_{I,S}), the symbol actually
// points the corresponding R_RISCV_PCREL_HI20 relocation, and the target VA		// points the corresponding R_RISCV_PCREL_HI20 relocation, and the target VA
// is calculated using PCREL_HI20's symbol.		// is calculated using PCREL_HI20's symbol.
//		//
// This function returns the R_RISCV_PCREL_HI20 relocation from		// This function returns the R_RISCV_PCREL_HI20 relocation from
// R_RISCV_PCREL_LO12's symbol and addend.		// R_RISCV_PCREL_LO12's symbol and addend.
static Relocation getRISCVPCRelHi20(const Symbol sym, uint64_t addend) {		Relocation lld::elf::getRISCVPCRelHi20(const Symbol sym, uint64_t addend) {
const Defined *d = cast<Defined>(sym);		const Defined *d = cast<Defined>(sym);
if (!d->section) {		if (!d->section) {
error("R_RISCV_PCREL_LO12 relocation points to an absolute symbol: " +		error("R_RISCV_PCREL_LO12 relocation points to an absolute symbol: " +
sym->getName());		sym->getName());
return nullptr;		return nullptr;
}		}
InputSection *isec = cast<InputSection>(d->section);		InputSection *isec = cast<InputSection>(d->section);

▲ Show 20 Lines • Show All 146 Lines • ▼ Show 20 Lines	return in.mipsGot->getVA() + in.mipsGot->getGlobalDynOffset(file, sym) -
in.mipsGot->getGp(file);		in.mipsGot->getGp(file);
case R_MIPS_TLSLD:		case R_MIPS_TLSLD:
return in.mipsGot->getVA() + in.mipsGot->getTlsIndexOffset(file) -		return in.mipsGot->getVA() + in.mipsGot->getTlsIndexOffset(file) -
in.mipsGot->getGp(file);		in.mipsGot->getGp(file);
case R_AARCH64_PAGE_PC: {		case R_AARCH64_PAGE_PC: {
uint64_t val = sym.isUndefWeak() ? p + a : sym.getVA(a);		uint64_t val = sym.isUndefWeak() ? p + a : sym.getVA(a);
return getAArch64Page(val) - getAArch64Page(p);		return getAArch64Page(val) - getAArch64Page(p);
}		}
		case R_RISCV_GPREL: {
		if (!ElfSym::riscvGlobalPointer)
		llvm_unreachable(
		"Cannot compute R_RISCV_GPREL if __global_pointer$ is not set");

		return sym.getVA(a) - ElfSym::riscvGlobalPointer->getVA();
		}
case R_RISCV_PC_INDIRECT: {		case R_RISCV_PC_INDIRECT: {
if (const Relocation *hiRel = getRISCVPCRelHi20(&sym, a))		if (const Relocation *hiRel = getRISCVPCRelHi20(&sym, a))
return getRelocTargetVA(file, hiRel->type, hiRel->addend, sym.getVA(),		return getRelocTargetVA(file, hiRel->type, hiRel->addend, sym.getVA(),
*hiRel->sym, hiRel->expr);		*hiRel->sym, hiRel->expr);
return 0;		return 0;
}		}
case R_PC:		case R_PC:
case R_ARM_PCA: {		case R_ARM_PCA: {
▲ Show 20 Lines • Show All 264 Lines • ▼ Show 20 Lines
void InputSectionBase::relocateAlloc(uint8_t buf, uint8_t bufEnd) {		void InputSectionBase::relocateAlloc(uint8_t buf, uint8_t bufEnd) {
assert(flags & SHF_ALLOC);		assert(flags & SHF_ALLOC);
const unsigned bits = config->wordsize * 8;		const unsigned bits = config->wordsize * 8;
const TargetInfo &target = *elf::target;		const TargetInfo &target = *elf::target;
uint64_t lastPPCRelaxedRelocOff = UINT64_C(-1);		uint64_t lastPPCRelaxedRelocOff = UINT64_C(-1);
AArch64Relaxer aarch64relaxer(relocations);		AArch64Relaxer aarch64relaxer(relocations);
for (size_t i = 0, size = relocations.size(); i != size; ++i) {		for (size_t i = 0, size = relocations.size(); i != size; ++i) {
const Relocation &rel = relocations[i];		const Relocation &rel = relocations[i];
if (rel.expr == R_NONE)		if (rel.expr == R_NONE \|\| rel.expr == R_RELAX_HINT)
continue;		continue;
uint64_t offset = rel.offset;		uint64_t offset = rel.offset;
uint8_t *bufLoc = buf + offset;		uint8_t *bufLoc = buf + offset;

uint64_t secAddr = getOutputSection()->addr;		uint64_t secAddr = getOutputSection()->addr;
if (auto *sec = dyn_cast<InputSection>(this))		if (auto *sec = dyn_cast<InputSection>(this))
secAddr += sec->outSecOff;		secAddr += sec->outSecOff;
const uint64_t addrLoc = secAddr + offset;		const uint64_t addrLoc = secAddr + offset;
▲ Show 20 Lines • Show All 477 Lines • Show Last 20 Lines

lld/ELF/Relocations.h

Show All 40 Lines	enum RelExpr {
R_GOTPLT,		R_GOTPLT,
R_GOTPLTREL,		R_GOTPLTREL,
R_GOTREL,		R_GOTREL,
R_NONE,		R_NONE,
R_PC,		R_PC,
R_PLT,		R_PLT,
R_PLT_PC,		R_PLT_PC,
R_PLT_GOTPLT,		R_PLT_GOTPLT,
		R_RELAX_HINT,
R_RELAX_GOT_PC,		R_RELAX_GOT_PC,
R_RELAX_GOT_PC_NOPIC,		R_RELAX_GOT_PC_NOPIC,
R_RELAX_TLS_GD_TO_IE,		R_RELAX_TLS_GD_TO_IE,
R_RELAX_TLS_GD_TO_IE_ABS,		R_RELAX_TLS_GD_TO_IE_ABS,
R_RELAX_TLS_GD_TO_IE_GOT_OFF,		R_RELAX_TLS_GD_TO_IE_GOT_OFF,
R_RELAX_TLS_GD_TO_IE_GOTPLT,		R_RELAX_TLS_GD_TO_IE_GOTPLT,
R_RELAX_TLS_GD_TO_LE,		R_RELAX_TLS_GD_TO_LE,
R_RELAX_TLS_GD_TO_LE_NEG,		R_RELAX_TLS_GD_TO_LE_NEG,
Show All 39 Lines	enum RelExpr {
R_MIPS_TLSLD,		R_MIPS_TLSLD,
R_PPC32_PLTREL,		R_PPC32_PLTREL,
R_PPC64_CALL,		R_PPC64_CALL,
R_PPC64_CALL_PLT,		R_PPC64_CALL_PLT,
R_PPC64_RELAX_TOC,		R_PPC64_RELAX_TOC,
R_PPC64_TOCBASE,		R_PPC64_TOCBASE,
R_PPC64_RELAX_GOT_PC,		R_PPC64_RELAX_GOT_PC,
R_RISCV_ADD,		R_RISCV_ADD,
		R_RISCV_GPREL,
R_RISCV_PC_INDIRECT,		R_RISCV_PC_INDIRECT,
};		};

// Architecture-neutral representation of relocation.		// Architecture-neutral representation of relocation.
struct Relocation {		struct Relocation {
RelExpr expr;		RelExpr expr;
RelType type;		RelType type;
uint64_t offset;		uint64_t offset;
▲ Show 20 Lines • Show All 106 Lines • Show Last 20 Lines

lld/ELF/Relocations.cpp

Show First 20 Lines • Show All 120 Lines • ▼ Show 20 Lines	void elf::reportRangeError(uint8_t *loc, int64_t v, int n, const Symbol &sym,
std::string hint;		std::string hint;
if (!sym.getName().empty())		if (!sym.getName().empty())
hint = "; references " + lld::toString(sym) + getDefinedLocation(sym);		hint = "; references " + lld::toString(sym) + getDefinedLocation(sym);
errorOrWarn(errPlace.loc + msg + " is out of range: " + Twine(v) +		errorOrWarn(errPlace.loc + msg + " is out of range: " + Twine(v) +
" is not in [" + Twine(llvm::minIntN(n)) + ", " +		" is not in [" + Twine(llvm::minIntN(n)) + ", " +
Twine(llvm::maxIntN(n)) + "]" + hint);		Twine(llvm::maxIntN(n)) + "]" + hint);
}		}

// Build a bitmask with one bit set for each 64 subset of RelExpr.		// Build a bitmask with one bit set for each 64 subset of RelExpr.
		jrtc27Unsubmitted Not Done Reply Inline Actions (a) this is unrelated to the diff (b) this completely removes the whole point of oneof jrtc27: (a) this is unrelated to the diff (b) this completely removes the whole point of oneof
		PkmXUnsubmitted Done Reply Inline Actions As noted in the summary adding more `RelExpr` makes it unable to fit in a 64-bit mask, so this is temporary change to make this patch work for now. There were already 64 `RelExpr`s before this patch so there is no room for growing and I don't think it is workable in the long run. The only solution I can think of is split them into target specific `RelExpr` with some common exprs like `R_ABS` being shared, so as long as the target doesn't define more than ~20 `RelExpr` it should be fine. However this probably should belong to another patch. PkmX: As noted in the summary adding more `RelExpr` makes it unable to fit in a 64-bit mask, so this…
		PkmXUnsubmitted Done Reply Inline Actions Alternatively I think we can use two 64-bit masks, so this should work up to 128 `RelExpr`s. PkmX: Alternatively I think we can use two 64-bit masks, so this should work up to 128 `RelExpr`s.
static constexpr uint64_t buildMask() { return 0; }		static constexpr uint64_t buildMask() { return 0; }

template <typename... Tails>		template <typename... Tails>
static constexpr uint64_t buildMask(int head, Tails... tails) {		static constexpr uint64_t buildMask(int head, Tails... tails) {
return (0 <= head && head < 64 ? uint64_t(1) << head : 0) \|		return (0 <= head && head < 64 ? uint64_t(1) << head : 0) \|
buildMask(tails...);		buildMask(tails...);
		arichardsonUnsubmitted Not Done Reply Inline Actions This change should be a separate review. I would very much like support for > 64 RelExpr values to land upstream since we also had to make that change that for our CHERI LLD. arichardson: This change should be a separate review. I would very much like support for > 64 RelExpr values…
		PkmXUnsubmitted Done Reply Inline Actions That's okay for me. I will split this out into a new patch. PkmX: That's okay for me. I will split this out into a new patch.
		luismarquesUnsubmitted Not Done Reply Inline Actions we also had to make that change that for our CHERI LLD. That's okay for me. I will split this out into a new patch. I ran into the same problem in another context. It would be good to have some guidance about what's the best way forward to address this for the long term (e.g. two 64-bit masks?). @ruiu? @PkmX Are you still planning to submit your separate patch for this mask issue? And, more broadly, to update this LLD RISC-V relaxation patch? luismarques: >> we also had to make that change that for our CHERI LLD. > That's okay for me. I will split…
}		}

// Return true if `Expr` is one of `Exprs`.		// Return true if `Expr` is one of `Exprs`.
// There are more than 64 but less than 128 RelExprs, so we divide the set of		// There are more than 64 but less than 128 RelExprs, so we divide the set of
// exprs into [0, 64) and [64, 128) and represent each range as a constant		// exprs into [0, 64) and [64, 128) and represent each range as a constant
// 64-bit mask. Then we decide which mask to test depending on the value of		// 64-bit mask. Then we decide which mask to test depending on the value of
// expr and use a simple shift and bitwise-and to test for membership.		// expr and use a simple shift and bitwise-and to test for membership.
template <RelExpr... Exprs> static bool oneof(RelExpr expr) {		template <RelExpr... Exprs> static bool oneof(RelExpr expr) {
▲ Show 20 Lines • Show All 811 Lines • ▼ Show 20 Lines
bool RelocationScanner::isStaticLinkTimeConstant(RelExpr e, RelType type,		bool RelocationScanner::isStaticLinkTimeConstant(RelExpr e, RelType type,
const Symbol &sym,		const Symbol &sym,
uint64_t relOff) const {		uint64_t relOff) const {
// These expressions always compute a constant		// These expressions always compute a constant
if (oneof<R_GOTPLT, R_GOT_OFF, R_MIPS_GOT_LOCAL_PAGE, R_MIPS_GOTREL,		if (oneof<R_GOTPLT, R_GOT_OFF, R_MIPS_GOT_LOCAL_PAGE, R_MIPS_GOTREL,
R_MIPS_GOT_OFF, R_MIPS_GOT_OFF32, R_MIPS_GOT_GP_PC,		R_MIPS_GOT_OFF, R_MIPS_GOT_OFF32, R_MIPS_GOT_GP_PC,
R_AARCH64_GOT_PAGE_PC, R_GOT_PC, R_GOTONLY_PC, R_GOTPLTONLY_PC,		R_AARCH64_GOT_PAGE_PC, R_GOT_PC, R_GOTONLY_PC, R_GOTPLTONLY_PC,
R_PLT_PC, R_PLT_GOTPLT, R_PPC32_PLTREL, R_PPC64_CALL_PLT,		R_PLT_PC, R_PLT_GOTPLT, R_PPC32_PLTREL, R_PPC64_CALL_PLT,
R_PPC64_RELAX_TOC, R_RISCV_ADD, R_AARCH64_GOT_PAGE>(e))		R_PPC64_RELAX_TOC, R_RISCV_ADD, R_RELAX_HINT, R_AARCH64_GOT_PAGE>(
		e))
return true;		return true;

// These never do, except if the entire file is position dependent or if		// These never do, except if the entire file is position dependent or if
// only the low bits are used.		// only the low bits are used.
if (e == R_GOT \|\| e == R_PLT)		if (e == R_GOT \|\| e == R_PLT)
return target.usesOnlyLowPageBits(type) \|\| !config->isPic;		return target.usesOnlyLowPageBits(type) \|\| !config->isPic;

if (sym.isPreemptible)		if (sym.isPreemptible)
▲ Show 20 Lines • Show All 1,254 Lines • Show Last 20 Lines

lld/ELF/Target.h

Show First 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	virtual void relocate(uint8_t *loc, const Relocation &rel,
uint64_t val) const = 0;		uint64_t val) const = 0;
void relocateNoSym(uint8_t *loc, RelType type, uint64_t val) const {		void relocateNoSym(uint8_t *loc, RelType type, uint64_t val) const {
relocate(loc, Relocation{R_NONE, type, 0, 0, nullptr}, val);		relocate(loc, Relocation{R_NONE, type, 0, 0, nullptr}, val);
}		}

virtual void applyJumpInstrMod(uint8_t *loc, JumpModType type,		virtual void applyJumpInstrMod(uint8_t *loc, JumpModType type,
JumpModType val) const {}		JumpModType val) const {}

		virtual void finalizeSections() const {}

virtual ~TargetInfo();		virtual ~TargetInfo();

// This deletes a jump insn at the end of the section if it is a fall thru to		// This deletes a jump insn at the end of the section if it is a fall thru to
// the next section. Further, if there is a conditional jump and a direct		// the next section. Further, if there is a conditional jump and a direct
// jump consecutively, it tries to flip the conditional jump to convert the		// jump consecutively, it tries to flip the conditional jump to convert the
// direct jump into a fall thru and delete it. Returns true if a jump		// direct jump into a fall thru and delete it. Returns true if a jump
// instruction can be deleted.		// instruction can be deleted.
virtual bool deleteFallThruJmpInsn(InputSection &is, InputFile *file,		virtual bool deleteFallThruJmpInsn(InputSection &is, InputFile *file,
▲ Show 20 Lines • Show All 223 Lines • Show Last 20 Lines

lld/ELF/Writer.cpp

Show First 20 Lines • Show All 1,622 Lines • ▼ Show 20 Lines	template <class ELFT> void Writer<ELFT>::finalizeAddressDependentContent() {
for (Partition &part : partitions)		for (Partition &part : partitions)
finalizeSynthetic(part.armExidx.get());		finalizeSynthetic(part.armExidx.get());
resolveShfLinkOrder();		resolveShfLinkOrder();

// Converts call x@GDPLT to call __tls_get_addr		// Converts call x@GDPLT to call __tls_get_addr
if (config->emachine == EM_HEXAGON)		if (config->emachine == EM_HEXAGON)
hexagonTLSSymbolUpdate(outputSections);		hexagonTLSSymbolUpdate(outputSections);

		target->finalizeSections();

int assignPasses = 0;		int assignPasses = 0;
for (;;) {		for (;;) {
bool changed = target->needsThunks && tc.createThunks(outputSections);		bool changed = target->needsThunks && tc.createThunks(outputSections);

// With Thunk Size much smaller than branch range we expect to		// With Thunk Size much smaller than branch range we expect to
// converge quickly; if we get to 15 something has gone wrong.		// converge quickly; if we get to 15 something has gone wrong.
if (changed && tc.pass >= 15) {		if (changed && tc.pass >= 15) {
error("thunk creation not converged");		error("thunk creation not converged");
▲ Show 20 Lines • Show All 1,343 Lines • Show Last 20 Lines

lld/test/ELF/riscv-gp.s

	# REQUIRES: riscv			# REQUIRES: riscv
	# RUN: llvm-mc -filetype=obj -triple=riscv32 %s -o %t.32.o			# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+relax %s -o %t.32.o
	# RUN: ld.lld -pie %t.32.o -o %t.32			# RUN: ld.lld -pie %t.32.o -o %t.32
	# RUN: llvm-readelf -s %t.32 \| FileCheck --check-prefix=SYM32 %s			# RUN: llvm-readelf -s %t.32 \| FileCheck --check-prefix=SYM32 %s
	# RUN: llvm-readelf -S %t.32 \| FileCheck --check-prefix=SEC32 %s			# RUN: llvm-readelf -S %t.32 \| FileCheck --check-prefix=SEC32 %s
				# RUN: llvm-objdump -d --print-imm-hex %t.32 \| FileCheck --check-prefix=DIS32 %s
	# RUN: not ld.lld -shared %t.32.o -o /dev/null 2>&1 \| FileCheck --check-prefix=ERR %s			# RUN: not ld.lld -shared %t.32.o -o /dev/null 2>&1 \| FileCheck --check-prefix=ERR %s

	# RUN: llvm-mc -filetype=obj -triple=riscv64 %s -o %t.64.o			# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+relax %s -o %t.64.o
	# RUN: ld.lld -pie %t.64.o -o %t.64			# RUN: ld.lld -pie %t.64.o -o %t.64
	# RUN: llvm-readelf -s %t.64 \| FileCheck --check-prefix=SYM64 %s			# RUN: llvm-readelf -s %t.64 \| FileCheck --check-prefix=SYM64 %s
	# RUN: llvm-readelf -S %t.64 \| FileCheck --check-prefix=SEC64 %s			# RUN: llvm-readelf -S %t.64 \| FileCheck --check-prefix=SEC64 %s
				# RUN: llvm-objdump -d %t.64 \| FileCheck --check-prefix=DIS64 %s
	# RUN: not ld.lld -shared %t.64.o -o /dev/null 2>&1 \| FileCheck --check-prefix=ERR %s			# RUN: not ld.lld -shared %t.64.o -o /dev/null 2>&1 \| FileCheck --check-prefix=ERR %s

	## __global_pointer$ = .sdata+0x800 = 0x39c0			## __global_pointer$ = .sdata+0x800 = 0x39c0
	# SEC32: [ 7] .sdata PROGBITS {{0*}}000031c0			# SEC32: [ 7] .sdata PROGBITS {{0*}}000031c0
	# SYM32: {{0*}}000039c0 0 NOTYPE GLOBAL DEFAULT 7 __global_pointer$			# SYM32: {{0*}}000039c0 0 NOTYPE GLOBAL DEFAULT 7 __global_pointer$

	# SEC64: [ 7] .sdata PROGBITS {{0*}}000032e0			# SEC64: [ 7] .sdata PROGBITS {{0*}}000032e0
	# SYM64: {{0*}}00003ae0 0 NOTYPE GLOBAL DEFAULT 7 __global_pointer$			# SYM64: {{0*}}00003ae0 0 NOTYPE GLOBAL DEFAULT 7 __global_pointer$

	## __global_pointer$ - 0x1000 = 4096*3-2048			# DIS32: auipc gp, 3
	# DIS: 1000: auipc gp, 3			# DIS32-NEXT: addi gp, gp, -1968
	# DIS-NEXT: addi gp, gp, -2048
				# DIS64: auipc gp, 3
				# DIS64-NEXT: addi gp, gp, -1896

	# ERR: error: relocation R_RISCV_PCREL_HI20 cannot be used against symbol '__global_pointer$'; recompile with -fPIC			# ERR: error: relocation R_RISCV_PCREL_HI20 cannot be used against symbol '__global_pointer$'; recompile with -fPIC

				.option norelax
	lla gp, __global_pointer$			lla gp, __global_pointer$

	.section .sdata,"aw"			.section .sdata,"aw"

lld/test/ELF/riscv-relax-align-rvc.s

This file was added.

				# REQUIRES: riscv

				# RUN: llvm-mc -filetype=obj -triple=riscv32-unknown-elf -mattr=+c,+relax %s -o %t.rv32.o
				# RUN: llvm-mc -filetype=obj -triple=riscv64-unknown-elf -mattr=+c,+relax %s -o %t.rv64.o

				# RUN: ld.lld %t.rv32.o -o %t.rv32
				# RUN: ld.lld %t.rv64.o -o %t.rv64
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv32 \| FileCheck %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv64 \| FileCheck %s

				# CHECK: c.add a0, a1
				# CHECK-NEXT: addi zero, zero, 0
				# CHECK-NEXT: addi zero, zero, 0
				# CHECK-NEXT: addi zero, zero, 0
				# CHECK-NEXT: c.nop
				# CHECK-NEXT: c.add s0, s1
				# CHECK-NEXT: c.add s2, s3
				# CHECK-NEXT: c.add s4, s5
				# CHECK-NEXT: c.nop
				# CHECK-NEXT: c.add t0, t1

				.global _start
				_start:
				.balign 4
				c.add a0, a1
				.balign 16
				c.add s0, s1
				c.add s2, s3
				c.add s4, s5
				.balign 8
				c.add t0, t1

lld/test/ELF/riscv-relax-align.s

This file was added.

				# REQUIRES: riscv

				# RUN: llvm-mc -filetype=obj -triple=riscv32-unknown-elf -mattr=+relax %s -o %t.rv32.o
				# RUN: llvm-mc -filetype=obj -triple=riscv64-unknown-elf -mattr=+relax %s -o %t.rv64.o

				# RUN: ld.lld %t.rv32.o -o %t.rv32
				# RUN: ld.lld %t.rv64.o -o %t.rv64
				# RUN: llvm-objdump -d --no-show-raw-insn %t.rv32 \| FileCheck %s
				# RUN: llvm-objdump -d --no-show-raw-insn %t.rv64 \| FileCheck %s

				# Check that alignment is always handled regardless of --relax option
				# RUN: ld.lld --no-relax %t.rv32.o -o %t-no-relax.rv32
				# RUN: ld.lld --no-relax %t.rv64.o -o %t-no-relax.rv64
				# RUN: llvm-objdump -d --no-show-raw-insn %t-no-relax.rv32 \| FileCheck %s
				# RUN: llvm-objdump -d --no-show-raw-insn %t-no-relax.rv64 \| FileCheck %s

				# CHECK: add a0, a1, a2
				# CHECK-NEXT: add a3, a4, a5
				# CHECK-NEXT: nop
				# CHECK-NEXT: nop
				# CHECK-NEXT: add s0, s1, s2
				# CHECK-NEXT: add t0, t1, t2

				.global _start
				_start:
				.balign 4
				add a0, a1, a2
				add a3, a4, a5
				.balign 16
				add s0, s1, s2
				.balign 4
				.balign 4
				add t0, t1, t2

lld/test/ELF/riscv-relax-call.s

This file was added.

				# REQUIRES: riscv

				# RUN: llvm-mc -filetype=obj -triple=riscv32-unknown-elf -mattr=+relax %s -o %t.rv32.o
				# RUN: llvm-mc -filetype=obj -triple=riscv64-unknown-elf -mattr=+relax %s -o %t.rv64.o
				# RUN: llvm-mc -filetype=obj -triple=riscv32-unknown-elf -mattr=+c,+relax %s -o %t.rv32c.o
				# RUN: llvm-mc -filetype=obj -triple=riscv64-unknown-elf -mattr=+c,+relax %s -o %t.rv64c.o

				# jal relaxation
				# RUN: ld.lld %t.rv32.o --defsym foo=_start+20 -o %t.rv32
				# RUN: ld.lld %t.rv64.o --defsym foo=_start+20 -o %t.rv64
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv32 \| FileCheck --check-prefix=JAL %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv64 \| FileCheck --check-prefix=JAL %s
				# JAL: jal ra, {{.*}} <foo>
				# JAL-NEXT: jal zero, {{.*}} <foo>

				# c.j and c.jal (RV32C-only) relaxation
				# RUN: ld.lld %t.rv32c.o --defsym foo=_start+20 -o %t.rv32c
				# RUN: ld.lld %t.rv64c.o --defsym foo=_start+20 -o %t.rv64c
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv32c \| FileCheck --check-prefix=RV32C %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv64c \| FileCheck --check-prefix=RV64C %s
				# RV32C: c.jal {{.*}} <foo>
				# RV32C-NEXT: c.j {{.*}} <foo>
				# RV64C: jal ra, {{.*}} <foo>
				# RV64C-NEXT: c.j {{.*}} <foo>

				# Don't relax to c.j/c.jal if out of range
				# RUN: ld.lld %t.rv32c.o --defsym foo=_start+0x1004 -o %t.rv32c
				# RUN: ld.lld %t.rv64c.o --defsym foo=_start+0x1004 -o %t.rv64c
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv32c \| FileCheck --check-prefix=JAL %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv64c \| FileCheck --check-prefix=JAL %s

				# Don't relax if out of range (for the first call)
				# RUN: ld.lld %t.rv32c.o --defsym foo=_start+0x100000 -o %t-boundary.rv32
				# RUN: ld.lld %t.rv64c.o --defsym foo=_start+0x100000 -o %t-boundary.rv64
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t-boundary.rv32 \| FileCheck --check-prefix=BOUNDARY %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t-boundary.rv64 \| FileCheck --check-prefix=BOUNDARY %s
				# BOUNDARY: auipc ra, 256
				# BOUNDARY-NEXT: jalr ra, 0(ra)
				# BOUNDARY-NEXT: jal zero, {{.*}} <foo>

				# Check relaxation works across output sections
				# RUN: echo 'SECTIONS { .text 0x100000 : { *(.text) } .foo : ALIGN(8) { foo = .; } }' > %t-cross-section.lds
				# RUN: ld.lld %t.rv32c.o %t-cross-section.lds -o %t-cross-section.rv32
				# RUN: ld.lld %t.rv64c.o %t-cross-section.lds -o %t-cross-section.rv64
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t-cross-section.rv32 \| FileCheck --check-prefix=RV32C %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t-cross-section.rv64 \| FileCheck --check-prefix=RV64C %s

				# Test for output section alignment checking during relaxation. The .foo section
				# cannot be moved closer due to alignment so lld must not relax the call, even
				# though it seems it may be in range before relaxation.

				# RUN: echo 'SECTIONS { .text 0x100000 : { *(.text); } .foo : ALIGN(0x100000) { foo = .; } }' > %t-cross-section-out-of-range.lds
				# RUN: ld.lld %t.rv32c.o %t-cross-section-out-of-range.lds -o %t-cross-section-out-of-range.rv32
				# RUN: ld.lld %t.rv64c.o %t-cross-section-out-of-range.lds -o %t-cross-section-out-of-range.rv64
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t-cross-section-out-of-range.rv32 \| FileCheck --check-prefix=NORELAX %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t-cross-section-out-of-range.rv64 \| FileCheck --check-prefix=NORELAX %s
				# NORELAX: auipc ra, {{.*}}
				# NORELAX-NEXT: jalr ra, {{.*}}(ra)
				# NORELAX: auipc t1, {{.*}}
				# NORELAX-NEXT: jalr zero, {{.*}}(t1)

				# Don't relax to absolute symbols
				# RUN: ld.lld %t.rv32c.o -Ttext=0x100000 --defsym foo=0x100000 -o %t-abs.rv32
				# RUN: ld.lld %t.rv64c.o -Ttext=0x100000 --defsym foo=0x100000 -o %t-abs.rv64
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t-abs.rv32 \| FileCheck --check-prefix=NORELAX %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t-abs.rv64 \| FileCheck --check-prefix=NORELAX %s

				.global _start
				.p2align 3
				_start:
				call foo
				tail foo

lld/test/ELF/riscv-relax-hi20-lo12.s

This file was added.

				# REQUIRES: riscv

				# RUN: llvm-mc -filetype=obj -triple=riscv32-unknown-elf -mattr=+relax %s -o %t.rv32.o
				# RUN: llvm-mc -filetype=obj -triple=riscv64-unknown-elf -mattr=+relax %s -o %t.rv64.o
				# RUN: llvm-mc -filetype=obj -triple=riscv32-unknown-elf -mattr=+c,+relax %s -o %t.rv32c.o
				# RUN: llvm-mc -filetype=obj -triple=riscv64-unknown-elf -mattr=+c,+relax %s -o %t.rv64c.o

				# RUN: echo 'SECTIONS { .text : { *(.text) } .sdata 0x200000 : { foo = .; } }' > %t.lds
				# RUN: ld.lld --undefined=__global_pointer$ %t.rv32.o %t.lds -o %t.rv32
				# RUN: ld.lld --undefined=__global_pointer$ %t.rv64.o %t.lds -o %t.rv64
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv32 \| FileCheck --check-prefix=GP %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv64 \| FileCheck --check-prefix=GP %s
				# GP-NOT: lui
				# GP: addi a0, gp, -2048
				# GP-NEXT: lw a0, -2048(gp)
				# GP-NEXT: sw a0, -2048(gp)

				# RUN: echo 'SECTIONS { .text : { *(.text) } .sdata 0x200000 : { foo = . + 4096; } }' > %t-out-of-range.lds
				# RUN: ld.lld --undefined=__global_pointer$ %t.rv32.o %t-out-of-range.lds -o %t.rv32-out-of-range
				# RUN: ld.lld --undefined=__global_pointer$ %t.rv64.o %t-out-of-range.lds -o %t.rv64-out-of-range
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv32-out-of-range \| FileCheck --check-prefix=NORELAX %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv64-out-of-range \| FileCheck --check-prefix=NORELAX %s
				# NORELAX: lui a0, 513
				# NORELAX-NEXT: addi a0, a0, 0
				# NORELAX-NEXT: lw a0, 0(a0)
				# NORELAX-NEXT: sw a0, 0(a0)

				# RUN: ld.lld --defsym=foo=0x1000 %t.rv32c.o -o %t.rv32-clui
				# RUN: ld.lld --defsym=foo=0x1000 %t.rv64c.o -o %t.rv64-clui
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv32-clui \| FileCheck --check-prefix=CLUI %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv64-clui \| FileCheck --check-prefix=CLUI %s
				# CLUI: c.lui a0, 1
				# CLUI-NEXT: addi a0, a0, 0
				# CLUI-NEXT: lw a0, 0(a0)
				# CLUI-NEXT: sw a0, 0(a0)

				# RUN: ld.lld --defsym=foo=0x10 %t.rv32c.o -o %t.rv32-cli
				# RUN: ld.lld --defsym=foo=0x10 %t.rv64c.o -o %t.rv64-cli
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv32-cli \| FileCheck --check-prefix=CLI %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv64-cli \| FileCheck --check-prefix=CLI %s
				# CLI: c.li a0, 0
				# CLI-NEXT: addi a0, a0, 16
				# CLI-NEXT: lw a0, 16(a0)
				# CLI-NEXT: sw a0, 16(a0)

				.global _start
				_start:
				lui a0, %hi(foo)
				addi a0, a0, %lo(foo)
				lw a0, %lo(foo)(a0)
				sw a0, %lo(foo)(a0)

lld/test/ELF/riscv-relax-pcrel.s

This file was added.

				# REQUIRES: riscv

				# RUN: llvm-mc -filetype=obj -triple=riscv32-unknown-elf -mattr=+relax %s -o %t.rv32.o
				# RUN: llvm-mc -filetype=obj -triple=riscv64-unknown-elf -mattr=+relax %s -o %t.rv64.o

				# RUN: echo 'SECTIONS { .text 0x100000 : { *(.text) } .sdata 0x200000 : { foo = .; } }' > %t.lds
				# RUN: ld.lld --undefined=__global_pointer$ %t.rv32.o %t.lds -o %t.rv32
				# RUN: ld.lld --undefined=__global_pointer$ %t.rv64.o %t.lds -o %t.rv64
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv32 \| FileCheck --check-prefix=GP %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t.rv64 \| FileCheck --check-prefix=GP %s
				# GP-NOT: auipc
				# GP: addi a0, gp, -2048
				# GP-NEXT: sw a0, -2048(gp)

				# RUN: echo 'SECTIONS { .text 0x100000 : { *(.text) } .sdata 0x200000 : { foo = . + 4096; } }' > %t-norelax.lds
				# RUN: ld.lld --undefined=__global_pointer$ %t.rv32.o %t-norelax.lds -o %t-norelax.rv32
				# RUN: ld.lld --undefined=__global_pointer$ %t.rv64.o %t-norelax.lds -o %t-norelax.rv64
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t-norelax.rv32 \| FileCheck --check-prefix=NORELAX %s
				# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t-norelax.rv64 \| FileCheck --check-prefix=NORELAX %s
				# NORELAX: auipc a0, 257
				# NORELAX-NEXT: addi a0, a0, 0
				# NORELAX-NEXT: sw a0, 0(a0)

				.global _start
				_start:
				auipc a0, %pcrel_hi(foo)
				addi a0, a0, %pcrel_lo(_start)
				sw a0, %pcrel_lo(_start)(a0)

lld/test/ELF/riscv-relax-syms.s

This file was added.

				# REQUIRES: riscv

				// Check that relaxation correctly adjusts symbol addresses and sizes.

				# RUN: llvm-mc -filetype=obj -triple=riscv32-unknown-elf -mattr=+relax %s -o %t.rv32.o
				# RUN: llvm-mc -filetype=obj -triple=riscv64-unknown-elf -mattr=+relax %s -o %t.rv64.o
				# RUN: ld.lld -Ttext=0x100000 %t.rv32.o -o %t.rv32
				# RUN: ld.lld -Ttext=0x100000 %t.rv64.o -o %t.rv64

				# RUN: llvm-readelf -s %t.rv32 \| FileCheck %s
				# RUN: llvm-readelf -s %t.rv64 \| FileCheck %s

				# CHECK: 100000 4 NOTYPE LOCAL DEFAULT 1 a
				# CHECK: 100000 8 NOTYPE LOCAL DEFAULT 1 b
				# CHECK: 100004 4 NOTYPE LOCAL DEFAULT 1 c
				# CHECK: 100004 8 NOTYPE LOCAL DEFAULT 1 d
				# CHECK: 100000 12 NOTYPE GLOBAL DEFAULT 1 _start

				.global _start
				_start:
				a:
				b:
				add a0, a1, a2
				.size a, . - a
				c:
				d:
				call _start
				.size b, . - b
				.size c, . - c
				add a0, a1, a2
				.size d, . - d
				.size _start, . - _start

lld/test/ELF/riscv-reloc-align.s

This file was deleted.

	# REQUIRES: riscv

	# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+relax %s -o %t.o
	# RUN: not ld.lld %t.o -o /dev/null 2>&1 \| FileCheck %s

	# CHECK: relocation R_RISCV_ALIGN requires unimplemented linker relaxation

	.global _start
	_start:
	nop
	.balign 8
	nop

This is an archive of the discontinued LLVM Phabricator instance.

[WIP][LLD][RISCV] Linker RelaxationAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 424315

lld/ELF/Arch/RISCV.cpp

lld/ELF/InputSection.h

lld/ELF/InputSection.cpp

lld/ELF/Relocations.h

lld/ELF/Relocations.cpp

lld/ELF/Target.h

lld/ELF/Writer.cpp

lld/test/ELF/riscv-gp.s

lld/test/ELF/riscv-relax-align-rvc.s

lld/test/ELF/riscv-relax-align.s

lld/test/ELF/riscv-relax-call.s

lld/test/ELF/riscv-relax-hi20-lo12.s

lld/test/ELF/riscv-relax-pcrel.s

lld/test/ELF/riscv-relax-syms.s

lld/test/ELF/riscv-reloc-align.s

[WIP][LLD][RISCV] Linker Relaxation
AbandonedPublic