This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lld/ELF/
-
ELF/
-
Arch/
1/3
RISCV.cpp
-
InputSection.h
-
InputSection.cpp
-
Relocations.h
-
Relocations.cpp
-
Target.h
-
Target.cpp
2
Writer.cpp

Differential D77694

[WIP][RISCV][ELF] Linker relaxation support
AbandonedPublic

Authored by jrtc27 on Apr 7 2020, 4:52 PM.

Download Raw Diff

Details

Reviewers

• espindola
MaskRay

Summary

This is a proof-of-concept implementation for R_RISCV_ALIGN that I've had lying
around for a while. I just rebased it, but it has not been build tested, let
alone run. It was previously able to link a working FreeBSD kernel, though, and
compared bit-for-bit identical to an -mno-relax build. I'm posting this here as
it came up on IRC and I figured it would be best to share it and avoid
duplicating work (and perhaps also promoting a bit more discussion about how to
implement this correctly in LLD; the mutableData/makeMutableDataCopy is a bit
of a gross hack that could potentially go wrong if consumers of data hang on to
now-stale pointers, and it would be nice if we could instead just
mprotect/mremap/etc the mapping and let the OS CoW as needed whilst retaining
the same address, although given I work on CHERI and CHERI has minor issues
with the mprotect/mremap interface _increasing_ permissions, perhaps I should
not be pushing for that).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jrtc27 created this revision.Apr 7 2020, 4:52 PM

Herald added a reviewer: • espindola. · View Herald TranscriptApr 7 2020, 4:52 PM

Herald added a reviewer: MaskRay. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, evandro, luismarques and 30 others. · View Herald Transcript

Harbormaster failed remote builds in B52270: Diff 255860!Apr 7 2020, 5:28 PM

Interestingly, I was thinking about the same thing on Saturday! I wanted to
add optimizeBasicBlockJumps() in a proper place (D68065; basic block
sections). The current place
is wrong for a thunk target (e.g. AArch64). I did not write code because I am
still unsure how to properly do linker relaxations.

Title: It should mention R_RISCV_ALIGN.

I believe the current framework can only handle the BFD counterpart of
relax_pass==2 (_bfd_riscv_relax_align). Other instruction rewriting may require
intertwined scanRelocations and finalizeAddressDependentContent. The relocation
scanning pass may have to be split, but I don't know whether the boundary is.

It has occurred to me that splitting the relocation scanning pass may be good for

copy relocations and canonical PLT entries
non-preemptible IFUNC
.plt.got (https://bugs.llvm.org/show_bug.cgi?id=32938)
RISC-V linker relaxations

The above is basically limitation of a linker with one-pass relocation scanning. To have a complete support we may have to bite the bullet. More bookkeeping may
be needed. InputSectionBase::relocations will not be sufficient.
finalizeSynthetic() may be called more than once.

An incomplete list of passes we do after scanRelocations:

forEachRelSec(scanRelocations<ELFT>);
add symbols to in.symTab and partitions[*].dynSymTab
removeUnusedSyntheticSections()
sortSections()
finalizeSynthetic(in.*)
fixSectionAlignments()
finalizeAddressDependentContext // 
finalizeSynthetic(in.symTab)
finalizeSynthetic(in.ppc64LongBranchTarget) // conceptually it should be done after thunks are finalized

We need to move scanRelocations() as late as possible and move some passes (including finalizeSynthetic()) into finalizeAddressDependentContent(). These passes need to refactored to work if called more than once.

The finalizeAddressDependentContent() should be changed to several rounds of iterations (relax_pass). The last round handles R_RISCV_ALIGN.

A bit off-topic. For RISC-V's (ab)use of linker relaxations, my feeling is still complex. It is indeed a very convenient approach toward a good balance of code size/speed/convenience, but if we want to achieve more, some post-link time optimization frameworks may be more suitable. I don't really have enough experience with link time optimization but my understanding is that we currently use the term link time optimization (especially in the LLVM context) for optimizations performed on the LLVM IR level. Those low-level machine representations are not categorized as LTO.

lld/ELF/Arch/RISCV.cpp
551	llvm::erase_if
lld/ELF/Writer.cpp
1674	This loop should be merged with the previous for loop.
1679	Check `SHF_EXECINSTR`

+@grimar @psmith @ruiu

I know Peter is super busy and has very little time expendable on LLVM... Still, I am eager to hear whether we should move from one-pass relocation scanning to multi-pass relocation scanning. Maybe simple evidence from other linkers may give me more confidence to think more in this area.

In D77694#1968464, @MaskRay wrote:

Still, I am eager to hear whether we should move from one-pass relocation scanning to multi-pass relocation scanning.

FTR, relocation scanning is an expensive path and in the past the general direction was to avoid doing it multiple times. I.e. my concern is perfomance.

Will have to have a think about this in more detail over the Easter Weekend.

I do have some experience with relocations being scanned multiple times. As George mentions, performance, particularly for very large programs with millions of relocations is a concern, especially for a linker that is most attractive to its user base for high performance. The performance impact can be mitigated by only doing what you need when scanning the relocations early. For example in a debug build the number of relocations vastly outweighs the number of non-debug relocations yet we are unlikely to need to scan them early. The downside of this approach is that we have a large non-local dependency between the relocation scans which can make the implementation fragile. For example it is easy to forget that something needs to be done in scan pass 1, only to see some problem come up with scan pass 2 that depended on it. Overall we'll only know by measuring on several large programs.

In D77694#1969073, @psmith wrote:

Will have to have a think about this in more detail over the Easter Weekend.

I do have some experience with relocations being scanned multiple times. As George mentions, performance, particularly for very large programs with millions of relocations is a concern, especially for a linker that is most attractive to its user base for high performance. The performance impact can be mitigated by only doing what you need when scanning the relocations early. For example in a debug build the number of relocations vastly outweighs the number of non-debug relocations yet we are unlikely to need to scan them early. The downside of this approach is that we have a large non-local dependency between the relocation scans which can make the implementation fragile. For example it is easy to forget that something needs to be done in scan pass 1, only to see some problem come up with scan pass 2 that depended on it. Overall we'll only know by measuring on several large programs.

Apologies not got a lot more to add at the moment.

the mutableData/makeMutableDataCopy is a bit of a gross hack that could potentially go wrong if consumers of data hang on to now-stale pointers,

If we were single threaded I think the risk of holding on to stale pointers is low and probably could be managed with warnings in comments and code-review. I think our most likely cause of problem would be during multithreaded parts of the program. if Thread 1 gets an address to the read-only data, just after but before Thread 1 has finished, Thread 2 makes a mutableCopy, with possibilities of inconsistencies. I've not got any great suggestions on how to fix this. One way might be to make the interface a bit more explicit, for example any section containing relaxations, even if they aren't relaxed get copied.

The current place is wrong for a thunk target (e.g. AArch64). I did not write code because I am still unsure how to properly do linker relaxations.

One thing that concerns me is convergence. In finalizeAddressDependentContent() we have had to be careful to avoid the edge case where simultaneously adding and subtracting content end up with oscillations and non-convergence. I think that we've managed that so by locking the size of some sections like the Thumb relocations. My guess is that architectures will choose Thunks or Relaxations but can't easily support both at the same time, with Thunks adding content and Relaxations shrinking it. I do hope that RISCV doesn't end up having to write the equivalent of the Erratum patches, I guess there would be an implementation that could use relaxations.

We need to move scanRelocations() as late as possible and move some passes (including finalizeSynthetic()) into finalizeAddressDependentContent(). These passes need to refactored to work if called more than once.

After investigating https://bugs.llvm.org/show_bug.cgi?id=44824 .ARM.exidx needs this right now as a linker script can have non-monotonically increasing VMA despite having a monotonically increasing sectionIndex. This leads to an incorrect order in the .ARM.exidx table. I'll start work on pr44824 next week.

Hello, thanks for uploading this, it s great to see more work in the same area. I have also been working on relocation relaxation in LLD. I have been focussing on relaxing R_RISCV_CALL relocations for now. My approach is not too dissimilar from yours. I plan to upload a similar patch in the future.

lld/ELF/Arch/RISCV.cpp
37	This needs to be re-added.

jrtc27 marked an inline comment as done.Apr 12 2020, 3:44 PM

jrtc27 added inline comments.

lld/ELF/Arch/RISCV.cpp
37	Oops, yes, obviously wrong. As I said, this was just a quick-and-dirty rebase on top of several months of upstream changes and I haven't even tried building it, but was uploaded for others to take if they so desire (especially the `deleteRanges` as doing that efficiently is nasty code to write and debug; bfd just takes the simple-but-inefficient quadratic approach...).

MaskRay mentioned this in D90686: [lld][ELF] Add additional time trace categories.Nov 3 2020, 10:32 AM

tangxingxin1008 added a subscriber: tangxingxin1008.Mar 10 2021, 2:06 AM

Herald added subscribers: vkmr, frasercrmck. · View Herald TranscriptMar 10 2021, 2:06 AM

lenary removed a subscriber: lenary.Mar 10 2021, 10:42 AM

PkmX mentioned this in D100835: [WIP][LLD][RISCV] Linker Relaxation.Apr 20 2021, 2:54 AM

alistair23 added a subscriber: alistair23.Apr 4 2022, 5:20 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 4 2022, 5:20 PM

Herald added subscribers: sunshaoce, • pcwang-thead, eopXD and 3 others. · View Herald Transcript

rkruppe removed a subscriber: rkruppe.Apr 6 2022, 10:24 AM

Superseded by D127581

Revision Contents

Path

Size

lld/

ELF/

Arch/

131 lines

16 lines

101 lines

1 line

4 lines

4 lines

4 lines

31 lines

Diff 255860

lld/ELF/Arch/RISCV.cpp

Show All 27 Lines	public:
void writeGotHeader(uint8_t *buf) const override;		void writeGotHeader(uint8_t *buf) const override;
void writeGotPlt(uint8_t *buf, const Symbol &s) const override;		void writeGotPlt(uint8_t *buf, const Symbol &s) const override;
void writePltHeader(uint8_t *buf) const override;		void writePltHeader(uint8_t *buf) const override;
void writePlt(uint8_t *buf, const Symbol &sym,		void writePlt(uint8_t *buf, const Symbol &sym,
uint64_t pltEntryAddr) const override;		uint64_t pltEntryAddr) const override;
RelType getDynRel(RelType type) const override;		RelType getDynRel(RelType type) const override;
RelExpr getRelExpr(RelType type, const Symbol &s,		RelExpr getRelExpr(RelType type, const Symbol &s,
const uint8_t *loc) const override;		const uint8_t *loc) const override;
void relocate(uint8_t *loc, const Relocation &rel,		bool relaxSection(InputSection *isec, int pass) const override;
uint64_t val) const override;
s.egertonUnsubmitted Not Done Reply Inline Actions This needs to be re-added. s.egerton: This needs to be re-added.
jrtc27AuthorUnsubmitted Done Reply Inline Actions Oops, yes, obviously wrong. As I said, this was just a quick-and-dirty rebase on top of several months of upstream changes and I haven't even tried building it, but was uploaded for others to take if they so desire (especially the `deleteRanges` as doing that efficiently is nasty code to write and debug; bfd just takes the simple-but-inefficient quadratic approach...). jrtc27: Oops, yes, obviously wrong. As I said, this was just a quick-and-dirty rebase on top of several…
};		};

} // end anonymous namespace		} // end anonymous namespace

const uint64_t dtpOffset = 0x800;		const uint64_t dtpOffset = 0x800;

enum Op {		enum Op {
ADDI = 0x13,		ADDI = 0x13,
AUIPC = 0x17,		AUIPC = 0x17,
JALR = 0x67,		JALR = 0x67,
LD = 0x3003,		LD = 0x3003,
LW = 0x2003,		LW = 0x2003,
SRLI = 0x5013,		SRLI = 0x5013,
SUB = 0x40000033,		SUB = 0x40000033,
};		};

enum Reg {		enum Reg {
		X_ZERO = 0,
X_RA = 1,		X_RA = 1,
X_T0 = 5,		X_T0 = 5,
X_T1 = 6,		X_T1 = 6,
X_T2 = 7,		X_T2 = 7,
X_T3 = 28,		X_T3 = 28,
};		};

		enum RelaxPass {
		PASS_OPT,
		PASS_ALIGN,
		};

static uint32_t hi20(uint32_t val) { return (val + 0x800) >> 12; }		static uint32_t hi20(uint32_t val) { return (val + 0x800) >> 12; }
static uint32_t lo12(uint32_t val) { return val & 4095; }		static uint32_t lo12(uint32_t val) { return val & 4095; }

static uint32_t itype(uint32_t op, uint32_t rd, uint32_t rs1, uint32_t imm) {		static uint32_t itype(uint32_t op, uint32_t rd, uint32_t rs1, uint32_t imm) {
return op \| (rd << 7) \| (rs1 << 15) \| (imm << 20);		return op \| (rd << 7) \| (rs1 << 15) \| (imm << 20);
}		}
static uint32_t rtype(uint32_t op, uint32_t rd, uint32_t rs1, uint32_t rs2) {		static uint32_t rtype(uint32_t op, uint32_t rd, uint32_t rs1, uint32_t rs2) {
return op \| (rd << 7) \| (rs1 << 15) \| (rs2 << 20);		return op \| (rd << 7) \| (rs1 << 15) \| (rs2 << 20);
}		}
static uint32_t utype(uint32_t op, uint32_t rd, uint32_t imm) {		static uint32_t utype(uint32_t op, uint32_t rd, uint32_t imm) {
return op \| (rd << 7) \| (imm << 12);		return op \| (rd << 7) \| (imm << 12);
}		}

		const uint32_t nopInstr = itype(ADDI, X_ZERO, X_ZERO, 0);
		const uint16_t rvcNopInstr = 0x1;

RISCV::RISCV() {		RISCV::RISCV() {
copyRel = R_RISCV_COPY;		copyRel = R_RISCV_COPY;
noneRel = R_RISCV_NONE;		noneRel = R_RISCV_NONE;
pltRel = R_RISCV_JUMP_SLOT;		pltRel = R_RISCV_JUMP_SLOT;
relativeRel = R_RISCV_RELATIVE;		relativeRel = R_RISCV_RELATIVE;
iRelativeRel = R_RISCV_IRELATIVE;		iRelativeRel = R_RISCV_IRELATIVE;
if (config->is64) {		if (config->is64) {
symbolicRel = R_RISCV_64;		symbolicRel = R_RISCV_64;
Show All 13 Lines	RISCV::RISCV() {
gotHeaderEntriesNum = 1;		gotHeaderEntriesNum = 1;

// .got.plt[0] = _dl_runtime_resolve, .got.plt[1] = link_map		// .got.plt[0] = _dl_runtime_resolve, .got.plt[1] = link_map
gotPltHeaderEntriesNum = 2;		gotPltHeaderEntriesNum = 2;

pltHeaderSize = 32;		pltHeaderSize = 32;
pltEntrySize = 16;		pltEntrySize = 16;
ipltEntrySize = 16;		ipltEntrySize = 16;

		relaxPasses = {PASS_OPT, PASS_ALIGN};
}		}

static uint32_t getEFlags(InputFile *f) {		static uint32_t getEFlags(InputFile *f) {
if (config->is64)		if (config->is64)
return cast<ObjFile<ELF64LE>>(f)->getObj().getHeader()->e_flags;		return cast<ObjFile<ELF64LE>>(f)->getObj().getHeader()->e_flags;
return cast<ObjFile<ELF32LE>>(f)->getObj().getHeader()->e_flags;		return cast<ObjFile<ELF32LE>>(f)->getObj().getHeader()->e_flags;
}		}

▲ Show 20 Lines • Show All 120 Lines • ▼ Show 20 Lines	case R_RISCV_TLS_GD_HI20:
return R_TLSGD_PC;		return R_TLSGD_PC;
case R_RISCV_TLS_GOT_HI20:		case R_RISCV_TLS_GOT_HI20:
config->hasStaticTlsModel = true;		config->hasStaticTlsModel = true;
return R_GOT_PC;		return R_GOT_PC;
case R_RISCV_TPREL_HI20:		case R_RISCV_TPREL_HI20:
case R_RISCV_TPREL_LO12_I:		case R_RISCV_TPREL_LO12_I:
case R_RISCV_TPREL_LO12_S:		case R_RISCV_TPREL_LO12_S:
return R_TLS;		return R_TLS;
		case R_RISCV_ALIGN:
		return R_RISCV_RELAX_HINT;
		// TODO: implement linker relaxation optimisation pass for these
case R_RISCV_RELAX:		case R_RISCV_RELAX:
case R_RISCV_TPREL_ADD:		case R_RISCV_TPREL_ADD:
return R_NONE;		return R_NONE;
case R_RISCV_ALIGN:
// Not just a hint; always padded to the worst-case number of NOPs, so may
// not currently be aligned, and without linker relaxation support we can't
// delete NOPs to realign.
errorOrWarn(getErrorLocation(loc) + "relocation R_RISCV_ALIGN requires "
"unimplemented linker relaxation; recompile with -mno-relax");
return R_NONE;
default:		default:
error(getErrorLocation(loc) + "unknown relocation (" + Twine(type) +		error(getErrorLocation(loc) + "unknown relocation (" + Twine(type) +
") against symbol " + toString(s));		") against symbol " + toString(s));
return R_NONE;		return R_NONE;
}		}
}		}

// Extract bits V[Begin:End], where range is inclusive, and Begin must be < 63.		// Extract bits V[Begin:End], where range is inclusive, and Begin must be < 63.
▲ Show 20 Lines • Show All 175 Lines • ▼ Show 20 Lines	void RISCV::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {

case R_RISCV_TLS_DTPREL32:		case R_RISCV_TLS_DTPREL32:
write32le(loc, val - dtpOffset);		write32le(loc, val - dtpOffset);
break;		break;
case R_RISCV_TLS_DTPREL64:		case R_RISCV_TLS_DTPREL64:
write64le(loc, val - dtpOffset);		write64le(loc, val - dtpOffset);
break;		break;

		case R_RISCV_ALIGN:
		assert(config->relocatable);
		return;
case R_RISCV_RELAX:		case R_RISCV_RELAX:
return; // Ignored (for now)		// Either this is a relocatable link, or we have disabled relaxations.
		return;

default:		default:
llvm_unreachable("unknown relocation");		llvm_unreachable("unknown relocation");
}		}
}		}

		static void relaxAlign(InputSection *isec, Relocation &rel,
		std::vector<std::pair<uint64_t, uint64_t>> &remove,
		uint64_t &removeTotal) {
		// Addend is the number of bytes of nops currently present. The alignment
		// required is therefore the next power of two bigger than this.
		uint64_t align = NextPowerOf2(rel.addend);
		uint64_t offset = rel.offset - removeTotal;
		uint64_t aligned = alignTo(offset, align);
		uint64_t requiredBytes = aligned - offset;
		uint8_t *loc = isec->mutableData().data() + rel.offset;
		if (requiredBytes > (uint64_t)rel.addend) {
		errorOrWarn(getErrorLocation(loc) + "need " + Twine(requiredBytes) +
		" bytes to align to " + Twine(align) +
		"-byte boundary, but only " + Twine(rel.addend) + " present");
		return;
		}

		// Delete relocation
		rel.expr = R_NONE;

		if (requiredBytes == (uint64_t)rel.addend)
		return;

		// Fill with nops
		for (uint8_t p = loc, e = loc + requiredBytes; p + 3 < e; p += 4)
		write32le(p, nopInstr);

		// Write a single compressed nop if required
		if (requiredBytes % 4 != 0) {
		if (requiredBytes % 4 != 2) {
		errorOrWarn(getErrorLocation(loc) + "need " + Twine(requiredBytes) +
		" bytes to align to " + Twine(align) +
		"-byte boundary, which is not an integral number of " +
		"instructions");
		return;
		} else if (!(config->eflags & EF_RISCV_RVC)) {
		errorOrWarn(getErrorLocation(loc) + "need " + Twine(requiredBytes) +
		" bytes to align to " + Twine(align) +
		"-byte boundary, which requires a compressed nop, but " +
		"extension not present");
		return;
		}
		write16le(loc + requiredBytes - 2, rvcNopInstr);
		}

		remove.emplace_back(rel.offset + requiredBytes, rel.addend - requiredBytes);
		removeTotal += rel.addend - requiredBytes;
		}

		bool RISCV::relaxSection(InputSection *isec, int pass) const {
		if (config->relocatable)
		return false;

		std::vector<std::pair<uint64_t, uint64_t>> remove;
		uint64_t removeTotal = 0;
		for (auto i = isec->relocations.begin(), e = isec->relocations.end(); i != e;
		++i) {
		Relocation &rel = *i;
		switch (pass) {
		default:
		llvm_unreachable("unknown relaxation pass");

		case PASS_OPT: {
		// Check if this is paired with an R_RISCV_RELAX
		if (i + 1 == e \|\| (i + 1)->type != R_RISCV_RELAX \|\|
		rel.offset != (i + 1)->offset)
		continue;

		// Skip the R_RISCV_RELAX next time
		++i;

		switch (rel.type) {
		case R_RISCV_CALL:
		case R_RISCV_CALL_PLT:
		break;
		}
		break;
		}

		case PASS_ALIGN:
		if (rel.type == R_RISCV_ALIGN)
		relaxAlign(isec, rel, remove, removeTotal);
		break;
		}
		}

		auto brel = isec->relocations.begin();
		auto dest = brel;
		for (auto src = brel, erel = isec->relocations.end(); src != erel; ++src) {
		if (src->expr != R_NONE)
		dest++ = src;
		}
		isec->relocations.resize(dest - brel);
		MaskRayUnsubmitted Not Done Reply Inline Actions llvm::erase_if MaskRay: llvm::erase_if

		if (remove.size() == 0)
		return false;

		isec->deleteRanges(remove);
		return true;
		}

TargetInfo *getRISCVTargetInfo() {		TargetInfo *getRISCVTargetInfo() {
static RISCV target;		static RISCV target;
return &target;		return &target;
}		}

} // namespace elf		} // namespace elf
} // namespace lld		} // namespace lld

lld/ELF/InputSection.h

Show First 20 Lines • Show All 342 Lines • ▼ Show 20 Lines	public:
void relocateNonAlloc(uint8_t *buf, llvm::ArrayRef<RelTy> rels);		void relocateNonAlloc(uint8_t *buf, llvm::ArrayRef<RelTy> rels);

// Used by ICF.		// Used by ICF.
uint32_t eqClass[2] = {0, 0};		uint32_t eqClass[2] = {0, 0};

// Called by ICF to merge two input sections.		// Called by ICF to merge two input sections.
void replace(InputSection *other);		void replace(InputSection *other);

		MutableArrayRef<uint8_t> mutableData() {
		if (!copiedData)
		makeMutableDataCopy();
		return llvm::makeMutableArrayRef(const_cast<uint8_t *>(rawData.data()),
		rawData.size());
		}

		void deleteRanges(std::vector<std::pair<uint64_t, uint64_t>> &ranges);

static InputSection discarded;		static InputSection discarded;

private:		private:
template <class ELFT, class RelTy>		template <class ELFT, class RelTy>
void copyRelocations(uint8_t *buf, llvm::ArrayRef<RelTy> rels);		void copyRelocations(uint8_t *buf, llvm::ArrayRef<RelTy> rels);

template <class ELFT> void copyShtGroup(uint8_t *buf);		template <class ELFT> void copyShtGroup(uint8_t *buf);

		void makeMutableDataCopy();

		// This field stores whether we have made a mutable copy of the data, either
		// because we have uncompressed it or because during relaxation we have had
		// to rewrite the contents.
		mutable bool copiedData = false;
};		};

inline bool isDebugSection(const InputSectionBase &sec) {		inline bool isDebugSection(const InputSectionBase &sec) {
return sec.name.startswith(".debug") \|\| sec.name.startswith(".zdebug");		return sec.name.startswith(".debug") \|\| sec.name.startswith(".zdebug");
}		}

// The list of all input sections.		// The list of all input sections.
extern std::vector<InputSectionBase *> inputSections;		extern std::vector<InputSectionBase *> inputSections;

} // namespace elf		} // namespace elf

std::string toString(const elf::InputSectionBase *);		std::string toString(const elf::InputSectionBase *);
} // namespace lld		} // namespace lld

#endif		#endif

lld/ELF/InputSection.cpp

Show All 14 Lines
#include "Relocations.h"		#include "Relocations.h"
#include "SymbolTable.h"		#include "SymbolTable.h"
#include "Symbols.h"		#include "Symbols.h"
#include "SyntheticSections.h"		#include "SyntheticSections.h"
#include "Target.h"		#include "Target.h"
#include "Thunks.h"		#include "Thunks.h"
#include "lld/Common/ErrorHandler.h"		#include "lld/Common/ErrorHandler.h"
#include "lld/Common/Memory.h"		#include "lld/Common/Memory.h"
		#include "llvm/ADT/PriorityQueue.h"
#include "llvm/Support/Compiler.h"		#include "llvm/Support/Compiler.h"
#include "llvm/Support/Compression.h"		#include "llvm/Support/Compression.h"
#include "llvm/Support/Endian.h"		#include "llvm/Support/Endian.h"
#include "llvm/Support/Threading.h"		#include "llvm/Support/Threading.h"
#include "llvm/Support/xxhash.h"		#include "llvm/Support/xxhash.h"
#include <algorithm>		#include <algorithm>
#include <mutex>		#include <mutex>
#include <set>		#include <set>
▲ Show 20 Lines • Show All 798 Lines • ▼ Show 20 Lines	static uint64_t getRelocTargetVA(const InputFile *file, RelType type, int64_t a,
case R_TLSGD_PC:		case R_TLSGD_PC:
return in.got->getGlobalDynAddr(sym) + a - p;		return in.got->getGlobalDynAddr(sym) + a - p;
case R_TLSLD_GOTPLT:		case R_TLSLD_GOTPLT:
return in.got->getVA() + in.got->getTlsIndexOff() + a - in.gotPlt->getVA();		return in.got->getVA() + in.got->getTlsIndexOff() + a - in.gotPlt->getVA();
case R_TLSLD_GOT:		case R_TLSLD_GOT:
return in.got->getTlsIndexOff() + a;		return in.got->getTlsIndexOff() + a;
case R_TLSLD_PC:		case R_TLSLD_PC:
return in.got->getTlsIndexVA() + a - p;		return in.got->getTlsIndexVA() + a - p;
		case R_RISCV_RELAX_HINT:
		return 0;
default:		default:
llvm_unreachable("invalid expression");		llvm_unreachable("invalid expression");
}		}
}		}

// This function applies relocations to sections without SHF_ALLOC bit.		// This function applies relocations to sections without SHF_ALLOC bit.
// Such sections are never mapped to memory at runtime. Debug sections are		// Such sections are never mapped to memory at runtime. Debug sections are
// an example. Relocations in non-alloc sections are much easier to		// an example. Relocations in non-alloc sections are much easier to
▲ Show 20 Lines • Show All 331 Lines • ▼ Show 20 Lines	if (partition != other->partition) {
for (InputSection *isec : dependentSections)		for (InputSection *isec : dependentSections)
isec->partition = 1;		isec->partition = 1;
}		}

other->repl = repl;		other->repl = repl;
other->markDead();		other->markDead();
}		}

		void InputSection::makeMutableDataCopy() {
		static std::mutex mu;
		std::lock_guard<std::mutex> lock(mu);

		ArrayRef<uint8_t> oldRef = data();
		// In case the above just uncompressed
		if (copiedData)
		return;

		size_t size = oldRef.size();
		uint8_t *newData = bAlloc.Allocate<uint8_t>(size);
		memcpy(newData, oldRef.data(), size);
		rawData = makeArrayRef(newData, size);
		}

		void InputSection::deleteRanges(
		std::vector<std::pair<uint64_t, uint64_t>> &ranges) {
		// Delete bytes from data.

		uint64_t removed = 0;
		MutableArrayRef<uint8_t> mutRef = mutableData();
		uint8_t *buf = mutRef.data();
		size_t size = mutRef.size();
		for (auto i = ranges.begin(), e = ranges.end(); i != e; ++i) {
		uint8_t *moveTo = buf + (i->first - removed);
		uint8_t *moveFrom = buf + i->first + i->second;
		uint64_t nextOffset = i + 1 == e ? size : (i + 1)->first;
		memmove(moveTo, moveFrom, nextOffset - i->first - i->second);
		removed += i->second;
		}
		rawData = makeArrayRef(buf, size - removed);

		// Update relocations; assumes already sorted.

		auto irange = ranges.begin();
		auto erange = ranges.end();
		removed = 0;
		for (Relocation &rel : relocations) {
		while (irange != erange && irange->first < rel.offset) {
		removed += irange->second;
		++irange;
		}
		rel.offset -= removed;
		}

		// Update symbols.

		std::vector<Defined *> symbols;
		for (Symbol *s : file->getSymbols())
		if (auto *dr = dyn_cast<Defined>(s))
		if (!dr->isSection() && dr->section == this)
		symbols.push_back(dr);

		llvm::sort(symbols,
		[](Defined a, Defined b) { return a->value < b->value; });

		using DefinedEndPair = std::pair<Defined *, uint64_t>;
		auto compareEnds = [](DefinedEndPair &a, DefinedEndPair &b) {
		uint64_t aend = a.first->value + a.second + a.first->size;
		uint64_t bend = b.first->value + b.second + b.first->size;
		return aend > bend;
		};
		PriorityQueue<DefinedEndPair, std::vector<DefinedEndPair>,
		decltype(compareEnds)>
		symbolEnds(compareEnds);

		auto isym = symbols.begin();
		auto esym = symbols.end();
		irange = ranges.begin();
		erange = ranges.end();
		removed = 0;
		while (isym != esym \|\| irange != erange \|\| !symbolEnds.empty()) {
		while (irange != erange \|\| !symbolEnds.empty()) {
		// Adjust the size of any earlier symbols whose ends do not overlap with
		// the current range.
		if (isym != esym && irange != erange && irange->first >= (*isym)->value)
		break;
		while (!symbolEnds.empty()) {
		auto top = symbolEnds.top();
		uint64_t end = top.first->value + top.second + top.first->size;
		if (irange != erange && end > irange->first)
		break;
		top.first->size -= removed - top.second;
		symbolEnds.pop();
		}
		if (irange != erange) {
		removed += irange->second;
		++irange;
		}
		}
		if (isym != esym) {
		(*isym)->value -= removed;
		symbolEnds.emplace(*isym, removed);
		++isym;
		}
		}
		}

template <class ELFT>		template <class ELFT>
EhInputSection::EhInputSection(ObjFile<ELFT> &f,		EhInputSection::EhInputSection(ObjFile<ELFT> &f,
const typename ELFT::Shdr &header,		const typename ELFT::Shdr &header,
StringRef name)		StringRef name)
: InputSectionBase(f, header, name, InputSectionBase::EHFrame) {}		: InputSectionBase(f, header, name, InputSectionBase::EHFrame) {}

SyntheticSection *EhInputSection::getParent() const {		SyntheticSection *EhInputSection::getParent() const {
return cast_or_null<SyntheticSection>(parent);		return cast_or_null<SyntheticSection>(parent);
▲ Show 20 Lines • Show All 185 Lines • Show Last 20 Lines

lld/ELF/Relocations.h

Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	enum RelExpr {
R_MIPS_TLSLD,		R_MIPS_TLSLD,
R_PPC32_PLTREL,		R_PPC32_PLTREL,
R_PPC64_CALL,		R_PPC64_CALL,
R_PPC64_CALL_PLT,		R_PPC64_CALL_PLT,
R_PPC64_RELAX_TOC,		R_PPC64_RELAX_TOC,
R_PPC64_TOCBASE,		R_PPC64_TOCBASE,
R_RISCV_ADD,		R_RISCV_ADD,
R_RISCV_PC_INDIRECT,		R_RISCV_PC_INDIRECT,
		R_RISCV_RELAX_HINT,
};		};

// Architecture-neutral representation of relocation.		// Architecture-neutral representation of relocation.
struct Relocation {		struct Relocation {
RelExpr expr;		RelExpr expr;
RelType type;		RelType type;
uint64_t offset;		uint64_t offset;
int64_t addend;		int64_t addend;
▲ Show 20 Lines • Show All 83 Lines • Show Last 20 Lines

lld/ELF/Relocations.cpp

Show First 20 Lines • Show All 391 Lines • ▼ Show 20 Lines	static bool isStaticLinkTimeConstant(RelExpr e, RelType type, const Symbol &sym,
InputSectionBase &s, uint64_t relOff) {		InputSectionBase &s, uint64_t relOff) {
// These expressions always compute a constant		// These expressions always compute a constant
if (oneof<R_DTPREL, R_GOTPLT, R_GOT_OFF, R_TLSLD_GOT_OFF,		if (oneof<R_DTPREL, R_GOTPLT, R_GOT_OFF, R_TLSLD_GOT_OFF,
R_MIPS_GOT_LOCAL_PAGE, R_MIPS_GOTREL, R_MIPS_GOT_OFF,		R_MIPS_GOT_LOCAL_PAGE, R_MIPS_GOTREL, R_MIPS_GOT_OFF,
R_MIPS_GOT_OFF32, R_MIPS_GOT_GP_PC, R_MIPS_TLSGD,		R_MIPS_GOT_OFF32, R_MIPS_GOT_GP_PC, R_MIPS_TLSGD,
R_AARCH64_GOT_PAGE_PC, R_GOT_PC, R_GOTONLY_PC, R_GOTPLTONLY_PC,		R_AARCH64_GOT_PAGE_PC, R_GOT_PC, R_GOTONLY_PC, R_GOTPLTONLY_PC,
R_PLT_PC, R_TLSGD_GOT, R_TLSGD_GOTPLT, R_TLSGD_PC, R_PPC32_PLTREL,		R_PLT_PC, R_TLSGD_GOT, R_TLSGD_GOTPLT, R_TLSGD_PC, R_PPC32_PLTREL,
R_PPC64_CALL_PLT, R_PPC64_RELAX_TOC, R_RISCV_ADD, R_TLSDESC_CALL,		R_PPC64_CALL_PLT, R_PPC64_RELAX_TOC, R_RISCV_ADD, R_TLSDESC_CALL,
R_TLSDESC_PC, R_AARCH64_TLSDESC_PAGE, R_TLSLD_HINT, R_TLSIE_HINT>(		R_TLSDESC_PC, R_AARCH64_TLSDESC_PAGE, R_TLSLD_HINT, R_TLSIE_HINT,
e))		R_RISCV_RELAX_HINT>(e))
return true;		return true;

// These never do, except if the entire file is position dependent or if		// These never do, except if the entire file is position dependent or if
// only the low bits are used.		// only the low bits are used.
if (e == R_GOT \|\| e == R_PLT \|\| e == R_TLSDESC)		if (e == R_GOT \|\| e == R_PLT \|\| e == R_TLSDESC)
return target->usesOnlyLowPageBits(type) \|\| !config->isPic;		return target->usesOnlyLowPageBits(type) \|\| !config->isPic;

if (sym.isPreemptible)		if (sym.isPreemptible)
▲ Show 20 Lines • Show All 1,623 Lines • Show Last 20 Lines

lld/ELF/Target.h

Show First 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	virtual void relaxTlsGdToIe(uint8_t *loc, const Relocation &rel,
uint64_t val) const;		uint64_t val) const;
virtual void relaxTlsGdToLe(uint8_t *loc, const Relocation &rel,		virtual void relaxTlsGdToLe(uint8_t *loc, const Relocation &rel,
uint64_t val) const;		uint64_t val) const;
virtual void relaxTlsIeToLe(uint8_t *loc, const Relocation &rel,		virtual void relaxTlsIeToLe(uint8_t *loc, const Relocation &rel,
uint64_t val) const;		uint64_t val) const;
virtual void relaxTlsLdToLe(uint8_t *loc, const Relocation &rel,		virtual void relaxTlsLdToLe(uint8_t *loc, const Relocation &rel,
uint64_t val) const;		uint64_t val) const;

		llvm::SmallVector<int, 4> relaxPasses;

		virtual bool relaxSection(InputSection *isec, int pass) const;

protected:		protected:
// On FreeBSD x86_64 the first page cannot be mmaped.		// On FreeBSD x86_64 the first page cannot be mmaped.
// On Linux this is controlled by vm.mmap_min_addr. At least on some x86_64		// On Linux this is controlled by vm.mmap_min_addr. At least on some x86_64
// installs this is set to 65536, so the first 15 pages cannot be used.		// installs this is set to 65536, so the first 15 pages cannot be used.
// Given that, the smallest value that can be used in here is 0x10000.		// Given that, the smallest value that can be used in here is 0x10000.
uint64_t defaultImageBase = 0x10000;		uint64_t defaultImageBase = 0x10000;
};		};

▲ Show 20 Lines • Show All 113 Lines • Show Last 20 Lines

lld/ELF/Target.cpp

Show First 20 Lines • Show All 174 Lines • ▼ Show 20 Lines	void TargetInfo::relaxTlsIeToLe(uint8_t *loc, const Relocation &rel,
llvm_unreachable("Should not have claimed to be relaxable");		llvm_unreachable("Should not have claimed to be relaxable");
}		}

void TargetInfo::relaxTlsLdToLe(uint8_t *loc, const Relocation &rel,		void TargetInfo::relaxTlsLdToLe(uint8_t *loc, const Relocation &rel,
uint64_t val) const {		uint64_t val) const {
llvm_unreachable("Should not have claimed to be relaxable");		llvm_unreachable("Should not have claimed to be relaxable");
}		}

		bool TargetInfo::relaxSection(InputSection *isec, int pass) const {
		llvm_unreachable("Should not have claimed to have relaxation passes");
		}

uint64_t TargetInfo::getImageBase() const {		uint64_t TargetInfo::getImageBase() const {
// Use -image-base if set. Fall back to the target default if not.		// Use -image-base if set. Fall back to the target default if not.
if (config->imageBase)		if (config->imageBase)
return *config->imageBase;		return *config->imageBase;
return config->isPic ? 0 : defaultImageBase;		return config->isPic ? 0 : defaultImageBase;
}		}

} // namespace elf		} // namespace elf
} // namespace lld		} // namespace lld

lld/ELF/Writer.cpp

Show First 20 Lines • Show All 1,662 Lines • ▼ Show 20 Lines	template <class ELFT> void Writer<ELFT>::finalizeAddressDependentContent() {
// If addrExpr is set, the address may not be a multiple of the alignment.		// If addrExpr is set, the address may not be a multiple of the alignment.
// Warn because this is error-prone.		// Warn because this is error-prone.
for (BaseCommand *cmd : script->sectionCommands)		for (BaseCommand *cmd : script->sectionCommands)
if (auto *os = dyn_cast<OutputSection>(cmd))		if (auto *os = dyn_cast<OutputSection>(cmd))
if (os->addr % os->alignment != 0)		if (os->addr % os->alignment != 0)
warn("address (0x" + Twine::utohexstr(os->addr) + ") of section " +		warn("address (0x" + Twine::utohexstr(os->addr) + ") of section " +
os->name + " is not a multiple of alignment (" +		os->name + " is not a multiple of alignment (" +
Twine(os->alignment) + ")");		Twine(os->alignment) + ")");

		// We cannot relax until after thunk creation has finished, since that causes
		// code to increase in size and potentially invalidate some relaxations.
		for (int pass : target->relaxPasses) {
		MaskRayUnsubmitted Not Done Reply Inline Actions This loop should be merged with the previous for loop. MaskRay: This loop should be merged with the previous for loop.
		assignPasses = 0;
		for (;;) {
		bool changed = false;
		for (OutputSection *osec : outputSections)
		for (InputSection *isec : getInputSections(osec))
		MaskRayUnsubmitted Not Done Reply Inline Actions Check `SHF_EXECINSTR` MaskRay: Check `SHF_EXECINSTR`
		changed \|= target->relaxSection(isec, pass);

		const Defined *changedSym = script->assignAddresses();
		if (!changed) {
		// Some symbols may be dependent on section addresses. When we break the
		// loop, the symbol values are finalized because a previous
		// assignAddresses() finalized section addresses.
		if (!changedSym)
		break;
		if (++assignPasses == 5) {
		errorOrWarn("assignment to symbol " + toString(*changedSym) +
		" does not converge after relaxation pass " +
		toString(pass));
		break;
		}
		}
		}
		}
}		}

static void finalizeSynthetic(SyntheticSection *sec) {		static void finalizeSynthetic(SyntheticSection *sec) {
if (sec && sec->isNeeded() && sec->getParent())		if (sec && sec->isNeeded() && sec->getParent())
sec->finalizeContents();		sec->finalizeContents();
}		}

// In order to allow users to manipulate linker-synthesized sections,		// In order to allow users to manipulate linker-synthesized sections,
▲ Show 20 Lines • Show All 299 Lines • ▼ Show 20 Lines	template <class ELFT> void Writer<ELFT>::finalizeSections() {
// because this is the earliest point where we know sizes of sections and		// because this is the earliest point where we know sizes of sections and
// their layouts (that are needed to determine if jump targets are in		// their layouts (that are needed to determine if jump targets are in
// range).		// range).
//		//
// 2) Update the sections. We need to generate content that depends on the		// 2) Update the sections. We need to generate content that depends on the
// address of InputSections. For example, MIPS GOT section content or		// address of InputSections. For example, MIPS GOT section content or
// android packed relocations sections content.		// android packed relocations sections content.
//		//
// 3) Assign the final values for the linker script symbols. Linker scripts		// 3) Perform any linker relaxations that are address-dependent.
		//
		// 4) Assign the final values for the linker script symbols. Linker scripts
// sometimes using forward symbol declarations. We want to set the correct		// sometimes using forward symbol declarations. We want to set the correct
// values. They also might change after adding the thunks.		// values. They also might change after adding the thunks.
finalizeAddressDependentContent();		finalizeAddressDependentContent();

// finalizeAddressDependentContent may have added local symbols to the static symbol table.		// finalizeAddressDependentContent may have added local symbols to the static symbol table.
finalizeSynthetic(in.symTab);		finalizeSynthetic(in.symTab);
finalizeSynthetic(in.ppc64LongBranchTarget);		finalizeSynthetic(in.ppc64LongBranchTarget);

▲ Show 20 Lines • Show All 817 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[WIP][RISCV][ELF] Linker relaxation supportAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 255860

lld/ELF/Arch/RISCV.cpp

lld/ELF/InputSection.h

lld/ELF/InputSection.cpp

lld/ELF/Relocations.h

lld/ELF/Relocations.cpp

lld/ELF/Target.h

lld/ELF/Target.cpp

lld/ELF/Writer.cpp

[WIP][RISCV][ELF] Linker relaxation support
AbandonedPublic