This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
ELF/
-
LinkerScript.h
-
LinkerScript.cpp
-
OutputSections.h
5
ScriptParser.cpp
-
test/ELF/linkerscript/
-
ELF/
-
linkerscript/
-
sections-constraint6.s

Differential D39489

[ELF] - Linkerscript: fix issue with multiple output sections definitions.
AbandonedPublic

Authored by grimar on Nov 1 2017, 5:20 AM.

Download Raw Diff

Details

Reviewers

ruiu
• rafael

Summary

Currently for following script LLD produce broken output silently:

bar : ONLY_IF_RO { *(no_such_section) }
bar : ONLY_IF_RW { *(foo_aw) }
sym1 = SIZEOF(bar); 
sym2 = ADDR(bar);

Values of sym1 and sym2 are zeroes.

That happens because of NameToOutputSection member which currently is a
mapping from name to OutputSection. Though in case above there are two different
output section commands, so I believe it should be a mapping to vector.

Patch changes implementation of mapping what fixes the issue observed.

Diff Detail

Event Timeline

grimar created this revision.Nov 1 2017, 5:20 AM

Herald added a subscriber: emaste. · View Herald TranscriptNov 1 2017, 5:20 AM

Rebased. Ping.

Herald added a subscriber: arichardson. · View Herald TranscriptNov 15 2017, 4:16 AM

If I understand correctly is that there is no restriction on OutputSection name so something like:

bar : { *.bar.1 }
bar : { *.bar.2 }

Would be legal, but not very sensible and a user could just change the name of the second bar. Whereas the case you describe:

bar : ONLY_IF_RO { *(no_such_section) }
bar : ONLY_IF_RW { *(foo_aw) }
sym1 = SIZEOF(bar); 
sym2 = ADDR(bar);

Is useful as it permits a program to get the [base, limit) of an OutputSection that can contain either RO or RW? This would make the added complexity of the change worthwhile?

If I've understood their purpose correctly; I'm not too sure that the forward declarations are a good idea. Recent ld manuals prohibit forward references for many of the builtin functions.

ELF/ScriptParser.cpp
974	If I understand correctly this might create a forward reference to some OutputSection Name that we've not seen yet? I don't think that this is allowed in this case: https://sourceware.org/binutils/docs/ld/Builtin-Functions.html Return the address (VMA) of the named section. Your script must previously have defined the location of that section.
999	If this is a forward reference then the manual says we should give an error message: If the section has not been allocated when this is evaluated, the linker will report an error.
1050	The docs don't say anything about forward references for this particular builtin. I think that there is a possibility in creating a cycle in address allocation though with something like . = LOADADDR(forward)
1073	Another case where we are supposed to give an error if there is a forward reference: If the section has not been allocated when this is evaluated, the linker will report an error.

grimar added inline comments.Nov 15 2017, 7:20 AM

ELF/ScriptParser.cpp
974	It is actually what original code did already before my change. `getOrCreateOutputSection` creates forward reference. We rely on that in our testcases, for example absolute.s has just: "PROVIDE(foo = 1 + ABSOLUTE(ADDR(.text)));" and expects we can evaluate that. I believe we have the same behavior as gnu linkers here. Below you referenced to spec saying that for ALIGNOF "If the section has not been allocated when this is evaluated, the linker will report an error.". I think that is how it should say for all commands you mentioned. We use `checkIfExists` to verify that section was allocated at the moment when we call evaluation of `ADDR` or other commands, and for parsing script we create forward references atm.

Before getting into details, I'd like to ask if such linker script is valid in GNU linkers. Also I wonder how you noticed this.

In D39489#927076, @ruiu wrote:

Before getting into details, I'd like to ask if such linker script is valid in GNU linkers.

Yes, both gold and bfd accepts this script and produce valid result.

Also I wonder how you noticed this.

Just found during review of code that our logic is probably weak here and tried to break it
with the testcase from this patch.

If this is hypothetical, I wouldn't work this hard to "fix" it.

In D39489#927124, @ruiu wrote:

If this is hypothetical, I wouldn't work this hard to "fix" it.

Its not only fixes the issue. Our current logic of 'createOutputSection' honestly looks very odd for me.
It can depending on conditions:

Create and return new output section and remember it in NameToOutputSection.
Create and return new output section and don't remember it.
Just declare and return previously defined output section.

And It works with name to ouput sections as 1:1 though we know its 1:many semantically.

I believe new logic and naming introduced in this patch is much more clear.
That is what I tried to improve as well here.

But with this patch the code is longer than before. If it doesn't have any practical benefits, it is probably not a good idea to add more code.

grimar abandoned this revision.Dec 1 2017, 4:14 AM

Revision Contents

Path

Size

ELF/

7 lines

50 lines

4 lines

26 lines

test/

ELF/

linkerscript/

sections-constraint6.s

37 lines

Diff 123004

ELF/LinkerScript.h

Show First 20 Lines • Show All 199 Lines • ▼ Show 20 Lines	struct AddressState {
AddressState();		AddressState();
uint64_t ThreadBssOffset = 0;		uint64_t ThreadBssOffset = 0;
OutputSection *OutSec = nullptr;		OutputSection *OutSec = nullptr;
MemoryRegion *MemRegion = nullptr;		MemoryRegion *MemRegion = nullptr;
llvm::DenseMap<const MemoryRegion *, uint64_t> MemRegionOffset;		llvm::DenseMap<const MemoryRegion *, uint64_t> MemRegionOffset;
std::function<uint64_t()> LMAOffset;		std::function<uint64_t()> LMAOffset;
};		};

llvm::DenseMap<StringRef, OutputSection *> NameToOutputSection;		llvm::DenseMap<StringRef, std::vector<OutputSection *>> NameToOutputSection;

void addSymbol(SymbolAssignment *Cmd);		void addSymbol(SymbolAssignment *Cmd);
void assignSymbol(SymbolAssignment *Cmd, bool InSec);		void assignSymbol(SymbolAssignment *Cmd, bool InSec);
void setDot(Expr E, const Twine &Loc, bool InSec);		void setDot(Expr E, const Twine &Loc, bool InSec);

std::vector<InputSection *>		std::vector<InputSection *>
computeInputSections(const InputSectionDescription *,		computeInputSections(const InputSectionDescription *,
const llvm::DenseMap<SectionBase *, int> &Order);		const llvm::DenseMap<SectionBase *, int> &Order);
Show All 20 Lines	class LinkerScript final {
// LinkerScript.		// LinkerScript.
AddressState *Ctx = nullptr;		AddressState *Ctx = nullptr;

OutputSection *Aether;		OutputSection *Aether;

uint64_t Dot;		uint64_t Dot;

public:		public:
OutputSection *createOutputSection(StringRef Name, StringRef Location);		void declareOutputSection(StringRef Name);
OutputSection *getOrCreateOutputSection(StringRef Name);		OutputSection *defineOutputSection(StringRef Name, StringRef Location);
		OutputSection *getOutputSection(StringRef Name);

bool hasPhdrsCommands() { return !PhdrsCommands.empty(); }		bool hasPhdrsCommands() { return !PhdrsCommands.empty(); }
uint64_t getDot() { return Dot; }		uint64_t getDot() { return Dot; }
void discard(ArrayRef<InputSection *> V);		void discard(ArrayRef<InputSection *> V);

ExprValue getSymbolValue(StringRef Name, const Twine &Loc);		ExprValue getSymbolValue(StringRef Name, const Twine &Loc);

void addOrphanSections();		void addOrphanSections();
Show All 38 Lines

ELF/LinkerScript.cpp

Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	uint64_t ExprValue::getSectionOffset() const {
// If the alignment is trivial, we don't have to compute the full		// If the alignment is trivial, we don't have to compute the full
// value to know the offset. This allows this function to succeed in		// value to know the offset. This allows this function to succeed in
// cases where the output section is not yet known.		// cases where the output section is not yet known.
if (Alignment == 1)		if (Alignment == 1)
return Val;		return Val;
return getValue() - getSecAddr();		return getValue() - getSecAddr();
}		}

OutputSection *LinkerScript::createOutputSection(StringRef Name,		OutputSection *LinkerScript::defineOutputSection(StringRef Name,
StringRef Location) {		StringRef Location) {
OutputSection *&SecRef = NameToOutputSection[Name];		std::vector<OutputSection *> &V = NameToOutputSection[Name];
OutputSection *Sec;		// If we have no output sections with Name or all sections are already defined,
if (SecRef && SecRef->Location.empty()) {		// we should create new section definition. This is used to handle
// There was a forward reference.		// multiple output section definitions when constraints are used, for example:
Sec = SecRef;		// bar : ONLY_IF_RO { *(...) }
} else {		// bar : ONLY_IF_RW { *(...) }
Sec = make<OutputSection>(Name, SHT_PROGBITS, 0);		if (V.empty() \|\| V.back()->isDefined()) {
if (!SecRef)		OutputSection *Sec = make<OutputSection>(Name, SHT_PROGBITS, 0);
SecRef = Sec;		Sec->Location = Location;
		V.push_back(Sec);
		return Sec;
}		}

		// When we have forward declaration we should define it and return.
		OutputSection *Sec = V.back();
Sec->Location = Location;		Sec->Location = Location;
return Sec;		return Sec;
}		}

OutputSection *LinkerScript::getOrCreateOutputSection(StringRef Name) {		// Returns first live section by Name. Normally only one section should be live.
OutputSection *&CmdRef = NameToOutputSection[Name];		// Otherwise commands like SIZEOF(section) will use first alive output section
if (!CmdRef)		// for calculation. Ideally we would like to report an error then, but GNU
CmdRef = make<OutputSection>(Name, SHT_PROGBITS, 0);		// linkers allow that and we just follow for simplicity.
return CmdRef;		OutputSection *LinkerScript::getOutputSection(StringRef Name) {
		std::vector<OutputSection *> &V = NameToOutputSection[Name];
		for (OutputSection *Sec : V)
		if (Sec->Live)
		return Sec;
		return V.front();
		}

		// Used for creating forward references to output section.
		void LinkerScript::declareOutputSection(StringRef Name) {
		std::vector<OutputSection *> &V = NameToOutputSection[Name];
		if (!V.empty())
		return;
		V.push_back(make<OutputSection>(Name, SHT_PROGBITS, 0));
}		}

void LinkerScript::setDot(Expr E, const Twine &Loc, bool InSec) {		void LinkerScript::setDot(Expr E, const Twine &Loc, bool InSec) {
uint64_t Val = E().getValue();		uint64_t Val = E().getValue();
if (Val < Dot && InSec)		if (Val < Dot && InSec)
error(Loc + ": unable to move location counter backward for: " +		error(Loc + ": unable to move location counter backward for: " +
Ctx->OutSec->Name);		Ctx->OutSec->Name);
Dot = Val;		Dot = Val;
▲ Show 20 Lines • Show All 313 Lines • ▼ Show 20 Lines	for (BaseCommand *Base : Vec)
if (auto *Sec = dyn_cast<OutputSection>(Base))		if (auto *Sec = dyn_cast<OutputSection>(Base))
if (Sec->Name == Name)		if (Sec->Name == Name)
return Sec;		return Sec;
return nullptr;		return nullptr;
}		}

static OutputSection createSection(InputSectionBase IS,		static OutputSection createSection(InputSectionBase IS,
StringRef OutsecName) {		StringRef OutsecName) {
OutputSection *Sec = Script->createOutputSection(OutsecName, "<internal>");		OutputSection *Sec = Script->defineOutputSection(OutsecName, "<internal>");
Sec->addSection(cast<InputSection>(IS));		Sec->addSection(cast<InputSection>(IS));
return Sec;		return Sec;
}		}

static OutputSection addInputSec(StringMap<OutputSection > &Map,		static OutputSection addInputSec(StringMap<OutputSection > &Map,
InputSectionBase *IS, StringRef OutsecName) {		InputSectionBase *IS, StringRef OutsecName) {
// Sections with SHT_GROUP or SHF_GROUP attributes reach here only when the -r		// Sections with SHT_GROUP or SHF_GROUP attributes reach here only when the -r
// option is given. A section with SHT_GROUP defines a "section group", and		// option is given. A section with SHT_GROUP defines a "section group", and
▲ Show 20 Lines • Show All 569 Lines • Show Last 20 Lines

ELF/OutputSections.h

Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	static bool classof(const SectionBase *S) {
return S->kind() == SectionBase::Output;		return S->kind() == SectionBase::Output;
}		}

static bool classof(const BaseCommand *C);		static bool classof(const BaseCommand *C);

uint64_t getLMA() const { return Addr + LMAOffset; }		uint64_t getLMA() const { return Addr + LMAOffset; }
template <typename ELFT> void writeHeaderTo(typename ELFT::Shdr *SHdr);		template <typename ELFT> void writeHeaderTo(typename ELFT::Shdr *SHdr);

		// Linker script might create forward declarations for output sections.
		// This method returns true if section known to have definition.
		bool isDefined() { return !Location.empty(); }

unsigned SectionIndex;		unsigned SectionIndex;
unsigned SortRank;		unsigned SortRank;

uint32_t getPhdrFlags() const;		uint32_t getPhdrFlags() const;

// Pointer to the PT_LOAD segment, which this section resides in. This field		// Pointer to the PT_LOAD segment, which this section resides in. This field
// is used to correctly compute file offset of a section. When two sections		// is used to correctly compute file offset of a section. When two sections
// share the same load segment, difference between their file offsets should		// share the same load segment, difference between their file offsets should
▲ Show 20 Lines • Show All 100 Lines • Show Last 20 Lines

ELF/ScriptParser.cpp

Show First 20 Lines • Show All 658 Lines • ▼ Show 20 Lines	if (!isPowerOf2_64(Alignment)) {
return (uint64_t)1; // Return a dummy value.		return (uint64_t)1; // Return a dummy value.
}		}
return Alignment;		return Alignment;
};		};
}		}

OutputSection *ScriptParser::readOutputSectionDescription(StringRef OutSec) {		OutputSection *ScriptParser::readOutputSectionDescription(StringRef OutSec) {
OutputSection *Cmd =		OutputSection *Cmd =
Script->createOutputSection(OutSec, getCurrentLocation());		Script->defineOutputSection(OutSec, getCurrentLocation());

if (peek() != ":")		if (peek() != ":")
readSectionAddressType(Cmd);		readSectionAddressType(Cmd);
expect(":");		expect(":");

std::string Location = getCurrentLocation();		std::string Location = getCurrentLocation();
if (consume("AT"))		if (consume("AT"))
Cmd->LMAExpr = readParenExpr();		Cmd->LMAExpr = readParenExpr();
▲ Show 20 Lines • Show All 290 Lines • ▼ Show 20 Lines	if (Tok == "ABSOLUTE") {
return [=] {		return [=] {
ExprValue I = Inner();		ExprValue I = Inner();
I.ForceAbsolute = true;		I.ForceAbsolute = true;
return I;		return I;
};		};
}		}
if (Tok == "ADDR") {		if (Tok == "ADDR") {
StringRef Name = readParenLiteral();		StringRef Name = readParenLiteral();
OutputSection *Sec = Script->getOrCreateOutputSection(Name);		Script->declareOutputSection(Name);
		peter.smithUnsubmitted Not Done Reply Inline Actions If I understand correctly this might create a forward reference to some OutputSection Name that we've not seen yet? I don't think that this is allowed in this case: https://sourceware.org/binutils/docs/ld/Builtin-Functions.html Return the address (VMA) of the named section. Your script must previously have defined the location of that section. peter.smith: If I understand correctly this might create a forward reference to some OutputSection Name that…
		grimarAuthorUnsubmitted Not Done Reply Inline Actions It is actually what original code did already before my change. `getOrCreateOutputSection` creates forward reference. We rely on that in our testcases, for example absolute.s has just: "PROVIDE(foo = 1 + ABSOLUTE(ADDR(.text)));" and expects we can evaluate that. I believe we have the same behavior as gnu linkers here. Below you referenced to spec saying that for ALIGNOF "If the section has not been allocated when this is evaluated, the linker will report an error.". I think that is how it should say for all commands you mentioned. We use `checkIfExists` to verify that section was allocated at the moment when we call evaluation of `ADDR` or other commands, and for parsing script we create forward references atm. grimar: It is actually what original code did already before my change. `getOrCreateOutputSection`…
return [=]() -> ExprValue {		return [=]() -> ExprValue {
		OutputSection *Sec = Script->getOutputSection(Name);
checkIfExists(Sec, Location);		checkIfExists(Sec, Location);
return {Sec, false, 0, Location};		return {Sec, false, 0, Location};
};		};
}		}
if (Tok == "ALIGN") {		if (Tok == "ALIGN") {
expect("(");		expect("(");
Expr E = readExpr();		Expr E = readExpr();
if (consume(")")) {		if (consume(")")) {
E = checkAlignment(E, Location);		E = checkAlignment(E, Location);
return [=] { return alignTo(Script->getDot(), E().getValue()); };		return [=] { return alignTo(Script->getDot(), E().getValue()); };
}		}
expect(",");		expect(",");
Expr E2 = checkAlignment(readExpr(), Location);		Expr E2 = checkAlignment(readExpr(), Location);
expect(")");		expect(")");
return [=] {		return [=] {
ExprValue V = E();		ExprValue V = E();
V.Alignment = E2().getValue();		V.Alignment = E2().getValue();
return V;		return V;
};		};
}		}
if (Tok == "ALIGNOF") {		if (Tok == "ALIGNOF") {
StringRef Name = readParenLiteral();		StringRef Name = readParenLiteral();
OutputSection *Cmd = Script->getOrCreateOutputSection(Name);		Script->declareOutputSection(Name);
		peter.smithUnsubmitted Not Done Reply Inline Actions If this is a forward reference then the manual says we should give an error message: If the section has not been allocated when this is evaluated, the linker will report an error. peter.smith: If this is a forward reference then the manual says we should give an error message: > If the…
return [=] {		return [=] {
checkIfExists(Cmd, Location);		OutputSection *Sec = Script->getOutputSection(Name);
return Cmd->Alignment;		checkIfExists(Sec, Location);
		return Sec->Alignment;
};		};
}		}
if (Tok == "ASSERT")		if (Tok == "ASSERT")
return readAssertExpr();		return readAssertExpr();
if (Tok == "CONSTANT")		if (Tok == "CONSTANT")
return readConstant();		return readConstant();
if (Tok == "DATA_SEGMENT_ALIGN") {		if (Tok == "DATA_SEGMENT_ALIGN") {
expect("(");		expect("(");
Show All 30 Lines	Expr ScriptParser::readPrimary() {
if (Tok == "LENGTH") {		if (Tok == "LENGTH") {
StringRef Name = readParenLiteral();		StringRef Name = readParenLiteral();
if (Script->MemoryRegions.count(Name) == 0)		if (Script->MemoryRegions.count(Name) == 0)
setError("memory region not defined: " + Name);		setError("memory region not defined: " + Name);
return [=] { return Script->MemoryRegions[Name]->Length; };		return [=] { return Script->MemoryRegions[Name]->Length; };
}		}
if (Tok == "LOADADDR") {		if (Tok == "LOADADDR") {
StringRef Name = readParenLiteral();		StringRef Name = readParenLiteral();
OutputSection *Cmd = Script->getOrCreateOutputSection(Name);		Script->declareOutputSection(Name);
		peter.smithUnsubmitted Not Done Reply Inline Actions The docs don't say anything about forward references for this particular builtin. I think that there is a possibility in creating a cycle in address allocation though with something like . = LOADADDR(forward) peter.smith: The docs don't say anything about forward references for this particular builtin. I think that…
return [=] {		return [=] {
checkIfExists(Cmd, Location);		OutputSection *Sec = Script->getOutputSection(Name);
return Cmd->getLMA();		checkIfExists(Sec, Location);
		return Sec->getLMA();
};		};
}		}
if (Tok == "ORIGIN") {		if (Tok == "ORIGIN") {
StringRef Name = readParenLiteral();		StringRef Name = readParenLiteral();
if (Script->MemoryRegions.count(Name) == 0)		if (Script->MemoryRegions.count(Name) == 0)
setError("memory region not defined: " + Name);		setError("memory region not defined: " + Name);
return [=] { return Script->MemoryRegions[Name]->Origin; };		return [=] { return Script->MemoryRegions[Name]->Origin; };
}		}
if (Tok == "SEGMENT_START") {		if (Tok == "SEGMENT_START") {
expect("(");		expect("(");
skip();		skip();
expect(",");		expect(",");
Expr E = readExpr();		Expr E = readExpr();
expect(")");		expect(")");
return [=] { return E(); };		return [=] { return E(); };
}		}
if (Tok == "SIZEOF") {		if (Tok == "SIZEOF") {
StringRef Name = readParenLiteral();		StringRef Name = readParenLiteral();
OutputSection *Cmd = Script->getOrCreateOutputSection(Name);		Script->declareOutputSection(Name);
		peter.smithUnsubmitted Not Done Reply Inline Actions Another case where we are supposed to give an error if there is a forward reference: If the section has not been allocated when this is evaluated, the linker will report an error. peter.smith: Another case where we are supposed to give an error if there is a forward reference: > If the…
// Linker script does not create an output section if its content is empty.		// Linker script does not create an output section if its content is empty.
// We want to allow SIZEOF(.foo) where .foo is a section which happened to		// We want to allow SIZEOF(.foo) where .foo is a section which happened to
// be empty.		// be empty.
return [=] { return Cmd->Size; };		return [=] {
		OutputSection *Sec = Script->getOutputSection(Name);
		return Sec->Size;
		};
}		}
if (Tok == "SIZEOF_HEADERS")		if (Tok == "SIZEOF_HEADERS")
return [=] { return elf::getHeaderSize(); };		return [=] { return elf::getHeaderSize(); };

// Tok is the dot.		// Tok is the dot.
if (Tok == ".")		if (Tok == ".")
return [=] { return Script->getSymbolValue(Tok, Location); };		return [=] { return Script->getSymbolValue(Tok, Location); };

▲ Show 20 Lines • Show All 259 Lines • Show Last 20 Lines

test/ELF/linkerscript/sections-constraint6.s

				# REQUIRES: x86
				# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t.o
				# RUN: echo "SECTIONS { . = 0x10000; \
				# RUN: symSize1 = SIZEOF(bar); symAddr1 = ADDR(bar); \
				# RUN: bar : ONLY_IF_RO { *(unknown) } \
				# RUN: bar : ONLY_IF_RW { *(foo_aw) } \
				# RUN: symSize2 = SIZEOF(bar); symAddr2 = ADDR(bar); \
				# RUN: }" > %t.script
				# RUN: ld.lld -o %t -T %t.script %t.o
				# RUN: llvm-readobj -s -t %t \| FileCheck %s

				## Check values of symbols are (1) not zeroes and (2) corresponds to
				## size and address of bar that contains foo_aw input section.
				# CHECK: Sections [
				# CHECK: Name: bar
				# CHECK-NEXT: Type: SHT_PROGBITS
				# CHECK-NEXT: Flags [
				# CHECK-NEXT: SHF_ALLOC
				# CHECK-NEXT: SHF_WRITE
				# CHECK-NEXT: ]
				# CHECK-NEXT: Address: 0x10001
				# CHECK-NEXT: Offset:
				# CHECK-NEXT: Size: 8
				# CHECK: Name: symSize1
				# CHECK-NEXT: Value: 0x8
				# CHECK: Name: symAddr1
				# CHECK-NEXT: Value: 0x10001
				# CHECK: Name: symSize2
				# CHECK-NEXT: Value: 0x8
				# CHECK: Name: symAddr2
				# CHECK-NEXT: Value: 0x10001

				.section foo_a,"a"
				.byte 0

				.section foo_aw,"aw"
				.quad 0