This is an archive of the discontinued LLVM Phabricator instance.

wasm/InputChunks.cpp
95	I haven't read this function yet, but this function is not OK because it is too long and doesn't have comment to help readers understand it. Please read other code in lld carefully and follow local conventions, and then reorganize the code in this patch, so that the first time reader can easily understand your code.

yurydelendik retitled this revision from [WebAssembly] Update .debug_line section PC addresses during LEB optimization to [WIP] [WebAssembly] Update .debug_line section PC addresses during LEB optimization.Jun 7 2018, 1:00 PM

I there a way we can avoid needing to do this at all? How about we simply don't allow LEB compression and dwarf in the same output? It seems OK if building with dwarf in them don't have this optimization.

Sorry if my suggestion means your work is wasted? Maybe I'm missing something

In D47901#1125643, @sbc100 wrote:

I there a way we can avoid needing to do this at all? How about we simply don't allow LEB compression and dwarf in the same output? It seems OK if building with dwarf in them don't have this optimization.

It is not OK for the presence of debug info to affect the construction of the rest of the executable. Normally this is something only the compiler needs to worry about, but it appears the linker also needs to care. If I understand the motivation here correctly.

In D47901#1125643, @sbc100 wrote:

I there a way we can avoid needing to do this at all? How about we simply don't allow LEB compression and dwarf in the same output? It seems OK if building with dwarf in them don't have this optimization.

In D47901#1125645, @sbc100 wrote:

Sorry if my suggestion means your work is wasted? Maybe I'm missing something

We can do that, yes. We can also consider that at later point of time. I'll keep this patch open until we open different patch with disabling LEB when DWARF sections are present. (Do we want track it per function vs per file?)

Update comments

yurydelendik marked an inline comment as done.Jun 7 2018, 2:39 PM

In D47901#1125651, @probinson wrote:

In D47901#1125643, @sbc100 wrote:

I there a way we can avoid needing to do this at all? How about we simply don't allow LEB compression and dwarf in the same output? It seems OK if building with dwarf in them don't have this optimization.

It is not OK for the presence of debug info to affect the construction of the rest of the executable. Normally this is something only the compiler needs to worry about, but it appears the linker also needs to care. If I understand the motivation here correctly.

Hmm, I do see your point. Maybe I can explain in a little more detail: There is a optional optimization that the wasm lld linker can perform which effectively shrinks the code section by converting padded-LEB128s (used at relocation sites) to non-padded LEB128s. This only occurs if you pass -O2 to the linker. We could even make it a separate optional argument. Obviously when we do this code section compression we change the instruction offsets, which breaks the DWARF information.

So I see a few options here:
(1) Increase the complexity of the linker so it can parse, modify mad re-write the DWARF sections rather than just blindly copying them.
(2) Remove this LEB compression feature from the linker and make it into a separate tool
(3) Make LEB compression and debug sections mutually exclusive. Would mean that --compress-lebs would automatically imply --strip-debug.

My understanding is that (1) generally goes against the philosophy of linkers in general, but maybe I'm wrong. If I'm correct then perhaps (2) makes the most sense as it keeps the linker dump and fast.

@probinson After this more detailed explanation are would you still be strongly against (3)?
@ruiu I am right about (1)? i.e. would dwarf parsing and re-writing be outside the normal preview of the linker?

I do imagine that "there's no way to create an executable with debug info if you are creating a release build" is unacceptable. So, we need to support both -O2 with debug info. But is this the only way to do this? This seems a bit too complicated to me. Maybe just code is complicated and the algorithm might not, but it is hard to tell because of the lack of any explanation of the algorithm.

It looks like your patch recognizes all DWARF records that contains in-file offsets to adjust them. Is this the right approach? If DWARF is extended (that happens fairly frequently), do you have to make a change to the linker to produce non-broken debug info? If that's the case, I think it is too fragile.

I wonder if we can just emit DWARF as-is and let the debugger count the number of bytes in the expanded form, to compensate the difference caused by the LEB128 compaction. Is this something you can do? Is that easier than making a change to the linker?

In D47901#1125770, @ruiu wrote:

I do imagine that "there's no way to create an executable with debug info if you are creating a release build" is unacceptable. So, we need to support both -O2 with debug info. But is this the only way to do this? This seems a bit too complicated to me. Maybe just code is complicated and the algorithm might not, but it is hard to tell because of the lack of any explanation of the algorithm.

How about if we decouple of the compaction from the -O2 or "release build" and make it a separate flag?

In D47901#1125774, @sbc100 wrote:

In D47901#1125770, @ruiu wrote:

I do imagine that "there's no way to create an executable with debug info if you are creating a release build" is unacceptable. So, we need to support both -O2 with debug info. But is this the only way to do this? This seems a bit too complicated to me. Maybe just code is complicated and the algorithm might not, but it is hard to tell because of the lack of any explanation of the algorithm.

How about if we decouple of the compaction from the -O2 or "release build" and make it a separate flag?

i.e is it permissible to have an optional feature that implies --strip-debug?

In D47901#1125770, @ruiu wrote:

It looks like your patch recognizes all DWARF records that contains in-file offsets to adjust them. Is this the right approach? If DWARF is extended (that happens fairly frequently), do you have to make a change to the linker to produce non-broken debug info? If that's the case, I think it is too fragile.

.debug_line uses delta encoding for PC and line/column to reduce amount of used space, otherwise it would need to refer each source code statement/operator by non-relative offset. At worst, it is every wasm operator, and adding relocation records with just increase amount of used space. FWIW, The algorithm just fixed "delta" encoded values and cannot be simplified.

There's no notion of "PC" in wasm, no? You cannot take an address of a function and call it indirectly later. Well, you can do that, but what you get is not an address in the regular mean. It's an index, if I understand correctly. So, by PC, what do you mean?

How about my suggestion to make a change to the debugger, so that it computes offset not against the compacted raw wasm text section but in the fully-expanded form?

As to making it a separate tool to compress wasm text section, I believe that's an option, though I'm not sure if it is ideal. I imagine that the tool compresses text section by optimize LEB128-encoded numbers and then strip debug sections to produce production binary. That doesn't sound like a bad idea, at least, and maybe that's a good idea. But I'd like to hear from wasm developers about it, as it affects their workflow.

In D47901#1125788, @ruiu wrote:

There's no notion of "PC" in wasm, no? You cannot take an address of a function and call it indirectly later. Well, you can do that, but what you get is not an address in the regular mean. It's an index, if I understand correctly. So, by PC, what do you mean?

For now, bytecode offset (function or file relative) is used in role of the PC -- for call stack, for profiler, and we also successfully adapted it for source maps. I'm not sure what moving to index (of wasm operator?) base will change, but it will only will add complexity ~~and will not solve this particular issue.~~

For now, bytecode offset (function or file relative) is used in role of the PC -- for call stack, for profiler, and we also successfully adapted it for source maps. I'm not sure what moving to index (of wasm operator?) base will change, but it will only will add complexity and will not solve this particular issue.

I'm not pushing this idea, but what I was saying is not use the function index or something like that instead of PC. What I meant is to continue using the bytecode offset as before, but count "bytecode offset" in the fully expanded form instead of the raw in-file form. So, any LEB128 number is counted as 5 byte long. Then, that "virtual bytecode offset" should exactly match the original, uncompressed byte code offset, eliminating the need of adjusting any byte offset in DWARF.

In D47901#1125812, @ruiu wrote:

What I meant is to continue using the bytecode offset as before, but count "bytecode offset" in the fully expanded form instead of the raw in-file form. So, any LEB128 number is counted as 5 byte long. Then, that "virtual bytecode offset" should exactly match the original, uncompressed byte code offset, eliminating the need of adjusting any byte offset in DWARF.

Yes, that's a valid solution too -- we just need to specify a normalized form that every debugger will consider when it will decode the data. Thanks. I had also idea to supply "code transform" custom sections that reflex the same idea, but without a burden for debugger to know about normalized "virtual bytecode offset" format.

Per discussion at the WebAssembly toolchain meeting, this solution is too complex. I going to close this WIP in favor more simple solutions such as disabling debug info during optimization, or moving optimization out of the lld.

In D47901#1164512, @yurydelendik wrote:

Per discussion at the WebAssembly toolchain meeting, this solution is too complex. I going to close this WIP in favor more simple solutions such as disabling debug info during optimization, or moving optimization out of the lld.

That makes sense to me. Unlike compiler -O optimization, LEB128 compression by the linker doesn't change anything in the resulting executable. It just changes the encoding of immediates in the binary so that they are represented more compactly. So I can imagine that it is a rare situation that you have to enable both linker -O2 and -debug options.

Revision Contents

Path

Size

wasm/

3 lines

206 lines

27 lines

22 lines

Diff 150410

wasm/InputChunks.h

Show First 20 Lines • Show All 146 Lines • ▼ Show 20 Lines	public:
}		}
uint32_t getFunctionIndex() const { return FunctionIndex.getValue(); }		uint32_t getFunctionIndex() const { return FunctionIndex.getValue(); }
bool hasFunctionIndex() const { return FunctionIndex.hasValue(); }		bool hasFunctionIndex() const { return FunctionIndex.hasValue(); }
void setFunctionIndex(uint32_t Index);		void setFunctionIndex(uint32_t Index);
uint32_t getTableIndex() const { return TableIndex.getValue(); }		uint32_t getTableIndex() const { return TableIndex.getValue(); }
bool hasTableIndex() const { return TableIndex.hasValue(); }		bool hasTableIndex() const { return TableIndex.hasValue(); }
void setTableIndex(uint32_t Index);		void setTableIndex(uint32_t Index);

		uint32_t translateCompressedPC(uint32_t PC) const;

// The size of a given input function can depend on the values of the		// The size of a given input function can depend on the values of the
// LEB relocations within it. This finalizeContents method is called after		// LEB relocations within it. This finalizeContents method is called after
// all the symbol values have be calcualted but before getSize() is ever		// all the symbol values have be calcualted but before getSize() is ever
// called.		// called.
void calculateSize();		void calculateSize();

const WasmSignature &Signature;		const WasmSignature &Signature;

Show All 10 Lines	uint32_t getInputSectionOffset() const override {
return Function->CodeSectionOffset;		return Function->CodeSectionOffset;
}		}

const WasmFunction *Function;		const WasmFunction *Function;
llvm::Optional<uint32_t> FunctionIndex;		llvm::Optional<uint32_t> FunctionIndex;
llvm::Optional<uint32_t> TableIndex;		llvm::Optional<uint32_t> TableIndex;
uint32_t CompressedFuncSize = 0;		uint32_t CompressedFuncSize = 0;
uint32_t CompressedSize = 0;		uint32_t CompressedSize = 0;
		std::vector<std::pair<uint32_t, uint32_t>> CompressTransform;
};		};

class SyntheticFunction : public InputFunction {		class SyntheticFunction : public InputFunction {
public:		public:
SyntheticFunction(const WasmSignature &S, StringRef Name,		SyntheticFunction(const WasmSignature &S, StringRef Name,
StringRef DebugName = {})		StringRef DebugName = {})
: InputFunction(S, nullptr, nullptr), Name(Name), DebugName(DebugName) {		: InputFunction(S, nullptr, nullptr), Name(Name), DebugName(DebugName) {
SectionKind = InputChunk::SyntheticFunction;		SectionKind = InputChunk::SyntheticFunction;
▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

wasm/InputChunks.cpp

//===- InputChunks.cpp ----------------------------------------------------===//		//===- InputChunks.cpp ----------------------------------------------------===//
//		//
// The LLVM Linker		// The LLVM Linker
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "InputChunks.h"		#include "InputChunks.h"
#include "Config.h"		#include "Config.h"
#include "OutputSegment.h"		#include "OutputSegment.h"
#include "WriterUtils.h"		#include "WriterUtils.h"
#include "lld/Common/ErrorHandler.h"		#include "lld/Common/ErrorHandler.h"
#include "lld/Common/LLVM.h"		#include "lld/Common/LLVM.h"
#include "llvm/Support/LEB128.h"		#include "llvm/Support/LEB128.h"
		#include <algorithm>

#define DEBUG_TYPE "lld"		#define DEBUG_TYPE "lld"

using namespace llvm;		using namespace llvm;
using namespace llvm::wasm;		using namespace llvm::wasm;
using namespace llvm::support::endian;		using namespace llvm::support::endian;
		using namespace llvm::dwarf;
using namespace lld;		using namespace lld;
using namespace lld::wasm;		using namespace lld::wasm;

static StringRef ReloctTypeToString(uint8_t RelocType) {		static StringRef ReloctTypeToString(uint8_t RelocType) {
switch (RelocType) {		switch (RelocType) {
#define WASM_RELOC(NAME, REL) case REL: return #NAME;		#define WASM_RELOC(NAME, REL) case REL: return #NAME;
#include "llvm/BinaryFormat/WasmRelocs.def"		#include "llvm/BinaryFormat/WasmRelocs.def"
#undef WASM_RELOC		#undef WASM_RELOC
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	for (const WasmRelocation &Rel : Relocations) {
uint32_t ExpectedValue = File->calcExpectedValue(Rel);		uint32_t ExpectedValue = File->calcExpectedValue(Rel);
if (ExpectedValue != ExistingValue)		if (ExpectedValue != ExistingValue)
warn("unexpected existing value for " + ReloctTypeToString(Rel.Type) +		warn("unexpected existing value for " + ReloctTypeToString(Rel.Type) +
": existing=" + Twine(ExistingValue) +		": existing=" + Twine(ExistingValue) +
" expected=" + Twine(ExpectedValue));		" expected=" + Twine(ExpectedValue));
}		}
}		}

		static void updateDebugLine(ObjFile *File, const ArrayRef<uint8_t> &data,
		ruiuUnsubmitted Done Reply Inline Actions I haven't read this function yet, but this function is not OK because it is too long and doesn't have comment to help readers understand it. Please read other code in lld carefully and follow local conventions, and then reorganize the code in this patch, so that the first time reader can easily understand your code. ruiu: I haven't read this function yet, but this function is not OK because it is too long and…
		const std::vector<WasmRelocation> &Relocations,
		uint8_t *OutputBuf) {
		LLVM_DEBUG(dbgs() << "Processing .debug_line\n");

		// R_WEBASSEMBLY_FUNCTION_OFFSET_I32 entries will be found during .debug_info
		// units parsing. Save these entries to access them during DW_LNE_set_address
		// operation below.
		llvm::DenseMap<uint32_t, const WasmRelocation *> FunctionRelocations;
		for (const WasmRelocation &Rel : Relocations) {
		if (Rel.Type == R_WEBASSEMBLY_FUNCTION_OFFSET_I32)
		FunctionRelocations[Rel.Offset] = &Rel;
		}

		// Scan all CUs to find only related to the current data region.
		DWARFContext *DW = File->getDWARFContext();
		for (const auto &CU : DW->compile_units()) {
		const DWARFSection &S = CU->getLineSection();
		// Skip .debug_line section if is not contained in the data region.
		if (data.data() > (const unsigned char *)S.Data.data() \|\|
		(const unsigned char *)S.Data.data() + S.Data.size() >
		data.data() + data.size())
		continue;

		// Processing .debug_line section unit
		DWARFDataExtractor d(DW->getDWARFObj(), S, /* isLittleEndian = */ true,
		CU->getFormParams().AddrSize);
		size_t CUOffset = (const unsigned char *)S.Data.data() - data.data();

		// Parsing .debug_line prolog for DWARF format versions 2-4.
		// TODO add v5 support?
		uint32_t Offset = 0;
		uint64_t TotalLength = d.getU32(&Offset);
		assert(TotalLength < 0xffffff00);
		uint16_t Version = d.getU16(&Offset);
		assert(Version >= 2 && Version <= 4);
		const uint64_t SizeOfPrologueLength = 4;
		uint64_t PrologueLength = d.getUnsigned(&Offset, SizeOfPrologueLength);
		const uint64_t EndPrologueOffset = PrologueLength + Offset;
		const uint64_t SizeOfTotalLength = 4;
		const uint64_t EndOffset = TotalLength + SizeOfTotalLength;

		/* uint8_t MinInstLength = */ d.getU8(&Offset);
		uint8_t MaxOpsPerInst = Version >= 4 ? d.getU8(&Offset) : 1;
		assert(MaxOpsPerInst == 1);
		/* uint8_t DefaultIsStmt = */ d.getU8(&Offset);
		int8_t LineBase = (int8_t)d.getU8(&Offset);
		uint8_t LineRange = d.getU8(&Offset);
		uint8_t OpcodeBase = d.getU8(&Offset);

		std::vector<uint8_t> StandardOpcodeLengths;
		StandardOpcodeLengths.reserve(OpcodeBase - 1);
		for (uint32_t I = 1; I < OpcodeBase; ++I) {
		uint8_t OpLen = d.getU8(&Offset);
		StandardOpcodeLengths.push_back(OpLen);
		}

		assert(Offset < EndPrologueOffset);
		// Skip directory entries
		Offset = EndPrologueOffset;

		// TODO remove tracking of the Address (added for debugging)
		uint64_t Address = 0;
		std::unique_ptr<PCTransformIterator> PCTr;
		while (Offset < EndOffset) {
		const uint8_t OpcodeOffset = Offset;
		uint8_t Opcode = d.getU8(&Offset);
		if (Opcode == 0) {
		// Parse Extended Opcodes, mosly DW_LNE_set_address (see below)
		uint64_t Len = d.getULEB128(&Offset);
		assert(Len > 0);
		const uint64_t EndExtOp = Offset + Len;
		uint8_t SubOpcode = d.getU8(&Offset);
		switch (SubOpcode) {
		case DW_LNE_set_address: {
		const uint32_t RelocValueOffset = (uint32_t)CUOffset + Offset;
		Address = d.getRelocatedAddress(&Offset);
		LLVM_DEBUG(dbgs()
		<< "DW_LNE_set_address " << (void *)Address << "\n");
		// Get relocation entry to get get right InputFunction and start PC.
		const WasmRelocation *SecRelPtr =
		FunctionRelocations[RelocValueOffset];
		assert(SecRelPtr);
		PCTr = File->getPCTransformIterator(*SecRelPtr);
		break;
		}
		case DW_LNE_end_sequence: {
		LLVM_DEBUG(dbgs() << " - end_seq " << (void *)Address << "\n");
		break;
		}
		}
		Offset = EndExtOp;
		} else if (Opcode < OpcodeBase) {
		// Parse Standard Opcodes
		switch (Opcode) {
		case DW_LNS_copy: {
		LLVM_DEBUG(dbgs() << " - copy " << (void *)Address << "\n");
		break;
		}
		case DW_LNS_advance_pc: {
		// Check if PC advance value for DW_LNS_advance_pc was changed,
		// and change its argument LEB value to fit the smaller value
		// if needed
		const uint32_t LEBStart = Offset;
		uint64_t AddrOffset = d.getULEB128(&Offset);
		Address += AddrOffset;
		LLVM_DEBUG(dbgs() << "DW_LNE_advance_pc " << (void *)Address << "\n");
		const uint32_t Delta = PCTr->advance(AddrOffset);
		assert(Delta <= AddrOffset);
		if (Delta < AddrOffset) {
		encodeULEB128(Delta, OutputBuf + CUOffset + LEBStart,
		Offset - LEBStart);
		}
		break;
		}
		case DW_LNS_const_add_pc: {
		// This one tricky, adding extra table line here by replacing
		// standard opcode with special one, with shorter PC advance value,
		// if needed
		// TODO Is this the best we can do here?
		uint8_t AdjustOpcode = 255 - OpcodeBase;
		uint64_t AddrOffset = AdjustOpcode / LineRange;
		Address += AddrOffset;
		LLVM_DEBUG(dbgs()
		<< "DW_LNS_const_add_pc " << (void *)Address << "\n");
		const uint32_t Delta = PCTr->advance(AddrOffset);
		assert(Delta <= AddrOffset);
		if (Delta < AddrOffset) {
		uint8_t NewOpcode = OpcodeBase - LineBase + Delta * LineRange;
		OutputBuf[CUOffset + OpcodeOffset] = NewOpcode;
		}
		break;
		}
		case DW_LNS_fixed_advance_pc: {
		// Check if PC advance value for DW_LNS_fixed_advance_pc was changed,
		// and change its uint16_t argument if needed.
		uint16_t PCOffset = d.getU16(&Offset);
		Address += PCOffset;
		LLVM_DEBUG(dbgs()
		<< "DW_LNS_fixed_advance_pc " << (void *)Address << "\n");
		const uint32_t Delta = PCTr->advance(PCOffset);
		assert(Delta <= PCOffset);
		if (Delta < PCOffset)
		write16le(OutputBuf + CUOffset + OpcodeOffset + 1, Delta);
		break;
		}
		default: {
		uint8_t Skip = StandardOpcodeLengths[Opcode - 1U];
		for (uint8_t I = 0; I < Skip; ++I)
		d.getULEB128(&Offset);
		break;
		}
		}
		} else {
		// Parsing Special Opcode and updating only PC advance value
		// part if needed.
		uint8_t AdjustOpcode = Opcode - OpcodeBase;
		uint64_t AddrOffset = AdjustOpcode / LineRange;
		Address += AddrOffset;
		LLVM_DEBUG(dbgs() << " - special " << (void *)Address << "\n");
		const uint32_t Delta = PCTr->advance(AddrOffset);
		assert(Delta <= AddrOffset);
		if (Delta < AddrOffset) {
		uint8_t NewOpcode = Opcode - (AddrOffset - Delta) * LineRange;
		OutputBuf[CUOffset + OpcodeOffset] = NewOpcode;
		}
		}
		}
		}
		}

// Copy this input chunk to an mmap'ed output file and apply relocations.		// Copy this input chunk to an mmap'ed output file and apply relocations.
void InputChunk::writeTo(uint8_t *Buf) const {		void InputChunk::writeTo(uint8_t *Buf) const {
// Copy contents		// Copy contents
memcpy(Buf + OutputOffset, data().data(), data().size());		memcpy(Buf + OutputOffset, data().data(), data().size());

// Apply relocations		// Apply relocations
if (Relocations.empty())		if (Relocations.empty())
return;		return;

#ifndef NDEBUG		#ifndef NDEBUG
verifyRelocTargets();		verifyRelocTargets();
#endif		#endif

		if (Config->CompressRelocTargets && getName() == ".debug_line")
		updateDebugLine(File, data(), Relocations, Buf + OutputOffset);

LLVM_DEBUG(dbgs() << "applying relocations: " << getName()		LLVM_DEBUG(dbgs() << "applying relocations: " << getName()
<< " count=" << Relocations.size() << "\n");		<< " count=" << Relocations.size() << "\n");
int32_t Off = OutputOffset - getInputSectionOffset();		int32_t Off = OutputOffset - getInputSectionOffset();

for (const WasmRelocation &Rel : Relocations) {		for (const WasmRelocation &Rel : Relocations) {
uint8_t *Loc = Buf + Rel.Offset + Off;		uint8_t *Loc = Buf + Rel.Offset + Off;
uint32_t Value = File->calcNewValue(Rel);		uint32_t Value = File->calcNewValue(Rel);
LLVM_DEBUG(dbgs() << "apply reloc: type=" << ReloctTypeToString(Rel.Type)		LLVM_DEBUG(dbgs() << "apply reloc: type=" << ReloctTypeToString(Rel.Type)
▲ Show 20 Lines • Show All 128 Lines • ▼ Show 20 Lines	void InputFunction::calculateSize() {
uint32_t End = Start + Function->Size;		uint32_t End = Start + Function->Size;

uint32_t LastRelocEnd = Start + FunctionSizeLength;		uint32_t LastRelocEnd = Start + FunctionSizeLength;
for (WasmRelocation &Rel : Relocations) {		for (WasmRelocation &Rel : Relocations) {
LLVM_DEBUG(dbgs() << " region: " << (Rel.Offset - LastRelocEnd) << "\n");		LLVM_DEBUG(dbgs() << " region: " << (Rel.Offset - LastRelocEnd) << "\n");
CompressedFuncSize += Rel.Offset - LastRelocEnd;		CompressedFuncSize += Rel.Offset - LastRelocEnd;
CompressedFuncSize += getRelocWidth(Rel, File->calcNewValue(Rel));		CompressedFuncSize += getRelocWidth(Rel, File->calcNewValue(Rel));
LastRelocEnd = Rel.Offset + getRelocWidthPadded(Rel);		LastRelocEnd = Rel.Offset + getRelocWidthPadded(Rel);

		CompressTransform.emplace_back(LastRelocEnd - Start - FunctionSizeLength,
		CompressedFuncSize);
}		}
LLVM_DEBUG(dbgs() << " final region: " << (End - LastRelocEnd) << "\n");		LLVM_DEBUG(dbgs() << " final region: " << (End - LastRelocEnd) << "\n");
CompressedFuncSize += End - LastRelocEnd;		CompressedFuncSize += End - LastRelocEnd;

		CompressTransform.emplace_back(Function->Size - FunctionSizeLength,
		CompressedFuncSize);

// Now we know how long the resulting function is we can add the encoding		// Now we know how long the resulting function is we can add the encoding
// of its length		// of its length
uint8_t Buf[5];		uint8_t Buf[5];
CompressedSize = CompressedFuncSize + encodeULEB128(CompressedFuncSize, Buf);		CompressedSize = CompressedFuncSize + encodeULEB128(CompressedFuncSize, Buf);

LLVM_DEBUG(dbgs() << " calculateSize orig: " << Function->Size << "\n");		LLVM_DEBUG(dbgs() << " calculateSize orig: " << Function->Size << "\n");
LLVM_DEBUG(dbgs() << " calculateSize new: " << CompressedSize << "\n");		LLVM_DEBUG(dbgs() << " calculateSize new: " << CompressedSize << "\n");
}		}
Show All 26 Lines	for (const WasmRelocation &Rel : Relocations) {
LastRelocEnd = SecStart + Rel.Offset + getRelocWidthPadded(Rel);		LastRelocEnd = SecStart + Rel.Offset + getRelocWidthPadded(Rel);
}		}

unsigned ChunkSize = End - LastRelocEnd;		unsigned ChunkSize = End - LastRelocEnd;
LLVM_DEBUG(dbgs() << " write final chunk: " << ChunkSize << "\n");		LLVM_DEBUG(dbgs() << " write final chunk: " << ChunkSize << "\n");
memcpy(Buf, LastRelocEnd, ChunkSize);		memcpy(Buf, LastRelocEnd, ChunkSize);
LLVM_DEBUG(dbgs() << " total: " << (Buf + ChunkSize - Orig) << "\n");		LLVM_DEBUG(dbgs() << " total: " << (Buf + ChunkSize - Orig) << "\n");
}		}

		uint32_t InputFunction::translateCompressedPC(uint32_t PC) const {
		if (CompressTransform.size() == 0)
		return PC;

		// Find address range: a pair of when the range starts and its new start.
		auto I =
		std::lower_bound(CompressTransform.rbegin(), CompressTransform.rend(), PC,
		[](const std::pair<uint32_t, uint32_t> &El,
		uint32_t Value) { return El.first > Value; });

		if (I == CompressTransform.rend()) {
		// If a range is not found, the PC was not transformed yet,
		// clamping result to the new start of the first range.
		return std::min(PC, CompressTransform[0].second);
		}

		uint32_t NewPC = I->second + (PC - I->first);
		// If a range is not last one, the result needs to be clamped to the new
		// start of the following range.
		if (I == CompressTransform.rbegin())
		return NewPC;
		return std::min(NewPC, std::prev(I)->second);
		}

wasm/InputFiles.h

//===- InputFiles.h ---------------------------------------------- C++ --===//		//===- InputFiles.h ---------------------------------------------- C++ --===//
//		//
// The LLVM Linker		// The LLVM Linker
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLD_WASM_INPUT_FILES_H		#ifndef LLD_WASM_INPUT_FILES_H
#define LLD_WASM_INPUT_FILES_H		#define LLD_WASM_INPUT_FILES_H

#include "Symbols.h"		#include "Symbols.h"
#include "lld/Common/LLVM.h"		#include "lld/Common/LLVM.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/DenseSet.h"		#include "llvm/ADT/DenseSet.h"
		#include "llvm/DebugInfo/DWARF/DWARFContext.h"
#include "llvm/Object/Archive.h"		#include "llvm/Object/Archive.h"
#include "llvm/Object/Wasm.h"		#include "llvm/Object/Wasm.h"
#include "llvm/Support/MemoryBuffer.h"		#include "llvm/Support/MemoryBuffer.h"
#include <vector>		#include <vector>

		using llvm::DWARFContext;
using llvm::object::Archive;		using llvm::object::Archive;
using llvm::object::WasmObjectFile;		using llvm::object::WasmObjectFile;
using llvm::object::WasmSection;		using llvm::object::WasmSection;
using llvm::object::WasmSymbol;		using llvm::object::WasmSymbol;
using llvm::wasm::WasmGlobal;		using llvm::wasm::WasmGlobal;
using llvm::wasm::WasmImport;		using llvm::wasm::WasmImport;
using llvm::wasm::WasmRelocation;		using llvm::wasm::WasmRelocation;
using llvm::wasm::WasmSignature;		using llvm::wasm::WasmSignature;
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	public:

void parse() override;		void parse() override;

private:		private:
std::unique_ptr<Archive> File;		std::unique_ptr<Archive> File;
llvm::DenseSet<uint64_t> Seen;		llvm::DenseSet<uint64_t> Seen;
};		};

		class PCTransformIterator {
		public:
		PCTransformIterator(const InputFunction &Function, uint32_t PC)
		: Function(Function), OldPC(PC) {
		NewPC = translate(PC);
		}

		uint32_t getOldPC() const { return OldPC; }
		uint32_t getNewPC() const { return NewPC; }
		uint32_t advance(uint32_t Delta);

		private:
		const InputFunction &Function;
		uint32_t OldPC;
		uint32_t NewPC;

		uint32_t translate(uint32_t PC) const;
		};

// .o file (wasm object file)		// .o file (wasm object file)
class ObjFile : public InputFile {		class ObjFile : public InputFile {
public:		public:
explicit ObjFile(MemoryBufferRef M) : InputFile(ObjectKind, M) {}		explicit ObjFile(MemoryBufferRef M) : InputFile(ObjectKind, M) {}
static bool classof(const InputFile *F) { return F->kind() == ObjectKind; }		static bool classof(const InputFile *F) { return F->kind() == ObjectKind; }

void parse() override;		void parse() override;

// Returns the underlying wasm file.		// Returns the underlying wasm file.
const WasmObjectFile *getWasmObj() const { return WasmObj.get(); }		const WasmObjectFile *getWasmObj() const { return WasmObj.get(); }
		DWARFContext *getDWARFContext() const { return DWContext.get(); }

void dumpInfo() const;		void dumpInfo() const;

uint32_t calcNewIndex(const WasmRelocation &Reloc) const;		uint32_t calcNewIndex(const WasmRelocation &Reloc) const;
uint32_t calcNewValue(const WasmRelocation &Reloc) const;		uint32_t calcNewValue(const WasmRelocation &Reloc) const;
uint32_t calcNewAddend(const WasmRelocation &Reloc) const;		uint32_t calcNewAddend(const WasmRelocation &Reloc) const;
uint32_t calcExpectedValue(const WasmRelocation &Reloc) const;		uint32_t calcExpectedValue(const WasmRelocation &Reloc) const;

		std::unique_ptr<PCTransformIterator>
		getPCTransformIterator(const WasmRelocation &Reloc) const;

const WasmSection *CodeSection = nullptr;		const WasmSection *CodeSection = nullptr;
const WasmSection *DataSection = nullptr;		const WasmSection *DataSection = nullptr;

// Maps input type indices to output type indices		// Maps input type indices to output type indices
std::vector<uint32_t> TypeMap;		std::vector<uint32_t> TypeMap;
std::vector<bool> TypeIsUsed;		std::vector<bool> TypeIsUsed;
// Maps function indices to table indices		// Maps function indices to table indices
std::vector<uint32_t> TableEntries;		std::vector<uint32_t> TableEntries;
Show All 16 Lines	private:
Symbol *createUndefined(const WasmSymbol &Sym);		Symbol *createUndefined(const WasmSymbol &Sym);

bool isExcludedByComdat(InputChunk *Chunk) const;		bool isExcludedByComdat(InputChunk *Chunk) const;

// List of all symbols referenced or defined by this file.		// List of all symbols referenced or defined by this file.
std::vector<Symbol *> Symbols;		std::vector<Symbol *> Symbols;

std::unique_ptr<WasmObjectFile> WasmObj;		std::unique_ptr<WasmObjectFile> WasmObj;

		std::unique_ptr<DWARFContext> DWContext;
};		};

// Opens a given file.		// Opens a given file.
llvm::Optional<MemoryBufferRef> readFile(StringRef Path);		llvm::Optional<MemoryBufferRef> readFile(StringRef Path);

} // namespace wasm		} // namespace wasm

std::string toString(const wasm::InputFile *File);		std::string toString(const wasm::InputFile *File);

} // namespace lld		} // namespace lld

#endif		#endif

wasm/InputFiles.cpp

Show All 36 Lines	Optional<MemoryBufferRef> lld::wasm::readFile(StringRef Path) {
}		}
std::unique_ptr<MemoryBuffer> &MB = *MBOrErr;		std::unique_ptr<MemoryBuffer> &MB = *MBOrErr;
MemoryBufferRef MBRef = MB->getMemBufferRef();		MemoryBufferRef MBRef = MB->getMemBufferRef();
make<std::unique_ptr<MemoryBuffer>>(std::move(MB)); // take MB ownership		make<std::unique_ptr<MemoryBuffer>>(std::move(MB)); // take MB ownership

return MBRef;		return MBRef;
}		}

		uint32_t PCTransformIterator::advance(uint32_t Delta) {
		const uint32_t PastNewPC = NewPC;
		OldPC += Delta;
		NewPC = translate(OldPC);
		assert(NewPC >= PastNewPC);
		return NewPC - PastNewPC;
		}

		uint32_t PCTransformIterator::translate(uint32_t PC) const {
		auto i = Function.translateCompressedPC(PC);
		return i;
		}

void ObjFile::dumpInfo() const {		void ObjFile::dumpInfo() const {
log("info for: " + getName() +		log("info for: " + getName() +
"\n Symbols : " + Twine(Symbols.size()) +		"\n Symbols : " + Twine(Symbols.size()) +
"\n Function Imports : " + Twine(WasmObj->getNumImportedFunctions()) +		"\n Function Imports : " + Twine(WasmObj->getNumImportedFunctions()) +
"\n Global Imports : " + Twine(WasmObj->getNumImportedGlobals()));		"\n Global Imports : " + Twine(WasmObj->getNumImportedGlobals()));
}		}

// Relocations contain either symbol or type indices. This function takes a		// Relocations contain either symbol or type indices. This function takes a
▲ Show 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	case R_WEBASSEMBLY_FUNCTION_OFFSET_I32:
return 0;		return 0;
case R_WEBASSEMBLY_SECTION_OFFSET_I32:		case R_WEBASSEMBLY_SECTION_OFFSET_I32:
return getSectionSymbol(Reloc.Index)->Section->OutputOffset + Reloc.Addend;		return getSectionSymbol(Reloc.Index)->Section->OutputOffset + Reloc.Addend;
default:		default:
llvm_unreachable("unknown relocation type");		llvm_unreachable("unknown relocation type");
}		}
}		}

		std::unique_ptr<PCTransformIterator>
		ObjFile::getPCTransformIterator(const WasmRelocation &Reloc) const {
		if (auto *Sym = dyn_cast<DefinedFunction>(getFunctionSymbol(Reloc.Index)))
		return make_unique<PCTransformIterator>(*Sym->Function, Reloc.Addend);
		llvm_unreachable("InputFunction is not found");
		}

void ObjFile::parse() {		void ObjFile::parse() {
// Parse a memory buffer as a wasm file.		// Parse a memory buffer as a wasm file.
LLVM_DEBUG(dbgs() << "Parsing object: " << toString(this) << "\n");		LLVM_DEBUG(dbgs() << "Parsing object: " << toString(this) << "\n");
std::unique_ptr<Binary> Bin = CHECK(createBinary(MB), toString(this));		std::unique_ptr<Binary> Bin = CHECK(createBinary(MB), toString(this));

auto *Obj = dyn_cast<WasmObjectFile>(Bin.get());		auto *Obj = dyn_cast<WasmObjectFile>(Bin.get());
if (!Obj)		if (!Obj)
fatal(toString(this) + ": not a wasm file");		fatal(toString(this) + ": not a wasm file");
if (!Obj->isRelocatableObject())		if (!Obj->isRelocatableObject())
fatal(toString(this) + ": not a relocatable wasm file");		fatal(toString(this) + ": not a relocatable wasm file");

Bin.release();		Bin.release();
WasmObj.reset(Obj);		WasmObj.reset(Obj);

		DWContext = DWARFContext::create(*Obj);

// Build up a map of function indices to table indices for use when		// Build up a map of function indices to table indices for use when
// verifying the existing table index relocations		// verifying the existing table index relocations
uint32_t TotalFunctions =		uint32_t TotalFunctions =
WasmObj->getNumImportedFunctions() + WasmObj->functions().size();		WasmObj->getNumImportedFunctions() + WasmObj->functions().size();
TableEntries.resize(TotalFunctions);		TableEntries.resize(TotalFunctions);
for (const WasmElemSegment &Seg : WasmObj->elements()) {		for (const WasmElemSegment &Seg : WasmObj->elements()) {
if (Seg.Offset.Opcode != WASM_OPCODE_I32_CONST)		if (Seg.Offset.Opcode != WASM_OPCODE_I32_CONST)
fatal(toString(this) + ": invalid table elements");		fatal(toString(this) + ": invalid table elements");
▲ Show 20 Lines • Show All 210 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[WIP] [WebAssembly] Update .debug_line section PC addresses during LEB optimizationAbandonedPublic

Details

Diff Detail