This is an archive of the discontinued LLVM Phabricator instance.

sbc100 retitled this revision from [WebAssembly] Use Symbol class heirarchy. NFC. to [WIP] [WebAssembly] Use Symbol class heirarchy. NFC..Feb 8 2018, 8:35 PM

remove line

Harbormaster completed remote builds in B14790: Diff 133557.Feb 8 2018, 8:40 PM

LGTM -- nothing obviously weird from my skim.

wasm/InputFiles.cpp
56	Do you need the else here?

This revision is now accepted and ready to land.Feb 8 2018, 9:19 PM

sbc100 retitled this revision from [WIP] [WebAssembly] Use Symbol class heirarchy. NFC. to [WebAssembly] Use Symbol class heirarchy. NFC..Feb 9 2018, 11:27 AM

sbc100 added reviewers: ncw, ruiu.

Herald added a subscriber: arichardson. · View Herald TranscriptFeb 9 2018, 11:27 AM

This change has some pretty serious merge conflict with the symbol change change that is in the pipeline so I won't land it without clear plan (Currently looking the resolving the merge myself, but might wait until the larger change lands).

wasm/InputFiles.cpp
56	I don't need it no, I know I know its not the lld style.. its just read better for me. I'll remove though.

Overall looking very nice. It closes the gap between wasm and other lld ports and make it easier to read.

wasm/InputFiles.cpp
307–318	Ideally it should return `FunctionSymbol *`.
wasm/InputFiles.h
111	Can you make this type safe? I.e. can you change the type of FunctionSymbols so that you don't need to use cast?
wasm/SymbolTable.cpp
148–149	It is not clear to me why we need addDefined function even though we have addDefinedFunction and addDefinedGlobal functions. What is this for?
198	Ditto -- having addUndefinedFunction and addUndefined doesn't seem orthogonal because the former seems like a subset of the latter.
wasm/Symbols.h
88	This class hierarchy is interesting Symbol FunctionSymbol DefinedFunction UndefinedFunction GlobalSymbol DefinedGlobal UndefinedGlobal Lazy because in ELF and COFF, defined/undefined is a broader category than function/non-function. But I can see that your class hierarchy just represents how symbols are organized in wasm. (That's one of the reasons why I think that the current design of lld, in which all ports share only the design but not a concrete implementation. If we had to implement coff/elf/wasm/etc on all "unified" code base, that would have been extremely hard to do.)
125	Can you move UndefinedFunction here so that related classes are adjacent in source code?

sbc100 added inline comments.Feb 9 2018, 1:52 PM

wasm/Symbols.h
88	Indeed. It is quite major difference from ELF here. It only occurred to be when doing this work that ELF doesn't have different undefined types. Whereas in wasm an undefined symbol has a lot of type information (for example undefined functions carry their type signature). We are a little constrained by your typical OO class hierarchy here because, for example, under this scheme there is no common base class for defined or undefined symbols, not that its a huge problem. I think this scheme is still the best options.

sbc100 added inline comments.Feb 9 2018, 4:23 PM

wasm/InputFiles.h
111	Hmm... i just tried it and realized why I didn't do it like this in the first place. The problem is that FunctionSymbols is constructed as we parse the object, and before all symbols have been resolved, and at that time archive symbols are Lazy... and will eventually transform in the DefinedFunction symbols or DefinedGlobal symbols. And Lazy symbol can't be cast to either of those types. I could perhaps work around this by delaying the construction of this vector, or perhaps by introducing a LazyGlobal and LazyFunction subtype?

ncw added inline comments.Feb 13 2018, 2:23 AM

wasm/Symbols.h
250–257	Yikes, this is scary! For one, doing `make<SymbolUnion>` means that the actual symbol's destructor won't be called at the program end. We're relying on the Symbol classes being "simple" with trivial destructors, but what if someone forgets and sticks a std::vector in there as a member...? It's not ideal, quite fragile. And similarly, replaceSymbol doesn't deallocate the previous data in the union, in just writes straight over it. The new object will be OK, but the previous one will leak any members. Again not a problem as long as the Symbols all have trivial dtors, but it feels like an accident waiting to happen. Is it technically Undefined Behaviour? You seem to be relying on the exact details of the base-to-derived pointer adjustments. First we allocate the union, then reinterpret-cast it to a `Symbol`, then placement-construct an UndefinedFunction symbol. Then the first bit of UB happens, we assume that the newly-constructed UndefinedFunction, when cast to a Symbol, has the same address. That is, we assume the following: `(void)(BaseClass)(new (addr) DerivedClass) == (void)addr` Then the second bit of UB happens when we do replaceSymbol and make the same assumption again, but in replaceSymbol we further assume that all derived classes will construct their Symbol base at the same offset within the memory block we're placement-constructing them in (it's just another assumption about object layout). The basic assumption is that base classes are constructed before the derived class, and the base class is at offset zero within the memory when placement-constructed at a specific address. I can see it's just copying the existing LLD code. What you're trying to achieve is to dynamically change the derived type of an object, so that previously-created pointers to the base class remain valid as pointers to the new object's base class. There is a solution I can think of that's "safe". Instead of doing `reinterpret_cast<Symbol*>(make<SymbolUnion>())`, why not instead put the Kind member next to the union, something like this: `struct SymbolHandle { int Kind; SymbolUnion U; }` and then store pointers to the union-with-kind. You can then safely destruct the union using a switch, and safely replace the union too; and finally, you can add automatic casting operators that allow casting a SymbolHandle to an `UndefinedFunction&` etc with an assertion on Kind and a cast on the appropriate member of the union.

sbc100 added inline comments.Feb 13 2018, 8:49 AM

wasm/Symbols.h
250–257	I think perhaps you raise some good points, but I see this as more of an lld wide discussion. The other ports have always working in this way AFAICT, and it was always my intention to have the wasm port do the same thing. Perhaps @ruiu can add some documentation about this technique, why it is actually safe in the this context, and what the motivating factors are for using it. In any case I don't think we should block this change in this design discussion. Lets make the linkers consistent and iterate (together) from there.

As to the use of the placement new, it is an intentional design choice that symbols are trivially destructible. We shouldn't add anything that needs a desturctor to Symbol for performance reasons. We sometimes allocate literally millions of symbols, and we don't want to call destructors million times. Also, even if we had a destructor for Symbol, they wouldn't be called because we usually use _exit to terminate the process as quickly as possible, without calling the destructor of objects that are used throughout the linking process. lld may not look like very "object-oriented", and that's intentional.

As to the union type, we don't really use it as a union. That is a convenient way to allocate a memory that is large enough to hold any Symbol-derived type, but we don't access any member of it. As you wrote, maybe we should make it explicit by changing (Symbol *)make<SymbolUnion>() to reinterpret_cast<Symbol *>(make<SymbolUnion>()) (I actually wrote that cast with reinterpret_cast in mind, without noticing that there's other way of interpreting it), or if the existence of the union causes confusion like that, we should remove SymbolUnion and define char[maximum symbol size] for a symbol instead. The point is that the union isn't important. We just want to allocate blank memory for symbols. We identify object type using LLVM's classof mechanism, and I think that works just fine.

Add static assert

Harbormaster completed remote builds in B14923: Diff 134059.Feb 13 2018, 9:26 AM

As to the union type, we don't really use it as a union. That is a convenient way to allocate a memory that is large enough to hold any Symbol-derived type, but we don't access any member of it. As you wrote, maybe we should make it explicit by changing (Symbol *)make<SymbolUnion>() to reinterpret_cast<Symbol *>(make<SymbolUnion>()) (I actually wrote that cast with reinterpret_cast in mind, without noticing that there's other way of interpreting it), or if the existence of the union causes confusion like that, we should remove SymbolUnion and define char[maximum symbol size] for a symbol instead. The point is that the union isn't important. We just want to allocate blank memory for symbols. We identify object type using LLVM's classof mechanism, and I think that works just fine.

On second thought, it is already interpreted as interpret_cast<> and not a UB, no? SymbolUnion is not a union of Symbols, but is a union of char[].

Add static assert
reinterpret_cast

Harbormaster completed remote builds in B14927: Diff 134073.Feb 13 2018, 10:13 AM

Add static assert
reinterpret_cast
cleanup

Harbormaster completed remote builds in B14934: Diff 134088.Feb 13 2018, 11:10 AM

sbc100 added inline comments.Feb 13 2018, 11:57 AM

wasm/Symbols.h
250–257	FWIW, it looks it should be easy to protect against (1) and (2) by adding: static_assert(std::is_trivially_destructible<T>(), "Symbol types must be trivially destructible"); The spec on this explicitly says `Storage occupied by trivially destructible objects may be reused without calling the destructor.`. That still leaves the concerns you raise in (3) of course.

reorder classes

sbc100 marked an inline comment as done.Feb 13 2018, 12:05 PM

sbc100 added inline comments.

wasm/SymbolTable.cpp
148–149	Renamed to reflect their purpose .. addSyntheticGlobal etc..

Harbormaster completed remote builds in B14938: Diff 134093.Feb 13 2018, 12:07 PM

rebase

LGTM

Feel free to commit whenever convenient for you.

In D43112#1006394, @ruiu wrote:

As to the union type, we don't really use it as a union. That is a convenient way to allocate a memory that is large enough to hold any Symbol-derived type, but we don't access any member of it. As you wrote, maybe we should make it explicit by changing (Symbol *)make<SymbolUnion>() to reinterpret_cast<Symbol *>(make<SymbolUnion>()) (I actually wrote that cast with reinterpret_cast in mind, without noticing that there's other way of interpreting it), or if the existence of the union causes confusion like that, we should remove SymbolUnion and define char[maximum symbol size] for a symbol instead. The point is that the union isn't important. We just want to allocate blank memory for symbols. We identify object type using LLVM's classof mechanism, and I think that works just fine.

On second thought, it is already interpreted as interpret_cast<> and not a UB, no? SymbolUnion is not a union of Symbols, but is a union of char[].

I'm happy with the assertions that the types are trivially-destructible, that makes good sense.

I'm still a bit uneasy that there might be UB. You're assuming that the base class is positioned right at the front of the class: you take the address of the union and reinterpret it as a Symbol*. Then, when you actually construct the object using placement new, you're assuming that the base-class pointer is unadjusted:

T *replaceSymbol(Symbol *S, ArgT &&... Arg) {
  T *Derived = new (S) T(std::forward<ArgT>(Arg)...);
  assert(static_cast<Symbol*>(Derived) == S); // This is what you are requiring to be true
  return Derived;
}

And yet: the compiler's (implementation-defined) rules for object layout might not put the base class there at the same address as the start of the memory. (And possibly the compile environment could redefine placement new to use padding, possibly by some kind of sanitizer guards?)

I think that for "standard layout" class, the object layout has to put the base class first in memory (http://en.cppreference.com/w/cpp/language/data_members). Unfortunately, the Symbol classes are not "standard layout", so I can't see any standards-provided guarantee that the base class is where you want it to be.

I don't want to be a pesky language lawyer. Would it be OK to add the assertion above?

wasm/Symbols.h
250–257	"have always worked this way" - I think it's actually recent, looks like the change dates to 31 Oct 2017. Agreed that it's OK to change Wasm to match.

In terms of placement new, the implementation can only every assume sizeof(T) storage. Its documented that users of placement new are supposed to allocate sizeof(T) bytes and pass it in: http://en.cppreference.com/w/cpp/language/new.

I think added the extra assert seems reasonable. I'm not familiar with object layout rules. I think that can happen as a followup and we can add it to all three implementations.

wasm/Symbols.h
250–257	I guess so. I was thinking that they have always done similar tricks involving in-place replacement of the symbols with sub-types of symbol. But I don't know if the old behaviour was more or less risky.

Also, doesn't the existing assert do exactly what you want already:

assert(static_cast<Symbol *>(static_cast<T *>(nullptr)) == nullptr &&          
         "Not a Symbol");

Closed by commit rL325150: [WebAssembly] Use a Symbol class heirarchy. NFC. (authored by sbc). · Explain WhyFeb 14 2018, 10:30 AM

Closed by commit rLLD325150: [WebAssembly] Use a Symbol class heirarchy. NFC. (authored by sbc). · Explain Why

This revision was automatically updated to reflect the committed changes.

In D43112#1007692, @sbc100 wrote:
Also, doesn't the existing assert do exactly what you want already:
assert(static_cast<Symbol *>(static_cast<T *>(nullptr)) == nullptr &&          
         "Not a Symbol");
?

D'oh! I hadn't spotted that one in amongst the rest, it covers exactly one of the cases I was worried about.

OK, I think the existing asserts cover it all then, plus your helpful comment about placement new clears that up.

Whew, thanks for adding the other assertions. I agree now it's all safe. (Although, a standards-conforming compiler could quite legally fail the assertion, no compilers will in practice and you'd just get a build failure. So that's OK.)

Revision Contents

Path

Size

wasm/

14 lines

35 lines

9 lines

113 lines

164 lines

114 lines

40 lines

Diff 134261

wasm/InputFiles.h

Show First 20 Lines • Show All 101 Lines • ▼ Show 20 Lines	public:

std::vector<uint32_t> TypeMap;		std::vector<uint32_t> TypeMap;
std::vector<bool> TypeIsUsed;		std::vector<bool> TypeIsUsed;
std::vector<InputSegment *> Segments;		std::vector<InputSegment *> Segments;
std::vector<InputFunction *> Functions;		std::vector<InputFunction *> Functions;

ArrayRef<Symbol *> getSymbols() const { return Symbols; }		ArrayRef<Symbol *> getSymbols() const { return Symbols; }

Symbol *getFunctionSymbol(uint32_t Index) const {		FunctionSymbol *getFunctionSymbol(uint32_t Index) const {
return FunctionSymbols[Index];		return cast<FunctionSymbol>(FunctionSymbols[Index]);
		ruiuUnsubmitted Not Done Reply Inline Actions Can you make this type safe? I.e. can you change the type of FunctionSymbols so that you don't need to use cast? ruiu: Can you make this type safe? I.e. can you change the type of FunctionSymbols so that you don't…
		sbc100AuthorUnsubmitted Not Done Reply Inline Actions Hmm... i just tried it and realized why I didn't do it like this in the first place. The problem is that FunctionSymbols is constructed as we parse the object, and before all symbols have been resolved, and at that time archive symbols are Lazy... and will eventually transform in the DefinedFunction symbols or DefinedGlobal symbols. And Lazy symbol can't be cast to either of those types. I could perhaps work around this by delaying the construction of this vector, or perhaps by introducing a LazyGlobal and LazyFunction subtype? sbc100: Hmm... i just tried it and realized why I didn't do it like this in the first place. The…
}		}

Symbol *getGlobalSymbol(uint32_t Index) const { return GlobalSymbols[Index]; }		GlobalSymbol *getGlobalSymbol(uint32_t Index) const {
		return cast<GlobalSymbol>(GlobalSymbols[Index]);
		}

private:		private:
uint32_t relocateVirtualAddress(uint32_t Index) const;		uint32_t relocateVirtualAddress(uint32_t Index) const;
uint32_t relocateTypeIndex(uint32_t Original) const;		uint32_t relocateTypeIndex(uint32_t Original) const;
uint32_t relocateGlobalIndex(uint32_t Original) const;		uint32_t relocateGlobalIndex(uint32_t Original) const;
uint32_t relocateTableIndex(uint32_t Original) const;		uint32_t relocateTableIndex(uint32_t Original) const;

Symbol *createDefined(const WasmSymbol &Sym, Symbol::Kind Kind,		Symbol createDefinedGlobal(const WasmSymbol &Sym, InputChunk Chunk,
InputChunk *Chunk = nullptr,		uint32_t Address);
uint32_t Address = UINT32_MAX);		Symbol createDefinedFunction(const WasmSymbol &Sym, InputChunk Chunk);
Symbol *createUndefined(const WasmSymbol &Sym, Symbol::Kind Kind,		Symbol *createUndefined(const WasmSymbol &Sym, Symbol::Kind Kind,
const WasmSignature *Signature = nullptr);		const WasmSignature *Signature = nullptr);
void initializeSymbols();		void initializeSymbols();
InputSegment *getSegment(const WasmSymbol &WasmSym) const;		InputSegment *getSegment(const WasmSymbol &WasmSym) const;
const WasmSignature *getFunctionSig(const WasmSymbol &Sym) const;		const WasmSignature *getFunctionSig(const WasmSymbol &Sym) const;
uint32_t getGlobalValue(const WasmSymbol &Sym) const;		uint32_t getGlobalValue(const WasmSymbol &Sym) const;
InputFunction *getFunction(const WasmSymbol &Sym) const;		InputFunction *getFunction(const WasmSymbol &Sym) const;
bool isExcludedByComdat(InputChunk *Chunk) const;		bool isExcludedByComdat(InputChunk *Chunk) const;
Show All 25 Lines

wasm/InputFiles.cpp

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	void ObjFile::dumpInfo() const {
log("info for: " + getName() + "\n" +		log("info for: " + getName() + "\n" +
" Total Functions : " + Twine(FunctionSymbols.size()) + "\n" +		" Total Functions : " + Twine(FunctionSymbols.size()) + "\n" +
" Total Globals : " + Twine(GlobalSymbols.size()) + "\n" +		" Total Globals : " + Twine(GlobalSymbols.size()) + "\n" +
" Function Imports : " + Twine(NumFunctionImports) + "\n" +		" Function Imports : " + Twine(NumFunctionImports) + "\n" +
" Global Imports : " + Twine(NumGlobalImports) + "\n");		" Global Imports : " + Twine(NumGlobalImports) + "\n");
}		}

uint32_t ObjFile::relocateVirtualAddress(uint32_t GlobalIndex) const {		uint32_t ObjFile::relocateVirtualAddress(uint32_t GlobalIndex) const {
return getGlobalSymbol(GlobalIndex)->getVirtualAddress();		if (auto *DG = dyn_cast<DefinedGlobal>(getGlobalSymbol(GlobalIndex)))
		return DG->getVirtualAddress();
		else
		dberrisUnsubmitted Not Done Reply Inline Actions Do you need the else here? dberris: Do you need the else here?
		sbc100AuthorUnsubmitted Not Done Reply Inline Actions I don't need it no, I know I know its not the lld style.. its just read better for me. I'll remove though. sbc100: I don't need it no, I know I know its not the lld style.. its just read better for me. I'll…
		return 0;
}		}

uint32_t ObjFile::relocateFunctionIndex(uint32_t Original) const {		uint32_t ObjFile::relocateFunctionIndex(uint32_t Original) const {
const Symbol *Sym = getFunctionSymbol(Original);		const FunctionSymbol *Sym = getFunctionSymbol(Original);
uint32_t Index = Sym->getOutputIndex();		uint32_t Index = Sym->getOutputIndex();
DEBUG(dbgs() << "relocateFunctionIndex: " << toString(*Sym) << ": "		DEBUG(dbgs() << "relocateFunctionIndex: " << toString(*Sym) << ": "
<< Original << " -> " << Index << "\n");		<< Original << " -> " << Index << "\n");
return Index;		return Index;
}		}

uint32_t ObjFile::relocateTypeIndex(uint32_t Original) const {		uint32_t ObjFile::relocateTypeIndex(uint32_t Original) const {
assert(TypeIsUsed[Original]);		assert(TypeIsUsed[Original]);
return TypeMap[Original];		return TypeMap[Original];
}		}

uint32_t ObjFile::relocateTableIndex(uint32_t Original) const {		uint32_t ObjFile::relocateTableIndex(uint32_t Original) const {
const Symbol *Sym = getFunctionSymbol(Original);		const FunctionSymbol *Sym = getFunctionSymbol(Original);
uint32_t Index = Sym->hasTableIndex() ? Sym->getTableIndex() : 0;		uint32_t Index = Sym->hasTableIndex() ? Sym->getTableIndex() : 0;
DEBUG(dbgs() << "relocateTableIndex: " << toString(*Sym) << ": " << Original		DEBUG(dbgs() << "relocateTableIndex: " << toString(*Sym) << ": " << Original
<< " -> " << Index << "\n");		<< " -> " << Index << "\n");
return Index;		return Index;
}		}

uint32_t ObjFile::relocateGlobalIndex(uint32_t Original) const {		uint32_t ObjFile::relocateGlobalIndex(uint32_t Original) const {
const Symbol *Sym = getGlobalSymbol(Original);		const Symbol *Sym = getGlobalSymbol(Original);
▲ Show 20 Lines • Show All 164 Lines • ▼ Show 20 Lines	void ObjFile::initializeSymbols() {
// in the object		// in the object
for (const SymbolRef &Sym : WasmObj->symbols()) {		for (const SymbolRef &Sym : WasmObj->symbols()) {
const WasmSymbol &WasmSym = WasmObj->getWasmSymbol(Sym.getRawDataRefImpl());		const WasmSymbol &WasmSym = WasmObj->getWasmSymbol(Sym.getRawDataRefImpl());
Symbol *S;		Symbol *S;
switch (WasmSym.Type) {		switch (WasmSym.Type) {
case WasmSymbol::SymbolType::FUNCTION_EXPORT: {		case WasmSymbol::SymbolType::FUNCTION_EXPORT: {
InputFunction *Function = getFunction(WasmSym);		InputFunction *Function = getFunction(WasmSym);
if (!isExcludedByComdat(Function)) {		if (!isExcludedByComdat(Function)) {
S = createDefined(WasmSym, Symbol::Kind::DefinedFunctionKind, Function);		S = createDefinedFunction(WasmSym, Function);
break;		break;
} else {		} else {
Function->Live = false;		Function->Live = false;
LLVM_FALLTHROUGH; // Exclude function, and add the symbol as undefined		LLVM_FALLTHROUGH; // Exclude function, and add the symbol as undefined
}		}
}		}
case WasmSymbol::SymbolType::FUNCTION_IMPORT:		case WasmSymbol::SymbolType::FUNCTION_IMPORT:
S = createUndefined(WasmSym, Symbol::Kind::UndefinedFunctionKind,		S = createUndefined(WasmSym, Symbol::Kind::UndefinedFunctionKind,
getFunctionSig(WasmSym));		getFunctionSig(WasmSym));
break;		break;
case WasmSymbol::SymbolType::GLOBAL_EXPORT: {		case WasmSymbol::SymbolType::GLOBAL_EXPORT: {
InputSegment *Segment = getSegment(WasmSym);		InputSegment *Segment = getSegment(WasmSym);
if (!isExcludedByComdat(Segment)) {		if (!isExcludedByComdat(Segment)) {
S = createDefined(WasmSym, Symbol::Kind::DefinedGlobalKind, Segment,		S = createDefinedGlobal(WasmSym, Segment, getGlobalValue(WasmSym));
getGlobalValue(WasmSym));
break;		break;
} else {		} else {
Segment->Live = false;		Segment->Live = false;
LLVM_FALLTHROUGH; // Exclude global, and add the symbol as undefined		LLVM_FALLTHROUGH; // Exclude global, and add the symbol as undefined
}		}
}		}
case WasmSymbol::SymbolType::GLOBAL_IMPORT:		case WasmSymbol::SymbolType::GLOBAL_IMPORT:
S = createUndefined(WasmSym, Symbol::Kind::UndefinedGlobalKind);		S = createUndefined(WasmSym, Symbol::Kind::UndefinedGlobalKind);
Show All 21 Lines	void ObjFile::initializeSymbols() {
DEBUG(dbgs() << "Globals : " << GlobalSymbols.size() << "\n");		DEBUG(dbgs() << "Globals : " << GlobalSymbols.size() << "\n");
}		}

Symbol *ObjFile::createUndefined(const WasmSymbol &Sym, Symbol::Kind Kind,		Symbol *ObjFile::createUndefined(const WasmSymbol &Sym, Symbol::Kind Kind,
const WasmSignature *Signature) {		const WasmSignature *Signature) {
return Symtab->addUndefined(Sym.Name, Kind, Sym.Flags, this, Signature);		return Symtab->addUndefined(Sym.Name, Kind, Sym.Flags, this, Signature);
}		}

Symbol *ObjFile::createDefined(const WasmSymbol &Sym, Symbol::Kind Kind,		Symbol *ObjFile::createDefinedFunction(const WasmSymbol &Sym,
InputChunk *Chunk, uint32_t Address) {		InputChunk *Chunk) {
Symbol *S;		if (Sym.isBindingLocal())
if (Sym.isBindingLocal()) {		return make<DefinedFunction>(Sym.Name, Sym.Flags, this, Chunk);
S = make<Symbol>(Sym.Name, true);		return Symtab->addDefined(true, Sym.Name, Sym.Flags, this, Chunk);
S->update(Kind, this, Sym.Flags, Chunk, Address);
return S;
}		}
return Symtab->addDefined(Sym.Name, Kind, Sym.Flags, this, Chunk, Address);
		Symbol ObjFile::createDefinedGlobal(const WasmSymbol &Sym, InputChunk Chunk,
		uint32_t Address) {
		if (Sym.isBindingLocal())
		return make<DefinedGlobal>(Sym.Name, Sym.Flags, this, Chunk, Address);
		return Symtab->addDefined(false, Sym.Name, Sym.Flags, this, Chunk, Address);
		ruiuUnsubmitted Not Done Reply Inline Actions Ideally it should return `FunctionSymbol `. ruiu:* Ideally it should return `FunctionSymbol *`.
}		}

void ArchiveFile::parse() {		void ArchiveFile::parse() {
// Parse a MemoryBufferRef as an archive file.		// Parse a MemoryBufferRef as an archive file.
DEBUG(dbgs() << "Parsing library: " << toString(this) << "\n");		DEBUG(dbgs() << "Parsing library: " << toString(this) << "\n");
File = CHECK(Archive::create(MB), toString(this));		File = CHECK(Archive::create(MB), toString(this));

// Read the symbol table to construct Lazy symbols.		// Read the symbol table to construct Lazy symbols.
▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

wasm/SymbolTable.h

Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	public:

void reportDuplicate(Symbol Existing, InputFile NewFile);		void reportDuplicate(Symbol Existing, InputFile NewFile);
void reportRemainingUndefines();		void reportRemainingUndefines();

ArrayRef<Symbol *> getSymbols() const { return SymVector; }		ArrayRef<Symbol *> getSymbols() const { return SymVector; }
Symbol *find(StringRef Name);		Symbol *find(StringRef Name);
ObjFile *findComdat(StringRef Name) const;		ObjFile *findComdat(StringRef Name) const;

Symbol *addDefined(StringRef Name, Symbol::Kind Kind, uint32_t Flags,		Symbol *addDefined(bool IsFunction, StringRef Name, uint32_t Flags,
InputFile F, InputChunk Chunk = nullptr,		InputFile F, InputChunk Chunk = nullptr,
uint32_t Address = 0);		uint32_t Address = 0);
Symbol *addUndefined(StringRef Name, Symbol::Kind Kind, uint32_t Flags,		Symbol *addUndefined(StringRef Name, Symbol::Kind Kind, uint32_t Flags,
InputFile F, const WasmSignature Signature = nullptr);		InputFile F, const WasmSignature Signature = nullptr);
Symbol addUndefinedFunction(StringRef Name, const WasmSignature Type);		Symbol addUndefinedFunction(StringRef Name, const WasmSignature Type);
void addLazy(ArchiveFile F, const Archive::Symbol Sym);		void addLazy(ArchiveFile F, const Archive::Symbol Sym);
bool addComdat(StringRef Name, ObjFile *);		bool addComdat(StringRef Name, ObjFile *);

Symbol *addSyntheticGlobal(StringRef Name);		DefinedGlobal *addSyntheticGlobal(StringRef Name, uint32_t Flags = 0);
Symbol addSyntheticFunction(StringRef Name, const WasmSignature Type,		DefinedFunction *addSyntheticFunction(StringRef Name,
uint32_t Flags);		const WasmSignature *Type,
		uint32_t Flags = 0);
private:		private:
std::pair<Symbol *, bool> insert(StringRef Name);		std::pair<Symbol *, bool> insert(StringRef Name);

llvm::DenseMap<llvm::CachedHashStringRef, ObjFile *> ComdatMap;		llvm::DenseMap<llvm::CachedHashStringRef, ObjFile *> ComdatMap;
llvm::DenseMap<llvm::CachedHashStringRef, Symbol *> SymMap;		llvm::DenseMap<llvm::CachedHashStringRef, Symbol *> SymMap;
std::vector<Symbol *> SymVector;		std::vector<Symbol *> SymVector;
};		};

extern SymbolTable *Symtab;		extern SymbolTable *Symtab;

} // namespace wasm		} // namespace wasm
} // namespace lld		} // namespace lld

#endif		#endif

wasm/SymbolTable.cpp

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	if (It == SymMap.end())
return nullptr;		return nullptr;
return It->second;		return It->second;
}		}

std::pair<Symbol *, bool> SymbolTable::insert(StringRef Name) {		std::pair<Symbol *, bool> SymbolTable::insert(StringRef Name) {
Symbol *&Sym = SymMap[CachedHashStringRef(Name)];		Symbol *&Sym = SymMap[CachedHashStringRef(Name)];
if (Sym)		if (Sym)
return {Sym, false};		return {Sym, false};
Sym = make<Symbol>(Name, false);		Sym = reinterpret_cast<Symbol *>(make<SymbolUnion>());
SymVector.emplace_back(Sym);		SymVector.emplace_back(Sym);
return {Sym, true};		return {Sym, true};
}		}

void SymbolTable::reportDuplicate(Symbol Existing, InputFile NewFile) {		void SymbolTable::reportDuplicate(Symbol Existing, InputFile NewFile) {
error("duplicate symbol: " + toString(*Existing) + "\n>>> defined in " +		error("duplicate symbol: " + toString(*Existing) + "\n>>> defined in " +
toString(Existing->getFile()) + "\n>>> defined in " +		toString(Existing->getFile()) + "\n>>> defined in " +
toString(NewFile));		toString(NewFile));
}		}

// Check the type of new symbol matches that of the symbol is replacing.		// Check the type of new symbol matches that of the symbol is replacing.
// For functions this can also involve verifying that the signatures match.		// For functions this can also involve verifying that the signatures match.
static void checkSymbolTypes(const Symbol &Existing, const InputFile &F,		static void checkSymbolTypes(const Symbol &Existing, const InputFile &F,
Symbol::Kind Kind, const WasmSignature *NewSig) {		bool NewIsFunction, const WasmSignature *NewSig) {
if (Existing.isLazy())		if (Existing.isLazy())
return;		return;

bool NewIsFunction = Kind == Symbol::Kind::UndefinedFunctionKind \|\|
Kind == Symbol::Kind::DefinedFunctionKind;

// First check the symbol types match (i.e. either both are function		// First check the symbol types match (i.e. either both are function
// symbols or both are data symbols).		// symbols or both are data symbols).
if (Existing.isFunction() != NewIsFunction) {		if (Existing.isFunction() != NewIsFunction) {
error("symbol type mismatch: " + Existing.getName() + "\n>>> defined as " +		error("symbol type mismatch: " + Existing.getName() + "\n>>> defined as " +
(Existing.isFunction() ? "Function" : "Global") + " in " +		(Existing.isFunction() ? "Function" : "Global") + " in " +
toString(Existing.getFile()) + "\n>>> defined as " +		toString(Existing.getFile()) + "\n>>> defined as " +
(NewIsFunction ? "Function" : "Global") + " in " + F.getName());		(NewIsFunction ? "Function" : "Global") + " in " + F.getName());
return;		return;
}		}

// For function symbols, optionally check the function signature matches too.		// For function symbols, optionally check the function signature matches too.
if (!NewIsFunction \|\| !Config->CheckSignatures)		auto *ExistingFunc = dyn_cast<FunctionSymbol>(&Existing);
		if (!ExistingFunc \|\| !Config->CheckSignatures)
return;		return;

// Skip the signature check if the existing function has no signature (e.g.		// Skip the signature check if the existing function has no signature (e.g.
// if it is an undefined symbol generated by --undefined command line flag).		// if it is an undefined symbol generated by --undefined command line flag).
if (!Existing.hasFunctionType())		if (!ExistingFunc->hasFunctionType())
return;		return;

DEBUG(dbgs() << "checkSymbolTypes: " << Existing.getName() << "\n");		DEBUG(dbgs() << "checkSymbolTypes: " << ExistingFunc->getName() << "\n");
assert(NewSig);		assert(NewSig);

const WasmSignature &OldSig = Existing.getFunctionType();		const WasmSignature &OldSig = ExistingFunc->getFunctionType();
if (*NewSig == OldSig)		if (*NewSig == OldSig)
return;		return;

error("function signature mismatch: " + Existing.getName() +		error("function signature mismatch: " + ExistingFunc->getName() +
"\n>>> defined as " + toString(OldSig) + " in " +		"\n>>> defined as " + toString(OldSig) + " in " +
toString(Existing.getFile()) + "\n>>> defined as " + toString(*NewSig) +		toString(ExistingFunc->getFile()) + "\n>>> defined as " +
" in " + F.getName());		toString(*NewSig) + " in " + F.getName());
}		}

static void checkSymbolTypes(const Symbol &Existing, const InputFile &F,		static void checkSymbolTypes(const Symbol &Existing, const InputFile &F,
Symbol::Kind Kind, const InputChunk *Chunk) {		bool IsFunction, const InputChunk *Chunk) {
const WasmSignature *Sig = nullptr;		const WasmSignature *Sig = nullptr;
if (auto *F = dyn_cast_or_null<InputFunction>(Chunk))		if (auto *F = dyn_cast_or_null<InputFunction>(Chunk))
Sig = &F->Signature;		Sig = &F->Signature;
return checkSymbolTypes(Existing, F, Kind, Sig);		return checkSymbolTypes(Existing, F, IsFunction, Sig);
}		}

Symbol *SymbolTable::addSyntheticFunction(StringRef Name,		DefinedFunction *SymbolTable::addSyntheticFunction(StringRef Name,
const WasmSignature *Type,		const WasmSignature *Type,
uint32_t Flags) {		uint32_t Flags) {
DEBUG(dbgs() << "addSyntheticFunction: " << Name << "\n");		DEBUG(dbgs() << "addSyntheticFunction: " << Name << "\n");
Symbol *S;		Symbol *S;
bool WasInserted;		bool WasInserted;
std::tie(S, WasInserted) = insert(Name);		std::tie(S, WasInserted) = insert(Name);
assert(WasInserted);		assert(WasInserted);
S->update(Symbol::DefinedFunctionKind, nullptr, Flags);		return replaceSymbol<DefinedFunction>(S, Name, Flags, Type);
S->setFunctionType(Type);
return S;
}		}

Symbol *SymbolTable::addSyntheticGlobal(StringRef Name) {		DefinedGlobal *SymbolTable::addSyntheticGlobal(StringRef Name, uint32_t Flags) {
DEBUG(dbgs() << "addSyntheticGlobal: " << Name << "\n");		DEBUG(dbgs() << "addSyntheticGlobal: " << Name << "\n");
Symbol *S;		Symbol *S;
bool WasInserted;		bool WasInserted;
std::tie(S, WasInserted) = insert(Name);		std::tie(S, WasInserted) = insert(Name);
assert(WasInserted);		assert(WasInserted);
S->update(Symbol::DefinedGlobalKind);		return replaceSymbol<DefinedGlobal>(S, Name, Flags);
return S;
}		}

Symbol *SymbolTable::addDefined(StringRef Name, Symbol::Kind Kind,		Symbol *SymbolTable::addDefined(bool IsFunction, StringRef Name, uint32_t Flags,
uint32_t Flags, InputFile F, InputChunk Chunk,		InputFile F, InputChunk Chunk,
		ruiuUnsubmitted Not Done Reply Inline Actions It is not clear to me why we need addDefined function even though we have addDefinedFunction and addDefinedGlobal functions. What is this for? ruiu: It is not clear to me why we need addDefined function even though we have addDefinedFunction…
		sbc100AuthorUnsubmitted Not Done Reply Inline Actions Renamed to reflect their purpose .. addSyntheticGlobal etc.. sbc100: Renamed to reflect their purpose .. addSyntheticGlobal etc..
uint32_t Address) {		uint32_t Address) {
DEBUG(dbgs() << "addDefined: " << Name << " addr:" << Address << "\n");		if (IsFunction)
		DEBUG(dbgs() << "addDefined: func:" << Name << "\n");
		else
		DEBUG(dbgs() << "addDefined: global:" << Name << " addr:" << Address
		<< "\n");
Symbol *S;		Symbol *S;
bool WasInserted;		bool WasInserted;
		bool Replace = false;
		bool CheckTypes = false;

std::tie(S, WasInserted) = insert(Name);		std::tie(S, WasInserted) = insert(Name);
if (WasInserted) {		if (WasInserted) {
S->update(Kind, F, Flags, Chunk, Address);		Replace = true;
} else if (S->isLazy()) {		} else if (S->isLazy()) {
// The existing symbol is lazy. Replace it without checking types since		// Existing symbol is lazy. Replace it without checking types since
// lazy symbols don't have any type information.		// lazy symbols don't have any type information.
DEBUG(dbgs() << "replacing existing lazy symbol: " << Name << "\n");		DEBUG(dbgs() << "replacing existing lazy symbol: " << Name << "\n");
S->update(Kind, F, Flags, Chunk, Address);		Replace = true;
} else if (!S->isDefined()) {		} else if (!S->isDefined()) {
// The existing symbol table entry is undefined. The new symbol replaces		// Existing symbol is undefined: replace it, while check types.
// it, after checking the type matches
DEBUG(dbgs() << "resolving existing undefined symbol: " << Name << "\n");		DEBUG(dbgs() << "resolving existing undefined symbol: " << Name << "\n");
checkSymbolTypes(S, F, Kind, Chunk);		Replace = true;
S->update(Kind, F, Flags, Chunk, Address);		CheckTypes = true;
} else if ((Flags & WASM_SYMBOL_BINDING_MASK) == WASM_SYMBOL_BINDING_WEAK) {		} else if ((Flags & WASM_SYMBOL_BINDING_MASK) == WASM_SYMBOL_BINDING_WEAK) {
// the new symbol is weak we can ignore it		// the new symbol is weak we can ignore it
DEBUG(dbgs() << "existing symbol takes precedence\n");		DEBUG(dbgs() << "existing symbol takes precedence\n");
} else if (S->isWeak()) {		} else if (S->isWeak()) {
// the new symbol is not weak and the existing symbol is, so we replace		// the existing symbol is, so we replace it
// it
DEBUG(dbgs() << "replacing existing weak symbol\n");		DEBUG(dbgs() << "replacing existing weak symbol\n");
checkSymbolTypes(S, F, Kind, Chunk);		Replace = true;
S->update(Kind, F, Flags, Chunk, Address);		CheckTypes = true;
} else {		} else {
// neither symbol is week. They conflict.		// neither symbol is week. They conflict.
reportDuplicate(S, F);		reportDuplicate(S, F);
}		}
return S;
}

Symbol *SymbolTable::addUndefinedFunction(StringRef Name,		if (Replace) {
const WasmSignature *Type) {		if (CheckTypes)
DEBUG(dbgs() << "addUndefinedFunction: " << Name << "\n");		checkSymbolTypes(S, F, IsFunction, Chunk);
Symbol *S;		if (IsFunction)
bool WasInserted;		replaceSymbol<DefinedFunction>(S, Name, Flags, F, Chunk);
std::tie(S, WasInserted) = insert(Name);		else
if (WasInserted) {		replaceSymbol<DefinedGlobal>(S, Name, Flags, F, Chunk, Address);
S->update(Symbol::UndefinedFunctionKind);
S->setFunctionType(Type);
} else if (!S->isFunction()) {
error("symbol type mismatch: " + Name);
}		}
return S;		return S;
}		}

Symbol *SymbolTable::addUndefined(StringRef Name, Symbol::Kind Kind,		Symbol *SymbolTable::addUndefined(StringRef Name, Symbol::Kind Kind,
		ruiuUnsubmitted Not Done Reply Inline Actions Ditto -- having addUndefinedFunction and addUndefined doesn't seem orthogonal because the former seems like a subset of the latter. ruiu: Ditto -- having addUndefinedFunction and addUndefined doesn't seem orthogonal because the…
uint32_t Flags, InputFile *F,		uint32_t Flags, InputFile *F,
const WasmSignature *Type) {		const WasmSignature *Type) {
DEBUG(dbgs() << "addUndefined: " << Name << "\n");		DEBUG(dbgs() << "addUndefined: " << Name << "\n");
Symbol *S;		Symbol *S;
bool WasInserted;		bool WasInserted;
std::tie(S, WasInserted) = insert(Name);		std::tie(S, WasInserted) = insert(Name);
		bool IsFunction = Kind == Symbol::UndefinedFunctionKind;
if (WasInserted) {		if (WasInserted) {
S->update(Kind, F, Flags);		if (IsFunction)
if (Type)		replaceSymbol<UndefinedFunction>(S, Name, Flags, F, Type);
S->setFunctionType(Type);		else
} else if (S->isLazy()) {		replaceSymbol<UndefinedGlobal>(S, Name, Flags, F);
		} else if (auto *LazySym = dyn_cast<LazySymbol>(S)) {
DEBUG(dbgs() << "resolved by existing lazy\n");		DEBUG(dbgs() << "resolved by existing lazy\n");
auto *AF = cast<ArchiveFile>(S->getFile());		auto *AF = cast<ArchiveFile>(LazySym->getFile());
AF->addMember(&S->getArchiveSymbol());		AF->addMember(&LazySym->getArchiveSymbol());
} else if (S->isDefined()) {		} else if (S->isDefined()) {
DEBUG(dbgs() << "resolved by existing\n");		DEBUG(dbgs() << "resolved by existing\n");
checkSymbolTypes(S, F, Kind, Type);		checkSymbolTypes(S, F, IsFunction, Type);
}		}
return S;		return S;
}		}

void SymbolTable::addLazy(ArchiveFile F, const Archive::Symbol Sym) {		void SymbolTable::addLazy(ArchiveFile F, const Archive::Symbol Sym) {
DEBUG(dbgs() << "addLazy: " << Sym->getName() << "\n");		DEBUG(dbgs() << "addLazy: " << Sym->getName() << "\n");
StringRef Name = Sym->getName();		StringRef Name = Sym->getName();
Symbol *S;		Symbol *S;
bool WasInserted;		bool WasInserted;
std::tie(S, WasInserted) = insert(Name);		std::tie(S, WasInserted) = insert(Name);
if (WasInserted) {		if (WasInserted) {
S->update(Symbol::LazyKind, F);		replaceSymbol<LazySymbol>(S, Name, F, *Sym);
S->setArchiveSymbol(*Sym);
} else if (S->isUndefined()) {		} else if (S->isUndefined()) {
// There is an existing undefined symbol. The can load from the		// There is an existing undefined symbol. The can load from the
// archive.		// archive.
DEBUG(dbgs() << "replacing existing undefined\n");		DEBUG(dbgs() << "replacing existing undefined\n");
F->addMember(Sym);		F->addMember(Sym);
}		}
}		}

Show All 15 Lines

wasm/Symbols.h

	Show All 17 Lines
	using llvm::wasm::WasmSignature;			using llvm::wasm::WasmSignature;

	namespace lld {			namespace lld {
	namespace wasm {			namespace wasm {

	class InputFile;			class InputFile;
	class InputChunk;			class InputChunk;

				// The base class for real symbol classes.
	class Symbol {			class Symbol {
	public:			public:
	enum Kind {			enum Kind {
	DefinedFunctionKind,			DefinedFunctionKind,
	DefinedGlobalKind,			DefinedGlobalKind,

	LazyKind,			LazyKind,
	UndefinedFunctionKind,			UndefinedFunctionKind,
	UndefinedGlobalKind,			UndefinedGlobalKind,

	LastDefinedKind = DefinedGlobalKind,			LastDefinedKind = DefinedGlobalKind,
	InvalidKind,			InvalidKind,
	};			};

	Symbol(StringRef Name, uint32_t Flags) : Flags(Flags), Name(Name) {}			Kind kind() const { return static_cast<Kind>(SymbolKind); }

	Kind getKind() const { return SymbolKind; }

	bool isLazy() const { return SymbolKind == LazyKind; }			bool isLazy() const { return SymbolKind == LazyKind; }
	bool isDefined() const { return SymbolKind <= LastDefinedKind; }			bool isDefined() const { return SymbolKind <= LastDefinedKind; }
	bool isUndefined() const {			bool isUndefined() const {
	return SymbolKind == UndefinedGlobalKind \|\|			return SymbolKind == UndefinedGlobalKind \|\|
	SymbolKind == UndefinedFunctionKind;			SymbolKind == UndefinedFunctionKind;
	}			}
	bool isFunction() const {			bool isFunction() const {
	return SymbolKind == DefinedFunctionKind \|\|			return SymbolKind == DefinedFunctionKind \|\|
	SymbolKind == UndefinedFunctionKind;			SymbolKind == UndefinedFunctionKind;
	}			}
	bool isGlobal() const { return !isFunction(); }			bool isGlobal() const { return !isFunction(); }
	bool isLocal() const;			bool isLocal() const;
	bool isWeak() const;			bool isWeak() const;
	bool isHidden() const;			bool isHidden() const;

	// Returns the symbol name.			// Returns the symbol name.
	StringRef getName() const { return Name; }			StringRef getName() const { return Name; }

	// Returns the file from which this symbol was created.			// Returns the file from which this symbol was created.
	InputFile *getFile() const { return File; }			InputFile *getFile() const { return File; }
	InputChunk *getChunk() const { return Chunk; }			InputChunk *getChunk() const { return Chunk; }

	bool hasFunctionType() const { return FunctionType != nullptr; }
	const WasmSignature &getFunctionType() const;
	void setFunctionType(const WasmSignature *Type);
	void setHidden(bool IsHidden);			void setHidden(bool IsHidden);

	uint32_t getOutputIndex() const;			uint32_t getOutputIndex() const;

	// Returns true if an output index has been set for this symbol			// Returns true if an output index has been set for this symbol
	bool hasOutputIndex() const;			bool hasOutputIndex() const;

	// Set the output index of the symbol (in the function or global index			// Set the output index of the symbol (in the function or global index
	// space of the output object.			// space of the output object.
	void setOutputIndex(uint32_t Index);			void setOutputIndex(uint32_t Index);

				protected:
				Symbol(StringRef Name, Kind K, uint32_t Flags, InputFile F, InputChunk C)
				: Name(Name), SymbolKind(K), Flags(Flags), File(F), Chunk(C) {}

				StringRef Name;
				Kind SymbolKind;
				uint32_t Flags;
				InputFile *File;
				InputChunk *Chunk;
				llvm::Optional<uint32_t> OutputIndex;
				};

				class FunctionSymbol : public Symbol {
				ruiuUnsubmitted Not Done Reply Inline Actions This class hierarchy is interesting Symbol FunctionSymbol DefinedFunction UndefinedFunction GlobalSymbol DefinedGlobal UndefinedGlobal Lazy because in ELF and COFF, defined/undefined is a broader category than function/non-function. But I can see that your class hierarchy just represents how symbols are organized in wasm. (That's one of the reasons why I think that the current design of lld, in which all ports share only the design but not a concrete implementation. If we had to implement coff/elf/wasm/etc on all "unified" code base, that would have been extremely hard to do.) ruiu: This class hierarchy is interesting Symbol FunctionSymbol DefinedFunction…
				sbc100AuthorUnsubmitted Not Done Reply Inline Actions Indeed. It is quite major difference from ELF here. It only occurred to be when doing this work that ELF doesn't have different undefined types. Whereas in wasm an undefined symbol has a lot of type information (for example undefined functions carry their type signature). We are a little constrained by your typical OO class hierarchy here because, for example, under this scheme there is no common base class for defined or undefined symbols, not that its a huge problem. I think this scheme is still the best options. sbc100: Indeed. It is quite major difference from ELF here. It only occurred to be when doing this…
				public:
				static bool classof(const Symbol *S) {
				return S->kind() == DefinedFunctionKind \|\|
				S->kind() == UndefinedFunctionKind;
				}

				bool hasFunctionType() const { return FunctionType != nullptr; }
				const WasmSignature &getFunctionType() const;

	uint32_t getTableIndex() const;			uint32_t getTableIndex() const;

	// Returns true if a table index has been set for this symbol			// Returns true if a table index has been set for this symbol
	bool hasTableIndex() const;			bool hasTableIndex() const;

	// Set the table index of the symbol			// Set the table index of the symbol
	void setTableIndex(uint32_t Index);			void setTableIndex(uint32_t Index);

	// Returns the virtual address of a defined global.			protected:
	// Only works for globals, not functions.			void setFunctionType(const WasmSignature *Type);

				FunctionSymbol(StringRef Name, Kind K, uint32_t Flags, InputFile *F,
				InputChunk *C)
				: Symbol(Name, K, Flags, F, C) {}

				llvm::Optional<uint32_t> TableIndex;

				// Explict function type, needed for undefined or synthetic functions only.
				const WasmSignature *FunctionType = nullptr;
				};

				class DefinedFunction : public FunctionSymbol {
				public:
				DefinedFunction(StringRef Name, uint32_t Flags, InputFile *F = nullptr,
				InputChunk *C = nullptr)
				: FunctionSymbol(Name, DefinedFunctionKind, Flags, F, C) {}

				DefinedFunction(StringRef Name, uint32_t Flags, const WasmSignature *Type)
				ruiuUnsubmitted Done Reply Inline Actions Can you move UndefinedFunction here so that related classes are adjacent in source code? ruiu: Can you move UndefinedFunction here so that related classes are adjacent in source code?
				: FunctionSymbol(Name, DefinedFunctionKind, Flags, nullptr, nullptr) {
				setFunctionType(Type);
				}

				static bool classof(const Symbol *S) {
				return S->kind() == DefinedFunctionKind;
				}
				};

				class UndefinedFunction : public FunctionSymbol {
				public:
				UndefinedFunction(StringRef Name, uint32_t Flags, InputFile *File = nullptr,
				const WasmSignature *Type = nullptr)
				: FunctionSymbol(Name, UndefinedFunctionKind, Flags, File, nullptr) {
				setFunctionType(Type);
				}

				static bool classof(const Symbol *S) {
				return S->kind() == UndefinedFunctionKind;
				}
				};

				class GlobalSymbol : public Symbol {
				public:
				static bool classof(const Symbol *S) {
				return S->kind() == DefinedGlobalKind \|\| S->kind() == UndefinedGlobalKind;
				}

				protected:
				GlobalSymbol(StringRef Name, Kind K, uint32_t Flags, InputFile *F,
				InputChunk *C)
				: Symbol(Name, K, Flags, F, C) {}
				};

				class DefinedGlobal : public GlobalSymbol {
				public:
				DefinedGlobal(StringRef Name, uint32_t Flags, InputFile *F = nullptr,
				InputChunk *C = nullptr, uint32_t Address = 0)
				: GlobalSymbol(Name, DefinedGlobalKind, Flags, F, C),
				VirtualAddress(Address) {}

				static bool classof(const Symbol *S) {
				return S->kind() == DefinedGlobalKind;
				}

	uint32_t getVirtualAddress() const;			uint32_t getVirtualAddress() const;

	void setVirtualAddress(uint32_t VA);			void setVirtualAddress(uint32_t VA);

	void update(Kind K, InputFile *F = nullptr, uint32_t Flags = 0,			protected:
	InputChunk *chunk = nullptr, uint32_t Address = UINT32_MAX);			uint32_t VirtualAddress;
				};

				class UndefinedGlobal : public GlobalSymbol {
				public:
				UndefinedGlobal(StringRef Name, uint32_t Flags, InputFile *File = nullptr)
				: GlobalSymbol(Name, UndefinedGlobalKind, Flags, File, nullptr) {}
				static bool classof(const Symbol *S) {
				return S->kind() == UndefinedGlobalKind;
				}
				};

				class LazySymbol : public Symbol {
				public:
				LazySymbol(StringRef Name, InputFile *File, const Archive::Symbol &Sym)
				: Symbol(Name, LazyKind, 0, File, nullptr), ArchiveSymbol(Sym) {}

				static bool classof(const Symbol *S) { return S->kind() == LazyKind; }

	void setArchiveSymbol(const Archive::Symbol &Sym) { ArchiveSymbol = Sym; }
	const Archive::Symbol &getArchiveSymbol() { return ArchiveSymbol; }			const Archive::Symbol &getArchiveSymbol() { return ArchiveSymbol; }

	protected:			protected:
	uint32_t Flags;			Archive::Symbol ArchiveSymbol;
	uint32_t VirtualAddress = 0;

	StringRef Name;
	Archive::Symbol ArchiveSymbol = {nullptr, 0, 0};
	Kind SymbolKind = InvalidKind;
	InputFile *File = nullptr;
	InputChunk *Chunk = nullptr;
	llvm::Optional<uint32_t> OutputIndex;
	llvm::Optional<uint32_t> TableIndex;
	const WasmSignature *FunctionType = nullptr;
	};			};

	// linker-generated symbols			// linker-generated symbols
	struct WasmSym {			struct WasmSym {
	// __stack_pointer			// __stack_pointer
	// Global that holds the address of the top of the explicit value stack in			// Global that holds the address of the top of the explicit value stack in
	// linear memory.			// linear memory.
	static Symbol *StackPointer;			static DefinedGlobal *StackPointer;

	// __data_end			// __data_end
	// Symbol marking the end of the data and bss.			// Symbol marking the end of the data and bss.
	static Symbol *DataEnd;			static DefinedGlobal *DataEnd;

	// __heap_base			// __heap_base
	// Symbol marking the end of the data, bss and explicit stack. Any linear			// Symbol marking the end of the data, bss and explicit stack. Any linear
	// memory following this address is not used by the linked code and can			// memory following this address is not used by the linked code and can
	// therefore be used as a backing store for brk()/malloc() implementations.			// therefore be used as a backing store for brk()/malloc() implementations.
	static Symbol *HeapBase;			static DefinedGlobal *HeapBase;

	// __wasm_call_ctors			// __wasm_call_ctors
	// Function that directly calls all ctors in priority order.			// Function that directly calls all ctors in priority order.
	static Symbol *CallCtors;			static DefinedFunction *CallCtors;

	// __dso_handle			// __dso_handle
	// Global used in calls to __cxa_atexit to determine current DLL			// Global used in calls to __cxa_atexit to determine current DLL
	static Symbol *DsoHandle;			static DefinedGlobal *DsoHandle;
	};			};

				// A buffer class that is large enough to hold any Symbol-derived
				// object. We allocate memory using this class and instantiate a symbol
				// using the placement new.
				union SymbolUnion {
				alignas(DefinedFunction) char A[sizeof(DefinedFunction)];
				alignas(DefinedGlobal) char B[sizeof(DefinedGlobal)];
				alignas(LazySymbol) char C[sizeof(LazySymbol)];
				alignas(UndefinedFunction) char D[sizeof(UndefinedFunction)];
				alignas(UndefinedGlobal) char E[sizeof(UndefinedFunction)];
				};

				template <typename T, typename... ArgT>
				T replaceSymbol(Symbol S, ArgT &&... Arg) {
				static_assert(std::is_trivially_destructible<T>(),
				"Symbol types must be trivially destructible");
				static_assert(sizeof(T) <= sizeof(SymbolUnion), "Symbol too small");
				static_assert(alignof(T) <= alignof(SymbolUnion),
				"SymbolUnion not aligned enough");
				assert(static_cast<Symbol >(static_cast<T >(nullptr)) == nullptr &&
				"Not a Symbol");
				return new (S) T(std::forward<ArgT>(Arg)...);
				}

	} // namespace wasm			} // namespace wasm

	// Returns a symbol name for an error message.			// Returns a symbol name for an error message.
	std::string toString(const wasm::Symbol &Sym);			std::string toString(const wasm::Symbol &Sym);
	std::string toString(wasm::Symbol::Kind Kind);			std::string toString(wasm::Symbol::Kind Kind);

	} // namespace lld			} // namespace lld

				ncwUnsubmitted Not Done Reply Inline Actions Yikes, this is scary! For one, doing `make<SymbolUnion>` means that the actual symbol's destructor won't be called at the program end. We're relying on the Symbol classes being "simple" with trivial destructors, but what if someone forgets and sticks a std::vector in there as a member...? It's not ideal, quite fragile. And similarly, replaceSymbol doesn't deallocate the previous data in the union, in just writes straight over it. The new object will be OK, but the previous one will leak any members. Again not a problem as long as the Symbols all have trivial dtors, but it feels like an accident waiting to happen. Is it technically Undefined Behaviour? You seem to be relying on the exact details of the base-to-derived pointer adjustments. First we allocate the union, then reinterpret-cast it to a `Symbol`, then placement-construct an UndefinedFunction symbol. Then the first bit of UB happens, we assume that the newly-constructed UndefinedFunction, when cast to a Symbol, has the same address. That is, we assume the following: `(void)(BaseClass)(new (addr) DerivedClass) == (void)addr` Then the second bit of UB happens when we do replaceSymbol and make the same assumption again, but in replaceSymbol we further assume that all derived classes will construct their Symbol base at the same offset within the memory block we're placement-constructing them in (it's just another assumption about object layout). The basic assumption is that base classes are constructed before the derived class, and the base class is at offset zero within the memory when placement-constructed at a specific address. I can see it's just copying the existing LLD code. What you're trying to achieve is to dynamically change the derived type of an object, so that previously-created pointers to the base class remain valid as pointers to the new object's base class. There is a solution I can think of that's "safe". Instead of doing `reinterpret_cast<Symbol>(make<SymbolUnion>())`, why not instead put the Kind member next to the union, something like this: `struct SymbolHandle { int Kind; SymbolUnion U; }` and then store pointers to the union-with-kind. You can then safely destruct the union using a switch, and safely replace the union too; and finally, you can add automatic casting operators that allow casting a SymbolHandle to an `UndefinedFunction&` etc with an assertion on Kind and a cast on the appropriate member of the union. ncw:* Yikes, this is scary! 1. For one, doing `make<SymbolUnion>` means that the actual symbol's…
				sbc100AuthorUnsubmitted Not Done Reply Inline Actions I think perhaps you raise some good points, but I see this as more of an lld wide discussion. The other ports have always working in this way AFAICT, and it was always my intention to have the wasm port do the same thing. Perhaps @ruiu can add some documentation about this technique, why it is actually safe in the this context, and what the motivating factors are for using it. In any case I don't think we should block this change in this design discussion. Lets make the linkers consistent and iterate (together) from there. sbc100: I think perhaps you raise some good points, but I see this as more of an lld wide discussion.
				sbc100AuthorUnsubmitted Not Done Reply Inline Actions FWIW, it looks it should be easy to protect against (1) and (2) by adding: static_assert(std::is_trivially_destructible<T>(), "Symbol types must be trivially destructible"); The spec on this explicitly says `Storage occupied by trivially destructible objects may be reused without calling the destructor.`. That still leaves the concerns you raise in (3) of course. sbc100: FWIW, it looks it should be easy to protect against (1) and (2) by adding: ``` static_assert…
				ncwUnsubmitted Not Done Reply Inline Actions "have always worked this way" - I think it's actually recent, looks like the change dates to 31 Oct 2017. Agreed that it's OK to change Wasm to match. ncw: "have always worked this way" - I think it's actually recent, looks like the change dates to 31…
				sbc100AuthorUnsubmitted Not Done Reply Inline Actions I guess so. I was thinking that they have always done similar tricks involving in-place replacement of the symbols with sub-types of symbol. But I don't know if the old behaviour was more or less risky. sbc100: I guess so. I was thinking that they have always done similar tricks involving in-place…
	#endif			#endif

wasm/Symbols.cpp

	Show All 16 Lines

	#define DEBUG_TYPE "lld"			#define DEBUG_TYPE "lld"

	using namespace llvm;			using namespace llvm;
	using namespace llvm::wasm;			using namespace llvm::wasm;
	using namespace lld;			using namespace lld;
	using namespace lld::wasm;			using namespace lld::wasm;

	Symbol *WasmSym::CallCtors;			DefinedFunction *WasmSym::CallCtors;
	Symbol *WasmSym::DsoHandle;			DefinedGlobal *WasmSym::DsoHandle;
	Symbol *WasmSym::DataEnd;			DefinedGlobal *WasmSym::DataEnd;
	Symbol *WasmSym::HeapBase;			DefinedGlobal *WasmSym::HeapBase;
	Symbol *WasmSym::StackPointer;			DefinedGlobal *WasmSym::StackPointer;

	const WasmSignature &Symbol::getFunctionType() const {
	if (Chunk != nullptr)
	return dyn_cast<InputFunction>(Chunk)->Signature;

	assert(FunctionType != nullptr);
	return *FunctionType;
	}

	void Symbol::setFunctionType(const WasmSignature *Type) {
	assert(FunctionType == nullptr);
	assert(!Chunk);
	FunctionType = Type;
	}

	uint32_t Symbol::getVirtualAddress() const {
	assert(isGlobal());
	DEBUG(dbgs() << "getVirtualAddress: " << getName() << "\n");
	return Chunk ? dyn_cast<InputSegment>(Chunk)->translateVA(VirtualAddress)
	: VirtualAddress;
	}

	bool Symbol::hasOutputIndex() const {			bool Symbol::hasOutputIndex() const {
	if (auto *F = dyn_cast_or_null<InputFunction>(Chunk))			if (auto *F = dyn_cast_or_null<InputFunction>(Chunk))
	return F->hasOutputIndex();			return F->hasOutputIndex();
	return OutputIndex.hasValue();			return OutputIndex.hasValue();
	}			}

	uint32_t Symbol::getOutputIndex() const {			uint32_t Symbol::getOutputIndex() const {
	if (auto *F = dyn_cast_or_null<InputFunction>(Chunk))			if (auto *F = dyn_cast_or_null<InputFunction>(Chunk))
	return F->getOutputIndex();			return F->getOutputIndex();
	return OutputIndex.getValue();			return OutputIndex.getValue();
	}			}

	void Symbol::setVirtualAddress(uint32_t Value) {
	DEBUG(dbgs() << "setVirtualAddress " << Name << " -> " << Value << "\n");
	assert(isGlobal());
	VirtualAddress = Value;
	}

	void Symbol::setOutputIndex(uint32_t Index) {			void Symbol::setOutputIndex(uint32_t Index) {
	DEBUG(dbgs() << "setOutputIndex " << Name << " -> " << Index << "\n");			DEBUG(dbgs() << "setOutputIndex " << Name << " -> " << Index << "\n");
	assert(!dyn_cast_or_null<InputFunction>(Chunk));			assert(!dyn_cast_or_null<InputFunction>(Chunk));
	assert(!OutputIndex.hasValue());			assert(!OutputIndex.hasValue());
	OutputIndex = Index;			OutputIndex = Index;
	}			}

	uint32_t Symbol::getTableIndex() const {			bool Symbol::isWeak() const {
				return (Flags & WASM_SYMBOL_BINDING_MASK) == WASM_SYMBOL_BINDING_WEAK;
				}

				bool Symbol::isLocal() const {
				return (Flags & WASM_SYMBOL_BINDING_MASK) == WASM_SYMBOL_BINDING_LOCAL;
				}

				bool Symbol::isHidden() const {
				return (Flags & WASM_SYMBOL_VISIBILITY_MASK) == WASM_SYMBOL_VISIBILITY_HIDDEN;
				}

				void Symbol::setHidden(bool IsHidden) {
				DEBUG(dbgs() << "setHidden: " << Name << " -> " << IsHidden << "\n");
				Flags &= ~WASM_SYMBOL_VISIBILITY_MASK;
				if (IsHidden)
				Flags \|= WASM_SYMBOL_VISIBILITY_HIDDEN;
				else
				Flags \|= WASM_SYMBOL_VISIBILITY_DEFAULT;
				}

				const WasmSignature &FunctionSymbol::getFunctionType() const {
				if (auto *F = dyn_cast_or_null<InputFunction>(Chunk))
				return F->Signature;

				assert(FunctionType != nullptr);
				return *FunctionType;
				}

				void FunctionSymbol::setFunctionType(const WasmSignature *Type) {
				assert(FunctionType == nullptr);
				assert(!Chunk);
				FunctionType = Type;
				}

				uint32_t FunctionSymbol::getTableIndex() const {
	if (auto *F = dyn_cast_or_null<InputFunction>(Chunk))			if (auto *F = dyn_cast_or_null<InputFunction>(Chunk))
	return F->getTableIndex();			return F->getTableIndex();
	return TableIndex.getValue();			return TableIndex.getValue();
	}			}

	bool Symbol::hasTableIndex() const {			bool FunctionSymbol::hasTableIndex() const {
	if (auto *F = dyn_cast_or_null<InputFunction>(Chunk))			if (auto *F = dyn_cast_or_null<InputFunction>(Chunk))
	return F->hasTableIndex();			return F->hasTableIndex();
	return TableIndex.hasValue();			return TableIndex.hasValue();
	}			}

	void Symbol::setTableIndex(uint32_t Index) {			void FunctionSymbol::setTableIndex(uint32_t Index) {
	// For imports, we set the table index here on the Symbol; for defined			// For imports, we set the table index here on the Symbol; for defined
	// functions we set the index on the InputFunction so that we don't export			// functions we set the index on the InputFunction so that we don't export
	// the same thing twice (keeps the table size down).			// the same thing twice (keeps the table size down).
	if (auto *F = dyn_cast_or_null<InputFunction>(Chunk)) {			if (auto *F = dyn_cast_or_null<InputFunction>(Chunk)) {
	F->setTableIndex(Index);			F->setTableIndex(Index);
	return;			return;
	}			}
	DEBUG(dbgs() << "setTableIndex " << Name << " -> " << Index << "\n");			DEBUG(dbgs() << "setTableIndex " << Name << " -> " << Index << "\n");
	assert(!TableIndex.hasValue());			assert(!TableIndex.hasValue());
	TableIndex = Index;			TableIndex = Index;
	}			}

	void Symbol::update(Kind K, InputFile F, uint32_t Flags_, InputChunk Chunk_,			uint32_t DefinedGlobal::getVirtualAddress() const {
	uint32_t Address) {			assert(isGlobal());
	SymbolKind = K;			DEBUG(dbgs() << "getVirtualAddress: " << getName() << "\n");
	File = F;			return Chunk ? dyn_cast<InputSegment>(Chunk)->translateVA(VirtualAddress)
	Flags = Flags_;			: VirtualAddress;
	Chunk = Chunk_;
	if (Address != UINT32_MAX)
	setVirtualAddress(Address);
	}

	bool Symbol::isWeak() const {
	return (Flags & WASM_SYMBOL_BINDING_MASK) == WASM_SYMBOL_BINDING_WEAK;
	}

	bool Symbol::isLocal() const {
	return (Flags & WASM_SYMBOL_BINDING_MASK) == WASM_SYMBOL_BINDING_LOCAL;
	}

	bool Symbol::isHidden() const {
	return (Flags & WASM_SYMBOL_VISIBILITY_MASK) == WASM_SYMBOL_VISIBILITY_HIDDEN;
	}			}

	void Symbol::setHidden(bool IsHidden) {			void DefinedGlobal::setVirtualAddress(uint32_t Value) {
	DEBUG(dbgs() << "setHidden: " << Name << " -> " << IsHidden << "\n");			DEBUG(dbgs() << "setVirtualAddress " << Name << " -> " << Value << "\n");
	Flags &= ~WASM_SYMBOL_VISIBILITY_MASK;			assert(isGlobal());
	if (IsHidden)			VirtualAddress = Value;
	Flags \|= WASM_SYMBOL_VISIBILITY_HIDDEN;
	else
	Flags \|= WASM_SYMBOL_VISIBILITY_DEFAULT;
	}			}

	std::string lld::toString(const wasm::Symbol &Sym) {			std::string lld::toString(const wasm::Symbol &Sym) {
	if (Config->Demangle)			if (Config->Demangle)
	if (Optional<std::string> S = demangleItanium(Sym.getName()))			if (Optional<std::string> S = demangleItanium(Sym.getName()))
	return "`" + *S + "'";			return "`" + *S + "'";
	return Sym.getName();			return Sym.getName();
	}			}
	Show All 16 Lines

wasm/Writer.cpp

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	private:
void writeSections();		void writeSections();

uint64_t FileSize = 0;		uint64_t FileSize = 0;
uint32_t DataSize = 0;		uint32_t DataSize = 0;
uint32_t NumMemoryPages = 0;		uint32_t NumMemoryPages = 0;

std::vector<const WasmSignature *> Types;		std::vector<const WasmSignature *> Types;
DenseMap<WasmSignature, int32_t, WasmSignatureDenseMapInfo> TypeIndices;		DenseMap<WasmSignature, int32_t, WasmSignatureDenseMapInfo> TypeIndices;
std::vector<const Symbol *> ImportedFunctions;		std::vector<const FunctionSymbol *> ImportedFunctions;
std::vector<const Symbol *> ImportedGlobals;		std::vector<const GlobalSymbol *> ImportedGlobals;
std::vector<WasmExportEntry> ExportedSymbols;		std::vector<WasmExportEntry> ExportedSymbols;
std::vector<const Symbol *> DefinedGlobals;		std::vector<const DefinedGlobal *> DefinedGlobals;
std::vector<InputFunction *> DefinedFunctions;		std::vector<InputFunction *> DefinedFunctions;
std::vector<const Symbol *> IndirectFunctions;		std::vector<const FunctionSymbol *> IndirectFunctions;
std::vector<WasmInitFunc> InitFunctions;		std::vector<WasmInitFunc> InitFunctions;

// Elements that are used to construct the final output		// Elements that are used to construct the final output
std::string Header;		std::string Header;
std::vector<OutputSection *> OutputSections;		std::vector<OutputSection *> OutputSections;

std::unique_ptr<FileOutputBuffer> Buffer;		std::unique_ptr<FileOutputBuffer> Buffer;
std::unique_ptr<SyntheticFunction> CtorFunction;		std::unique_ptr<SyntheticFunction> CtorFunction;
Show All 23 Lines	void Writer::createImportSection() {
if (NumImports == 0)		if (NumImports == 0)
return;		return;

SyntheticSection *Section = createSyntheticSection(WASM_SEC_IMPORT);		SyntheticSection *Section = createSyntheticSection(WASM_SEC_IMPORT);
raw_ostream &OS = Section->getStream();		raw_ostream &OS = Section->getStream();

writeUleb128(OS, NumImports, "import count");		writeUleb128(OS, NumImports, "import count");

for (const Symbol *Sym : ImportedFunctions) {		for (const FunctionSymbol *Sym : ImportedFunctions) {
WasmImport Import;		WasmImport Import;
Import.Module = "env";		Import.Module = "env";
Import.Field = Sym->getName();		Import.Field = Sym->getName();
Import.Kind = WASM_EXTERNAL_FUNCTION;		Import.Kind = WASM_EXTERNAL_FUNCTION;
Import.SigIndex = lookupType(Sym->getFunctionType());		Import.SigIndex = lookupType(Sym->getFunctionType());
writeImport(OS, Import);		writeImport(OS, Import);
}		}

▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
void Writer::createGlobalSection() {		void Writer::createGlobalSection() {
if (DefinedGlobals.empty())		if (DefinedGlobals.empty())
return;		return;

SyntheticSection *Section = createSyntheticSection(WASM_SEC_GLOBAL);		SyntheticSection *Section = createSyntheticSection(WASM_SEC_GLOBAL);
raw_ostream &OS = Section->getStream();		raw_ostream &OS = Section->getStream();

writeUleb128(OS, DefinedGlobals.size(), "global count");		writeUleb128(OS, DefinedGlobals.size(), "global count");
for (const Symbol *Sym : DefinedGlobals) {		for (const DefinedGlobal *Sym : DefinedGlobals) {
WasmGlobal Global;		WasmGlobal Global;
Global.Type.Type = WASM_TYPE_I32;		Global.Type.Type = WASM_TYPE_I32;
Global.Type.Mutable = Sym == WasmSym::StackPointer;		Global.Type.Mutable = Sym == WasmSym::StackPointer;
Global.InitExpr.Opcode = WASM_OPCODE_I32_CONST;		Global.InitExpr.Opcode = WASM_OPCODE_I32_CONST;
Global.InitExpr.Value.Int32 = Sym->getVirtualAddress();		Global.InitExpr.Value.Int32 = Sym->getVirtualAddress();
writeGlobal(OS, Global);		writeGlobal(OS, Global);
}		}
}		}
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	void Writer::createElemSection() {
writeUleb128(OS, 0, "table index");		writeUleb128(OS, 0, "table index");
WasmInitExpr InitExpr;		WasmInitExpr InitExpr;
InitExpr.Opcode = WASM_OPCODE_I32_CONST;		InitExpr.Opcode = WASM_OPCODE_I32_CONST;
InitExpr.Value.Int32 = kInitialTableOffset;		InitExpr.Value.Int32 = kInitialTableOffset;
writeInitExpr(OS, InitExpr);		writeInitExpr(OS, InitExpr);
writeUleb128(OS, IndirectFunctions.size(), "elem count");		writeUleb128(OS, IndirectFunctions.size(), "elem count");

uint32_t TableIndex = kInitialTableOffset;		uint32_t TableIndex = kInitialTableOffset;
for (const Symbol *Sym : IndirectFunctions) {		for (const FunctionSymbol *Sym : IndirectFunctions) {
assert(Sym->getTableIndex() == TableIndex);		assert(Sym->getTableIndex() == TableIndex);
writeUleb128(OS, Sym->getOutputIndex(), "function index");		writeUleb128(OS, Sym->getOutputIndex(), "function index");
++TableIndex;		++TableIndex;
}		}
}		}

void Writer::createCodeSection() {		void Writer::createCodeSection() {
if (DefinedFunctions.empty())		if (DefinedFunctions.empty())
▲ Show 20 Lines • Show All 286 Lines • ▼ Show 20 Lines	void Writer::createSections() {
}		}
}		}

void Writer::calculateImports() {		void Writer::calculateImports() {
for (Symbol *Sym : Symtab->getSymbols()) {		for (Symbol *Sym : Symtab->getSymbols()) {
if (!Sym->isUndefined() \|\| (Sym->isWeak() && !Config->Relocatable))		if (!Sym->isUndefined() \|\| (Sym->isWeak() && !Config->Relocatable))
continue;		continue;

if (Sym->isFunction()) {		if (auto *F = dyn_cast<FunctionSymbol>(Sym)) {
Sym->setOutputIndex(ImportedFunctions.size());		F->setOutputIndex(ImportedFunctions.size());
ImportedFunctions.push_back(Sym);		ImportedFunctions.push_back(F);
} else {		} else if (auto *G = dyn_cast<GlobalSymbol>(Sym)) {
Sym->setOutputIndex(ImportedGlobals.size());		G->setOutputIndex(ImportedGlobals.size());
ImportedGlobals.push_back(Sym);		ImportedGlobals.push_back(G);
}		}
}		}
}		}

void Writer::calculateExports() {		void Writer::calculateExports() {
bool ExportHidden = Config->Relocatable;		bool ExportHidden = Config->Relocatable;
StringSet<> UsedNames;		StringSet<> UsedNames;

▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	void Writer::calculateTypes() {

for (ObjFile *File : Symtab->ObjectFiles) {		for (ObjFile *File : Symtab->ObjectFiles) {
ArrayRef<WasmSignature> Types = File->getWasmObj()->types();		ArrayRef<WasmSignature> Types = File->getWasmObj()->types();
for (uint32_t I = 0; I < Types.size(); I++)		for (uint32_t I = 0; I < Types.size(); I++)
if (File->TypeIsUsed[I])		if (File->TypeIsUsed[I])
File->TypeMap[I] = registerType(Types[I]);		File->TypeMap[I] = registerType(Types[I]);
}		}

for (const Symbol *Sym : ImportedFunctions)		for (const FunctionSymbol *Sym : ImportedFunctions)
registerType(Sym->getFunctionType());		registerType(Sym->getFunctionType());

for (const InputFunction *F : DefinedFunctions)		for (const InputFunction *F : DefinedFunctions)
registerType(F->Signature);		registerType(F->Signature);
}		}

void Writer::assignIndexes() {		void Writer::assignIndexes() {
uint32_t GlobalIndex = ImportedGlobals.size() + DefinedGlobals.size();		uint32_t GlobalIndex = ImportedGlobals.size() + DefinedGlobals.size();
uint32_t FunctionIndex = ImportedFunctions.size() + DefinedFunctions.size();		uint32_t FunctionIndex = ImportedFunctions.size() + DefinedFunctions.size();

auto AddDefinedGlobal = [&](Symbol* Sym) {		auto AddDefinedGlobal = [&](DefinedGlobal *Sym) {
if (Sym) {		if (Sym) {
DefinedGlobals.emplace_back(Sym);		DefinedGlobals.emplace_back(Sym);
Sym->setOutputIndex(GlobalIndex++);		Sym->setOutputIndex(GlobalIndex++);
}		}
};		};
AddDefinedGlobal(WasmSym::StackPointer);		AddDefinedGlobal(WasmSym::StackPointer);
AddDefinedGlobal(WasmSym::HeapBase);		AddDefinedGlobal(WasmSym::HeapBase);
AddDefinedGlobal(WasmSym::DataEnd);		AddDefinedGlobal(WasmSym::DataEnd);

if (Config->Relocatable)		if (Config->Relocatable)
DefinedGlobals.reserve(Symtab->getSymbols().size());		DefinedGlobals.reserve(Symtab->getSymbols().size());

uint32_t TableIndex = kInitialTableOffset;		uint32_t TableIndex = kInitialTableOffset;

if (Config->Relocatable) {		if (Config->Relocatable) {
for (ObjFile *File : Symtab->ObjectFiles) {		for (ObjFile *File : Symtab->ObjectFiles) {
DEBUG(dbgs() << "Globals: " << File->getName() << "\n");		DEBUG(dbgs() << "Globals: " << File->getName() << "\n");
for (Symbol *Sym : File->getSymbols()) {		for (Symbol *Sym : File->getSymbols()) {
// Create wasm globals for data symbols defined in this file		// Create wasm globals for data symbols defined in this file
if (!Sym->isDefined() \|\| File != Sym->getFile())		if (File != Sym->getFile())
continue;		continue;
if (Sym->isFunction())		if (auto *G = dyn_cast<DefinedGlobal>(Sym))
continue;		AddDefinedGlobal(G);

AddDefinedGlobal(Sym);
}		}
}		}
}		}

for (ObjFile *File : Symtab->ObjectFiles) {		for (ObjFile *File : Symtab->ObjectFiles) {
DEBUG(dbgs() << "Functions: " << File->getName() << "\n");		DEBUG(dbgs() << "Functions: " << File->getName() << "\n");
for (InputFunction *Func : File->Functions) {		for (InputFunction *Func : File->Functions) {
if (!Func->Live)		if (!Func->Live)
continue;		continue;
DefinedFunctions.emplace_back(Func);		DefinedFunctions.emplace_back(Func);
Func->setOutputIndex(FunctionIndex++);		Func->setOutputIndex(FunctionIndex++);
}		}
}		}

for (ObjFile *File : Symtab->ObjectFiles) {		for (ObjFile *File : Symtab->ObjectFiles) {
DEBUG(dbgs() << "Handle relocs: " << File->getName() << "\n");		DEBUG(dbgs() << "Handle relocs: " << File->getName() << "\n");
auto HandleRelocs = [&](InputChunk *Chunk) {		auto HandleRelocs = [&](InputChunk *Chunk) {
if (!Chunk->Live)		if (!Chunk->Live)
return;		return;
ArrayRef<WasmSignature> Types = File->getWasmObj()->types();		ArrayRef<WasmSignature> Types = File->getWasmObj()->types();
for (const WasmRelocation& Reloc : Chunk->getRelocations()) {		for (const WasmRelocation& Reloc : Chunk->getRelocations()) {
if (Reloc.Type == R_WEBASSEMBLY_TABLE_INDEX_I32 \|\|		if (Reloc.Type == R_WEBASSEMBLY_TABLE_INDEX_I32 \|\|
Reloc.Type == R_WEBASSEMBLY_TABLE_INDEX_SLEB) {		Reloc.Type == R_WEBASSEMBLY_TABLE_INDEX_SLEB) {
Symbol *Sym = File->getFunctionSymbol(Reloc.Index);		FunctionSymbol *Sym = File->getFunctionSymbol(Reloc.Index);
if (Sym->hasTableIndex() \|\| !Sym->hasOutputIndex())		if (Sym->hasTableIndex() \|\| !Sym->hasOutputIndex())
continue;		continue;
Sym->setTableIndex(TableIndex++);		Sym->setTableIndex(TableIndex++);
IndirectFunctions.emplace_back(Sym);		IndirectFunctions.emplace_back(Sym);
} else if (Reloc.Type == R_WEBASSEMBLY_TYPE_INDEX_LEB) {		} else if (Reloc.Type == R_WEBASSEMBLY_TYPE_INDEX_LEB) {
Chunk->File->TypeMap[Reloc.Index] = registerType(Types[Reloc.Index]);		Chunk->File->TypeMap[Reloc.Index] = registerType(Types[Reloc.Index]);
Chunk->File->TypeIsUsed[Reloc.Index] = true;		Chunk->File->TypeIsUsed[Reloc.Index] = true;
}		}
▲ Show 20 Lines • Show All 172 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Use Symbol class heirarchy. NFC.ClosedPublic

Details

Diff Detail

Event Timeline