This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Fix linkage and visibility info of IR global syms
AbandonedPublic

Authored by pmatos on Dec 14 2021, 12:06 PM.

Details

Reviewers
sbc100
Summary

The visibility and linkage information of symbols referring to IR
globals is broken. This patches fixes this while refactoring the code
setting symbol types in WasmSyms and emitting the external
declarations.

Emission of external declarations is now done in the first execution
of emitConstantPool rather than in emitLinkage (and a few other
places). This is the point where we have already gathered information
about used symbols (by running the MC Lower PrePass) and not yet
started emitting any functions so that any declarations that need to
be emitted are done so at the top of the file before any functions.

This changes the order of a few directives in the final asm file which
required an update to a few tests.

Diff Detail

Event Timeline

pmatos created this revision.Dec 14 2021, 12:06 PM
pmatos requested review of this revision.Dec 14 2021, 12:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 14 2021, 12:06 PM

My understanding is that labels like that in the assembly mean you are defining a table ... for referencing an external table it should be enough to just include a .tabletype.

Maybe I'm missing the motivation for this change though?

llvm/test/CodeGen/WebAssembly/externref-tableset.ll
97

Wouldn't this line be declaring a defined table rather than an external/undefined one?

pmatos added inline comments.Dec 14 2021, 11:45 PM
llvm/test/CodeGen/WebAssembly/externref-tableset.ll
97

But we are defining a table, right? In line 6 with @externref_table = local_unnamed_addr addrspace(1) global [0 x %externref] undef.

You are right, I think in that .tabletype is enough to access an external table, which is why we don't need one for the __funcref_call_table in funcref-table_call.ll.

sbc100 added inline comments.Dec 15 2021, 8:11 AM
llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
204

Are all symbols here defined? (As opposed to external?) Below we check GV->hasInitializer().. should we do something like that? (also the below code add OutStreamer->AddBlankLine(); after, should we do that too, and try to share code maybe?

llvm/test/CodeGen/WebAssembly/externref-tableset.ll
97

Oh I see, in that case that makes sense. I was thrown off by the undef on that line.. what does that signal?

pmatos added inline comments.Dec 21 2021, 3:33 AM
llvm/test/CodeGen/WebAssembly/externref-tableset.ll
97

That just makes the initial table as undefined. Although I think it could be null as well. Unsure in practical terms what the different is though.

pmatos updated this revision to Diff 398894.Jan 11 2022, 3:40 AM

Simplified code to emit labels for tables and tabletype directive.

@sbc100 not emitting .globl for tables since i think those are not necessary given .tabletype. Is this the case? That's the reason why I am not calling emitLinkage if the symbol is a table in emitGlobalVariable.

sbc100 added inline comments.Jan 11 2022, 4:30 AM
llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
214

Are you saying that for undefined/external tables we call emitTableType elsewhere? IIUC the table type needs to be declared ragardless of whether the table has a definition (hasInitializer()) or not, right?

This comment seems a little odd to me given that the code below doesn't just handle table symbols.

pmatos added inline comments.Jan 11 2022, 8:34 AM
llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
214

Yes, so if the table is extern it's tabletype is emitted in emitExternDecls. If it's a global defined in the current compilation unit, then it needs to be emitted here. That's what I was trying to say with my comment, but maybe I need to be more precise.

Regarding the mention of hasInitializer(), I am unsure how to actually test this. I haven't managed to create a table as a global that doesn't enter the condition, i.e. without an initializer. Do you have a suggestion?

sbc100 added inline comments.Jan 11 2022, 8:44 AM
llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
214

I'm not sure... with globals and functions and data symbols you just skip the label, and that makes them externally defined (i.e. without an initializer).

It could be that for some reason we don't support that yet, but maybe thats a case for anther CL?

pmatos added inline comments.Jan 11 2022, 9:26 AM
llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
214

we might be talking about different things when referring to globals. Sorry. Just to clarify. You are referring to globals in asm, as far as I understand. I am referring to globals in LLVM IR. Tables are implemented as LLVM IR globals (https://llvm.org/docs/LangRef.html#global-variables) .

If I define a table as an LLVM IR global, the question is, what asm directives are needed to define that table. I would think we need just .tabletype and the label to actually allocate the table symbol. At the moment, all tables start by being initialized with undef in IR. However, that might mean that at the asm level they always have an initializer.

sbc100 added inline comments.Jan 11 2022, 9:47 AM
llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
214

Perhaps it would help to distinguish between "define a table" and "declare a table".

I don't know about the LLVM IR but at the asm level, one declares something simply by giving it a type, but defines it by giving it a label. In addition to the label one can declare it ".globl" in which case other compilations units can also see it.

pmatos added inline comments.Jan 11 2022, 11:20 AM
llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
214

Ah - that's something that was not totally clear to me. Now I understand what .globl is about. Let me restructure the patch given what you said. Thanks.

@sbc100 Unfortunately while adding some tests to this including testing private IR globals, which don't emit a .globl for the symbol I found a crash, which I now fixed and an issue with multiple globals in the same compilation unit. For example, two tables don't work because .tabletype is only emitted for the first. emitLinkage is called on each symbol in turn, and emitExternalDecls emits the tabletype for the symbol, however it only runs for the first symbol because emitExternalDecls only actually runs the first time it's called. This is very unfortunately and will need a bit of refactoring. I am working on it.

pmatos updated this revision to Diff 400550.Jan 17 2022, 7:26 AM

Refactored places where type related utility functions are and refactored some of the symbol type creation.

Noticed some issues with generating duplicate .tabletype which was fixed by moving emitExternalDecls to onFinalization. However, this doesn't seem like a good solution since the .*type directives need to show up before being used, and therefore cannot come at the end of the file.

@sbc100 I think I understand where this is going wrong. As far as I understand emitExternalDecls should only be emitting .functype, etc directives for external symbols and yet it's doing so for all symbols in OutContext, even those which are not external. I will have to revisit my assumptions and take a look at the code again, since the current solution of running emitExternalDecls in onFinalization means all directives come at the end and that doesn't work because it complains if, for example a reference to bar comes before bars type declaration.

@sbc100 I think I understand where this is going wrong. As far as I understand emitExternalDecls should only be emitting .functype, etc directives for external symbols and yet it's doing so for all symbols in OutContext, even those which are not external. I will have to revisit my assumptions and take a look at the code again, since the current solution of running emitExternalDecls in onFinalization means all directives come at the end and that doesn't work because it complains if, for example a reference to bar comes before bars type declaration.

Yes, it makes sense to me that emitExternalDecls only applies to external symbols (those not defined in the module). However, I would have thought that for typechecking reasons (we do basic type checking, unlike other asm backend) that all the external decls always come first.

@sbc100 I think I understand where this is going wrong. As far as I understand emitExternalDecls should only be emitting .functype, etc directives for external symbols and yet it's doing so for all symbols in OutContext, even those which are not external. I will have to revisit my assumptions and take a look at the code again, since the current solution of running emitExternalDecls in onFinalization means all directives come at the end and that doesn't work because it complains if, for example a reference to bar comes before bars type declaration.

Yes, it makes sense to me that emitExternalDecls only applies to external symbols (those not defined in the module). However, I would have thought that for typechecking reasons (we do basic type checking, unlike other asm backend) that all the external decls always come first.

Yes, it's the type checking that's tripping the current patch. My plan is to go back to emitExternalDecls and ensure we really only emit the external decls first.

pmatos updated this revision to Diff 402494.Jan 24 2022, 6:11 AM

Full patch for required refactoring and fix.

I understand this will require some explanations, which will follow as well as an update to the issue description.

Refactored the setting of the Wasm Symbol type into WasmSymbolSetType in WebAssemblyAsmPrinter.cpp and WebAssemblyMCInstLower.cpp. However, this forced us to move the function to WebAssemblyTypeUtilities.h and refactor some other type-related functions from WebAssemblyUtilities to WebAssemblyTypeUtilities.

This allowed us to simplify the emitLinkage implementation by removing the overridden function and moving the call of emitExternalDecls to emitConstantPool, which is the first function to be called when a function header is emitted.

So now we can write something like table-types.ll, which before didn't work at all because global symbols like tables where not being emitted besides the first one. Also declared only tables where not being emitted but now they are. After the clarification between .tabletype, .globl and label, I feel like this ensures the right directives are correctly emitted. The different combinations are checked through table-types.ll.

pmatos updated this revision to Diff 402537.Jan 24 2022, 8:18 AM
pmatos retitled this revision from [WebAssembly] Emit symbol labels for table global symbols to [WebAssembly] Fix linkage and visibility info of IR global syms.
pmatos edited the summary of this revision. (Show Details)

Update the bug summary, no code changes.

I general,g this seems like a good change..

I wonder if you could split out the emitLinkage -> emitConstant pool change? That would hopefully capture a lot of the test churn and seems maybe logically separable?

llvm/lib/Target/WebAssembly/Utils/WebAssemblyTypeUtilities.h
111 ↗(On Diff #402537)

The other functions in this file start with a lowercase letter..

111 ↗(On Diff #402537)

Is this the right place for this function to live? I don't see any other MC stuff in this file.

llvm/lib/Target/WebAssembly/Utils/WebAssemblyUtilities.cpp
19 ↗(On Diff #402537)

Is this change needed?

llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
287

Is this re-entrancy check still needed? i.e. maybe we now just call emitExternalDecls once?

545

It seems odd to do anything substantive here, but despite the fact that "WebAssembly disables constant pools".

I guess we you choose this point because its happens to occur at the correct point in the asm printing process.. and maybe there is not other hook we can/should use at that point?

pmatos abandoned this revision.Jan 25 2022, 3:45 AM

@sbc100 Thanks for the comments, I have split this into D118121 and D118122. I will now apply the comments from your reviews to the new revisions.

llvm/lib/Target/WebAssembly/Utils/WebAssemblyUtilities.cpp
19 ↗(On Diff #402537)

No - this will be removed.

pmatos added inline comments.Jan 25 2022, 4:10 AM
llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
287

Yes, the reentrancy check is still needed. The function emitConstantPool is called once for each function when emitting function headers. We want to emit this before the first function.

545

Yes, unfortunately, it would have been much better to override emitFunctionHeader however that function is not virtual. So the candidates become emitConstantPool or (called a bit later) emitFunctionEntryLabel. The later is not great thought because some visibility/linkage directives for the function might be emitted before the label and we might end up mixing declaration of globals with function directives which doesn't look so good. Therefore, we are left with doing this in emitConstantPool.

sbc100 added inline comments.Jan 26 2022, 11:00 AM
llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
545

Ho about we add a comment here... something like "We use emitConstantPool to perform emitExternalDecls the first time its called so that the type checker sees all extarnal decls before any function bodies. We probably should use emitFunctionHeader instead but that is not virtual".

Perhaps as a followup we could consider making emitFunctionHeader virtual?