This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly][lld] Preassign table number 0 to indirect function table for MVP inputs
ClosedPublic

Authored by wingo on Feb 3 2021, 11:50 PM.

Details

Summary

MVP object files may import at most one table, and if they do, it must
be assigned table number zero in the output, as the references to that
table are not relocatable. Ensure that this is the case, even if some
inputs define other tables.

Diff Detail

Event Timeline

wingo created this revision.Feb 3 2021, 11:50 PM
wingo requested review of this revision.Feb 3 2021, 11:50 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 3 2021, 11:50 PM
wingo added a comment.Feb 4 2021, 1:51 AM

Looking over this changeset, I see that we need some tests. But also there is a fair amount of code duplication for the "precoloring" logic. I can make it be a part of TableSymbol::setTableNumber and InputTable::setTableNumber; it would seem that that much duplication is necessary. But then the errors would probably be less helpful because there's less context at that point. Thoughts?

Looking over this changeset, I see that we need some tests. But also there is a fair amount of code duplication for the "precoloring" logic. I can make it be a part of TableSymbol::setTableNumber and InputTable::setTableNumber; it would seem that that much duplication is necessary. But then the errors would probably be less helpful because there's less context at that point. Thoughts?

IMO having helpful error messages is well worth some code duplication. Code duplication won't cause users to file the same bug every other week about their links failing, but unhelpful error messages will!

lld/wasm/Driver.cpp
804

Can this dyn_cast ever fail? If not, it would be better to use cast here or even make the parameter a TableSymbol.

lld/wasm/InputFiles.cpp
340

Same question about this dyn_cast.

lld/wasm/SyntheticSections.cpp
118–119

Would it be possible to include the name of an object file that contains the offending non-relocatable table? I've anecdotally found that including file names in similar error messages about unlinkable MVP and atomics objects has dramatically decreased the number of bugs that get filed about it.

lld/wasm/Writer.cpp
597

It would be good to have a short comment here as well about why some tables need to be deferred.

I'm not familiar with the use of term precoloring. Perhaps I should be.. but maybe we could use something more common like preassignTableNumber rather than precolorTableNumber? Certainly I don't think we want to use the term in the user-facing errors.

lld/wasm/InputFiles.cpp
343

I'm not sure this is a very usefull user-facing error. Can it ever actually happen? Can we improve it or perhaps make it an assert?

345

Can we simplify the code here by asserting that the table number is always zero and that there is only one table.

In fact, perhaps this should be called synthesizeTableSymbol (non-plural) and just return after for first call to addTable?

lld/wasm/SymbolTable.cpp
213 ↗(On Diff #321326)

I think this can be an unconditionally cast<> can't it?

219 ↗(On Diff #321326)

Similarly, since there is exactly one nonrelocatable table number (zero)... can we simplify this code based on that fact?

sbc100 added inline comments.Feb 4 2021, 12:45 PM
lld/wasm/SyntheticSections.cpp
121

Can this error actually happy? Wouldn't it be a bug in lld if it did? It seems like in order for this to happen there would be a bug in the pre-coloring logic.

wingo added a comment.Feb 4 2021, 11:52 PM

Thanks very much for the review!

The "precoloring" name is a bit sloppy -- comes from register allocation FWIW, where you solve the graph coloring problem for the interference graph of register uses. Some nodes require that a register be in a specific place, e.g. %rdi for argument 0 on x86; imposing that constraint is called precoloring. So for tables, we also have a kind of precoloring constraint. But, probably best to just used the term "preassign".

lld/wasm/Driver.cpp
804

I was thinking that it could fail if the existing symbol isn't a table. In that case we would have already issued an error, but control flow could still fall through here. But now I see that we make an early-return in that case below; will fix.

lld/wasm/InputFiles.cpp
340

Unfortunately it is possible that e.g. SymbolTable::createUndefinedTable returns a non-table symbol, if there was an existing symtab entry. We would have issued an error at that point though.

343

This is a sloppy error and I need to reword it; tx for catching.

345

Humm, good question! Old linkers would not propagate "additional" tables from input to output. Also, old compilers wouldn't define a table locally -- the indirect function table would only be imported. So maybe it would be good to avoid adding a new form of acceptable input, in which object files have many table symbols.

If we tighten this restriction, we can issue an error for inputs that

  • have no table symbols. and:
    • import any table not named __indirect_function_table, or
    • import more than one table, or
    • define any table

WDYT?

lld/wasm/SymbolTable.cpp
213 ↗(On Diff #321326)

AFAICS, sadly, no :( Otherwise we could simplify checkTableType as well.

219 ↗(On Diff #321326)

Sure. I can add a isPreassignedToTableNumberZero method or so. Surely with a better name.

lld/wasm/SyntheticSections.cpp
118–119

Sure.

121

I was thinking that it could happen if you had an MVP input that defined a table locally (not imported), but a reftypes input imported a table. But prompted by your questions I think I see that MVP inputs wouldn't ever define tables, so this error can't happen.

lld/wasm/Writer.cpp
597

Sure.

wingo updated this revision to Diff 321787.Feb 5 2021, 8:48 AM

Address feedback

wingo retitled this revision from [WebAssembly][lld] Precolor table numbers for MVP inputs to [WebAssembly][lld] Preassign table number 0 to indirect function table for MVP inputs.Feb 5 2021, 8:50 AM
wingo edited the summary of this revision. (Show Details)
sbc100 added inline comments.Feb 5 2021, 9:09 AM
lld/wasm/InputFiles.cpp
345

Sounds great!

wingo updated this revision to Diff 322368.Feb 9 2021, 6:11 AM
wingo edited the summary of this revision. (Show Details)

Fixup casts

wingo updated this revision to Diff 322379.Feb 9 2021, 6:48 AM

Tighten up restrictions on MVP inputs

wingo updated this revision to Diff 322384.Feb 9 2021, 7:20 AM

Add test

wingo added a comment.Feb 9 2021, 7:21 AM

So.... this one is ready for review again I think. I haven't been able to test some of the the error cases in synthesizeMVPIndirectFunctionTableIfNeeded because you can't actually make files with those characteristics currently.

sbc100 edited the summary of this revision. (Show Details)Feb 9 2021, 7:49 AM
sbc100 added inline comments.Feb 9 2021, 8:05 AM
lld/wasm/InputFiles.cpp
351

For this last paragraphs how about:

"Table usage in MVP objects cannot be relocatates so the linker must ensure that this
table gets assigned index zero".

wingo marked 7 inline comments as done.Feb 9 2021, 8:14 AM
sbc100 added inline comments.Feb 9 2021, 9:00 AM
lld/wasm/Driver.cpp
804

Can these two lines be moved into SymbolTable::addSyntheticTable (which already has a check for an existing symbol). Then you don't need to pass the extra existing arg here, and you don't need to extra existingTable table below?

hmm.. maybe that would make addSyntheticTable a little less tidy, but it seems like logical place... either that or replaceSymbol itself.

In general this notion of preassigned index seems to complicate things in a way that I'm not a fan of. I wish we could find a simpler way to force the indirect table to always have index 0.

wingo marked 2 inline comments as done.Feb 10 2021, 12:40 AM
wingo added inline comments.
lld/wasm/Driver.cpp
804

Right, the whole notion of symbols with preassigned addresses (table numbers in this case) is... unsatisfying. But it does seem to be essential to the constraints we're operating under. I thought that keeping the bits that much about with preassigned addresses local to the parts that deal in tables (and specifically the indirect function table) would limit the "taint".

Probably replaceSymbol is not the place for this to go, as other kinds of symbols rightly don't have a notion of preassigned addresses.

Incidentally, looking again, I don't think the code as it is works, because of the placement new in replaceSymbol :/ Will fix, I guess in addSyntheticTable.

wingo updated this revision to Diff 322626.Feb 10 2021, 2:20 AM

Address feedback; fix placement-new bug

wingo marked 3 inline comments as done.Feb 11 2021, 1:33 AM
wingo updated this revision to Diff 322937.Feb 11 2021, 3:11 AM

Fix comment

PTAL, I think all comments are addressed. Adding tlively in case he would like to weigh in.

This looks great, thanks @wingo! My only comments are about comments and error message contents. It would also be good to add tests for more of the error messages, especially the one about "Cannot preassign table number 0 to indirect function table..." because I expect that to be a common user-facing issue.

lld/wasm/InputFiles.cpp
333
368–369

I would remove the last sentence here and in the errors below. -mattr=+reference-types is an llc flag but not a clang flag, so it is probably not what the user wants. Also, clang should never produce objects that produce this error no matter what flags are passed to it.

lld/wasm/Symbols.cpp
391

It would be good to add a comment here along the lines of "zero is the only table number that can be preassigned." Otherwise from just the name of the function, it looks strange that other numbers aren't handled gracefully.

lld/wasm/SyntheticSections.cpp
237–238

The indirect function table is imported in Emscripten's dynamic linking scheme, so this isn't necessarily unexpected.

248–250

-mreference-types is the clang flags for enabling this feature.

Or another option would be to be tool- and flag-agnostic by saying "feature 'reference-types'" instead of naming particular flags.

we when we comes to assigning indexes we do something like this:

lld/wasm/Symbols.h
382

I feel a little bad for pushing back more on the this change, but it still feels a little more invasive than it could be.

Rather than adding these two methods to the table class what do you think about this appoach:

  • Add a global config variable that gets set if we have MVP objects in the link. We could call this something like`bool legacyFunctionTable = false`. Or alternatively "forceFunctioTableToBeZero"? Or "functionTableMustBeZero"? I kind of like the idea of a bool here because it clearly indicates that are two different modes.
  • We already track the global symbol indirectFunctionTable.
  • Then, we when we comes to assigning indexes we do something like this:
if (condig->legacyFunctionTable) {
     placeIndirectFunctionTableFirst()   <-- here we could report errors if the sorting is impossible
}

This sorting and error reporting could be done in the TableSection::assignIndexes().

Hopefully this should avoid adding the preassignTableNumberZero/hasPreassignedTableNumberZero methods and should avoid extra logic to too much extra logic addTable method.

WDYT?

(I guess it would also need to happen in ImportSection::assignIndexes...)

These are mostly cosmetic changes, that actual behaviour LGTM.

sbc100 added inline comments.Feb 11 2021, 2:03 PM
lld/wasm/SyntheticSections.cpp
250

Can we be more succinct here? If you feel its is helpful/necessary to convey all this information than perhaps this verbosity is unavoidable.

Would something shorter such as "object file was not built -mattr=+reference-types enabled. This is incompatible with the use tables imports in <other-object-name>" be enough?

wingo marked 3 inline comments as done.Feb 12 2021, 12:36 AM
wingo added inline comments.
lld/wasm/Symbols.h
382

This can work. It's not quite as minimal as this because tables can also get assigned numbers via the import section. I'll give it a go and see.

lld/wasm/SyntheticSections.cpp
250

We can certainly be more succinct but I don't know how to be as informative. As the comment notes, we don't have a good way to know where the "conflicting" import comes from AFAIU.

wingo updated this revision to Diff 323240.Feb 12 2021, 1:02 AM

Fix error messages and rebase

wingo updated this revision to Diff 323264.Feb 12 2021, 2:09 AM

Instead of tracking unrelocatable uses of the indirect function table on the table symbol, just use a global flag in the config. The writer will take care to handle the indirect function table symbol specially if this flag is set, writing it before any other table.

wingo marked 4 inline comments as done.Feb 12 2021, 2:11 AM

Addressed feedback; @sbc100 would you mind having another look?

sbc100 accepted this revision.Feb 12 2021, 7:50 AM

Nice! I like this much better now. Thanks for making all those changes.

Just a few more nits and I think we can land this.

lld/test/wasm/invalid-mvp-table-use.s
3 ↗(On Diff #323264)

Can you elaborate on this comment. What makes this an MVP file ? Is it simply the lack of -mattr=+reference-types.

Not part of this PR, but should we have llvm-mc fail here to catch this earlier?

lld/wasm/InputFiles.cpp
380–382

Can we replace this above block with simply:

lld/wasm/InputFiles.h
161

Can we call this addLegacyIndirectFunctionTableSymbolIfNeeded now so matches the config name?

lld/wasm/SyntheticSections.cpp
220

nit: Moving these method back down below writeBody would make the diff easier to read (smaller).

227

Can we combine these two into a single condition to remove one level of testing? In fact can we combine all three maybe?

lld/wasm/Writer.cpp
589

This can be combined into single condition (with no braces).

593

Oh nice so we always import this first, regardless of the config flag.. I like that.

This revision is now accepted and ready to land.Feb 12 2021, 7:50 AM
wingo updated this revision to Diff 323335.Feb 12 2021, 8:05 AM
wingo marked 4 inline comments as done.

Address feedback

lld/test/wasm/invalid-mvp-table-use.s
3 ↗(On Diff #323264)

Elaborated. As regards llvm-mc, there will need to be some changes when we finally add (again) support for symtab entries; will follow up then.

lld/wasm/InputFiles.cpp
380–382

Missing comment here, perhaps?

wingo marked 2 inline comments as done.Feb 12 2021, 8:06 AM
sbc100 added inline comments.Feb 12 2021, 8:10 AM
lld/wasm/InputFiles.cpp
380–382

Sorry, didn't mean to send that.