MVP object files may import at most one table, and if they do, it must
be assigned table number zero in the output, as the references to that
table are not relocatable. Ensure that this is the case, even if some
inputs define other tables.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Looking over this changeset, I see that we need some tests. But also there is a fair amount of code duplication for the "precoloring" logic. I can make it be a part of TableSymbol::setTableNumber and InputTable::setTableNumber; it would seem that that much duplication is necessary. But then the errors would probably be less helpful because there's less context at that point. Thoughts?
IMO having helpful error messages is well worth some code duplication. Code duplication won't cause users to file the same bug every other week about their links failing, but unhelpful error messages will!
lld/wasm/Driver.cpp | ||
---|---|---|
804 | Can this dyn_cast ever fail? If not, it would be better to use cast here or even make the parameter a TableSymbol. | |
lld/wasm/InputFiles.cpp | ||
340 | Same question about this dyn_cast. | |
lld/wasm/SyntheticSections.cpp | ||
118–119 | Would it be possible to include the name of an object file that contains the offending non-relocatable table? I've anecdotally found that including file names in similar error messages about unlinkable MVP and atomics objects has dramatically decreased the number of bugs that get filed about it. | |
lld/wasm/Writer.cpp | ||
597 | It would be good to have a short comment here as well about why some tables need to be deferred. |
I'm not familiar with the use of term precoloring. Perhaps I should be.. but maybe we could use something more common like preassignTableNumber rather than precolorTableNumber? Certainly I don't think we want to use the term in the user-facing errors.
lld/wasm/InputFiles.cpp | ||
---|---|---|
343 | I'm not sure this is a very usefull user-facing error. Can it ever actually happen? Can we improve it or perhaps make it an assert? | |
345 | Can we simplify the code here by asserting that the table number is always zero and that there is only one table. In fact, perhaps this should be called synthesizeTableSymbol (non-plural) and just return after for first call to addTable? | |
lld/wasm/SymbolTable.cpp | ||
213 ↗ | (On Diff #321326) | I think this can be an unconditionally cast<> can't it? |
219 ↗ | (On Diff #321326) | Similarly, since there is exactly one nonrelocatable table number (zero)... can we simplify this code based on that fact? |
lld/wasm/SyntheticSections.cpp | ||
---|---|---|
121 | Can this error actually happy? Wouldn't it be a bug in lld if it did? It seems like in order for this to happen there would be a bug in the pre-coloring logic. |
Thanks very much for the review!
The "precoloring" name is a bit sloppy -- comes from register allocation FWIW, where you solve the graph coloring problem for the interference graph of register uses. Some nodes require that a register be in a specific place, e.g. %rdi for argument 0 on x86; imposing that constraint is called precoloring. So for tables, we also have a kind of precoloring constraint. But, probably best to just used the term "preassign".
lld/wasm/Driver.cpp | ||
---|---|---|
804 | I was thinking that it could fail if the existing symbol isn't a table. In that case we would have already issued an error, but control flow could still fall through here. But now I see that we make an early-return in that case below; will fix. | |
lld/wasm/InputFiles.cpp | ||
340 | Unfortunately it is possible that e.g. SymbolTable::createUndefinedTable returns a non-table symbol, if there was an existing symtab entry. We would have issued an error at that point though. | |
343 | This is a sloppy error and I need to reword it; tx for catching. | |
345 | Humm, good question! Old linkers would not propagate "additional" tables from input to output. Also, old compilers wouldn't define a table locally -- the indirect function table would only be imported. So maybe it would be good to avoid adding a new form of acceptable input, in which object files have many table symbols. If we tighten this restriction, we can issue an error for inputs that
WDYT? | |
lld/wasm/SymbolTable.cpp | ||
213 ↗ | (On Diff #321326) | AFAICS, sadly, no :( Otherwise we could simplify checkTableType as well. |
219 ↗ | (On Diff #321326) | Sure. I can add a isPreassignedToTableNumberZero method or so. Surely with a better name. |
lld/wasm/SyntheticSections.cpp | ||
118–119 | Sure. | |
121 | I was thinking that it could happen if you had an MVP input that defined a table locally (not imported), but a reftypes input imported a table. But prompted by your questions I think I see that MVP inputs wouldn't ever define tables, so this error can't happen. | |
lld/wasm/Writer.cpp | ||
597 | Sure. |
lld/wasm/InputFiles.cpp | ||
---|---|---|
345 | Sounds great! |
So.... this one is ready for review again I think. I haven't been able to test some of the the error cases in synthesizeMVPIndirectFunctionTableIfNeeded because you can't actually make files with those characteristics currently.
lld/wasm/InputFiles.cpp | ||
---|---|---|
351 | For this last paragraphs how about: "Table usage in MVP objects cannot be relocatates so the linker must ensure that this |
lld/wasm/Driver.cpp | ||
---|---|---|
804 | Can these two lines be moved into SymbolTable::addSyntheticTable (which already has a check for an existing symbol). Then you don't need to pass the extra existing arg here, and you don't need to extra existingTable table below? hmm.. maybe that would make addSyntheticTable a little less tidy, but it seems like logical place... either that or replaceSymbol itself. In general this notion of preassigned index seems to complicate things in a way that I'm not a fan of. I wish we could find a simpler way to force the indirect table to always have index 0. |
lld/wasm/Driver.cpp | ||
---|---|---|
804 | Right, the whole notion of symbols with preassigned addresses (table numbers in this case) is... unsatisfying. But it does seem to be essential to the constraints we're operating under. I thought that keeping the bits that much about with preassigned addresses local to the parts that deal in tables (and specifically the indirect function table) would limit the "taint". Probably replaceSymbol is not the place for this to go, as other kinds of symbols rightly don't have a notion of preassigned addresses. Incidentally, looking again, I don't think the code as it is works, because of the placement new in replaceSymbol :/ Will fix, I guess in addSyntheticTable. |
PTAL, I think all comments are addressed. Adding tlively in case he would like to weigh in.
This looks great, thanks @wingo! My only comments are about comments and error message contents. It would also be good to add tests for more of the error messages, especially the one about "Cannot preassign table number 0 to indirect function table..." because I expect that to be a common user-facing issue.
lld/wasm/InputFiles.cpp | ||
---|---|---|
333 | ||
368–369 | I would remove the last sentence here and in the errors below. -mattr=+reference-types is an llc flag but not a clang flag, so it is probably not what the user wants. Also, clang should never produce objects that produce this error no matter what flags are passed to it. | |
lld/wasm/Symbols.cpp | ||
391 | It would be good to add a comment here along the lines of "zero is the only table number that can be preassigned." Otherwise from just the name of the function, it looks strange that other numbers aren't handled gracefully. | |
lld/wasm/SyntheticSections.cpp | ||
237–238 | The indirect function table is imported in Emscripten's dynamic linking scheme, so this isn't necessarily unexpected. | |
248–250 | -mreference-types is the clang flags for enabling this feature. Or another option would be to be tool- and flag-agnostic by saying "feature 'reference-types'" instead of naming particular flags. |
we when we comes to assigning indexes we do something like this:
lld/wasm/Symbols.h | ||
---|---|---|
382 | I feel a little bad for pushing back more on the this change, but it still feels a little more invasive than it could be. Rather than adding these two methods to the table class what do you think about this appoach:
if (condig->legacyFunctionTable) { placeIndirectFunctionTableFirst() <-- here we could report errors if the sorting is impossible } This sorting and error reporting could be done in the TableSection::assignIndexes(). Hopefully this should avoid adding the preassignTableNumberZero/hasPreassignedTableNumberZero methods and should avoid extra logic to too much extra logic addTable method. WDYT? (I guess it would also need to happen in ImportSection::assignIndexes...) These are mostly cosmetic changes, that actual behaviour LGTM. |
lld/wasm/SyntheticSections.cpp | ||
---|---|---|
250 | Can we be more succinct here? If you feel its is helpful/necessary to convey all this information than perhaps this verbosity is unavoidable. Would something shorter such as "object file was not built -mattr=+reference-types enabled. This is incompatible with the use tables imports in <other-object-name>" be enough? |
lld/wasm/Symbols.h | ||
---|---|---|
382 | This can work. It's not quite as minimal as this because tables can also get assigned numbers via the import section. I'll give it a go and see. | |
lld/wasm/SyntheticSections.cpp | ||
250 | We can certainly be more succinct but I don't know how to be as informative. As the comment notes, we don't have a good way to know where the "conflicting" import comes from AFAIU. |
Instead of tracking unrelocatable uses of the indirect function table on the table symbol, just use a global flag in the config. The writer will take care to handle the indirect function table symbol specially if this flag is set, writing it before any other table.
Nice! I like this much better now. Thanks for making all those changes.
Just a few more nits and I think we can land this.
lld/test/wasm/invalid-mvp-table-use.s | ||
---|---|---|
3 ↗ | (On Diff #323264) | Can you elaborate on this comment. What makes this an MVP file ? Is it simply the lack of -mattr=+reference-types. Not part of this PR, but should we have llvm-mc fail here to catch this earlier? |
lld/wasm/InputFiles.cpp | ||
380–382 | Can we replace this above block with simply: | |
lld/wasm/InputFiles.h | ||
161 | Can we call this addLegacyIndirectFunctionTableSymbolIfNeeded now so matches the config name? | |
lld/wasm/SyntheticSections.cpp | ||
220 | nit: Moving these method back down below writeBody would make the diff easier to read (smaller). | |
227 | Can we combine these two into a single condition to remove one level of testing? In fact can we combine all three maybe? | |
lld/wasm/Writer.cpp | ||
589 | This can be combined into single condition (with no braces). | |
593 | Oh nice so we always import this first, regardless of the config flag.. I like that. |
lld/wasm/InputFiles.cpp | ||
---|---|---|
380–382 | Sorry, didn't mean to send that. |
Can this dyn_cast ever fail? If not, it would be better to use cast here or even make the parameter a TableSymbol.