This is an archive of the discontinued LLVM Phabricator instance.

[lld][wasm] Split __wasm_apply_data_relocs when it exceeds the function size limit
Needs ReviewPublic

Authored by k1nder10 on Dec 15 2022, 8:29 AM.

Details

Reviewers
sbc100
Summary

__wasm_apply_data_relocs can grow drastically exceeding the function size limit.
This change introduces __wasm_apply_data_relocs_tail function which is supposed
to include instructions that caused the limit hit. __wasm_apply_data_relocs
calls __wasm_apply_data_relocs_tail at the end of the function body.
__wasm_apply_data_relocs_tail is empty if wasm module doesn't hit the function
size limit.

discussed here:
https://github.com/llvm/llvm-project/issues/55608
https://github.com/emscripten-core/emscripten/discussions/17374

Diff Detail

Event Timeline

k1nder10 created this revision.Dec 15 2022, 8:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 15 2022, 8:29 AM
k1nder10 requested review of this revision.Dec 15 2022, 8:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 15 2022, 8:29 AM

It's my first time pushing the changes to LLVM.
Should I add some tests for this? If so, where could I find examples of them?

lld/wasm/Driver.cpp
767

I tried to move the creation of this function to Writer.cpp 1349 under 'else' statement, so that we don't have any changes in out .wasm files
unless we exceed the function size limit. However, I got some strange runtime errors.

So, for now, we will have this function created even If we don't hit the limit (it just will be empty). Is it okay?

k1nder10 added inline comments.Dec 15 2022, 10:33 AM
lld/wasm/SplitApplyDataRelocsStrategy.cpp
40 ↗(On Diff #483201)

This should be changed to work properly. Will upload a new patchset tomorrow.

const size_t pos = function_to_split_.rfind(instruction_opcode, config->maxFunctionSize - skip_instruction_count);

k1nder10 updated this revision to Diff 483237.Dec 15 2022, 10:45 AM

[lld][wasm] Split __wasm_apply_data_relocs when it exceeds the function size limit

wasm_apply_data_relocs can grow drastically exceeding the function size limit.
This change introduces
wasm_apply_data_relocs_tail function which is supposed
to include instructions that caused the limit hit. wasm_apply_data_relocs
calls
wasm_apply_data_relocs_tail at the end of the function body.
__wasm_apply_data_relocs_tail is empty if wasm module doesn't hit the function
size limit.

discussed here:
https://github.com/llvm/llvm-project/issues/55608
https://github.com/emscripten-core/emscripten/discussions/17374

k1nder10 updated this revision to Diff 483898.Dec 19 2022, 3:39 AM

[lld][wasm] Split __wasm_apply_data_relocs when it exceeds the function size limit

wasm_apply_data_relocs can grow drastically exceeding the function size limit.
This change introduces
wasm_apply_data_relocs_tail function which is supposed
to include instructions that caused the limit hit. wasm_apply_data_relocs
calls
wasm_apply_data_relocs_tail at the end of the function body.
__wasm_apply_data_relocs_tail is empty if wasm module doesn't hit the function
size limit.

discussed here:
https://github.com/llvm/llvm-project/issues/55608
https://github.com/emscripten-core/emscripten/discussions/17374

sbc100 edited the summary of this revision. (Show Details)Dec 19 2022, 4:29 AM

I feel like a better approach might be to generate N different functions during the generation phase rather than trying to split the function after generation?

Also, I think we (also) want to investigate using a data driven approach for the relocations that don't involve symbol-specific globals. See the discussion on that here: https://github.com/emscripten-core/emscripten/issues/12682#issuecomment-1297755521

lld/test/wasm/shared.s
1 ↗(On Diff #483898)

Why are alll these test files showing up in the diff like every line of the file has changed? Did the line endings change maybe?

lld/wasm/Driver.cpp
767

I'd rather only emit the new function if needed.

Also, it seems like we should splitting into N possible sub-functions, not just 1, otherwise we just push the problem down the road a little.

769

We don't need either of these flags for the sub-function(s) I think

lld/wasm/SplitApplyDataRelocsStrategy.h
8 ↗(On Diff #483898)

Creating a class hierarchy seems unnecessary since there is only a single sub-class. I'm not sure its worth adding new headers and source files either

I feel like a better approach might be to generate N different functions during the generation phase rather than trying to split the function after generation?

Also, I think we (also) want to investigate using a data driven approach for the relocations that don't involve symbol-specific globals. See the discussion on that here: https://github.com/emscripten-core/emscripten/issues/12682#issuecomment-1297755521

Do we know the number of relocations (the size of __wasm_apply_data_relocs) during the generation phase?

Speaking of the discussion you provided, I also think it is a good idea. However, I don't have much time to work on it.

k1nder10 updated this revision to Diff 484242.Dec 20 2022, 5:56 AM

Fix the line ending in tests. Fix tests. Removed unnecessary classes.

k1nder10 marked an inline comment as done.Dec 20 2022, 6:03 AM
k1nder10 added inline comments.
lld/wasm/ApplyDataRelocsSplitter.cpp
42

As I understand, you build a wasm linker with -fno-exceptions right?
How should I rewrite such code?
Adding an assertion would work in a debug build but what about the release one?

k1nder10 added inline comments.Dec 20 2022, 6:25 AM
lld/wasm/Driver.cpp
767

Speaking of splitting this into N functions, it looks unnecessary as it must be enough to have 2 functions actually.
We have a more than million lines of code and a plethora of class hierarchies and we exceed the limit only by 200k instructions. So, there's a big stock (>7 million) of instructions to hit the limit again.

k1nder10 updated this revision to Diff 485994.Jan 3 2023, 7:46 AM

Got rid of exceptions

@sbc100
So, what do you think about this change? Can we merge it? Or should I try to split this up into N functions?
BTW, Is it okay that I have to modify the data-segments.ll & pie.ll files?

sbc100 added a comment.Jan 4 2023, 6:33 AM

I still have some concerns here. Firstly, I think it should be possible to do this without generating the tail function when its not needed. Can't applyDataRelocsTail simply be null in this case?

Secondly I'm not sure about and a new source file and header, or the name of the those new files, but thats mostly aesthetic.

sbc100 added a comment.Jan 4 2023, 6:37 AM

Also, I think it might be cleaner to simply generate the second function while the main function is being generated, rather then generate the long/unsplit function completely and then split it up.

Something like this phsudocode:

for (r in relocations) {
  if (current_size_is_over_limit()) {
    generate_tail_call_to_new_function():
    switch_current_output_function_to_new_function();
  }
  generate_code_for_relocation(r);
}

Then no splitting would be needed.