This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Make BR_TABLE non-duplicable
ClosedPublic

Authored by tlively on Jun 10 2020, 7:45 PM.

Details

Summary

After their range checks were removed in 7f50c15be5c0, br_tables
started being duplicated into their predecessors by tail
folding. Unfortunately, when the br_tables were in loops this
transformation introduced bad irreducible control flow which was later
expanded into even more br_tables. This commit abuses the
isNotDuplicable property to prevent this irreducible control flow
from being introduced. This change saves a few dozen bytes of code
size and has a negligible affect on performance for most of the large
Emscripten benchmarks, but can improve performance significantly on
microbenchmarks of switches in loops.

Diff Detail

Event Timeline

tlively created this revision.Jun 10 2020, 7:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2020, 7:45 PM
arsenm added a subscriber: arsenm.Jun 10 2020, 8:17 PM
arsenm added inline comments.
llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td
61

Does isConvergent accomplish the same thing here?

tlively marked an inline comment as done.Jun 11 2020, 12:02 AM
tlively added inline comments.
llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td
61

It would, but to be honest I'm still not sure what it means for an instruction to be convergent or not. I know convergent instructions can't get any new control flow dependencies, but I don't have a good sense of why that might be an important property and what kind of instructions would need it. Do you have a good intuitive explanation for that? If it makes more sense, I would be happy to use isConvergent rather than isNotDuplicable. I looked at the places they are used earlier and either seems fine for this instruction.

aheejin accepted this revision.Jun 11 2020, 4:54 AM

LGTM either with isNotDuplicable or isConvergent, as long as it works. I'm not familiar with what isConvergent is used for either, so..

Just to confirm, this is separate from why final.c in https://bugs.chromium.org/p/v8/issues/detail?id=10606 is slow, right?

llvm/test/CodeGen/WebAssembly/indirectbr.ll
30

Were we duplicating br_table even for this simple example? Is that why the test result changed?

This revision is now accepted and ready to land.Jun 11 2020, 4:54 AM
tlively marked an inline comment as done.Jun 11 2020, 10:03 AM

LGTM either with isNotDuplicable or isConvergent, as long as it works. I'm not familiar with what isConvergent is used for either, so..

Just to confirm, this is separate from why final.c in https://bugs.chromium.org/p/v8/issues/detail?id=10606 is slow, right?

Right, if I'm reading that bug correctly, there is also a register allocation issue in V8 making switches in loops slow.

llvm/test/CodeGen/WebAssembly/indirectbr.ll
30

That's right 👍

dschuff accepted this revision.Jun 11 2020, 1:15 PM

(discussed with @tlively offline). The root issue is that the br_table gets duplicated outside the wasm block that encloses the branch targets. That's what causes the irreducible control flow in this case. Because the target-independent passes that do the duplication can't reason about those wasm blocks (since we haven't done the analysis that places them until later in the pass pipeline) it probably just makes sense to use noduplicate, which is a stronger limitation than convergent.

This revision was automatically updated to reflect the committed changes.