After their range checks were removed in 7f50c15be5c0, br_tables
started being duplicated into their predecessors by tail
folding. Unfortunately, when the br_tables were in loops this
transformation introduced bad irreducible control flow which was later
expanded into even more br_tables. This commit abuses the
isNotDuplicable property to prevent this irreducible control flow
from being introduced. This change saves a few dozen bytes of code
size and has a negligible affect on performance for most of the large
Emscripten benchmarks, but can improve performance significantly on
microbenchmarks of switches in loops.
Details
- Reviewers
aheejin dschuff - Commits
- rGc5d012341e58: [WebAssembly] Make BR_TABLE non-duplicable
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td | ||
---|---|---|
61 | Does isConvergent accomplish the same thing here? |
llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td | ||
---|---|---|
61 | It would, but to be honest I'm still not sure what it means for an instruction to be convergent or not. I know convergent instructions can't get any new control flow dependencies, but I don't have a good sense of why that might be an important property and what kind of instructions would need it. Do you have a good intuitive explanation for that? If it makes more sense, I would be happy to use isConvergent rather than isNotDuplicable. I looked at the places they are used earlier and either seems fine for this instruction. |
LGTM either with isNotDuplicable or isConvergent, as long as it works. I'm not familiar with what isConvergent is used for either, so..
Just to confirm, this is separate from why final.c in https://bugs.chromium.org/p/v8/issues/detail?id=10606 is slow, right?
llvm/test/CodeGen/WebAssembly/indirectbr.ll | ||
---|---|---|
30 | Were we duplicating br_table even for this simple example? Is that why the test result changed? |
Right, if I'm reading that bug correctly, there is also a register allocation issue in V8 making switches in loops slow.
llvm/test/CodeGen/WebAssembly/indirectbr.ll | ||
---|---|---|
30 | That's right 👍 |
(discussed with @tlively offline). The root issue is that the br_table gets duplicated outside the wasm block that encloses the branch targets. That's what causes the irreducible control flow in this case. Because the target-independent passes that do the duplication can't reason about those wasm blocks (since we haven't done the analysis that places them until later in the pass pipeline) it probably just makes sense to use noduplicate, which is a stronger limitation than convergent.
Does isConvergent accomplish the same thing here?