Page MenuHomePhabricator

[WebAssembly] CFG Stackifier

Authored by sunfish on Sep 9 2015, 9:21 AM.



This is a prototype of the CFG "stackifier" algorithm for creating structured control flow out of a CFG.

Diff Detail


Event Timeline

sunfish updated this revision to Diff 34346.Sep 9 2015, 9:21 AM
sunfish retitled this revision from to [WebAssembly] CFG Stackifier.
sunfish updated this object.
sunfish set the repository for this revision to rL LLVM.
sunfish added a subscriber: jfb.
sunfish updated this revision to Diff 34388.Sep 9 2015, 4:39 PM
  • Implement switch statements
  • Misc fixes
sunfish updated this revision to Diff 34390.Sep 9 2015, 5:09 PM
  • Rebase
  • Include AsmPrinter.cpp change
sunfish updated this revision to Diff 34455.Sep 10 2015, 9:30 AM
  • add tests for interesting loop-with-split-backedge cases
  • rewrite the successor sorting code to be more readable
  • misc cleanups
sunfish updated this revision to Diff 34456.Sep 10 2015, 9:33 AM
  • include the AsmPrinter.h diff (again)
jfb added inline comments.Sep 10 2015, 11:31 AM
239 ↗(On Diff #34456)

This should probably be in its own patch once this one is ready to go, just so other folks who care can notice it.

11 ↗(On Diff #34456)

Could you have a bit of a longer description of what that means?

58 ↗(On Diff #34456)

std::vector, or non-zero size.

67 ↗(On Diff #34456)

Can this be Succs(MBB->successors())?

70 ↗(On Diff #34456)

Could you explain the strategy? You leave me hanging!

87 ↗(On Diff #34456)

It's a bit obvious, but it would be nice to have a comment here that says that A and B are now in the same loop depth. It doesn't mean it's the same loop, though! Would it make sense to check that too, and order them based on which of their loop heads is earlier in PO?

95 ↗(On Diff #34456)

Could you also take into account "furthest from backedge" in ordering?
We can also be a bit silly by trying to line up similarly-conditioned diamonds the same way, but that requires keeping some more state around for previous conditions so I'm not sure it's worth it.

It also seems like you want a tie-breaker here, and that can be as arbitrary as the block name? Maybe also something about instructions contained, including "ends with a non-reachable instr".

What does this do with unlikely/cold/landingpad?

104 ↗(On Diff #34456)

Expand on why we want that control.

107 ↗(On Diff #34456)

4 seems pretty conservative for most functions! I'd do 16 as a random guess (though of course data would win here...).

121 ↗(On Diff #34456)

This code is written as a test for itself? ;-)

jfb added a comment.Sep 10 2015, 1:54 PM

A few more high-level comments...

130 ↗(On Diff #34456)

Shouldn't this instead test for isConditionalBranch, not isBranch?

Also: conservatively, isInlineAsm?

61 ↗(On Diff #34456)

Why (select Int64:$a? Isn't the condition just a bool, so Int32 suffices (as for the other types)?

69 ↗(On Diff #34456)

Aggressive multiclass for SELECT?

52 ↗(On Diff #34456)

Why instructions can these be?

9 ↗(On Diff #34456)

Add a test with:

  • A single-block function.
  • An empty block.
  • Deeper nest.
  • Back-to-back diamond.
  • Diamonds in different loops.
  • TODO on inline asm as a terminator (though wasm probably won't have it, good to know it's missing).
  • TODO on landingpad.
  • Some of the code deals with instructions after branches, it would be good to test these.
6 ↗(On Diff #34456)

TODO on testing computed goto.

sunfish added inline comments.Sep 10 2015, 5:06 PM
58 ↗(On Diff #34456)

std::vector doesn't support Succs(MBB->successors()) below ;-).

87 ↗(On Diff #34456)

Ah, this was just a thinko on my part. This code should just check the actual loop, not the loop depth.

95 ↗(On Diff #34456)

I'm just trying to do something simple that does reasonable things in most cases right now. There will be plenty of time for tweaking this function.

Unlikely/cold are handled by MachineBlockPlacementID, and this pass attempts to respect that by preserving the original order of the blocks when it can. However, note that MachineBlockPlacementID is currently disabled; see the comments for details.

landingpad gets lowered away before we get here.

121 ↗(On Diff #34456)

Ok, it's not hard to rewrite this code to avoid a goto for the benefit of people who make remarks like this, so I've now done that.

130 ↗(On Diff #34456)

It needs to accept isBranch too because moving the block could mean that the branch should turn into a fallthrough.

LLVM doesn't support 'asm goto' so we can't get isInlineAsm here. I added a check and a report_fatal_error to catch cases like this in case they do get added though.

69 ↗(On Diff #34456)

SELECT isn't really part of this patch, so I dropped it for now.

52 ↗(On Diff #34456)

switches, for example.

9 ↗(On Diff #34456)
A single-block function.
An empty block.

We have these already.

Deeper nest.
Back-to-back diamond.
Diamonds in different loops.

Patches welcome. ;-)

TODO on inline asm as a terminator (though wasm probably won't have it, good to know it's missing).

There are no asm terminators in LLVM, so it's not possible get this right now. And, I added checking code which should catch this case in case LLVM ever adds them in the future.

TODO on landingpad.

The TODO is to set our ExceptionHandling type to something other than ExceptionHandling::None so that landingpad doesn't get stripped out entirely before it gets to this pass.

Some of the code deals with instructions after branches, it would be good to test these.

They deal with multiple terminators at the MachineBasicBlock level, which isn't exposed or testable from an LLVM IR unittest.

sunfish updated this revision to Diff 34704.Sep 14 2015, 10:48 AM
  • Update and post the patch from D12787
jfb added a reviewer: jfb.Sep 14 2015, 11:19 AM
sunfish updated this revision to Diff 34714.Sep 14 2015, 11:36 AM

Now with the correct version of the patch.

jfb edited edge metadata.Sep 14 2015, 4:06 PM

You should mark comments that are addressed as "done"!

58 ↗(On Diff #34456)

Non-zero size then? Most blocks won't have more than 8 successors, and that's cheap to put on the stack compared to heap allocating!

95 ↗(On Diff #34456)

Could you add TODOs for the things that make sense to try out?

You still want a final tie breaker though: as it stands the code breaks std::stable_sort's contract by having A < B and B < A both false!

sunfish marked 15 inline comments as done.Sep 14 2015, 4:37 PM
sunfish added inline comments.
59 ↗(On Diff #34714)

(Notice how this comment is no longer displayed anywhere near the code that it's about.)
(Am I supposed to check the Done box in my own comments too?)
(There's a weird bug in phabricator where the Done checkbox is modal and you can't always click it. I am attempting to work around it. Be aware of the lengths I am attempting to go for your benefit here.)

It's inside another vector, so making it zero size is nice because it means move constructing is trivial. However, it's not really that important since neither one of us has profiled this ;-).

96 ↗(On Diff #34714)

Ok, I added a TODO.

stable_sort just wants a strict weak ordering. When A < B and B < A are both false, then stable_sort treats them as equal and orders them according to their original order.

108 ↗(On Diff #34714)

When data arrives, we can update these accordingly.

10 ↗(On Diff #34714)

FYI, I have added several more tests now.

7 ↗(On Diff #34714)

That would go in a new file, rather than switch.ll.

sunfish updated this revision to Diff 34757.Sep 14 2015, 4:39 PM
sunfish edited edge metadata.
sunfish marked 4 inline comments as done.
  • addressed jfb's comments
jfb added a comment.Sep 14 2015, 5:25 PM

I think this is pretty good now. I like the approach as it reuses some LLVM infrastructure, and is overall simpler than the relooper. I'd like us to revisit and compare what we're missing from the relooper, but at this point in time we want to move things forward so I think this is a good start, and we'll do more comparative science later.

59 ↗(On Diff #34714)

Much appreciated :-)

41 ↗(On Diff #34757)

Shouldn't we support switch 64 on either architecture?

sunfish updated this revision to Diff 34770.Sep 14 2015, 6:00 PM
  • Allow SWITCH_I64 in wasm32 mode and vice versa, and add a TODO comment to actually make use of this.
sunfish marked an inline comment as done.Sep 14 2015, 6:01 PM
sunfish added inline comments.
41 ↗(On Diff #34770)

Good point. LLVM's jump table lowering forces the index to be pointer-sized, but we don't need that restriction. I fixed this, added a test, and added a TODO comment.

jfb accepted this revision.Sep 15 2015, 5:25 PM
jfb edited edge metadata.

One small comment, then lgtm. As we discussed offline, I'll unhook D12744 and commit it, since we'll probably want a hybrid approach for a while (or at least compare approaches).

161 ↗(On Diff #34770)

Shouldn't the above be in a separate patch, since you removed select handling?

This revision is now accepted and ready to land.Sep 15 2015, 5:25 PM
This revision was automatically updated to reflect the committed changes.