Page MenuHomePhabricator

[mlir][Transforms] Add pass to perform sparse conditional constant propagation
ClosedPublic

Authored by rriddle on Apr 17 2020, 12:54 PM.

Details

Summary

This revision adds the initial pass for performing SCCP generically in MLIR. SCCP is an algorithm for propagating constants across control flow, and optimistically assumes all values to be constant unless proven otherwise. It currently supports branching control, with support for regions and inter-procedural propagation being added in followups.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Herald added a project: Restricted Project. · View Herald TranscriptApr 17 2020, 12:54 PM
rriddle updated this revision to Diff 258405.Apr 17 2020, 12:58 PM

Cleanup pass summary

bondhugula added a subscriber: bondhugula.EditedApr 17 2020, 1:47 PM

I think the test cases are missing one of the key scenarios where a control flow path / block isn't executed because the predicate is a constant but that is only known by detecting that the same constant propagates across the back edge as well as the other predecessor. And to know that the same constant propagates, you need to know that the aforementioned control flow path doesn't execute - sort of a catch22. An example (mixed basic blocks with some C for brevity).

func @foo

^bb0:
  c1 = constant 1
  br ^bb1(c1)

^bb1(x1)
 if x1 < 20
   x2 = 1
   br ^bb1(x2)
 else 
   x3 = 50
   br ^bb1(x3)
mlir/include/mlir/Transforms/Passes.td
282

remove -> removed

mlir/test/Transforms/sccp.mlir
38

Nit: No phi's please! PHI values -> arguments ?

rriddle updated this revision to Diff 258430.Apr 17 2020, 2:48 PM

Resolve comments

rriddle marked 2 inline comments as done.Apr 17 2020, 2:50 PM

Thanks for the review Uday!

I added the test case as simple_loop_inner_control_flow, let me know if that is what you had in mind or if there are any other tests you'd like to see.

bondhugula added inline comments.Apr 17 2020, 10:20 PM
mlir/lib/Transforms/SCCP.cpp
44–45

Nit: A value with dynamic values can be confusing - instead, A value that cannot statically be determined to be a constant

130

Could you expand this comment?

157

Nit: do either backticks and regular single quotes work for parameters in doc comments?

314

You can with the last output parameter on tryToFold - is this what you were looking for?

bool inPlaceUpdate;
FoldUtils::tryToFold(op, nullptr, nullptr, &inPlaceUpdate);
// operation is folded if inPlaceUpdate is false
345–364

Could this reuse tryToFold?

374–378

Trivial braces.

mlir/test/Transforms/sccp.mlir
101

Mention here to the effect that this is not just inner control flow but you are testing sensitivity to non-executable edges in the CFG.

122–124

You could make this a little stronger/realistic by just doing an IV increment here.

%iv_inc = addi %iv, %cst_1
br ^bb1(%iv_inc)

Still the same result.

126–127

This looks good, but reading the test case in isolation won't make it clear as to what happens to the non-executable edge/path post SCCP on this test case. Does it stay as is, or becomes a dead/unreachable block, or is deleted? Add additional checks depending on what it is? (I just jumped here without looking at all of the actual code)

Thanks for the review Uday!

I added the test case as simple_loop_inner_control_flow, let me know if that is what you had in mind or if there are any other tests you'd like to see.

The test case looks good - thanks. A round of mostly superficial comments. Btw, why is the build failing? I can't tell by chasing the links here:
https://reviews.llvm.org/harbormaster/build/61458/ and the status here is fine as well: https://buildkite.com/mlir/mlir-core
(This is for my future reference as well.)

rriddle updated this revision to Diff 258480.Apr 17 2020, 10:50 PM
rriddle marked 14 inline comments as done.

Resolve comments

Thanks for the review!

mlir/lib/Transforms/SCCP.cpp
157

I'm not sure as we don't generate doxygen ATM for me to look at. The codebase is fairly inconsistent to either, but changed all within this file to consistently use ''. We should figure that out though, and standardize to using one.

314

This is a bit of a tricky situation. We don't actually want to fold here, this is essentially just simulated execution with constant parameters. The constants we have here aren't guaranteed to be those at runtime, so it's better to avoid the extra overhead of OperationFolder as we don't want to generated anything at this point.

345–364

Same comment above. We don't want to generate any constant operations here, we just want to know what the output would be. As stated above, the constant values are the current values from the lattice not the current IR. So even if we did use it, it wouldn't give us the result we want.

374–378

Thanks for the catch, they weren't trivial at one point but I forgot to cleanup.

mlir/test/Transforms/sccp.mlir
126–127

Added a few extra checks for making sure various things get folded. ATM we just insert constants. I've debated back and forth about adding in the CFG canonicalizations here, but given that canonicalize can already take care of those I decided to keep the initial versions focused on the propagation.

Btw, why is the build failing? I can't tell by chasing the links here:
https://reviews.llvm.org/harbormaster/build/61458/ and the status here is fine as well: https://buildkite.com/mlir/mlir-core
(This is for my future reference as well.)

I tried looking at the log, but it seems like it's just failing to create a local git branch to apply the diff to. I have been seeing a lot of harbormaster failures lately that just result to "failed to apply patch". I've just been ignoring these failures if local testing works.

bondhugula added inline comments.Apr 17 2020, 11:13 PM
mlir/lib/Transforms/SCCP.cpp
314

Right - while the algorithm is still running, a current 'constant' lattice value could be later lowered to the bottom symbol ('overdefined'). But once the algorithm has converged, all 'Unknown' lattice values will become either 'constant' or 'overdefined' at which point we could fold the relevant ops. Do you want to collect such ops? You'll still ideally need a new pattern rewriting driver that takes in a list of ops to process as the starting point. In any case, the comment above on "Don't try to fold the results ... won't be in-place" needs to be fixed - it's misleading because tryToFold does in-place updates as well. Instead, change to "can't guarantee that ... will be out of place"?

rriddle marked 2 inline comments as done.Apr 17 2020, 11:18 PM
rriddle added inline comments.
mlir/lib/Transforms/SCCP.cpp
314

Yeah, this is what I mentioned in the comment in the test file. I'm not sure all of the types of foldings we want to do within SCCP itself at this point. Right now we replace any results that were found to be constant and erase any operations that we can, but that doesn't cover all of the potential simplifications we could do. Right now my main goal is to get the constant propagation working across regions and inter-procedurally. After that I think we can tune the amount of additional simplifications we do here based on how it fits into user pipelines. Also, thank you for the comment suggestion!

Btw, why is the build failing? I can't tell by chasing the links here:
https://reviews.llvm.org/harbormaster/build/61458/ and the status here is fine as well: https://buildkite.com/mlir/mlir-core
(This is for my future reference as well.)

I tried looking at the log, but it seems like it's just failing to create a local git branch to apply the diff to. I have been seeing a lot of harbormaster failures lately that just result to "failed to apply patch". I've just been ignoring these failures if local testing works.

Yes, I think nearly all the harbormaster builds from at least the last few days are failing (all for me) - this is also the reason many commits in the last few days broke builds - everyone is ignoring notifications. This is a slippery slope esp with the revisions that are changing CMake files.

rriddle updated this revision to Diff 258485.Apr 17 2020, 11:22 PM
rriddle marked an inline comment as done.

Clarify in-place fold comments

bondhugula requested changes to this revision.Apr 17 2020, 11:52 PM

Some more comments. Again, I've been looking at things a bit locally. Will take a global look in the next round. Are your test cases still missing scenarios like:

  1. paths propagating different constants meeting (you have one with back edges (the last one) but none with the straightforward if/else style).
  2. a test case where you have say a multi operand op (say addi) that has one operand coming directly from a constant op and the other via a block argument whose both predecessors send in the same constant.
mlir/lib/Transforms/SCCP.cpp
324

But getResults() is empty!

354

Nit: Try to fold -> Simulate folding .... This is the part that's contradictory in the face of FoldUtils::tryToFold and the comment above "Don't try to fold ..."

359

Why isn't this conservative? If you mark it overdefined already, it can never be highered again. What if some of its operand lattice values are later lowered from unknown to constant? Are you treating both unknown and overdefined in the same way here for simulated folding purposes?

This revision now requires changes to proceed.Apr 17 2020, 11:52 PM
rriddle updated this revision to Diff 258489.Apr 18 2020, 12:23 AM
rriddle marked 5 inline comments as done.

Resolve comments

Some more comments. Again, I've been looking at things a bit locally. Will take a global look in the next round. Are your test cases still missing scenarios like:

  1. paths propagating different constants meeting (you have one with back edges (the last one) but none with the straightforward if/else style).
  2. a test case where you have say a multi operand op (say addi) that has one operand coming directly from a constant op and the other via a block argument whose both predecessors send in the same constant.

Added a test case for 1, 2 doesn't seem to cover any new code paths. There is already a test for folding an addi that has an input from a loop iv.

mlir/lib/Transforms/SCCP.cpp
324

getResults isn't necessarily empty if the operations has regions.

359

It already is conservative, if you look above we only get to this point if all of the operands have been resolved. At this point the operands can only go to overdefined, so there isn't a way that more information can be propagated to this point.

bondhugula added inline comments.Apr 18 2020, 12:37 AM
mlir/lib/Transforms/SCCP.cpp
324

But you've already handled the > 0 regions case in the block above and have returned for all those cases. You don't need the '|| conditional' altogether here.

rriddle updated this revision to Diff 258490.Apr 18 2020, 12:49 AM
rriddle marked an inline comment as done.

Resolve comments

mlir/lib/Transforms/SCCP.cpp
324

Oh duh, sorry about that. Thanks for pointing that out. Accidentally missed that when changing region handling.

Since the 'S' in SCCP is for sparse because this is for SSA representations and given that MLIR is all already SSA, there won't be a CCP. So you could just call this file ConditionalConstantPropagation.cpp if you prefer too instead of adding yet another hard to type four letter uppercase acronym. The CL flag could continue to use -sccp or -ccp.

bondhugula marked an inline comment as done.Apr 18 2020, 2:03 AM

Overall, this code looks really good. I can do a complete review unless Jacques/Mehdi who were originally added as reviewers plan to review it anyway.

mlir/lib/Transforms/SCCP.cpp
90

The standard term for this in the SCCP paper and the literature is meet - these are the meet rules. mergeIn -> meet?

132

Typo: any -> many.

359

Thanks - that's what I missed - the check above for unknown operands.

456

argLattice is actually invariant? Hoist it out of the loop, and use it in the block above where you are marking it overdefined.

rriddle updated this revision to Diff 258533.Apr 18 2020, 11:12 AM
rriddle marked 4 inline comments as done.

Resolve comments

mlir/lib/Transforms/SCCP.cpp
456

It is recomputed in the loop to avoid potential iterator invalidation w.r.t the lattice for the branch operand. Refactored to avoid eagerly constructing the branch operand lattice.

fhahn added a subscriber: fhahn.Apr 18 2020, 12:39 PM

Conceptually this looks very similar to LLVM's SCCP. Do you anticipate using any MLIR specific properties to cover more cases than the LLVM version? I am not really up-to-speed with the latest on MLIR dialects, but to me it seems like there is a set of transformations where there is currently no additional information to use from MLIR dialects compared to LLVM IR, like SCCP or GVN.

Viewed very simplistically, they only require the following from an IR: SSA, a way to traverse a function along CFG edges, a way to simplify instructions and a way to replace values. Do you think it would be feasible to provide some kind of interface to abstract those and then work towards sharing the implementations of the underlying algorithms between LLVM & MLIR? Or is the plan to duplicate various passes from LLVM also in MLIR?

Conceptually this looks very similar to LLVM's SCCP. Do you anticipate using any MLIR specific properties to cover more cases than the LLVM version? I am not really up-to-speed with the latest on MLIR dialects, but to me it seems like there is a set of transformations where there is currently no additional information to use from MLIR dialects compared to LLVM IR, like SCCP or GVN.

Viewed very simplistically, they only require the following from an IR: SSA, a way to traverse a function along CFG edges, a way to simplify instructions and a way to replace values. Do you think it would be feasible to provide some kind of interface to abstract those and then work towards sharing the implementations of the underlying algorithms between LLVM & MLIR? Or is the plan to duplicate various passes from LLVM also in MLIR?

Thanks for the comment Florian! I would be very +1 on sharing common implementations of the more generalized algorithms if we can. From a technical perspective, the major problems I've encountered with bridging the gap are a few impedance mismatches between LLVM and MLIR that arise from different implementation decisions:

  • The multi-level aspect leads to special handling for various constructs in MLIR(e.g. regions/structured control flow/etc.) that aren't present in LLVM.
  • MLIR globals(like functions) are not SSA values, and require specific handling because they do not use or interact with the traditional SSA use list.
  • Block arguments vs PHIs

When adding things in MLIR that already exist in LLVM, at least for me, it is a cost-benefit computation between the amount of work it would take to refactor the thing in LLVM to be usable by MLIR. By amount of work I mean not only the technical aspects, but also the effort to convince the community that it is the right thing to do. I'd love to share as much as possible(as we do for things like ADT/Support/etc) and I'm happy to collaborate in that direction, but if I'm driving it alone the cost has often out-weighed the benefits.

rriddle updated this revision to Diff 258552.Apr 18 2020, 1:32 PM

Tidy up a few things

Viewed very simplistically, they only require the following from an IR: SSA, a way to traverse a function along CFG edges, a way to simplify instructions and a way to replace values. Do you think it would be feasible to provide some kind of interface to abstract those and then work towards sharing the implementations of the underlying algorithms between LLVM & MLIR? Or is the plan to duplicate various passes from LLVM also in MLIR?

That'll be an interesting and likely recurring question moving forward. I wonder however if when applicable LLVM would be OK to take a slow-down from generalizing passes to MLIR and going through interfaces for the sake of sharing the implementation?

rriddle updated this revision to Diff 258568.Apr 18 2020, 6:36 PM

Tidy up terminator handling

fhahn added a comment.Apr 19 2020, 1:31 PM

Conceptually this looks very similar to LLVM's SCCP. Do you anticipate using any MLIR specific properties to cover more cases than the LLVM version? I am not really up-to-speed with the latest on MLIR dialects, but to me it seems like there is a set of transformations where there is currently no additional information to use from MLIR dialects compared to LLVM IR, like SCCP or GVN.

Viewed very simplistically, they only require the following from an IR: SSA, a way to traverse a function along CFG edges, a way to simplify instructions and a way to replace values. Do you think it would be feasible to provide some kind of interface to abstract those and then work towards sharing the implementations of the underlying algorithms between LLVM & MLIR? Or is the plan to duplicate various passes from LLVM also in MLIR?

Thanks for the comment Florian! I would be very +1 on sharing common implementations of the more generalized algorithms if we can. From a technical perspective, the major problems I've encountered with bridging the gap are a few impedance mismatches between LLVM and MLIR that arise from different implementation decisions:

  • The multi-level aspect leads to special handling for various constructs in MLIR(e.g. regions/structured control flow/etc.) that aren't present in LLVM.
  • MLIR globals(like functions) are not SSA values, and require specific handling because they do not use or interact with the traditional SSA use list.
  • Block arguments vs PHIs

Thanks for sharing the list. It seems like handling nested regions/structured control flow would be the most tricky to abstract. I am not really up-to-date on what is possible with nested regions, but would the interactions allowed between different regions be similar across all types of regions or could it be dependent on the region type? I.e. is it safe to propagate constants to all regions or is it unsafe to propagate constants between certain types of regions?

I think the other 2 issues should be fairly straight-forward to abstract.

When adding things in MLIR that already exist in LLVM, at least for me, it is a cost-benefit computation between the amount of work it would take to refactor the thing in LLVM to be usable by MLIR. By amount of work I mean not only the technical aspects, but also the effort to convince the community that it is the right thing to do. I'd love to share as much as possible(as we do for things like ADT/Support/etc) and I'm happy to collaborate in that direction, but if I'm driving it alone the cost has often out-weighed the benefits.

I think that's a reasonable trade-off. I mostly wanted to get a discussions started on that topic, which would definitely not be a short-term kind of project. I'm not too familiar with the MLIR side of things, but I might have a bit of time to collaborate on the LLVM side of such a project. It might be good to sync up/discuss ideas somewhere else than the current review ;)

Viewed very simplistically, they only require the following from an IR: SSA, a way to traverse a function along CFG edges, a way to simplify instructions and a way to replace values. Do you think it would be feasible to provide some kind of interface to abstract those and then work towards sharing the implementations of the underlying algorithms between LLVM & MLIR? Or is the plan to duplicate various passes from LLVM also in MLIR?

That'll be an interesting and likely recurring question moving forward. I wonder however if when applicable LLVM would be OK to take a slow-down from generalizing passes to MLIR and going through interfaces for the sake of sharing the implementation?

I think the answer here depends a lot on how this would look like in practice. Are you referring to slow-down in terms of runtime or development?

In terms of run-time overhead, the abstraction would have to be very cheap I think, but that should be feasible as long as it is limited to a few key aspects (like traversing functions/blocks/regions and an interface to simplify instructions given a set of input values (which LLVM has)). In terms of slowing down development, I am personally not too concerned. The passes that seem likely candidates (SCCP, GVN) don't seem to see a huge amount of ongoing development.

I think sharing implementations could be beneficial for both LLVM and MLIR, as it would hopefully mean more users = more testing = more people willing to work on fixes/improvements.

To clarify I was referring to compile-time slow-down as in LLVM would get slower.

Are @jpienaar or @mehdi_amini planning to review this anyway?

bondhugula accepted this revision.Apr 20 2020, 10:55 AM
bondhugula marked an inline comment as done.
bondhugula added inline comments.
mlir/lib/Transforms/SCCP.cpp
157

The doxygen is actually generated - it's here. https://mlir.llvm.org/doxygen/

309

If the value is an op result, do you want to / can you erase this op? You already have a pre inc iterator at its call site.

This revision is now accepted and ready to land.Apr 20 2020, 10:55 AM
rriddle marked 3 inline comments as done.Apr 20 2020, 11:35 AM
rriddle added inline comments.
mlir/lib/Transforms/SCCP.cpp
157

Oh, nice.

309

If you look at Line 281 we erase the op if it is valid and all of the results were replaced.

This revision was automatically updated to reflect the committed changes.
rriddle marked an inline comment as done.