This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Transforms/
-
mlir/
-
Transforms/
-
Passes.h
2/2
Passes.td
-
lib/Transforms/
-
Transforms/
-
CMakeLists.txt
19/19
RemoveNonLiveValues.cpp
-
test/Transforms/
-
Transforms/
13/13
remove-non-live-values.mlir

Differential D157049

[MLIR][transforms] Add an optimization pass to remove dead values
ClosedPublic

Authored by srishti-pm on Aug 3 2023, 3:47 PM.

Download Raw Diff

Details

Reviewers

matthiaskramm
jpienaar
Mogball
jcai19
mehdi_amini

Commits

rG0e98fb9fadb0: [MLIR][transforms] Add an optimization pass to remove dead values

Summary

Large deep learning models rely on heavy computations. However, not
every computation is necessary. And, even when a computation is
necessary, it helps if the values needed for the computation are
available in registers (which have low-latency) rather than being in
memory (which has high-latency).

Compilers can use liveness analysis to:-
(1) Remove extraneous computations from a program before it executes on
hardware, and,
(2) Optimize register allocation.

Both these tasks help achieve one very important goal: reducing runtime.

Recently, liveness analysis was added to MLIR. Thus, this commit uses
the recently added liveness analysis utility to try to accomplish task
(1).

It adds a pass called remove-dead-values whose goal is
optimization (reducing runtime) by removing unnecessary instructions.
Unlike other passes that rely on local information gathered from
patterns to accomplish optimization, this pass uses a full analysis of
the IR, specifically, liveness analysis, and is thus more powerful.

Currently, this pass performs the following optimizations:
(A) Removes function arguments that are not live,
(B) Removes function return values that are not live across all callers of
the function,
(C) Removes unneccesary operands, results, region arguments, region
terminator operands of region branch ops, and,
(D) Removes simple and region branch ops that have all non-live results and
don't affect memory in any way,

iff

the IR doesn't have any non-function symbol ops, non-call symbol user ops
and branch ops.

Here, a "simple op" refers to an op that isn't a symbol op, symbol-user op,
region branch op, branch op, region branch terminator op, or return-like.

It is noteworthy that we do not refer to non-live values as "dead" in this
file to avoid confusing it with dead code analysis's "dead", which refers to
unreachable code (code that never executes on hardware) while "non-live"
refers to code that executes on hardware but is unnecessary. Thus, while the
removal of dead code helps little in reducing runtime, removing non-live
values should theoretically have significant impact (depending on the amount
removed).

It is also important to note that unlike other passes (like canonicalize)
that apply op-specific optimizations through patterns, this pass uses
different interfaces to handle various types of ops and tries to cover all
existing ops through these interfaces.

It is because of its reliance on (a) liveness analysis and (b) interfaces
that makes it so powerful that it can optimize ops that don't have a
canonicalizer and even when an op does have a canonicalizer, it can perform
more aggressive optimizations, as observed in the test files associated with
this pass.

Example of optimization (A):-

int add_2_to_y(int x, int y) {
  return 2 + y
}

print(add_2_to_y(3, 4))
print(add_2_to_y(5, 6))

becomes

int add_2_to_y(int y) {
  return 2 + y
}

print(add_2_to_y(4))
print(add_2_to_y(6))

Example of optimization (B):-

int, int get_incremented_values(int y) {
  store y somewhere in memory
  return y + 1, y + 2
}

y1, y2 = get_incremented_values(4)
y3, y4 = get_incremented_values(6)
print(y2)

becomes

int get_incremented_values(int y) {
  store y somewhere in memory
  return y + 2
}

y2 = get_incremented_values(4)
y4 = get_incremented_values(6)
print(y2)

Example of optimization (C):-

Assume only %result1 is live here. Then,

%result1, %result2, %result3 = scf.while (%arg1 = %operand1, %arg2 = %operand2) {
  %terminator_operand2 = add %arg2, %arg2
  %terminator_operand3 = mul %arg2, %arg2
  %terminator_operand4 = add %arg1, %arg1
  scf.condition(%terminator_operand1) %terminator_operand2, %terminator_operand3, %terminator_operand4
} do {
^bb0(%arg3, %arg4, %arg5):
  %terminator_operand6 = add %arg4, %arg4
  %terminator_operand5 = add %arg5, %arg5
  scf.yield %terminator_operand5, %terminator_operand6
}

becomes

%result1, %result2 = scf.while (%arg2 = %operand2) {
  %terminator_operand2 = add %arg2, %arg2
  %terminator_operand3 = mul %arg2, %arg2
  scf.condition(%terminator_operand1) %terminator_operand2, %terminator_operand3
} do {
^bb0(%arg3, %arg4):
  %terminator_operand6 = add %arg4, %arg4
  scf.yield %terminator_operand6
}

It is interesting to see that %result2 won't be removed even though it is
not live because %terminator_operand3 forwards to it and cannot be
removed. And, that is because it also forwards to %arg4, which is live.

Example of optimization (D):-

int square_and_double_of_y(int y) {
  square = y ^ 2
  double = y * 2
  return square, double
}

sq, do = square_and_double_of_y(5)
print(do)

becomes

int square_and_double_of_y(int y) {
  double = y * 2
  return double
}

do = square_and_double_of_y(5)
print(do)

Signed-off-by: Srishti Srivastava <srishtisrivastava.ai@gmail.com>

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

srishti-pm created this revision.Aug 3 2023, 3:47 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 3 2023, 3:47 PM

Herald added subscribers: bviyer, Moerafaat, zero9178 and 21 others. · View Herald Transcript

srishti-pm requested review of this revision.Aug 3 2023, 3:47 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 3 2023, 3:47 PM

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Make nit comment changes.

srishti-pm edited the summary of this revision. (Show Details)Aug 3 2023, 4:13 PM

srishti-pm added reviewers: matthiaskramm, jcai19, jpienaar, mehdi_amini.

Update commit summary.

srishti-pm edited the summary of this revision. (Show Details)Aug 3 2023, 4:35 PM

Herald added a subscriber: wangpc. · View Herald TranscriptAug 3 2023, 4:35 PM

Can you add a bit more example that showcase how this pass catches cases that canonicalization cannot handle?

mlir/lib/Transforms/RemoveNonLiveValues.cpp
63	Nit: `ArrayRef<T>` should always be a replacement for `const SmallVector<T> &` I believe (Same elsewhere)
mlir/test/Transforms/remove-non-live-values.mlir
45	Canonicalize nukes everything here right now
265	Canonicalize already does this simplification.
293	Canonicalization already almost does that, in any case nothing requires a liveness analysis here I think

Enhance unit tests of cleanSimpleOp().

In D157049#4559295, @mehdi_amini wrote:

Can you add a bit more example that showcase how this pass catches cases that canonicalization cannot handle?

Sure, I'll look into this.

mehdi_amini added inline comments.Aug 3 2023, 4:58 PM

mlir/test/Transforms/remove-non-live-values.mlir

FYI canonicalize simplifies down to:

func.func @clean_simple_ops(%arg0: i32, %arg1: memref<i32>, %arg2: i32) -> i32 {
  %c6_i32 = arith.constant 6 : i32
  %0 = arith.addi %arg0, %arg0 : i32
  memref.store %c6_i32, %arg1[] : memref<i32>
  return %0 : i32
}

srishti-pm added inline comments.Aug 3 2023, 5:41 PM

mlir/test/Transforms/remove-non-live-values.mlir
74	Yup, thanks. I'm doing the comparison with `canonicalize` right now.

Add notes about comparisons with the canonicalize pass. More tests and notes to be added.

More tests and notes to be added.

I'm working on this.

Harbormaster completed remote builds in B250230: Diff 547070.Aug 3 2023, 9:14 PM

Enhance test cases to reveal functionality difference from the canonicalize pass.

In D157049#4559445, @srishti-pm wrote:

More tests and notes to be added.

I'm working on this.

This is done. @mehdi_amini, can you have a look at it and share your views?

Harbormaster completed remote builds in B250661: Diff 547626.Aug 6 2023, 7:06 PM

Nit changes

Harbormaster completed remote builds in B250663: Diff 547628.Aug 6 2023, 7:28 PM

Do you have any case of non-interprocedural effect of this?

In D157049#4564578, @mehdi_amini wrote:

Do you have any case of non-interprocedural effect of this?

Yes. Was trying to find some. Hadn't found one earlier but I have found some now. Adding them. Will update you once added.

srishti-pm retitled this revision from [MLIR][TRANSFORMS] Add an optimization pass to remove non-live values to [MLIR][transforms] Add an optimization pass to remove non-live values.Aug 7 2023, 11:01 AM

Update unit tests.

@mehdi_amini, I have updated the unit tests. Here's how you can navigate them:-

Non-interprocedural optimizations (positive tests) shown in:-
@clean_region_branch_op_dont_remove_first_2_results_but_remove_first_operand
@clean_region_branch_op_remove_last_2_results_last_2_arguments_and_last_operand
@clean_region_branch_op_remove_result

Interprocedural optimizations (positive tests) shown in:-
@clean_func_op_remove_argument_and_return_value
@clean_func_op_remove_arguments
@clean_func_op_remove_return_values
@clean_simple_ops
@clean_region_branch_op_erase_it

Negative tests:-
@dont_touch_unacceptable_ir
@dont_touch_unacceptable_ir_has_cleanable_simple_op_with_branch_op
@clean_func_op_dont_remove_return_values

All positive tests show effects that are different from the canonicalize pass.

Harbormaster completed remote builds in B250910: Diff 547940.Aug 7 2023, 6:51 PM

Can you try to elaborate with how you see the difference with canonicalize in the intra-procedural case?

In D157049#4568318, @mehdi_amini wrote:

Can you try to elaborate with how you see the difference with canonicalize in the intra-procedural case?

They're just optimizations that don't happen in canonicalize because patterns to perform those canonicalizations haven't been added to that particular op (like, say, the scf.while op). On the tests that I have added for the intra-procedural cases, we see signification optimization that this pass is able to do but if we run canonicalize on the same tests, it does nothing to the IR. Is this what you're asking or did I misunderstand your question?

srishti-pm added a reviewer: Mogball.Aug 9 2023, 11:25 AM

Naming nit: Can this be called removeDeadValues?

Mogball added inline comments.Aug 9 2023, 11:39 AM

mlir/include/mlir/Transforms/Passes.td
94	Please list the required invariants for the pass to succeed on the IR. There are invariants enforced in the pass definition but they are not listed here.
mlir/lib/Transforms/RemoveNonLiveValues.cpp
22	Please use `"` includes for MLIR/LLVM files
571	Please make this an error. This indicates a phase-ordering problem in the pipeline. Also, please improve the error message

In D157049#4574028, @Mogball wrote:

Naming nit: Can this be called removeDeadValues?

Thanks for bringing this up! This is actually an important discussion. We can't call it removeDeadValues because "non-live" doesn't mean "dead". Something could be "non-live" and not "dead". Also, something could be "dead" and not "non-live". Both these are orthogonal to each other.

"dead" (like in "dead code elimination") refers to instructions that should theoretically never execute on the hardware. So, theoretically, we could say that their removal should not influence the runtime much because these instructions were never executing on the hardware anyways. An example of dead code is lines 2-4 here:

1 x = 28
2 if (false) {
3   x = 99
4 }
5 print(x)

Liveness is different. An instruction could be producing a non-live result but still not be dead code. An example of such an instruction is the one on line 6:

6 x = 28
7 x = 99

Here, both the instructions execute on hardware. So, we cannot say that either of them is dead code. But, x is non-live after the first instruction, which means that the first instruction is unnecessarily executing on the hardware and increasing runtime. Again, it is nice to observe that the x = 99 on line 3 in the first codeblock is producing a live result; x is live; but the instruction is still dead.

The goal of this pass is to try to reduce the runtime by removing "non-live" values.

These could be function arguments that are present in a function but aren't live, in which case we modify the function signature to remove them. For example,

int add_2_to_y(int x, int y) {
  return 2 + y
}

print(add_2_to_y(3, 4))
print(add_2_to_y(5, 6))

This gets transformed to

int add_2_to_y(int y) {
  return 2 + y
}

print(add_2_to_y(4))
print(add_2_to_y(6))

Note that here x could be non-live even if it has uses, if ultimately, all its uses didn't affect memory or the final output of the program.

They could be function return values that are returned by a function but never live in any of its callers, in which case we again modify the function signature to remove them.

int, int get_incremented_values(int y) {
  return y + 1, y + 2
}

y1, y2 = get_incremented_values(4)
y3, y4 = get_incremented_values(6)
print(y2)

This gets transformed to

int get_incremented_values(int y) {
  return y + 2
}

y2 = get_incremented_values(4)
y4 = get_incremented_values(6)
print(y2)

Similarly, these could be unnecessary operands of an operation that say, forwards these operands to different regions but these values are not live in any of the regions (look at tests @clean_region_branch_op_dont_remove_first_2_results_but_remove_first_operand, @clean_region_branch_op_remove_last_2_results_last_2_arguments_and_last_operand, @clean_region_branch_op_remove_result for such examples).

Does this make things clearer?

jcai19 accepted this revision.Aug 9 2023, 3:43 PM

jcai19 added inline comments.

mlir/lib/Transforms/RemoveNonLiveValues.cpp
561	Maybe rename it to module to make it clearer.

This revision is now accepted and ready to land.Aug 9 2023, 3:43 PM

jcai19 removed a reviewer: jcai19.Aug 9 2023, 3:44 PM

This revision now requires review to proceed.Aug 9 2023, 3:44 PM

jcai19 added a reviewer: jcai19.Aug 9 2023, 3:44 PM

srishti-pm added inline comments.Aug 9 2023, 5:44 PM

mlir/lib/Transforms/RemoveNonLiveValues.cpp
63	I actually tried this but seems like this cannot be done. When I replace `const SmallVector<T> &` with `ArrayRef<T>`, I get the `error: no known conversion from 'mlir::Operation::result_range' (aka 'mlir::ResultRange') to 'ArrayRef<mlir::Value>' for 1st argument`. Am I doing something wrong?

Address comments.

mlir/lib/Transforms/RemoveNonLiveValues.cpp
22	Done.

They're just optimizations that don't happen in canonicalize because patterns to perform those canonicalizations haven't been added to that particular op (like, say, the scf.while op). On the tests that I have added for the intra-procedural cases, we see signification optimization that this pass is able to do but if we run canonicalize on the same tests, it does nothing to the IR. Is this what you're asking or did I misunderstand your question?

Yes: basically you are proposing to introduce a new pass, if it is just "this pass does not do more than canonicalization", the value proposal isn't clear. Right now there is nothing in either the review description or the pass description that provides any context on this.
So, my questions are all about prompting on the value proposition of this work, by contrasting it to something we all know (canonicalize).

One specific question that came to mind right now is: is this pass idempotent or would rerunning it a second time (recomputing the analysis) provide a different result? What about the interleaving with canonicalization?

srishti-pm added inline comments.Aug 9 2023, 6:42 PM

mlir/lib/Transforms/RemoveNonLiveValues.cpp
571	Actually this isn't a phase ordering issue but rather a demonstration of a weakness of the pass. It is potentially possible that an IR cannot be converted to something that will be "acceptable". The pass needs to handle such "unacceptable" IRs but it doesn't, as of now. It can be made to, by making more incremental changes to this pass but since those changes would be quite complex (and lengthy), they weren't all done in this patch. Based on this, do you still think it should be an error rather than a warning?

srishti-pm marked 4 inline comments as done.Aug 9 2023, 6:46 PM

srishti-pm added inline comments.

mlir/test/Transforms/remove-non-live-values.mlir
45	Comment made obsolete since the tests have been updated (canonicalize doesn't do it).
74	Done.
265	Comment made obsolete since the tests have been updated (canonicalize doesn't do it).
293	I think maybe it does require liveness analysis! Checkout the modified tests and let me know what you think :)

In D157049#4575202, @mehdi_amini wrote:

One specific question that came to mind right now is: is this pass idempotent or would rerunning it a second time (recomputing the analysis) provide a different result? What about the interleaving with canonicalization?

The pass is idempotent. The analysis is idempotent too. What do you mean by "what about the interleaving with canonicalization"?

In D157049#4575202, @mehdi_amini wrote:

They're just optimizations that don't happen in canonicalize because patterns to perform those canonicalizations haven't been added to that particular op (like, say, the scf.while op). On the tests that I have added for the intra-procedural cases, we see signification optimization that this pass is able to do but if we run canonicalize on the same tests, it does nothing to the IR. Is this what you're asking or did I misunderstand your question?

Yes: basically you are proposing to introduce a new pass, if it is just "this pass does not do more than canonicalization", the value proposal isn't clear. Right now there is nothing in either the review description or the pass description that provides any context on this.
So, my questions are all about prompting on the value proposition of this work, by contrasting it to something we all know (canonicalize).

I understand the issue. Working on making the "value added by this pass" clearer.

In D157049#4575215, @srishti-pm wrote:

In D157049#4575202, @mehdi_amini wrote:

One specific question that came to mind right now is: is this pass idempotent or would rerunning it a second time (recomputing the analysis) provide a different result? What about the interleaving with canonicalization?

The pass is idempotent. The analysis is idempotent too. What do you mean by "what about the interleaving with canonicalization"?

It is about idempotence of canonicalization + this pass.
So If I run: --pass-pipeline=builtin.module(remove-non-live-values, canonicalize) ; do I always get the same result as --pass-pipeline=builtin.module(remove-non-live-values, canonicalize, remove-non-live-values, canonicalize)

Harbormaster completed remote builds in B251549: Diff 548845.Aug 9 2023, 10:10 PM

In D157049#4575475, @mehdi_amini wrote:

In D157049#4575215, @srishti-pm wrote:

In D157049#4575202, @mehdi_amini wrote:

One specific question that came to mind right now is: is this pass idempotent or would rerunning it a second time (recomputing the analysis) provide a different result? What about the interleaving with canonicalization?

The pass is idempotent. The analysis is idempotent too. What do you mean by "what about the interleaving with canonicalization"?

It is about idempotence of canonicalization + this pass.
So If I run: --pass-pipeline=builtin.module(remove-non-live-values, canonicalize) ; do I always get the same result as --pass-pipeline=builtin.module(remove-non-live-values, canonicalize, remove-non-live-values, canonicalize)

Yes, you'll get the same result, assuming that canonicalize is idempotent. Without that assumption, all we can say is that the result of -remove-non-live-values -canonicalize will be the same as the result of -remove-non-live-values -canonicalize -remove-non-live-values.

Basically, I think any run of remove-non-live-values after its first run will essentially do nothing to the IR, irrespective of what other passes execute in between.

This is because: This pass uses an analysis called liveness analysis. And, the liveness of a value doesn't change after some (any) pass is applied to the IR because no pass changes the semantics of an IR (to my understanding). And since the only thing that this pass relies on is liveness analysis, it will not, for example, perform better optimizations after an IR has been canonicalized or inlined or anything. It will remove non-live values, no matter what form the IR is in. Again, it should be noted here that since this patch introduces this pass, it may not be perfect. The pass relies on liveness analysis which isn't perfect and liveness analysis relies on sparse backward dataflow which isn't perfect. But, theoretically speaking, if we assume that sparse backward dataflow analysis is perfect, liveness analysis is perfect, and this pass is perfect (as in, no forms of IR have been missed), the above holds true. I am also parallely trying to make these things perfect. Some patches that work towards that are here: https://reviews.llvm.org/D156376, https://reviews.llvm.org/D157261. It is potentially a long way to go. Hopefully we can achieve that 😅

Does this make sense?

srishti-pm updated this revision to Diff 549127.Aug 10 2023, 11:52 AM

srishti-pm marked an inline comment as done.

Address Jian's comment.

Harbormaster completed remote builds in B251753: Diff 549127.Aug 10 2023, 4:47 PM

Rebase to main.

Harbormaster completed remote builds in B252008: Diff 549465.Aug 11 2023, 12:56 PM

Address @Mogball and @mehdi_amini's comments.

mlir/include/mlir/Transforms/Passes.td
94	Done.
mlir/lib/Transforms/RemoveNonLiveValues.cpp
571	Have improved the message.

srishti-pm edited the summary of this revision. (Show Details)Aug 14 2023, 2:03 PM

srishti-pm edited the summary of this revision. (Show Details)

In D157049#4575228, @srishti-pm wrote:

I understand the issue. Working on making the "value added by this pass" clearer.

@mehdi_amini, I have done this now.

All comments here have been addressed and/or answered to. Awaiting a re-review @mehdi_amini, @Mogball, and @jcai19.

Harbormaster completed remote builds in B252454: Diff 550072.Aug 14 2023, 6:16 PM

clang-format

Harbormaster completed remote builds in B252516: Diff 550164.Aug 14 2023, 11:07 PM

matthiaskramm accepted this revision.Aug 15 2023, 11:02 AM

This revision is now accepted and ready to land.Aug 15 2023, 11:02 AM

Fix handling of region branch and region branch terminator ops based on the recent changes to their interface methods.

Harbormaster completed remote builds in B252705: Diff 550419.Aug 15 2023, 1:35 PM

Nit changes.

srishti-pm edited the summary of this revision. (Show Details)Aug 15 2023, 1:57 PM

Harbormaster completed remote builds in B252734: Diff 550461.Aug 15 2023, 2:56 PM

In D157049#4577762, @srishti-pm wrote:

In D157049#4575475, @mehdi_amini wrote:

In D157049#4575215, @srishti-pm wrote:

In D157049#4575202, @mehdi_amini wrote:

One specific question that came to mind right now is: is this pass idempotent or would rerunning it a second time (recomputing the analysis) provide a different result? What about the interleaving with canonicalization?

The pass is idempotent. The analysis is idempotent too. What do you mean by "what about the interleaving with canonicalization"?

It is about idempotence of canonicalization + this pass.
So If I run: --pass-pipeline=builtin.module(remove-non-live-values, canonicalize) ; do I always get the same result as --pass-pipeline=builtin.module(remove-non-live-values, canonicalize, remove-non-live-values, canonicalize)

Yes, you'll get the same result, assuming that canonicalize is idempotent. [...]

This is because: This pass uses an analysis called liveness analysis. And, the liveness of a value doesn't change after some (any) pass is applied to the IR because no pass changes the semantics of an IR (to my understanding)
[...]
Does this make sense?

I can understand the theoretical view, but that seems a bit simplistic to me: liveness analysis has to be conservative, and it's not clear to me that canonicalization can't just improve the capability of the analysis.

I understand the issue. Working on making the "value added by this pass" clearer.

Any update on this?

This revision now requires changes to proceed.Aug 15 2023, 10:36 PM

@mehdi_amini

I can understand the theoretical view, but that seems a bit simplistic to me: liveness analysis has to be conservative, and it's not clear to me that canonicalization can't just improve the capability of the analysis.

I don't understand your question. Can you kindly re-state it?

Any update on this?

Yes, I have updated the commit summary, revision summary, and the pass description to make the "value added" clear.

@mehdi_amini , theoretically, do you agree liveness analysis gets no benefit from canonicalization?

In D157049#4590921, @srishti-pm wrote:

@mehdi_amini

I can understand the theoretical view, but that seems a bit simplistic to me: liveness analysis has to be conservative, and it's not clear to me that canonicalization can't just improve the capability of the analysis.

I don't understand your question. Can you kindly re-state it?

I mean that liveness must say that a value is live when there is a doubt, otherwise we're kill a useful value for example. That's the conservative part.
Now I would think that the analysis cannot provide good results without canonicalization, that is it would most often say "I don't know so I'll say it is live".

In D157049#4591198, @mehdi_amini wrote:

In D157049#4590921, @srishti-pm wrote:

@mehdi_amini

I can understand the theoretical view, but that seems a bit simplistic to me: liveness analysis has to be conservative, and it's not clear to me that canonicalization can't just improve the capability of the analysis.

I don't understand your question. Can you kindly re-state it?

I mean that liveness must say that a value is live when there is a doubt, otherwise we're kill a useful value for example. That's the conservative part.
Now I would think that the analysis cannot provide good results without canonicalization, that is it would most often say "I don't know so I'll say it is live".

That is true. This would be rare at the current state that liveness analysis is in; because it is pretty strong. But yes, I agree to this. I actually already stated this as well. The idempotence is true only if we assume liveness analysis is perfect. But, it is not. But it is also true that it is very close to perfect. And, there are many instances where it can be perceived as stronger than any existing canonicalization pattern.

! In D157049#4577762, @srishti-pm wrote:
Again, it should be noted here that since this patch introduces this pass, it may not be perfect. The pass relies on liveness analysis which isn't perfect and liveness analysis relies on sparse backward dataflow which isn't perfect. But, theoretically speaking, if we assume that sparse backward dataflow analysis is perfect, liveness analysis is perfect, and this pass is perfect (as in, no forms of IR have been missed), the above holds true. I am also parallely trying to make these things perfect. Some patches that work towards that are here: https://reviews.llvm.org/D156376, https://reviews.llvm.org/D157261. It is potentially a long way to go. Hopefully we can achieve that 😅

Does this make sense?

Highlighting this comment again.

matthiaskramm accepted this revision.Aug 18 2023, 11:50 AM

srishti-pm requested review of this revision.Aug 18 2023, 3:16 PM

In D157049#4574384, @srishti-pm wrote:

In D157049#4574028, @Mogball wrote:

Naming nit: Can this be called removeDeadValues?

Thanks for bringing this up! This is actually an important discussion. We can't call it removeDeadValues because "non-live" doesn't mean "dead". Something could be "non-live" and not "dead". Also, something could be "dead" and not "non-live". Both these are orthogonal to each other.

"dead" (like in "dead code elimination") refers to instructions that should theoretically never execute on the hardware.

I think we should add comments here clarifying "dead" in MLIR means unreachable code and therefore we have to call this analysis "non-live".

jcai19 accepted this revision.Aug 21 2023, 10:28 AM

Mogball added inline comments.Aug 21 2023, 10:53 AM

mlir/lib/Transforms/RemoveNonLiveValues.cpp
58–59	But please use angle includes for C++ stdlib includes...
63	In this scenario, I believe you are in fact copying the values into a small vector and then passing that by const reference. It would be better to use a ValueRange
80	Please use `ValueRange` instead
96	Same here. It is unusual to pass SmallVector by const reference
131–135	Please drop trivial braces
140	same here
140
144	please use structured bindings
151	same here
168	This seems like a hack. If an API returned an OperandRange that is not mutable, const-casting it to a mutable range can break all sorts of invariants. If you need an API that returns a mutable range, please add it. Also, `OperandRange` should not be passed by reference.
210	please use structured bindings
mlir/test/Transforms/remove-non-live-values.mlir
10	Optimization passes should fail either with an error or be conservative and silently pass anyways.

Thanks for your comments, @Mogball. Working on addressing them.

mlir/test/Transforms/remove-non-live-values.mlir
10	Okay. So in this case, should I fail with an error or silently pass, what do you think? I think I'd prefer silently passing. What is your suggestion?

Mogball added inline comments.Aug 21 2023, 11:02 AM

mlir/test/Transforms/remove-non-live-values.mlir
10	I think because this pass has pretty strong invariants as to when it works or does not work, an error would be preferable.

In D157049#4604164, @jcai19 wrote:

In D157049#4574384, @srishti-pm wrote:

In D157049#4574028, @Mogball wrote:

Naming nit: Can this be called removeDeadValues?

Thanks for bringing this up! This is actually an important discussion. We can't call it removeDeadValues because "non-live" doesn't mean "dead". Something could be "non-live" and not "dead". Also, something could be "dead" and not "non-live". Both these are orthogonal to each other.

"dead" (like in "dead code elimination") refers to instructions that should theoretically never execute on the hardware.

I think we should add comments here clarifying "dead" in MLIR means unreachable code and therefore we have to call this analysis "non-live".

There are counter examples to this: "Dead-store elimination" for example is not about "stores that are unreachable", but ones that don't have observable effects.

In D157049#4604332, @mehdi_amini wrote:

In D157049#4604164, @jcai19 wrote:

In D157049#4574384, @srishti-pm wrote:

In D157049#4574028, @Mogball wrote:

Naming nit: Can this be called removeDeadValues?

Thanks for bringing this up! This is actually an important discussion. We can't call it removeDeadValues because "non-live" doesn't mean "dead". Something could be "non-live" and not "dead". Also, something could be "dead" and not "non-live". Both these are orthogonal to each other.

"dead" (like in "dead code elimination") refers to instructions that should theoretically never execute on the hardware.

I think we should add comments here clarifying "dead" in MLIR means unreachable code and therefore we have to call this analysis "non-live".

There are counter examples to this: "Dead-store elimination" for example is not about "stores that are unreachable", but ones that don't have observable effects.

Interesting. @matthiaskramm, what are your views here? I think you have more knowledge that me on this.

mlir/test/Transforms/remove-non-live-values.mlir
10	Understood. Will mark it as an error. Thanks!

srishti-pm marked an inline comment as done.Aug 21 2023, 11:38 AM

There are counter examples to this: "Dead-store elimination" for example is not about "stores that are unreachable", but ones that don't have observable effects.

Yeah, that's fair!

Here, the distinction between "dead" and "non-live" is mainly to avoid confusion between the two analyses (DCE and Liveness) that are both, at the same time, loaded into the same dataflow solver.

Matt added a subscriber: Matt.Aug 21 2023, 1:14 PM

Address all comments given so far.

In D157049#4604164, @jcai19 wrote:

In D157049#4574384, @srishti-pm wrote:

In D157049#4574028, @Mogball wrote:

Naming nit: Can this be called removeDeadValues?

Thanks for bringing this up! This is actually an important discussion. We can't call it removeDeadValues because "non-live" doesn't mean "dead". Something could be "non-live" and not "dead". Also, something could be "dead" and not "non-live". Both these are orthogonal to each other.

"dead" (like in "dead code elimination") refers to instructions that should theoretically never execute on the hardware.

I think we should add comments here clarifying "dead" in MLIR means unreachable code and therefore we have to call this analysis "non-live".

Done.

@Mogball and @jcai19, I have addressed all your comments. Thanks for the review!
@matthiaskramm, thank you for the clarification on the naming!
@mehdi_amini, awaiting your re-review and/or response, thanks! :)

Update commit summary to reflect recent changes.

srishti-pm edited the summary of this revision. (Show Details)Aug 21 2023, 5:02 PM

Nit changes.

Harbormaster completed remote builds in B253953: Diff 552169.Aug 21 2023, 5:22 PM

I haven't carefully reviewed the code, I don't find the explanation very substantiated, nor the positioning of this work very clear in the current stack right now, but I won't just block it either, so feel free to go ahead if everyone is satisfied here.

This revision is now accepted and ready to land.Aug 22 2023, 9:13 PM

Rebase to main

I would still prefer this pass be named "removeDeadValues". It's in line with what the pass dead -- dead argument elimination, dead result elimination, etc.

Harbormaster completed remote builds in B254457: Diff 552870.Aug 23 2023, 2:50 PM

I also have concerns regarding the term nonlive. If someone searches for "dead value elimination", would this pass shows up as top results, or would it show up at all? Currently nowhere in the pass mentions

In D157049#4611702, @Mogball wrote:

I would still prefer this pass be named "removeDeadValues". It's in line with what the pass dead -- dead argument elimination, dead result elimination, etc.

I agree that NonLive may cause unintended consequences, e.g. if someone wants to look for this pass, they probably won't know to search with NonLive. That said, I'm not sure "dead value" is widely used either. I googled "llvm dead value" and the results were mostly about DCE in a glimpse. Maybe we could call this "removeUnusedValues", since we plan to substitute this pass for https://github.com/tensorflow/tensorflow/blob/ffe5d2c2723adf969b3da78fb76ce1ceb9302f92/tensorflow/compiler/mlir/tensorflow/transforms/tf_passes.td#L2539?

In D157049#4611702, @Mogball wrote:

I would still prefer this pass be named "removeDeadValues". It's in line with what the pass dead -- dead argument elimination, dead result elimination, etc.

I think either name works, for the pass. It's only within the pass that I'd be careful about using "dead" when referring to values that aren't live. Since the pass also hooks in DeadCodeAnalysis, that might cause confusion.

matthiaskramm accepted this revision.Aug 23 2023, 2:58 PM

Call op arg operands handling.

In D157049#4611793, @matthiaskramm wrote:

In D157049#4611702, @Mogball wrote:

I would still prefer this pass be named "removeDeadValues". It's in line with what the pass dead -- dead argument elimination, dead result elimination, etc.

I think either name works, for the pass. It's only within the pass that I'd be careful about using "dead" when referring to values that aren't live. Since the pass also hooks in DeadCodeAnalysis, that might cause confusion.

I think "removeUnusedValues" might make it clearer that it's different from DCE but it's just my preference. Whatever name you use, please make sure your commit message and comments are consistent.

Thank you all for your comments. Based on the above comments from @Mogball, @jcai19, and @matthiaskramm, I'm renaming the pass to removeDeadValues.

Rename pass to RemoveDeadValues and propoagate this change to code comments and commit summary.

srishti-pm retitled this revision from [MLIR][transforms] Add an optimization pass to remove non-live values to [MLIR][transforms] Add an optimization pass to remove dead values.Aug 23 2023, 3:45 PM

srishti-pm edited the summary of this revision. (Show Details)Aug 23 2023, 3:48 PM

Also, thanks @matthiaskramm, @Mogball, and @jcai19 for reviewing and approving this patch. I'm waiting for the build and tests to pass here and then will land this.

Harbormaster completed remote builds in B254484: Diff 552911.Aug 23 2023, 4:43 PM

Closed by commit rG0e98fb9fadb0: [MLIR][transforms] Add an optimization pass to remove dead values (authored by srishti-pm). · Explain WhyAug 23 2023, 4:55 PM

This revision was automatically updated to reflect the committed changes.

srishti-pm added a commit: rG0e98fb9fadb0: [MLIR][transforms] Add an optimization pass to remove dead values.

Revision Contents

Path

Size

mlir/

include/

mlir/

Transforms/

Passes.h

4 lines

Passes.td

149 lines

lib/

Transforms/

CMakeLists.txt

1 line

RemoveNonLiveValues.cpp

622 lines

test/

Transforms/

remove-non-live-values.mlir

336 lines

Diff 550419

mlir/include/mlir/Transforms/Passes.h

	Show All 13 Lines
	#ifndef MLIR_TRANSFORMS_PASSES_H			#ifndef MLIR_TRANSFORMS_PASSES_H
	#define MLIR_TRANSFORMS_PASSES_H			#define MLIR_TRANSFORMS_PASSES_H

	#include "mlir/Pass/Pass.h"			#include "mlir/Pass/Pass.h"
	#include "mlir/Transforms/LocationSnapshot.h"			#include "mlir/Transforms/LocationSnapshot.h"
	#include "mlir/Transforms/ViewOpGraph.h"			#include "mlir/Transforms/ViewOpGraph.h"
	#include "llvm/Support/Debug.h"			#include "llvm/Support/Debug.h"
	#include <limits>			#include <limits>
				#include <memory>

	namespace mlir {			namespace mlir {

	class GreedyRewriteConfig;			class GreedyRewriteConfig;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Passes			// Passes
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
	/// Creates an instance of the inliner pass, and use the provided pass managers			/// Creates an instance of the inliner pass, and use the provided pass managers
	/// when optimizing callable operations with names matching the key type.			/// when optimizing callable operations with names matching the key type.
	/// Callable operations with a name not within the provided map will use the			/// Callable operations with a name not within the provided map will use the
	/// provided default pipeline builder.			/// provided default pipeline builder.
	std::unique_ptr<Pass>			std::unique_ptr<Pass>
	createInlinerPass(llvm::StringMap<OpPassManager> opPipelines,			createInlinerPass(llvm::StringMap<OpPassManager> opPipelines,
	std::function<void(OpPassManager &)> defaultPipelineBuilder);			std::function<void(OpPassManager &)> defaultPipelineBuilder);

				/// Creates an optimization pass to remove non-live values.
				std::unique_ptr<Pass> createRemoveNonLiveValuesPass();

	/// Creates a pass which performs sparse conditional constant propagation over			/// Creates a pass which performs sparse conditional constant propagation over
	/// nested operations.			/// nested operations.
	std::unique_ptr<Pass> createSCCPPass();			std::unique_ptr<Pass> createSCCPPass();

	/// Creates a pass which delete symbol operations that are unreachable. This			/// Creates a pass which delete symbol operations that are unreachable. This
	/// pass may only be scheduled on an operation that defines a SymbolTable.			/// pass may only be scheduled on an operation that defines a SymbolTable.
	std::unique_ptr<Pass> createSymbolDCEPass();			std::unique_ptr<Pass> createSymbolDCEPass();

	Show All 21 Lines

mlir/include/mlir/Transforms/Passes.td

Show First 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	def CSE : Pass<"cse"> {
}];		}];
let constructor = "mlir::createCSEPass()";		let constructor = "mlir::createCSEPass()";
let statistics = [		let statistics = [
Statistic<"numCSE", "num-cse'd", "Number of operations CSE'd">,		Statistic<"numCSE", "num-cse'd", "Number of operations CSE'd">,
Statistic<"numDCE", "num-dce'd", "Number of operations DCE'd">		Statistic<"numDCE", "num-dce'd", "Number of operations DCE'd">
];		];
}		}

		def RemoveNonLiveValues : Pass<"remove-non-live-values"> {
		let summary = "Remove non-live values";
		let description = [{
		The goal of this pass is optimization (reducing runtime) by removing
		unnecessary instructions. Unlike other passes that rely on local information
		gathered from patterns to accomplish optimization, this pass uses a full
		analysis of the IR, specifically, liveness analysis, and is thus more
		MogballUnsubmitted Done Reply Inline Actions Please list the required invariants for the pass to succeed on the IR. There are invariants enforced in the pass definition but they are not listed here. Mogball: Please list the required invariants for the pass to succeed on the IR. There are invariants…
		srishti-pmAuthorUnsubmitted Done Reply Inline Actions Done. srishti-pm: Done.
		powerful.

		Currently, this pass performs the following optimizations:
		(A) Removes function arguments that are not live,
		(B) Removes function return values that are not live across all callers of
		the function,
		(C) Removes unneccesary operands, results, region arguments, and region
		terminator operands of region branch ops, and,
		(D) Removes simple and region branch ops that have all non-live results and
		don't affect memory in any way,

		iff

		the IR doesn't have any non-function symbol ops, non-call symbol user ops
		and branch ops.

		Here, a "simple op" refers to an op that isn't a symbol op, symbol-user op,
		region branch op, branch op, region branch terminator op, or return-like.

		It is important to note that unlike other passes (like `canonicalize`) that
		apply op-specific optimizations through patterns, this pass uses different
		interfaces to handle various types of ops and tries to cover all existing
		ops through these interfaces.

		It is because of its reliance on (a) liveness analysis and (b) interfaces
		that makes it so powerful that it can optimize ops that don't have a
		canonicalizer and even when an op does have a canonicalizer, it can perform
		more aggressive optimizations, as observed in the test files associated with
		this pass.

		Example of optimization (A):-

		```
		int add_2_to_y(int x, int y) {
		return 2 + y
		}

		print(add_2_to_y(3, 4))
		print(add_2_to_y(5, 6))
		```

		becomes

		```
		int add_2_to_y(int y) {
		return 2 + y
		}

		print(add_2_to_y(4))
		print(add_2_to_y(6))
		```

		Example of optimization (B):-

		```
		int, int get_incremented_values(int y) {
		store y somewhere in memory
		return y + 1, y + 2
		}

		y1, y2 = get_incremented_values(4)
		y3, y4 = get_incremented_values(6)
		print(y2)
		```

		becomes

		```
		int get_incremented_values(int y) {
		store y somewhere in memory
		return y + 2
		}

		y2 = get_incremented_values(4)
		y4 = get_incremented_values(6)
		print(y2)
		```

		Example of optimization (C):-

		Assume only `%result1` is live here. Then,

		```
		%result1, %result2, %result3 = scf.while (%arg1 = %operand1, %arg2 = %operand2) {
		%terminator_operand2 = add %arg2, %arg2
		%terminator_operand3 = mul %arg2, %arg2
		%terminator_operand4 = add %arg1, %arg1
		scf.condition(%terminator_operand1) %terminator_operand2, %terminator_operand3, %terminator_operand4
		} do {
		^bb0(%arg3, %arg4, %arg5):
		%terminator_operand6 = arith.addi %arg4, %arg4 : i32
		%terminator_operand5 = arith.addi %arg5, %arg5 : i32
		scf.yield %terminator_operand5, %terminator_operand6 : i32, i32
		}
		```

		becomes

		```
		%result1, %result2 = scf.while (%arg2 = %operand2) {
		%terminator_operand2 = add %arg2, %arg2
		%terminator_operand3 = mul %arg2, %arg2
		scf.condition(%terminator_operand1) %terminator_operand2, %terminator_operand3
		} do {
		^bb0(%arg3, %arg4):
		%terminator_operand6 = arith.addi %arg4, %arg4 : i32
		scf.yield %terminator_operand6 : i32, i32
		}
		```

		It is interesting to see that `%result2` won't be removed even though it is
		not live because `%terminator_operand3` forwards to it and cannot be
		removed. And, that is because it also forwards to `%arg4`, which is live.

		Example of optimization (D):-

		```
		int square_and_double_of_y(int y) {
		square = y ^ 2
		double = y * 2
		return square, double
		}

		sq, do = square_and_double_of_y(5)
		print(do)
		```

		becomes

		```
		int square_and_double_of_y(int y) {
		double = y * 2
		return double
		}

		do = square_and_double_of_y(5)
		print(do)
		```
		}];
		let constructor = "mlir::createRemoveNonLiveValuesPass()";
		}

def PrintIRPass : Pass<"print-ir"> {		def PrintIRPass : Pass<"print-ir"> {
let summary = "Print IR on the debug stream";		let summary = "Print IR on the debug stream";
let description = [{		let description = [{
Print the entire IR on the debug stream. This is meant for debugging		Print the entire IR on the debug stream. This is meant for debugging
purposes to inspect the IR at a specific point in the pipeline.		purposes to inspect the IR at a specific point in the pipeline.
}];		}];
let constructor = "mlir::createPrintIRPass()";		let constructor = "mlir::createPrintIRPass()";
let options = [		let options = [
▲ Show 20 Lines • Show All 289 Lines • Show Last 20 Lines

mlir/lib/Transforms/CMakeLists.txt

	add_subdirectory(Utils)			add_subdirectory(Utils)

	add_mlir_library(MLIRTransforms			add_mlir_library(MLIRTransforms
	Canonicalizer.cpp			Canonicalizer.cpp
	ControlFlowSink.cpp			ControlFlowSink.cpp
	CSE.cpp			CSE.cpp
	GenerateRuntimeVerification.cpp			GenerateRuntimeVerification.cpp
	Inliner.cpp			Inliner.cpp
	LocationSnapshot.cpp			LocationSnapshot.cpp
	LoopInvariantCodeMotion.cpp			LoopInvariantCodeMotion.cpp
	Mem2Reg.cpp			Mem2Reg.cpp
	OpStats.cpp			OpStats.cpp
	PrintIR.cpp			PrintIR.cpp
				RemoveNonLiveValues.cpp
	SCCP.cpp			SCCP.cpp
	SROA.cpp			SROA.cpp
	StripDebugInfo.cpp			StripDebugInfo.cpp
	SymbolDCE.cpp			SymbolDCE.cpp
	SymbolPrivatize.cpp			SymbolPrivatize.cpp
	TopologicalSort.cpp			TopologicalSort.cpp
	ViewOpGraph.cpp			ViewOpGraph.cpp

	Show All 17 Lines

mlir/lib/Transforms/RemoveNonLiveValues.cpp

This file was added.

//===- RemoveNonLiveValues.cpp - Remove Non-Live Values -------------------===//

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

// The goal of this pass is optimization (reducing runtime) by removing

// unnecessary instructions. Unlike other passes that rely on local information

// gathered from patterns to accomplish optimization, this pass uses a full

// analysis of the IR, specifically, liveness analysis, and is thus more

// powerful.

// Currently, this pass performs the following optimizations:

// (A) Removes function arguments that are not live,

// (B) Removes function return values that are not live across all callers of

// the function,

// (C) Removes unneccesary operands, results, region arguments, and region

// terminator operands of region branch ops, and,

// (D) Removes simple and region branch ops that have all non-live results and

// don't affect memory in any way,

MogballUnsubmitted

Done

#include <memory>

- #include <mlir/Analysis/DataFlow/DeadCodeAnalysis.h>

+ #include "mlir/Analysis/DataFlow/DeadCodeAnalysis.h"

#include <mlir/Analysis/DataFlow/LivenessAnalysis.h>

Please use " includes for MLIR/LLVM files

Mogball: Please use `"` includes for MLIR/LLVM files

srishti-pmAuthorUnsubmitted

Done

Done.

srishti-pm: Done.

// iff

// the IR doesn't have any non-function symbol ops, non-call symbol user ops and

// branch ops.

// Here, a "simple op" refers to an op that isn't a symbol op, symbol-user op,

// region branch op, branch op, region branch terminator op, or return-like.

//===----------------------------------------------------------------------===//

#include "cassert"

#include "cstddef"

#include "memory"

#include "mlir/Analysis/DataFlow/DeadCodeAnalysis.h"

#include "mlir/Analysis/DataFlow/LivenessAnalysis.h"

#include "mlir/IR/Attributes.h"

#include "mlir/IR/Builders.h"

#include "mlir/IR/BuiltinAttributes.h"

#include "mlir/IR/Dialect.h"

#include "mlir/IR/FunctionInterfaces.h"

#include "mlir/IR/IRMapping.h"

#include "mlir/IR/OperationSupport.h"

#include "mlir/IR/SymbolTable.h"

#include "mlir/IR/Value.h"

#include "mlir/IR/ValueRange.h"

#include "mlir/IR/Visitors.h"

#include "mlir/Interfaces/CallInterfaces.h"

#include "mlir/Interfaces/ControlFlowInterfaces.h"

#include "mlir/Interfaces/SideEffectInterfaces.h"

#include "mlir/Pass/Pass.h"

#include "mlir/Support/LLVM.h"

#include "mlir/Transforms/FoldUtils.h"

#include "mlir/Transforms/Passes.h"

#include "optional"

#include "vector"

#include "llvm/ADT/STLExtras.h"

MogballUnsubmitted

Done

#include "mlir/Transforms/Passes.h"

- #include "optional"

- #include "vector"

+ #include <optional>

+ #include <vector>

#include "llvm/ADT/STLExtras.h"

But please use angle includes for C++ stdlib includes...

Mogball: But please use angle includes for C++ stdlib includes...

namespace mlir {

#define GEN_PASS_DEF_REMOVENONLIVEVALUES

#include "mlir/Transforms/Passes.h.inc"

mehdi_aminiUnsubmitted

Done

Nit: ArrayRef<T> should always be a replacement for const SmallVector<T> & I believe

(Same elsewhere)

mehdi_amini: Nit: `ArrayRef<T>` should always be a replacement for `const SmallVector<T> &` I believe (Same…

srishti-pmAuthorUnsubmitted

Done

I actually tried this but seems like this cannot be done. When I replace const SmallVector<T> & with ArrayRef<T>, I get the error: no known conversion from 'mlir::Operation::result_range' (aka 'mlir::ResultRange') to 'ArrayRef<mlir::Value>' for 1st argument. Am I doing something wrong?

srishti-pm: I actually tried this but seems like this cannot be done. When I replace `const SmallVector<T>…

MogballUnsubmitted

Done

In this scenario, I believe you are in fact copying the values into a small vector and then passing that by const reference. It would be better to use a ValueRange

Mogball: In this scenario, I believe you are in fact copying the values into a small vector and then…

} // namespace mlir

using namespace mlir;

using namespace mlir::dataflow;

//===----------------------------------------------------------------------===//

// RemoveNonLiveValues Pass

//===----------------------------------------------------------------------===//

namespace {

// Some helper functions...

/// Return true iff at least one value in `values` is live, given the liveness

/// information in `la`.

static bool hasLive(const SmallVector<Value> &values, RunLivenessAnalysis &la) {

for (Value value : values) {

MogballUnsubmitted

Done

Please use ValueRange instead

Mogball: Please use `ValueRange` instead

// If there is a null value, it implies that it was dropped during the

// execution of this pass, implying that it was non-live.

if (!value)

continue;

const Liveness *liveness = la.getLiveness(value);

if (!liveness || liveness->isLive)

return true;

}

return false;

}

/// Return a BitVector of size `values.size()` where its i-th bit is 1 iff the

/// i-th value in `values` is live, given the liveness information in `la`.

static BitVector markLives(const SmallVector<Value> &values,

RunLivenessAnalysis &la) {

MogballUnsubmitted

Done

Same here. It is unusual to pass SmallVector by const reference

Mogball: Same here. It is unusual to pass SmallVector by const reference

BitVector lives(values.size(), true);

for (auto it : llvm::enumerate(values)) {

Value value = it.value();

size_t index = it.index();

if (!value) {

lives.reset(index);

continue;

}

const Liveness *liveness = la.getLiveness(value);

// It is important to note that when `liveness` is null, we can't tell if

// `value` is live or not. So, the safe option is to consider it live. Also,

// the execution of this pass might create new SSA values when erasing some

// of the results of an op and we know that these new values are live

// (because they weren't erased) and also their liveness is null because

// liveness analysis ran before their creation.

if (liveness && !liveness->isLive)

lives.reset(index);

}

return lives;

}

/// Drop the uses of the i-th result of `op` and then erase it iff toErase[i]

/// is 1.

static void dropUsesAndEraseResults(Operation *op, BitVector toErase) {

assert(op->getNumResults() == toErase.size() &&

"expected the number of results in `op` and the size of `toErase` to "

"be the same");

std::vector<Type> newResultTypes;

for (OpResult result : op->getResults()) {

if (!toErase[result.getResultNumber()]) {

newResultTypes.push_back(result.getType());

}

OpBuilder builder(op);

MogballUnsubmitted

Done

std::vector<Type> newResultTypes;

- for (OpResult result : op->getResults()) {

- if (!toErase[result.getResultNumber()]) {

+ for (OpResult result : op->getResults())

+ if (!toErase[result.getResultNumber()])

newResultTypes.push_back(result.getType());

- }

OpBuilder builder(op);

Please drop trivial braces

Mogball: Please drop trivial braces

builder.setInsertionPointAfter(op);

OperationState state(op->getLoc(), op->getName().getStringRef(),

op->getOperands(), newResultTypes, op->getAttrs());

for (unsigned i = 0; i < op->getNumRegions(); ++i) {

state.addRegion();

MogballUnsubmitted

Done

same here

Mogball: same here

MogballUnsubmitted

Done

op->getOperands(), newResultTypes, op->getAttrs());

- for (unsigned i = 0; i < op->getNumRegions(); ++i) {

+ for (unsigned i = 0, e = op->getNumRegions(); i < e; ++i) {

state.addRegion();

Mogball:

}

Operation *newOp = builder.create(state);

for (const auto &indexed_regions : llvm::enumerate(op->getRegions())) {

Region &region = newOp->getRegion(indexed_regions.index());

MogballUnsubmitted

Done

please use structured bindings

Mogball: please use structured bindings

IRMapping mapping;

indexed_regions.value().cloneInto(&region, mapping);

}

unsigned indexOfNextNewCallOpResultToReplace = 0;

for (auto it : llvm::enumerate(op->getResults())) {

Value result = it.value();

MogballUnsubmitted

Done

same here

Mogball: same here

size_t index = it.index();

assert(result && "expected result to be non-null");

if (toErase[index]) {

result.dropAllUses();

} else {

result.replaceAllUsesWith(

newOp->getResult(indexOfNextNewCallOpResultToReplace++));

}

op->erase();

}

/// Convert a list of `Operand`s to a list of `OpOperand`s. This function is

/// borrowed from the Analysis/DataFlow/SparseAnalysis.cpp file.

static MutableArrayRef<OpOperand> operandsToOpOperands(OperandRange &operands) {

return MutableArrayRef<OpOperand>(operands.getBase(), operands.size());

MogballUnsubmitted

Done

This seems like a hack. If an API returned an OperandRange that is not mutable, const-casting it to a mutable range can break all sorts of invariants. If you need an API that returns a mutable range, please add it. Also, OperandRange should not be passed by reference.

Mogball: This seems like a hack. If an API returned an OperandRange that is not mutable, const-casting…

}

/// Clean a simple op `op`, given the liveness analysis information in `la`.

/// Here, cleaning means:

/// (1) Dropping all its uses, AND

/// (2) Erasing it

/// iff it has no memory effects and none of its results are live.

///

/// It is assumed that `op` is simple. Here, a simple op is one which isn't a

/// symbol op, a symbol-user op, a region branch op, a branch op, a region

/// branch terminator op, or return-like.

static void cleanSimpleOp(Operation *op, RunLivenessAnalysis &la) {

if (!isMemoryEffectFree(op) || hasLive(op->getResults(), la))

return;

op->dropAllUses();

op->erase();

}

/// Clean a function-like op `funcOp`, given the liveness information in `la`

/// and the IR in `module`. Here, cleaning means:

/// (1) Dropping the uses of its unnecessary (non-live) arguments,

/// (2) Erasing these arguments,

/// (3) Erasing their corresponding operands from its callers,

/// (4) Erasing its unnecessary terminator operands (return values that are

/// non-live across all callers),

/// (5) Dropping the uses of these return values from its callers, AND

/// (6) Erasing these resturn values

/// iff it is not public.

static void cleanFuncOp(FunctionOpInterface funcOp, Operation *module,

RunLivenessAnalysis &la) {

if (funcOp.isPublic())

return;

// Get the list of unnecessary (non-live) arguments in `nonLiveArgs`.

SmallVector<Value> arguments(funcOp.getArguments());

BitVector nonLiveArgs = markLives(arguments, la);

nonLiveArgs = nonLiveArgs.flip();

// Do (1).

for (auto it : llvm::enumerate(arguments)) {

Value arg = it.value();

MogballUnsubmitted

Done

please use structured bindings

Mogball: please use structured bindings

if (arg && nonLiveArgs[it.index()])

arg.dropAllUses();

}

// Do (2).

funcOp.eraseArguments(nonLiveArgs);

// Do (3).

SymbolTable::UseRange uses = *funcOp.getSymbolUses(module);

for (SymbolTable::SymbolUse use : uses) {

Operation *callOp = use.getUser();

assert(isa<CallOpInterface>(callOp) && "expected a call-like user");

callOp->eraseOperands(nonLiveArgs);

}

// Get the list of unnecessary terminator operands (return values that are

// non-live across all callers) in `nonLiveRets`. There is a very important

// subtlety here. Unnecessary terminator operands are NOT the operands of the

// terminator that are non-live. Instead, these are the return values of the

// callers such that a given return value is non-live across all callers. Such

// corresponding operands in the terminator could be live. An example to

// demonstrate this:

// func.func private @f(%arg0: memref<i32>) -> (i32, i32) {

// %c0_i32 = arith.constant 0 : i32

// %0 = arith.addi %c0_i32, %c0_i32 : i32

// memref.store %0, %arg0[] : memref<i32>

// return %c0_i32, %0 : i32, i32

// }

// func.func @main(%arg0: i32, %arg1: memref<i32>) -> (i32) {

// %1:2 = call @f(%arg1) : (memref<i32>) -> i32

// return %1#0 : i32

// }

// Here, we can see that %1#1 is never used. It is non-live. Thus, @f doesn't

// need to return %0. But, %0 is live. And, still, we want to stop it from

// being returned, in order to optimize our IR. So, this demonstrates how we

// can make our optimization strong by even removing a live return value (%0),

// since it forwards only to non-live value(s) (%1#1).

Operation *lastReturnOp = funcOp.back().getTerminator();

size_t numReturns = lastReturnOp->getNumOperands();

BitVector nonLiveRets(numReturns, true);

for (SymbolTable::SymbolUse use : uses) {

Operation *callOp = use.getUser();

assert(isa<CallOpInterface>(callOp) && "expected a call-like user");

BitVector liveCallRets = markLives(callOp->getResults(), la);

nonLiveRets &= liveCallRets.flip();

}

// Do (4).

// Note that in the absence of control flow ops forcing the control to go from

// the entry (first) block to the other blocks, the control never reaches any

// block other than the entry block, because every block has a terminator.

for (Block &block : funcOp.getBlocks()) {

Operation *returnOp = block.getTerminator();

if (returnOp && returnOp->getNumOperands() == numReturns)

returnOp->eraseOperands(nonLiveRets);

}

funcOp.eraseResults(nonLiveRets);

// Do (5) and (6).

for (SymbolTable::SymbolUse use : uses) {

Operation *callOp = use.getUser();

assert(isa<CallOpInterface>(callOp) && "expected a call-like user");

dropUsesAndEraseResults(callOp, nonLiveRets);

}

/// Clean a region branch op `regionBranchOp`, given the liveness information in

/// `la`. Here, cleaning means:

/// (1') Dropping all its uses, AND

/// (2') Erasing it

/// if it has no memory effects and none of its results are live, AND

/// (1) Erasing its unnecessary operands (operands that are forwarded to

/// unneccesary results and arguments),

/// (2) Cleaning each of its regions,

/// (3) Dropping the uses of its unnecessary results (results that are

/// forwarded from unnecessary operands and terminator operands), AND

/// (4) Erasing these results

/// otherwise.

/// Note that here, cleaning a region means:

/// (2.a) Dropping the uses of its unnecessary arguments (arguments that are

/// forwarded from unneccesary operands and terminator operands),

/// (2.b) Erasing these arguments, AND

/// (2.c) Erasing its unnecessary terminator operands (terminator operands

/// that are forwarded to unneccesary results and arguments).

/// It is important to note that values in this op flow from operands and

/// terminator operands (successor operands) to arguments and results (successor

/// inputs).

static void cleanRegionBranchOp(RegionBranchOpInterface regionBranchOp,

RunLivenessAnalysis &la) {

// Mark live results of `regionBranchOp` in `liveResults`.

auto markLiveResults = [&](BitVector &liveResults) {

liveResults = markLives(regionBranchOp->getResults(), la);

};

// Mark live arguments in the regions of `regionBranchOp` in `liveArgs`.

auto markLiveArgs = [&](DenseMap<Region *, BitVector> &liveArgs) {

for (Region &region : regionBranchOp->getRegions()) {

SmallVector<Value> arguments(region.front().getArguments());

BitVector regionLiveArgs = markLives(arguments, la);

liveArgs[&region] = regionLiveArgs;

}

};

// Return the successors of `region` if the latter is not null. Else return

// the successors of `regionBranchOp`.

auto getSuccessors = [&](Region *region = nullptr) {

std::optional<unsigned> index =

region ? std::optional(region->getRegionNumber()) : std::nullopt;

SmallVector<Attribute> operandAttributes(regionBranchOp->getNumOperands(),

nullptr);

SmallVector<RegionSuccessor> successors;

if (!index)

regionBranchOp.getEntrySuccessorRegions(operandAttributes, successors);

else

regionBranchOp.getSuccessorRegions(index, successors);

return successors;

};

// Return the operands of `terminator` that are forwarded to `successor` if

// the former is not null. Else return the operands of `regionBranchOp`

// forwarded to `successor`.

auto getForwardedOpOperands = [&](const RegionSuccessor &successor,

Operation *terminator = nullptr) {

Region *successorRegion = successor.getSuccessor();

std::optional<unsigned> index =

successorRegion ? std::optional(successorRegion->getRegionNumber())

: std::nullopt;

OperandRange operands =

terminator ? cast<RegionBranchTerminatorOpInterface>(terminator)

.getSuccessorOperands(index)

: regionBranchOp.getEntrySuccessorOperands(index);

MutableArrayRef<OpOperand> opOperands = operandsToOpOperands(operands);

return opOperands;

};

// Mark the non-forwarded operands of `regionBranchOp` in

// `nonForwardedOperands`.

auto markNonForwardedOperands = [&](BitVector &nonForwardedOperands) {

nonForwardedOperands.resize(regionBranchOp->getNumOperands(), true);

for (const RegionSuccessor &successor : getSuccessors()) {

for (OpOperand &opOperand : getForwardedOpOperands(successor))

nonForwardedOperands.reset(opOperand.getOperandNumber());

}

};

// Mark the non-forwarded terminator operands of the various regions of

// `regionBranchOp` in `nonForwardedRets`.

auto markNonForwardedReturnValues =

[&](DenseMap<Operation *, BitVector> &nonForwardedRets) {

for (Region &region : regionBranchOp->getRegions()) {

Operation *terminator = region.front().getTerminator();

nonForwardedRets[terminator] =

BitVector(terminator->getNumOperands(), true);

for (const RegionSuccessor &successor : getSuccessors(&region)) {

for (OpOperand &opOperand :

getForwardedOpOperands(successor, terminator))

nonForwardedRets[terminator].reset(opOperand.getOperandNumber());

}

};

// Update `valuesToKeep` (which is expected to correspond to operands or

// terminator operands) based on `resultsToKeep` and `argsToKeep`, given

// `region`. When `valuesToKeep` correspond to operands, `region` is null.

// Else, `region` is the parent region of the terminator.

auto updateOperandsOrTerminatorOperandsToKeep =

[&](BitVector &valuesToKeep, BitVector &resultsToKeep,

DenseMap<Region *, BitVector> &argsToKeep, Region *region = nullptr) {

Operation *terminator =

region ? region->front().getTerminator() : nullptr;

for (const RegionSuccessor &successor : getSuccessors(region)) {

Region *successorRegion = successor.getSuccessor();

for (auto [opOperand, input] :

llvm::zip(getForwardedOpOperands(successor, terminator),

successor.getSuccessorInputs())) {

size_t operandNum = opOperand.getOperandNumber();

bool updateBasedOn =

successorRegion

? argsToKeep[successorRegion]

[cast<BlockArgument>(input).getArgNumber()]

: resultsToKeep[cast<OpResult>(input).getResultNumber()];

valuesToKeep[operandNum] = valuesToKeep[operandNum] | updateBasedOn;

}

};

// Recompute `resultsToKeep` and `argsToKeep` based on `operandsToKeep` and

// `terminatorOperandsToKeep`. Store true in `resultsOrArgsToKeepChanged` if a

// value is modified, else, false.

auto recomputeResultsAndArgsToKeep =

[&](BitVector &resultsToKeep, DenseMap<Region *, BitVector> &argsToKeep,

BitVector &operandsToKeep,

DenseMap<Operation *, BitVector> &terminatorOperandsToKeep,

bool &resultsOrArgsToKeepChanged) {

resultsOrArgsToKeepChanged = false;

// Recompute `resultsToKeep` and `argsToKeep` based on `operandsToKeep`.

for (const RegionSuccessor &successor : getSuccessors()) {

Region *successorRegion = successor.getSuccessor();

for (auto [opOperand, input] :

llvm::zip(getForwardedOpOperands(successor),

successor.getSuccessorInputs())) {

bool recomputeBasedOn =

operandsToKeep[opOperand.getOperandNumber()];

bool toRecompute =

successorRegion

? argsToKeep[successorRegion]

[cast<BlockArgument>(input).getArgNumber()]

: resultsToKeep[cast<OpResult>(input).getResultNumber()];

if (!toRecompute && recomputeBasedOn)

resultsOrArgsToKeepChanged = true;

if (successorRegion) {

argsToKeep[successorRegion][cast<BlockArgument>(input)

.getArgNumber()] =

argsToKeep[successorRegion]

[cast<BlockArgument>(input).getArgNumber()] |

recomputeBasedOn;

} else {

resultsToKeep[cast<OpResult>(input).getResultNumber()] =

resultsToKeep[cast<OpResult>(input).getResultNumber()] |

recomputeBasedOn;

}

// Recompute `resultsToKeep` and `argsToKeep` based on

// `terminatorOperandsToKeep`.

for (Region &region : regionBranchOp->getRegions()) {

Operation *terminator = region.front().getTerminator();

for (const RegionSuccessor &successor : getSuccessors(&region)) {

Region *successorRegion = successor.getSuccessor();

for (auto [opOperand, input] :

llvm::zip(getForwardedOpOperands(successor, terminator),

successor.getSuccessorInputs())) {

bool recomputeBasedOn =

terminatorOperandsToKeep[region.back().getTerminator()]

[opOperand.getOperandNumber()];

bool toRecompute =

successorRegion

? argsToKeep[successorRegion]

[cast<BlockArgument>(input).getArgNumber()]

: resultsToKeep[cast<OpResult>(input).getResultNumber()];

if (!toRecompute && recomputeBasedOn)

resultsOrArgsToKeepChanged = true;

if (successorRegion) {

argsToKeep[successorRegion][cast<BlockArgument>(input)

.getArgNumber()] =

argsToKeep[successorRegion]

[cast<BlockArgument>(input).getArgNumber()] |

recomputeBasedOn;

} else {

resultsToKeep[cast<OpResult>(input).getResultNumber()] =

resultsToKeep[cast<OpResult>(input).getResultNumber()] |

recomputeBasedOn;

}

};

// Mark the values that we want to keep in `resultsToKeep`, `argsToKeep`,

// `operandsToKeep`, and `terminatorOperandsToKeep`.

auto markValuesToKeep =

[&](BitVector &resultsToKeep, DenseMap<Region *, BitVector> &argsToKeep,

BitVector &operandsToKeep,

DenseMap<Operation *, BitVector> &terminatorOperandsToKeep) {

bool resultsOrArgsToKeepChanged = true;

// We keep updating and recomputing the values until we reach a point

// where they stop changing.

while (resultsOrArgsToKeepChanged) {

// Update the operands that need to be kept.

updateOperandsOrTerminatorOperandsToKeep(operandsToKeep,

resultsToKeep, argsToKeep);

// Update the terminator operands that need to be kept.

for (Region &region : regionBranchOp->getRegions()) {

updateOperandsOrTerminatorOperandsToKeep(

terminatorOperandsToKeep[region.back().getTerminator()],

resultsToKeep, argsToKeep, &region);

}

// Recompute the results and arguments that need to be kept.

recomputeResultsAndArgsToKeep(

resultsToKeep, argsToKeep, operandsToKeep,

terminatorOperandsToKeep, resultsOrArgsToKeepChanged);

}

};

// Do (1') and (2'). This is the only case where the entire `regionBranchOp`

// is removed. It will not happen in any other scenario. Note that in this

// case, a non-forwarded operand of `regionBranchOp` could be live/non-live.

// It could never be live because of this op but its liveness could have been

// attributed to something else.

if (isMemoryEffectFree(regionBranchOp.getOperation()) &&

!hasLive(regionBranchOp->getResults(), la)) {

regionBranchOp->dropAllUses();

regionBranchOp->erase();

return;

}

// At this point, we know that every non-forwarded operand of `regionBranchOp`

// is live.

// Stores the results of `regionBranchOp` that we want to keep.

BitVector resultsToKeep;

// Stores the mapping from regions of `regionBranchOp` to their arguments that

// we want to keep.

DenseMap<Region *, BitVector> argsToKeep;

// Stores the operands of `regionBranchOp` that we want to keep.

BitVector operandsToKeep;

// Stores the mapping from region terminators in `regionBranchOp` to their

// operands that we want to keep.

DenseMap<Operation *, BitVector> terminatorOperandsToKeep;

// Initializing the above variables...

// The live results of `regionBranchOp` definitely need to be kept.

markLiveResults(resultsToKeep);

// Similarly, the live arguments of the regions in `regionBranchOp` definitely

// need to be kept.

markLiveArgs(argsToKeep);

// The non-forwarded operands of `regionBranchOp` definitely need to be kept.

// A live forwarded operand can be removed but no non-forwarded operand can be

// removed since it "controls" the flow of data in this control flow op.

markNonForwardedOperands(operandsToKeep);

// Similarly, the non-forwarded terminator operands of the regions in

// `regionBranchOp` definitely need to be kept.

markNonForwardedReturnValues(terminatorOperandsToKeep);

// Mark the values (results, arguments, operands, and terminator operands)

// that we want to keep.

markValuesToKeep(resultsToKeep, argsToKeep, operandsToKeep,

terminatorOperandsToKeep);

// Do (1).

regionBranchOp->eraseOperands(operandsToKeep.flip());

// Do (2.a) and (2.b).

for (Region &region : regionBranchOp->getRegions()) {

assert(!region.empty() && "expected a non-empty region in an op "

"implementing `RegionBranchOpInterface`");

for (auto [index, arg] : llvm::enumerate(region.front().getArguments())) {

if (argsToKeep[&region][index])

continue;

if (arg)

arg.dropAllUses();

}

region.front().eraseArguments(argsToKeep[&region].flip());

}

jcai19Unsubmitted

Done

Maybe rename it to module to make it clearer.

jcai19: Maybe rename it to module to make it clearer.

// Do (2.c).

for (Region &region : regionBranchOp->getRegions()) {

Operation *terminator = region.front().getTerminator();

terminator->eraseOperands(terminatorOperandsToKeep[terminator].flip());

}

// Do (3) and (4).

dropUsesAndEraseResults(regionBranchOp.getOperation(), resultsToKeep.flip());

}

MogballUnsubmitted

Done

Please make this an error. This indicates a phase-ordering problem in the pipeline. Also, please improve the error message

Mogball: Please make this an error. This indicates a phase-ordering problem in the pipeline. Also…

srishti-pmAuthorUnsubmitted

Done

Actually this isn't a phase ordering issue but rather a demonstration of a weakness of the pass. It is potentially possible that an IR cannot be converted to something that will be "acceptable". The pass needs to handle such "unacceptable" IRs but it doesn't, as of now. It can be made to, by making more incremental changes to this pass but since those changes would be quite complex (and lengthy), they weren't all done in this patch.

Based on this, do you still think it should be an error rather than a warning?

srishti-pm: Actually this isn't a phase ordering issue but rather a demonstration of a weakness of the pass.

srishti-pmAuthorUnsubmitted

Done

Have improved the message.

srishti-pm: Have improved the message.

struct RemoveNonLiveValues

: public impl::RemoveNonLiveValuesBase<RemoveNonLiveValues> {

void runOnOperation() override;

};

} // namespace

void RemoveNonLiveValues::runOnOperation() {

auto &la = getAnalysis<RunLivenessAnalysis>();

Operation *module = getOperation();

// The removal of non-live values is performed iff there are no branch ops,

// all symbol ops present in the IR are function-like, and all symbol user ops

// present in the IR are call-like.

WalkResult acceptableIR = module->walk([&](Operation *op) {

if (isa<BranchOpInterface>(op) ||

(isa<SymbolOpInterface>(op) && !isa<FunctionOpInterface>(op)) ||

(isa<SymbolUserOpInterface>(op) && !isa<CallOpInterface>(op))) {

op->emitWarning() << "cannot optimize an IR with non-function symbol "

"ops, non-call symbol user ops or branch ops\n";

return WalkResult::interrupt();

}

return WalkResult::advance();

});

if (acceptableIR.wasInterrupted())

return;

module->walk([&](Operation *op) {

if (auto funcOp = dyn_cast<FunctionOpInterface>(op)) {

cleanFuncOp(funcOp, module, la);

} else if (auto regionBranchOp = dyn_cast<RegionBranchOpInterface>(op)) {

cleanRegionBranchOp(regionBranchOp, la);

} else if (op->hasTrait<OpTrait::ReturnLike>()) {

// Nothing to do because this terminator is associated with either a

// function op or a region branch op and gets cleaned when these ops are

// cleaned.

} else if (isa<RegionBranchTerminatorOpInterface>(op)) {

// Nothing to do because this terminator is associated with a region

// branch op and gets cleaned when the latter is cleaned.

} else if (isa<CallOpInterface>(op)) {

// Nothing to do because this op is associated with a function op and gets

// cleaned when the latter is cleaned.

} else {

cleanSimpleOp(op, la);

}

});

}

std::unique_ptr<Pass> mlir::createRemoveNonLiveValuesPass() {

return std::make_unique<RemoveNonLiveValues>();

}

mlir/test/Transforms/remove-non-live-values.mlir

This file was added.

				// RUN: mlir-opt %s -remove-non-live-values -split-input-file -verify-diagnostics \| FileCheck %s

				// The IR remains untouched because of the presence of a non-function-like
				// symbol op (module @dont_touch_unacceptable_ir).
				//
				// CHECK-LABEL: module @dont_touch_unacceptable_ir
				// CHECK-LABEL: func.func @has_cleanable_simple_op
				// CHECK-NEXT: arith.addi
				// expected-warning @+1 {{cannot optimize an IR with non-function symbol ops, non-call symbol user ops or branch ops}}
				module @dont_touch_unacceptable_ir {
				MogballUnsubmitted Done Reply Inline Actions Optimization passes should fail either with an error or be conservative and silently pass anyways. Mogball: Optimization passes should fail either with an error or be conservative and silently pass…
				srishti-pmAuthorUnsubmitted Done Reply Inline Actions Okay. So in this case, should I fail with an error or silently pass, what do you think? I think I'd prefer silently passing. What is your suggestion? srishti-pm: Okay. So in this case, should I fail with an error or silently pass, what do you think? I think…
				MogballUnsubmitted Done Reply Inline Actions I think because this pass has pretty strong invariants as to when it works or does not work, an error would be preferable. Mogball: I think because this pass has pretty strong invariants as to when it works or does not work, an…
				srishti-pmAuthorUnsubmitted Done Reply Inline Actions Understood. Will mark it as an error. Thanks! srishti-pm: Understood. Will mark it as an error. Thanks!
				func.func @has_cleanable_simple_op(%arg0 : i32) {
				%non_live = arith.addi %arg0, %arg0 : i32
				return
				}
				}

				// -----

				// The IR remains untouched because of the presence of a branch op `cf.cond_br`.
				//
				// CHECK-LABEL: func.func @dont_touch_unacceptable_ir_has_cleanable_simple_op_with_branch_op
				// CHECK-NEXT: arith.constant 0
				// CHECK-NEXT: cf.cond_br
				func.func @dont_touch_unacceptable_ir_has_cleanable_simple_op_with_branch_op(%arg0: i1) {
				%non_live = arith.constant 0 : i32
				// expected-warning @+1 {{cannot optimize an IR with non-function symbol ops, non-call symbol user ops or branch ops}}
				cf.cond_br %arg0, ^bb1(%non_live : i32), ^bb2(%non_live : i32)
				^bb1(%non_live_0 : i32):
				cf.br ^bb3
				^bb2(%non_live_1 : i32):
				cf.br ^bb3
				^bb3:
				return
				}

				// -----

				// Note that this cleanup cannot be done by the `canonicalize` pass.
				//
				// CHECK-LABEL: func.func private @clean_func_op_remove_argument_and_return_value() {
				// CHECK-NEXT: return
				// CHECK-NEXT: }
				// CHECK: func.func @main(%[[arg0:.*]]: i32) {
				// CHECK-NEXT: call @clean_func_op_remove_argument_and_return_value() : () -> ()
				// CHECK-NEXT: return
				mehdi_aminiUnsubmitted Done Reply Inline Actions Canonicalize nukes everything here right now mehdi_amini: Canonicalize nukes everything here right now
				srishti-pmAuthorUnsubmitted Done Reply Inline Actions Comment made obsolete since the tests have been updated (canonicalize doesn't do it). srishti-pm: Comment made obsolete since the tests have been updated (canonicalize doesn't do it).
				// CHECK-NEXT: }
				func.func private @clean_func_op_remove_argument_and_return_value(%arg0: i32) -> (i32) {
				return %arg0 : i32
				}
				func.func @main(%arg0 : i32) {
				%non_live = func.call @clean_func_op_remove_argument_and_return_value(%arg0) : (i32) -> (i32)
				return
				}

				// -----

				// %arg0 is not live because it is never used. %arg1 is not live because its
				// user `arith.addi` doesn't have any uses and the value that it is forwarded to
				// (%non_live_0) also doesn't have any uses.
				//
				// Note that this cleanup cannot be done by the `canonicalize` pass.
				//
				// CHECK-LABEL: func.func private @clean_func_op_remove_arguments() -> i32 {
				// CHECK-NEXT: %[[c0:.*]] = arith.constant 0
				// CHECK-NEXT: return %[[c0]]
				// CHECK-NEXT: }
				// CHECK: func.func @main(%[[arg2:.]]: memref<i32>, %[[arg3:.]]: i32) -> (i32, memref<i32>) {
				// CHECK-NEXT: %[[live:.*]] = call @clean_func_op_remove_arguments() : () -> i32
				// CHECK-NEXT: return %[[live]], %[[arg2]]
				// CHECK-NEXT: }
				func.func private @clean_func_op_remove_arguments(%arg0 : memref<i32>, %arg1 : i32) -> (i32, i32) {
				%c0 = arith.constant 0 : i32
				%non_live = arith.addi %arg1, %arg1 : i32
				return %c0, %arg1 : i32, i32
				mehdi_aminiUnsubmitted Done Reply Inline Actions FYI `canonicalize` simplifies down to: func.func @clean_simple_ops(%arg0: i32, %arg1: memref<i32>, %arg2: i32) -> i32 { %c6_i32 = arith.constant 6 : i32 %0 = arith.addi %arg0, %arg0 : i32 memref.store %c6_i32, %arg1[] : memref<i32> return %0 : i32 } mehdi_amini: FYI `canonicalize` simplifies down to: ``` func.func @clean_simple_ops(%arg0: i32, %arg1…
				srishti-pmAuthorUnsubmitted Done Reply Inline Actions Yup, thanks. I'm doing the comparison with `canonicalize` right now. srishti-pm: Yup, thanks. I'm doing the comparison with `canonicalize` right now.
				srishti-pmAuthorUnsubmitted Done Reply Inline Actions Done. srishti-pm: Done.
				}
				func.func @main(%arg2 : memref<i32>, %arg3 : i32) -> (i32, memref<i32>) {
				%live, %non_live_0 = func.call @clean_func_op_remove_arguments(%arg2, %arg3) : (memref<i32>, i32) -> (i32, i32)
				return %live, %arg2 : i32, memref<i32>
				}

				// -----

				// Even though %non_live_0 is not live, the first return value of
				// @clean_func_op_remove_return_values isn't removed because %live is live
				// (liveness is checked across all callers).
				//
				// Also, the second return value of @clean_func_op_remove_return_values is
				// removed despite %c0 being live because neither %non_live nor %non_live_1 were
				// live (removal doesn't depend on the liveness of the operand itself but on the
				// liveness of where it is forwarded).
				//
				// Note that this cleanup cannot be done by the `canonicalize` pass.
				//
				// CHECK: func.func private @clean_func_op_remove_return_values(%[[arg0:.*]]: memref<i32>) -> i32 {
				// CHECK-NEXT: %[[c0]] = arith.constant 0
				// CHECK-NEXT: memref.store %[[c0]], %[[arg0]][]
				// CHECK-NEXT: return %[[c0]]
				// CHECK-NEXT: }
				// CHECK: func.func @main(%[[arg1:.*]]: memref<i32>) -> i32 {
				// CHECK-NEXT: %[[live:.*]] = call @clean_func_op_remove_return_values(%[[arg1]]) : (memref<i32>) -> i32
				// CHECK-NEXT: %[[non_live_0:.*]] = call @clean_func_op_remove_return_values(%[[arg1]]) : (memref<i32>) -> i32
				// CHECK-NEXT: return %[[live]] : i32
				// CHECK-NEXT: }
				func.func private @clean_func_op_remove_return_values(%arg0 : memref<i32>) -> (i32, i32) {
				%c0 = arith.constant 0 : i32
				memref.store %c0, %arg0[] : memref<i32>
				return %c0, %c0 : i32, i32
				}
				func.func @main(%arg1 : memref<i32>) -> (i32) {
				%live, %non_live = func.call @clean_func_op_remove_return_values(%arg1) : (memref<i32>) -> (i32, i32)
				%non_live_0, %non_live_1 = func.call @clean_func_op_remove_return_values(%arg1) : (memref<i32>) -> (i32, i32)
				return %live : i32
				}

				// -----

				// None of the return values of @clean_func_op_dont_remove_return_values can be
				// removed because the first one is forwarded to a live value %live and the
				// second one is forwarded to a live value %live_0.
				//
				// CHECK-LABEL: func.func private @clean_func_op_dont_remove_return_values() -> (i32, i32) {
				// CHECK-NEXT: %[[c0:.*]] = arith.constant 0 : i32
				// CHECK-NEXT: return %[[c0]], %[[c0]] : i32, i32
				// CHECK-NEXT: }
				// CHECK-LABEL: func.func @main() -> (i32, i32) {
				// CHECK-NEXT: %[[live_and_non_live:.*]]:2 = call @clean_func_op_dont_remove_return_values() : () -> (i32, i32)
				// CHECK-NEXT: %[[non_live_0_and_live_0:.*]]:2 = call @clean_func_op_dont_remove_return_values() : () -> (i32, i32)
				// CHECK-NEXT: return %[[live_and_non_live]]#0, %[[non_live_0_and_live_0]]#1 : i32, i32
				// CHECK-NEXT: }
				func.func private @clean_func_op_dont_remove_return_values() -> (i32, i32) {
				%c0 = arith.constant 0 : i32
				return %c0, %c0 : i32, i32
				}
				func.func @main() -> (i32, i32) {
				%live, %non_live = func.call @clean_func_op_dont_remove_return_values() : () -> (i32, i32)
				%non_live_0, %live_0 = func.call @clean_func_op_dont_remove_return_values() : () -> (i32, i32)
				return %live, %live_0 : i32, i32
				}

				// -----

				// Values kept:
				// (1) %non_live is not live. Yet, it is kept because %arg4 in `scf.condition`
				// forwards to it, which has to be kept. %arg4 in `scf.condition` has to be
				// kept because it forwards to %arg6 which is live.
				//
				// (2) %arg5 is not live. Yet, it is kept because %live_0 forwards to it, which
				// also forwards to %live, which is live.
				//
				// Values not kept:
				// (1) %arg1 is not kept as an operand of `scf.while` because it only forwards
				// to %arg3, which is not kept. %arg3 is not kept because %arg3 is not live and
				// only %arg1 and %arg7 forward to it, such that neither of them forward
				// anywhere else. Thus, %arg7 is also not kept in the `scf.yield` op.
				//
				// Note that this cleanup cannot be done by the `canonicalize` pass.
				//
				// CHECK: func.func @clean_region_branch_op_dont_remove_first_2_results_but_remove_first_operand(%[[arg0:.]]: i1, %[[arg1:.]]: i32, %[[arg2:.*]]: i32) -> i32 {
				// CHECK-NEXT: %[[live_and_non_live:.]]:2 = scf.while (%[[arg4:.]] = %[[arg2]]) : (i32) -> (i32, i32) {
				// CHECK-NEXT: %[[live_0:.*]] = arith.addi %[[arg4]], %[[arg4]]
				// CHECK-NEXT: scf.condition(%arg0) %[[live_0]], %[[arg4]] : i32, i32
				// CHECK-NEXT: } do {
				// CHECK-NEXT: ^bb0(%[[arg5:.]]: i32, %[[arg6:.]]: i32):
				// CHECK-NEXT: %[[live_1:.*]] = arith.addi %[[arg6]], %[[arg6]]
				// CHECK-NEXT: scf.yield %[[live_1]] : i32
				// CHECK-NEXT: }
				// CHECK-NEXT: return %[[live_and_non_live]]#0
				// CHECK-NEXT: }
				func.func @clean_region_branch_op_dont_remove_first_2_results_but_remove_first_operand(%arg0: i1, %arg1: i32, %arg2: i32) -> (i32) {
				%live, %non_live, %non_live_0 = scf.while (%arg3 = %arg1, %arg4 = %arg2) : (i32, i32) -> (i32, i32, i32) {
				%live_0 = arith.addi %arg4, %arg4 : i32
				%non_live_1 = arith.addi %arg3, %arg3 : i32
				scf.condition(%arg0) %live_0, %arg4, %non_live_1 : i32, i32, i32
				} do {
				^bb0(%arg5: i32, %arg6: i32, %arg7: i32):
				%live_1 = arith.addi %arg6, %arg6 : i32
				scf.yield %arg7, %live_1 : i32, i32
				}
				return %live : i32
				}

				// -----

				// Values kept:
				// (1) %live is kept because it is live.
				//
				// (2) %non_live is not live. Yet, it is kept because %arg3 in `scf.condition`
				// forwards to it and this %arg3 has to be kept. This %arg3 in `scf.condition`
				// has to be kept because it forwards to %arg6, which forwards to %arg4, which
				// forwards to %live, which is live.
				//
				// Values not kept:
				// (1) %non_live_0 is not kept because %non_live_2 in `scf.condition` forwards
				// to it, which forwards to only %non_live_0 and %arg7, where both these are
				// not live and have no other value forwarding to them.
				//
				// (2) %non_live_1 is not kept because %non_live_3 in `scf.condition` forwards
				// to it, which forwards to only %non_live_1 and %arg8, where both these are
				// not live and have no other value forwarding to them.
				//
				// (3) %c2 is not kept because it only forwards to %arg10, which is not kept.
				//
				// (4) %arg10 is not kept because only %c2 and %non_live_4 forward to it, none
				// of them forward anywhere else, and %arg10 is not.
				//
				// (5) %arg7 and %arg8 are not kept because they are not live, %non_live_2 and
				// %non_live_3 forward to them, and both only otherwise forward to %non_live_0
				// and %non_live_1 which are not live and have no other predecessors.
				//
				// Note that this cleanup cannot be done by the `canonicalize` pass.
				//
				// CHECK: func.func @clean_region_branch_op_remove_last_2_results_last_2_arguments_and_last_operand(%[[arg2:.*]]: i1) -> i32 {
				// CHECK-NEXT: %[[c0:.*]] = arith.constant 0
				// CHECK-NEXT: %[[c1:.*]] = arith.constant 1
				// CHECK-NEXT: %[[live_and_non_live:.]]:2 = scf.while (%[[arg3:.]] = %[[c0]], %[[arg4:.*]] = %[[c1]]) : (i32, i32) -> (i32, i32) {
				// CHECK-NEXT: scf.condition(%[[arg2]]) %[[arg4]], %[[arg3]] : i32, i32
				// CHECK-NEXT: } do {
				// CHECK-NEXT: ^bb0(%[[arg5:.]]: i32, %[[arg6:.]]: i32):
				// CHECK-NEXT: scf.yield %[[arg5]], %[[arg6]] : i32, i32
				// CHECK-NEXT: }
				// CHECK-NEXT: return %[[live_and_non_live]]#0 : i32
				// CHECK-NEXT: }
				func.func @clean_region_branch_op_remove_last_2_results_last_2_arguments_and_last_operand(%arg2: i1) -> (i32) {
				%c0 = arith.constant 0 : i32
				%c1 = arith.constant 1 : i32
				%c2 = arith.constant 2 : i32
				%live, %non_live, %non_live_0, %non_live_1 = scf.while (%arg3 = %c0, %arg4 = %c1, %arg10 = %c2) : (i32, i32, i32) -> (i32, i32, i32, i32) {
				%non_live_2 = arith.addi %arg10, %arg10 : i32
				%non_live_3 = arith.muli %arg10, %arg10 : i32
				scf.condition(%arg2) %arg4, %arg3, %non_live_2, %non_live_3 : i32, i32, i32, i32
				} do {
				^bb0(%arg5: i32, %arg6: i32, %arg7: i32, %arg8: i32):
				%non_live_4 = arith.addi %arg7, %arg8 :i32
				scf.yield %arg5, %arg6, %non_live_4 : i32, i32, i32
				}
				return %live : i32
				}

				// -----

				// The op isn't erased because it has memory effects but its unnecessary result
				// is removed.
				//
				// Note that this cleanup cannot be done by the `canonicalize` pass.
				//
				// CHECK: func.func @clean_region_branch_op_remove_result(%[[arg0:.]]: index, %[[arg1:.]]: memref<i32>) {
				// CHECK-NEXT: scf.index_switch %[[arg0]]
				// CHECK-NEXT: case 1 {
				// CHECK-NEXT: %[[c10:.*]] = arith.constant 10
				// CHECK-NEXT: memref.store %[[c10]], %[[arg1]][]
				// CHECK-NEXT: scf.yield
				// CHECK-NEXT: }
				// CHECK-NEXT: default {
				// CHECK-NEXT: }
				// CHECK-NEXT: return
				// CHECK-NEXT: }
				func.func @clean_region_branch_op_remove_result(%arg0 : index, %arg1 : memref<i32>) {
				%non_live = scf.index_switch %arg0 -> i32
				case 1 {
				%c10 = arith.constant 10 : i32
				memref.store %c10, %arg1[] : memref<i32>
				scf.yield %c10 : i32
				}
				default {
				%c11 = arith.constant 11 : i32
				mehdi_aminiUnsubmitted Done Reply Inline Actions Canonicalize already does this simplification. mehdi_amini: Canonicalize already does this simplification.
				srishti-pmAuthorUnsubmitted Done Reply Inline Actions Comment made obsolete since the tests have been updated (canonicalize doesn't do it). srishti-pm: Comment made obsolete since the tests have been updated (canonicalize doesn't do it).
				scf.yield %c11 : i32
				}
				return
				}

				// -----

				// The simple ops which don't have memory effects or live results get removed.
				// %arg5 doesn't get removed from the @main even though it isn't live because
				// the signature of a public function is always left untouched.
				//
				// Note that this cleanup cannot be done by the `canonicalize` pass.
				//
				// CHECK: func.func private @clean_simple_ops(%[[arg0:.]]: i32, %[[arg1:.]]: memref<i32>)
				// CHECK-NEXT: %[[live_0:.*]] = arith.addi %[[arg0]], %[[arg0]]
				// CHECK-NEXT: %[[c2:.*]] = arith.constant 2
				// CHECK-NEXT: %[[live_1:.*]] = arith.muli %[[live_0]], %[[c2]]
				// CHECK-NEXT: %[[c3:.*]] = arith.constant 3
				// CHECK-NEXT: %[[live_2:.*]] = arith.addi %[[arg0]], %[[c3]]
				// CHECK-NEXT: memref.store %[[live_2]], %[[arg1]][]
				// CHECK-NEXT: return %[[live_1]]
				// CHECK-NEXT: }
				// CHECK: func.func @main(%[[arg3:.]]: i32, %[[arg4:.]]: memref<i32>, %[[arg5:.*]]
				// CHECK-NEXT: %[[live:.*]] = call @clean_simple_ops(%[[arg3]], %[[arg4]])
				// CHECK-NEXT: return %[[live]]
				// CHECK-NEXT: }
				func.func private @clean_simple_ops(%arg0 : i32, %arg1 : memref<i32>, %arg2 : i32) -> (i32, i32, i32, i32) {
				%live_0 = arith.addi %arg0, %arg0 : i32
				mehdi_aminiUnsubmitted Done Reply Inline Actions Canonicalization already almost does that, in any case nothing requires a liveness analysis here I think mehdi_amini: Canonicalization already almost does that, in any case nothing requires a liveness analysis…
				srishti-pmAuthorUnsubmitted Done Reply Inline Actions I think maybe it does require liveness analysis! Checkout the modified tests and let me know what you think :) srishti-pm: I think maybe it does require liveness analysis! Checkout the modified tests and let me know…
				%c2 = arith.constant 2 : i32
				%live_1 = arith.muli %live_0, %c2 : i32
				%non_live_1 = arith.addi %live_1, %live_0 : i32
				%non_live_2 = arith.constant 7 : i32
				%non_live_3 = arith.subi %arg0, %non_live_1 : i32
				%c3 = arith.constant 3 : i32
				%live_2 = arith.addi %arg0, %c3 : i32
				memref.store %live_2, %arg1[] : memref<i32>
				return %live_1, %non_live_1, %non_live_2, %non_live_3 : i32, i32, i32, i32
				}

				func.func @main(%arg3 : i32, %arg4 : memref<i32>, %arg5 : i32) -> (i32) {
				%live, %non_live_1, %non_live_2, %non_live_3 = func.call @clean_simple_ops(%arg3, %arg4, %arg5) : (i32, memref<i32>, i32) -> (i32, i32, i32, i32)
				return %live : i32
				}

				// -----

				// The scf.while op has no memory effects and its result isn't live.
				//
				// Note that this cleanup cannot be done by the `canonicalize` pass.
				//
				// CHECK-LABEL: func.func private @clean_region_branch_op_erase_it() {
				// CHECK-NEXT: return
				// CHECK-NEXT: }
				// CHECK: func.func @main(%[[arg3:.]]: i32, %[[arg4:.]]: i1) {
				// CHECK-NEXT: call @clean_region_branch_op_erase_it() : () -> ()
				// CHECK-NEXT: return
				// CHECK-NEXT: }
				func.func private @clean_region_branch_op_erase_it(%arg0 : i32, %arg1 : i1) -> (i32) {
				%non_live = scf.while (%arg2 = %arg0) : (i32) -> (i32) {
				scf.condition(%arg1) %arg2 : i32
				} do {
				^bb0(%arg2: i32):
				scf.yield %arg2 : i32
				}
				return %non_live : i32
				}

				func.func @main(%arg3 : i32, %arg4 : i1) {
				%non_live_0 = func.call @clean_region_branch_op_erase_it(%arg3, %arg4) : (i32, i1) -> (i32)
				return
				}

This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][transforms] Add an optimization pass to remove dead valuesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 550419

mlir/include/mlir/Transforms/Passes.h

mlir/include/mlir/Transforms/Passes.td

mlir/lib/Transforms/CMakeLists.txt

mlir/lib/Transforms/RemoveNonLiveValues.cpp

mlir/test/Transforms/remove-non-live-values.mlir

[MLIR][transforms] Add an optimization pass to remove dead values
ClosedPublic