This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][transforms] Add an optimization pass to remove dead values
ClosedPublic

Authored by srishti-pm on Aug 3 2023, 3:47 PM.

Details

Summary

Large deep learning models rely on heavy computations. However, not
every computation is necessary. And, even when a computation is
necessary, it helps if the values needed for the computation are
available in registers (which have low-latency) rather than being in
memory (which has high-latency).

Compilers can use liveness analysis to:-
(1) Remove extraneous computations from a program before it executes on
hardware, and,
(2) Optimize register allocation.

Both these tasks help achieve one very important goal: reducing runtime.

Recently, liveness analysis was added to MLIR. Thus, this commit uses
the recently added liveness analysis utility to try to accomplish task
(1).

It adds a pass called remove-dead-values whose goal is
optimization (reducing runtime) by removing unnecessary instructions.
Unlike other passes that rely on local information gathered from
patterns to accomplish optimization, this pass uses a full analysis of
the IR, specifically, liveness analysis, and is thus more powerful.

Currently, this pass performs the following optimizations:
(A) Removes function arguments that are not live,
(B) Removes function return values that are not live across all callers of
the function,
(C) Removes unneccesary operands, results, region arguments, region
terminator operands of region branch ops, and,
(D) Removes simple and region branch ops that have all non-live results and
don't affect memory in any way,

iff

the IR doesn't have any non-function symbol ops, non-call symbol user ops
and branch ops.

Here, a "simple op" refers to an op that isn't a symbol op, symbol-user op,
region branch op, branch op, region branch terminator op, or return-like.

It is noteworthy that we do not refer to non-live values as "dead" in this
file to avoid confusing it with dead code analysis's "dead", which refers to
unreachable code (code that never executes on hardware) while "non-live"
refers to code that executes on hardware but is unnecessary. Thus, while the
removal of dead code helps little in reducing runtime, removing non-live
values should theoretically have significant impact (depending on the amount
removed).

It is also important to note that unlike other passes (like canonicalize)
that apply op-specific optimizations through patterns, this pass uses
different interfaces to handle various types of ops and tries to cover all
existing ops through these interfaces.

It is because of its reliance on (a) liveness analysis and (b) interfaces
that makes it so powerful that it can optimize ops that don't have a
canonicalizer and even when an op does have a canonicalizer, it can perform
more aggressive optimizations, as observed in the test files associated with
this pass.

Example of optimization (A):-

int add_2_to_y(int x, int y) {
  return 2 + y
}

print(add_2_to_y(3, 4))
print(add_2_to_y(5, 6))

becomes

int add_2_to_y(int y) {
  return 2 + y
}

print(add_2_to_y(4))
print(add_2_to_y(6))

Example of optimization (B):-

int, int get_incremented_values(int y) {
  store y somewhere in memory
  return y + 1, y + 2
}

y1, y2 = get_incremented_values(4)
y3, y4 = get_incremented_values(6)
print(y2)

becomes

int get_incremented_values(int y) {
  store y somewhere in memory
  return y + 2
}

y2 = get_incremented_values(4)
y4 = get_incremented_values(6)
print(y2)

Example of optimization (C):-

Assume only %result1 is live here. Then,

%result1, %result2, %result3 = scf.while (%arg1 = %operand1, %arg2 = %operand2) {
  %terminator_operand2 = add %arg2, %arg2
  %terminator_operand3 = mul %arg2, %arg2
  %terminator_operand4 = add %arg1, %arg1
  scf.condition(%terminator_operand1) %terminator_operand2, %terminator_operand3, %terminator_operand4
} do {
^bb0(%arg3, %arg4, %arg5):
  %terminator_operand6 = add %arg4, %arg4
  %terminator_operand5 = add %arg5, %arg5
  scf.yield %terminator_operand5, %terminator_operand6
}

becomes

%result1, %result2 = scf.while (%arg2 = %operand2) {
  %terminator_operand2 = add %arg2, %arg2
  %terminator_operand3 = mul %arg2, %arg2
  scf.condition(%terminator_operand1) %terminator_operand2, %terminator_operand3
} do {
^bb0(%arg3, %arg4):
  %terminator_operand6 = add %arg4, %arg4
  scf.yield %terminator_operand6
}

It is interesting to see that %result2 won't be removed even though it is
not live because %terminator_operand3 forwards to it and cannot be
removed. And, that is because it also forwards to %arg4, which is live.

Example of optimization (D):-

int square_and_double_of_y(int y) {
  square = y ^ 2
  double = y * 2
  return square, double
}

sq, do = square_and_double_of_y(5)
print(do)

becomes

int square_and_double_of_y(int y) {
  double = y * 2
  return double
}

do = square_and_double_of_y(5)
print(do)

Signed-off-by: Srishti Srivastava <srishtisrivastava.ai@gmail.com>

Diff Detail

Event Timeline

srishti-pm created this revision.Aug 3 2023, 3:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 3 2023, 3:47 PM
srishti-pm requested review of this revision.Aug 3 2023, 3:47 PM
srishti-pm updated this revision to Diff 547042.Aug 3 2023, 4:13 PM

Make nit comment changes.

srishti-pm edited the summary of this revision. (Show Details)Aug 3 2023, 4:13 PM
srishti-pm updated this revision to Diff 547046.Aug 3 2023, 4:34 PM

Update commit summary.

srishti-pm edited the summary of this revision. (Show Details)Aug 3 2023, 4:35 PM

Can you add a bit more example that showcase how this pass catches cases that canonicalization cannot handle?

mlir/lib/Transforms/RemoveNonLiveValues.cpp
63

Nit: ArrayRef<T> should always be a replacement for const SmallVector<T> & I believe

(Same elsewhere)

mlir/test/Transforms/remove-non-live-values.mlir
45

Canonicalize nukes everything here right now

265

Canonicalize already does this simplification.

293

Canonicalization already almost does that, in any case nothing requires a liveness analysis here I think

srishti-pm updated this revision to Diff 547052.Aug 3 2023, 4:55 PM

Enhance unit tests of cleanSimpleOp().

Can you add a bit more example that showcase how this pass catches cases that canonicalization cannot handle?

Sure, I'll look into this.

mehdi_amini added inline comments.Aug 3 2023, 4:58 PM
mlir/test/Transforms/remove-non-live-values.mlir
74

FYI canonicalize simplifies down to:

func.func @clean_simple_ops(%arg0: i32, %arg1: memref<i32>, %arg2: i32) -> i32 {
  %c6_i32 = arith.constant 6 : i32
  %0 = arith.addi %arg0, %arg0 : i32
  memref.store %c6_i32, %arg1[] : memref<i32>
  return %0 : i32
}
srishti-pm added inline comments.Aug 3 2023, 5:41 PM
mlir/test/Transforms/remove-non-live-values.mlir
74

Yup, thanks. I'm doing the comparison with canonicalize right now.

srishti-pm updated this revision to Diff 547070.Aug 3 2023, 6:05 PM

Add notes about comparisons with the canonicalize pass. More tests and notes to be added.

More tests and notes to be added.

I'm working on this.

srishti-pm updated this revision to Diff 547626.Aug 6 2023, 6:48 PM

Enhance test cases to reveal functionality difference from the canonicalize pass.

More tests and notes to be added.

I'm working on this.

This is done. @mehdi_amini, can you have a look at it and share your views?

srishti-pm updated this revision to Diff 547628.Aug 6 2023, 7:12 PM

Nit changes

Do you have any case of non-interprocedural effect of this?

Do you have any case of non-interprocedural effect of this?

Yes. Was trying to find some. Hadn't found one earlier but I have found some now. Adding them. Will update you once added.

srishti-pm retitled this revision from [MLIR][TRANSFORMS] Add an optimization pass to remove non-live values to [MLIR][transforms] Add an optimization pass to remove non-live values.Aug 7 2023, 11:01 AM
srishti-pm updated this revision to Diff 547940.Aug 7 2023, 2:08 PM

Update unit tests.

@mehdi_amini, I have updated the unit tests. Here's how you can navigate them:-

Non-interprocedural optimizations (positive tests) shown in:-
@clean_region_branch_op_dont_remove_first_2_results_but_remove_first_operand
@clean_region_branch_op_remove_last_2_results_last_2_arguments_and_last_operand
@clean_region_branch_op_remove_result

Interprocedural optimizations (positive tests) shown in:-
@clean_func_op_remove_argument_and_return_value
@clean_func_op_remove_arguments
@clean_func_op_remove_return_values
@clean_simple_ops
@clean_region_branch_op_erase_it

Negative tests:-
@dont_touch_unacceptable_ir
@dont_touch_unacceptable_ir_has_cleanable_simple_op_with_branch_op
@clean_func_op_dont_remove_return_values

All positive tests show effects that are different from the canonicalize pass.

Can you try to elaborate with how you see the difference with canonicalize in the intra-procedural case?

Can you try to elaborate with how you see the difference with canonicalize in the intra-procedural case?

They're just optimizations that don't happen in canonicalize because patterns to perform those canonicalizations haven't been added to that particular op (like, say, the scf.while op). On the tests that I have added for the intra-procedural cases, we see signification optimization that this pass is able to do but if we run canonicalize on the same tests, it does nothing to the IR. Is this what you're asking or did I misunderstand your question?

Naming nit: Can this be called removeDeadValues?

Mogball added inline comments.Aug 9 2023, 11:39 AM
mlir/include/mlir/Transforms/Passes.td
94

Please list the required invariants for the pass to succeed on the IR. There are invariants enforced in the pass definition but they are not listed here.

mlir/lib/Transforms/RemoveNonLiveValues.cpp
21

Please use " includes for MLIR/LLVM files

570

Please make this an error. This indicates a phase-ordering problem in the pipeline. Also, please improve the error message

Naming nit: Can this be called removeDeadValues?

Thanks for bringing this up! This is actually an important discussion. We can't call it removeDeadValues because "non-live" doesn't mean "dead". Something could be "non-live" and not "dead". Also, something could be "dead" and not "non-live". Both these are orthogonal to each other.

"dead" (like in "dead code elimination") refers to instructions that should theoretically never execute on the hardware. So, theoretically, we could say that their removal should not influence the runtime much because these instructions were never executing on the hardware anyways. An example of dead code is lines 2-4 here:

1 x = 28
2 if (false) {
3   x = 99
4 }
5 print(x)

Liveness is different. An instruction could be producing a non-live result but still not be dead code. An example of such an instruction is the one on line 6:

6 x = 28
7 x = 99

Here, both the instructions execute on hardware. So, we cannot say that either of them is dead code. But, x is non-live after the first instruction, which means that the first instruction is unnecessarily executing on the hardware and increasing runtime. Again, it is nice to observe that the x = 99 on line 3 in the first codeblock is producing a live result; x is live; but the instruction is still dead.


The goal of this pass is to try to reduce the runtime by removing "non-live" values.

These could be function arguments that are present in a function but aren't live, in which case we modify the function signature to remove them. For example,

int add_2_to_y(int x, int y) {
  return 2 + y
}

print(add_2_to_y(3, 4))
print(add_2_to_y(5, 6))

This gets transformed to

int add_2_to_y(int y) {
  return 2 + y
}

print(add_2_to_y(4))
print(add_2_to_y(6))

Note that here x could be non-live even if it has uses, if ultimately, all its uses didn't affect memory or the final output of the program.

They could be function return values that are returned by a function but never live in any of its callers, in which case we again modify the function signature to remove them.

int, int get_incremented_values(int y) {
  return y + 1, y + 2
}

y1, y2 = get_incremented_values(4)
y3, y4 = get_incremented_values(6)
print(y2)

This gets transformed to

int get_incremented_values(int y) {
  return y + 2
}

y2 = get_incremented_values(4)
y4 = get_incremented_values(6)
print(y2)

Similarly, these could be unnecessary operands of an operation that say, forwards these operands to different regions but these values are not live in any of the regions (look at tests @clean_region_branch_op_dont_remove_first_2_results_but_remove_first_operand, @clean_region_branch_op_remove_last_2_results_last_2_arguments_and_last_operand, @clean_region_branch_op_remove_result for such examples).

Does this make things clearer?

jcai19 accepted this revision.Aug 9 2023, 3:43 PM
jcai19 added inline comments.
mlir/lib/Transforms/RemoveNonLiveValues.cpp
560

Maybe rename it to module to make it clearer.

This revision is now accepted and ready to land.Aug 9 2023, 3:43 PM
jcai19 removed a reviewer: jcai19.Aug 9 2023, 3:44 PM
This revision now requires review to proceed.Aug 9 2023, 3:44 PM
srishti-pm added inline comments.Aug 9 2023, 5:44 PM
mlir/lib/Transforms/RemoveNonLiveValues.cpp
63

I actually tried this but seems like this cannot be done. When I replace const SmallVector<T> & with ArrayRef<T>, I get the error: no known conversion from 'mlir::Operation::result_range' (aka 'mlir::ResultRange') to 'ArrayRef<mlir::Value>' for 1st argument. Am I doing something wrong?

srishti-pm updated this revision to Diff 548845.Aug 9 2023, 6:36 PM
srishti-pm marked an inline comment as done.

Address comments.

mlir/lib/Transforms/RemoveNonLiveValues.cpp
21

Done.

They're just optimizations that don't happen in canonicalize because patterns to perform those canonicalizations haven't been added to that particular op (like, say, the scf.while op). On the tests that I have added for the intra-procedural cases, we see signification optimization that this pass is able to do but if we run canonicalize on the same tests, it does nothing to the IR. Is this what you're asking or did I misunderstand your question?

Yes: basically you are proposing to introduce a new pass, if it is just "this pass does not do more than canonicalization", the value proposal isn't clear. Right now there is nothing in either the review description or the pass description that provides any context on this.
So, my questions are all about prompting on the value proposition of this work, by contrasting it to something we all know (canonicalize).

One specific question that came to mind right now is: is this pass idempotent or would rerunning it a second time (recomputing the analysis) provide a different result? What about the interleaving with canonicalization?

srishti-pm added inline comments.Aug 9 2023, 6:42 PM
mlir/lib/Transforms/RemoveNonLiveValues.cpp
570

Actually this isn't a phase ordering issue but rather a demonstration of a weakness of the pass. It is potentially possible that an IR cannot be converted to something that will be "acceptable". The pass needs to handle such "unacceptable" IRs but it doesn't, as of now. It can be made to, by making more incremental changes to this pass but since those changes would be quite complex (and lengthy), they weren't all done in this patch.

Based on this, do you still think it should be an error rather than a warning?

srishti-pm marked 4 inline comments as done.Aug 9 2023, 6:46 PM
srishti-pm added inline comments.
mlir/test/Transforms/remove-non-live-values.mlir
45

Comment made obsolete since the tests have been updated (canonicalize doesn't do it).

74

Done.

265

Comment made obsolete since the tests have been updated (canonicalize doesn't do it).

293

I think maybe it does require liveness analysis! Checkout the modified tests and let me know what you think :)

srishti-pm marked 3 inline comments as done.Aug 9 2023, 6:52 PM

One specific question that came to mind right now is: is this pass idempotent or would rerunning it a second time (recomputing the analysis) provide a different result? What about the interleaving with canonicalization?

The pass is idempotent. The analysis is idempotent too. What do you mean by "what about the interleaving with canonicalization"?

They're just optimizations that don't happen in canonicalize because patterns to perform those canonicalizations haven't been added to that particular op (like, say, the scf.while op). On the tests that I have added for the intra-procedural cases, we see signification optimization that this pass is able to do but if we run canonicalize on the same tests, it does nothing to the IR. Is this what you're asking or did I misunderstand your question?

Yes: basically you are proposing to introduce a new pass, if it is just "this pass does not do more than canonicalization", the value proposal isn't clear. Right now there is nothing in either the review description or the pass description that provides any context on this.
So, my questions are all about prompting on the value proposition of this work, by contrasting it to something we all know (canonicalize).

I understand the issue. Working on making the "value added by this pass" clearer.

One specific question that came to mind right now is: is this pass idempotent or would rerunning it a second time (recomputing the analysis) provide a different result? What about the interleaving with canonicalization?

The pass is idempotent. The analysis is idempotent too. What do you mean by "what about the interleaving with canonicalization"?

It is about idempotence of canonicalization + this pass.
So If I run: --pass-pipeline=builtin.module(remove-non-live-values, canonicalize) ; do I always get the same result as --pass-pipeline=builtin.module(remove-non-live-values, canonicalize, remove-non-live-values, canonicalize)

One specific question that came to mind right now is: is this pass idempotent or would rerunning it a second time (recomputing the analysis) provide a different result? What about the interleaving with canonicalization?

The pass is idempotent. The analysis is idempotent too. What do you mean by "what about the interleaving with canonicalization"?

It is about idempotence of canonicalization + this pass.
So If I run: --pass-pipeline=builtin.module(remove-non-live-values, canonicalize) ; do I always get the same result as --pass-pipeline=builtin.module(remove-non-live-values, canonicalize, remove-non-live-values, canonicalize)

Yes, you'll get the same result, assuming that canonicalize is idempotent. Without that assumption, all we can say is that the result of -remove-non-live-values -canonicalize will be the same as the result of -remove-non-live-values -canonicalize -remove-non-live-values.

Basically, I think any run of remove-non-live-values after its first run will essentially do nothing to the IR, irrespective of what other passes execute in between.

This is because: This pass uses an analysis called liveness analysis. And, the liveness of a value doesn't change after some (any) pass is applied to the IR because no pass changes the semantics of an IR (to my understanding). And since the only thing that this pass relies on is liveness analysis, it will not, for example, perform better optimizations after an IR has been canonicalized or inlined or anything. It will remove non-live values, no matter what form the IR is in. Again, it should be noted here that since this patch introduces this pass, it may not be perfect. The pass relies on liveness analysis which isn't perfect and liveness analysis relies on sparse backward dataflow which isn't perfect. But, theoretically speaking, if we assume that sparse backward dataflow analysis is perfect, liveness analysis is perfect, and this pass is perfect (as in, no forms of IR have been missed), the above holds true. I am also parallely trying to make these things perfect. Some patches that work towards that are here: https://reviews.llvm.org/D156376, https://reviews.llvm.org/D157261. It is potentially a long way to go. Hopefully we can achieve that 😅

Does this make sense?

srishti-pm marked an inline comment as done.

Address Jian's comment.

Rebase to main.

srishti-pm marked an inline comment as done.

Address @Mogball and @mehdi_amini's comments.

mlir/include/mlir/Transforms/Passes.td
94

Done.

mlir/lib/Transforms/RemoveNonLiveValues.cpp
570

Have improved the message.

srishti-pm edited the summary of this revision. (Show Details)Aug 14 2023, 2:03 PM
srishti-pm edited the summary of this revision. (Show Details)
srishti-pm edited the summary of this revision. (Show Details)Aug 14 2023, 2:07 PM

I understand the issue. Working on making the "value added by this pass" clearer.

@mehdi_amini, I have done this now.

All comments here have been addressed and/or answered to. Awaiting a re-review @mehdi_amini, @Mogball, and @jcai19.

clang-format

matthiaskramm accepted this revision.Aug 15 2023, 11:02 AM
This revision is now accepted and ready to land.Aug 15 2023, 11:02 AM

Fix handling of region branch and region branch terminator ops based on the recent changes to their interface methods.

Nit changes.

srishti-pm edited the summary of this revision. (Show Details)Aug 15 2023, 1:57 PM
mehdi_amini requested changes to this revision.Aug 15 2023, 10:36 PM

One specific question that came to mind right now is: is this pass idempotent or would rerunning it a second time (recomputing the analysis) provide a different result? What about the interleaving with canonicalization?

The pass is idempotent. The analysis is idempotent too. What do you mean by "what about the interleaving with canonicalization"?

It is about idempotence of canonicalization + this pass.
So If I run: --pass-pipeline=builtin.module(remove-non-live-values, canonicalize) ; do I always get the same result as --pass-pipeline=builtin.module(remove-non-live-values, canonicalize, remove-non-live-values, canonicalize)

Yes, you'll get the same result, assuming that canonicalize is idempotent. [...]

This is because: This pass uses an analysis called liveness analysis. And, the liveness of a value doesn't change after some (any) pass is applied to the IR because no pass changes the semantics of an IR (to my understanding)
[...]
Does this make sense?

I can understand the theoretical view, but that seems a bit simplistic to me: liveness analysis has to be conservative, and it's not clear to me that canonicalization can't just improve the capability of the analysis.

I understand the issue. Working on making the "value added by this pass" clearer.

Any update on this?

This revision now requires changes to proceed.Aug 15 2023, 10:36 PM

@mehdi_amini

I can understand the theoretical view, but that seems a bit simplistic to me: liveness analysis has to be conservative, and it's not clear to me that canonicalization can't just improve the capability of the analysis.

I don't understand your question. Can you kindly re-state it?

Any update on this?

Yes, I have updated the commit summary, revision summary, and the pass description to make the "value added" clear.

@mehdi_amini , theoretically, do you agree liveness analysis gets no benefit from canonicalization?

@mehdi_amini

I can understand the theoretical view, but that seems a bit simplistic to me: liveness analysis has to be conservative, and it's not clear to me that canonicalization can't just improve the capability of the analysis.

I don't understand your question. Can you kindly re-state it?

I mean that liveness must say that a value is live when there is a doubt, otherwise we're kill a useful value for example. That's the conservative part.
Now I would think that the analysis cannot provide good results without canonicalization, that is it would most often say "I don't know so I'll say it is live".

@mehdi_amini

I can understand the theoretical view, but that seems a bit simplistic to me: liveness analysis has to be conservative, and it's not clear to me that canonicalization can't just improve the capability of the analysis.

I don't understand your question. Can you kindly re-state it?

I mean that liveness must say that a value is live when there is a doubt, otherwise we're kill a useful value for example. That's the conservative part.
Now I would think that the analysis cannot provide good results without canonicalization, that is it would most often say "I don't know so I'll say it is live".

That is true. This would be rare at the current state that liveness analysis is in; because it is pretty strong. But yes, I agree to this. I actually already stated this as well. The idempotence is true only if we assume liveness analysis is perfect. But, it is not. But it is also true that it is very close to perfect. And, there are many instances where it can be perceived as stronger than any existing canonicalization pattern.

! In D157049#4577762, @srishti-pm wrote:
Again, it should be noted here that since this patch introduces this pass, it may not be perfect. The pass relies on liveness analysis which isn't perfect and liveness analysis relies on sparse backward dataflow which isn't perfect. But, theoretically speaking, if we assume that sparse backward dataflow analysis is perfect, liveness analysis is perfect, and this pass is perfect (as in, no forms of IR have been missed), the above holds true. I am also parallely trying to make these things perfect. Some patches that work towards that are here: https://reviews.llvm.org/D156376, https://reviews.llvm.org/D157261. It is potentially a long way to go. Hopefully we can achieve that 😅

Does this make sense?

Highlighting this comment again.

matthiaskramm accepted this revision.Aug 18 2023, 11:50 AM
srishti-pm requested review of this revision.Aug 18 2023, 3:16 PM

Naming nit: Can this be called removeDeadValues?

Thanks for bringing this up! This is actually an important discussion. We can't call it removeDeadValues because "non-live" doesn't mean "dead". Something could be "non-live" and not "dead". Also, something could be "dead" and not "non-live". Both these are orthogonal to each other.

"dead" (like in "dead code elimination") refers to instructions that should theoretically never execute on the hardware.

I think we should add comments here clarifying "dead" in MLIR means unreachable code and therefore we have to call this analysis "non-live".

jcai19 accepted this revision.Aug 21 2023, 10:28 AM
Mogball added inline comments.Aug 21 2023, 10:53 AM
mlir/lib/Transforms/RemoveNonLiveValues.cpp
58–59

But please use angle includes for C++ stdlib includes...

63

In this scenario, I believe you are in fact copying the values into a small vector and then passing that by const reference. It would be better to use a ValueRange

80

Please use ValueRange instead

96

Same here. It is unusual to pass SmallVector by const reference

131–135

Please drop trivial braces

140

same here

140
144

please use structured bindings

151

same here

168

This seems like a hack. If an API returned an OperandRange that is not mutable, const-casting it to a mutable range can break all sorts of invariants. If you need an API that returns a mutable range, please add it. Also, OperandRange should not be passed by reference.

210

please use structured bindings

mlir/test/Transforms/remove-non-live-values.mlir
10

Optimization passes should fail either with an error or be conservative and silently pass anyways.

Thanks for your comments, @Mogball. Working on addressing them.

mlir/test/Transforms/remove-non-live-values.mlir
10

Okay. So in this case, should I fail with an error or silently pass, what do you think? I think I'd prefer silently passing. What is your suggestion?

Mogball added inline comments.Aug 21 2023, 11:02 AM
mlir/test/Transforms/remove-non-live-values.mlir
10

I think because this pass has pretty strong invariants as to when it works or does not work, an error would be preferable.

Naming nit: Can this be called removeDeadValues?

Thanks for bringing this up! This is actually an important discussion. We can't call it removeDeadValues because "non-live" doesn't mean "dead". Something could be "non-live" and not "dead". Also, something could be "dead" and not "non-live". Both these are orthogonal to each other.

"dead" (like in "dead code elimination") refers to instructions that should theoretically never execute on the hardware.

I think we should add comments here clarifying "dead" in MLIR means unreachable code and therefore we have to call this analysis "non-live".

There are counter examples to this: "Dead-store elimination" for example is not about "stores that are unreachable", but ones that don't have observable effects.

srishti-pm marked an inline comment as done.Aug 21 2023, 11:38 AM

Naming nit: Can this be called removeDeadValues?

Thanks for bringing this up! This is actually an important discussion. We can't call it removeDeadValues because "non-live" doesn't mean "dead". Something could be "non-live" and not "dead". Also, something could be "dead" and not "non-live". Both these are orthogonal to each other.

"dead" (like in "dead code elimination") refers to instructions that should theoretically never execute on the hardware.

I think we should add comments here clarifying "dead" in MLIR means unreachable code and therefore we have to call this analysis "non-live".

There are counter examples to this: "Dead-store elimination" for example is not about "stores that are unreachable", but ones that don't have observable effects.

Interesting. @matthiaskramm, what are your views here? I think you have more knowledge that me on this.

mlir/test/Transforms/remove-non-live-values.mlir
10

Understood. Will mark it as an error. Thanks!

srishti-pm marked an inline comment as done.Aug 21 2023, 11:38 AM

There are counter examples to this: "Dead-store elimination" for example is not about "stores that are unreachable", but ones that don't have observable effects.

Yeah, that's fair!

Here, the distinction between "dead" and "non-live" is mainly to avoid confusion between the two analyses (DCE and Liveness) that are both, at the same time, loaded into the same dataflow solver.

Matt added a subscriber: Matt.Aug 21 2023, 1:14 PM
srishti-pm marked 14 inline comments as done.

Address all comments given so far.

Naming nit: Can this be called removeDeadValues?

Thanks for bringing this up! This is actually an important discussion. We can't call it removeDeadValues because "non-live" doesn't mean "dead". Something could be "non-live" and not "dead". Also, something could be "dead" and not "non-live". Both these are orthogonal to each other.

"dead" (like in "dead code elimination") refers to instructions that should theoretically never execute on the hardware.

I think we should add comments here clarifying "dead" in MLIR means unreachable code and therefore we have to call this analysis "non-live".

Done.

@Mogball and @jcai19, I have addressed all your comments. Thanks for the review!
@matthiaskramm, thank you for the clarification on the naming!
@mehdi_amini, awaiting your re-review and/or response, thanks! :)

Update commit summary to reflect recent changes.

srishti-pm edited the summary of this revision. (Show Details)Aug 21 2023, 5:02 PM

Nit changes.

mehdi_amini resigned from this revision.Aug 22 2023, 9:13 PM

I haven't carefully reviewed the code, I don't find the explanation very substantiated, nor the positioning of this work very clear in the current stack right now, but I won't just block it either, so feel free to go ahead if everyone is satisfied here.

This revision is now accepted and ready to land.Aug 22 2023, 9:13 PM

Rebase to main

Mogball accepted this revision.Aug 23 2023, 2:19 PM

I would still prefer this pass be named "removeDeadValues". It's in line with what the pass dead -- dead argument elimination, dead result elimination, etc.

I also have concerns regarding the term nonlive. If someone searches for "dead value elimination", would this pass shows up as top results, or would it show up at all? Currently nowhere in the pass mentions

I would still prefer this pass be named "removeDeadValues". It's in line with what the pass dead -- dead argument elimination, dead result elimination, etc.

I agree that NonLive may cause unintended consequences, e.g. if someone wants to look for this pass, they probably won't know to search with NonLive. That said, I'm not sure "dead value" is widely used either. I googled "llvm dead value" and the results were mostly about DCE in a glimpse. Maybe we could call this "removeUnusedValues", since we plan to substitute this pass for https://github.com/tensorflow/tensorflow/blob/ffe5d2c2723adf969b3da78fb76ce1ceb9302f92/tensorflow/compiler/mlir/tensorflow/transforms/tf_passes.td#L2539?

I would still prefer this pass be named "removeDeadValues". It's in line with what the pass dead -- dead argument elimination, dead result elimination, etc.

I think either name works, for the pass. It's only within the pass that I'd be careful about using "dead" when referring to values that aren't live. Since the pass also hooks in DeadCodeAnalysis, that might cause confusion.

matthiaskramm accepted this revision.Aug 23 2023, 2:58 PM

Call op arg operands handling.

jcai19 accepted this revision.Aug 23 2023, 3:11 PM

I would still prefer this pass be named "removeDeadValues". It's in line with what the pass dead -- dead argument elimination, dead result elimination, etc.

I think either name works, for the pass. It's only within the pass that I'd be careful about using "dead" when referring to values that aren't live. Since the pass also hooks in DeadCodeAnalysis, that might cause confusion.

I think "removeUnusedValues" might make it clearer that it's different from DCE but it's just my preference. Whatever name you use, please make sure your commit message and comments are consistent.

Thank you all for your comments. Based on the above comments from @Mogball, @jcai19, and @matthiaskramm, I'm renaming the pass to removeDeadValues.

Rename pass to RemoveDeadValues and propoagate this change to code comments and commit summary.

srishti-pm retitled this revision from [MLIR][transforms] Add an optimization pass to remove non-live values to [MLIR][transforms] Add an optimization pass to remove dead values.Aug 23 2023, 3:45 PM
srishti-pm edited the summary of this revision. (Show Details)Aug 23 2023, 3:48 PM

Also, thanks @matthiaskramm, @Mogball, and @jcai19 for reviewing and approving this patch. I'm waiting for the build and tests to pass here and then will land this.