This is an archive of the discontinued LLVM Phabricator instance.

[Reassociation] Only form CSE expressions for local operands
ClosedPublic

Authored by qcolombet on Apr 3 2023, 10:24 AM.

Details

Summary
  1. TL;DR #

This patch constrains how much freedom the heuristic that tries to from CSE
expressions has. The added constrain is that the CSE-able expressions must be
within the same basic block as the expressions they get moved before.

  1. Details #

The reassociation pass currently tweaks the rewrite of the final expression
towards surfacing pairs of operands that would be CSE-able.

This heuristic applies after the regular ordering of the expression.
The regular ordering uses the program structure to choose in which order each
subexpression is materialized. That order follows the topological order.

Now, to expose more CSE opportunities, this heurisitc effectively bypasses the
previous ordering normally defined by the program and pushes up sub-expressions
that are arbitrary deep in the CFG.
E.g., let's say the program order (top to bottom) gives ((a*b)*c)*d)*e and
b*e appears the most in the program. The expression will be reordered in
(((b*e)*a)*c)*d

This reordering implies that all the sub expressions (in this example xx*a,
then yy*c, etc.) will need to appear after the CSE-able expression.

This may over-constrain where the (sub) expressions may live and in particular
it may create loop-dependent expressions.

This patch only allows to move expressions up the expression chain when the
related values are definied in the same basic block as the ones they
"push-down".

This constrain is far for being perfect but at least it avoids accidentally
creating loop dependent variables.

If we really want to expose CSE-able expressions in a proper way, we would need
a profitability metric and also make the decision globally as opposed to one
chain at a time.

I've put the new constrain behind an option to make comparing the old and new
versions easy. However, I believe that even if we find cases where the old
version performs better it is probably by accident. What I am aiming for with
this change is more predictability, then we can improve if need be.

This fixes www.llvm.org/PR61458

@dsanders Could you try to see if this patch keeps the benefits that Fiona had with b8a330c42ab?

Diff Detail

Event Timeline

qcolombet created this revision.Apr 3 2023, 10:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2023, 10:24 AM
qcolombet requested review of this revision.Apr 3 2023, 10:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2023, 10:24 AM
qcolombet edited the summary of this revision. (Show Details)Apr 3 2023, 10:25 AM

Adding more reviews.

I was hoping this patch could address my problem reported here: https://github.com/llvm/llvm-project/issues/62736
Sadly, it isn't. Maybe it's worth to extend it?

I was hoping this patch could address my problem reported here: https://github.com/llvm/llvm-project/issues/62736
Sadly, it isn't. Maybe it's worth to extend it?

Let me take a look. Thanks for the pointer.

I was hoping this patch could address my problem reported here: https://github.com/llvm/llvm-project/issues/62736
Sadly, it isn't. Maybe it's worth to extend it?

Let me take a look. Thanks for the pointer.

Hello @qcolombet

I'd like to kindly ask you whether you had a chance to look at it and to have any conclusions.

I was hoping this patch could address my problem reported here: https://github.com/llvm/llvm-project/issues/62736
Sadly, it isn't. Maybe it's worth to extend it?

Let me take a look. Thanks for the pointer.

Hello @qcolombet

I'd like to kindly ask you whether you had a chance to look at it and to have any conclusions.

Hi @pawosm01,

Thanks for the reminder.
I had a quick look and this patch is solving a different problem and is not expected to help this particular issue.

I've put slightly more details in https://github.com/llvm/llvm-project/issues/62736.

Cheers,
-Quentin

ab added inline comments.Jun 22 2023, 3:00 PM
llvm/lib/Transforms/Scalar/Reassociate.cpp
2439

Why the entry block, would it make sense to continue here and skip the value in this case?

qcolombet added inline comments.Jun 23 2023, 1:17 AM
llvm/lib/Transforms/Scalar/Reassociate.cpp
2439

Good point, I should probably add a comment here.
The reason I do that is because while we can skip the first value like that, we can't skip them if there are more than one that fall into this category.

To step back a little bit, these values are either arguments or global variables. Forget about global variables, there're not interesting for reassociation.

By default the order of the values not having a block is going to be very early in the chain (left most expression computed first):
reassoc_expr = arg1 op arg2 op arg3 op ... op real_val1 op ...

Therefore the beginning of the chain can be scheduled freely and the result will feed the rest of the chain.

Now, if we skip this value, we're effectively saying that it is okay to put them wherever in the chain. This means that we're going to potentially constraint where the sub expressions for these arguments will live and in particular we may be pushing their computation in loops.
E.g.,
reassoc_expr = real_val1 op arg1 op arg2 op arg3 op ... op real_val2 op ...
If real_val1 and real_val2 are in a loop, now all the intermediate computations must be in a loop as well.

We could be cleverer about anchoring these expression, but I thought it was not worth the complexity.

qcolombet updated this revision to Diff 533898.Jun 23 2023, 1:40 AM
  • Add a comment explaining why we anchor args in the entry block
qcolombet marked an inline comment as done.Jun 23 2023, 1:40 AM
ab accepted this revision.Jun 23 2023, 3:17 PM
ab added inline comments.
llvm/lib/Transforms/Scalar/Reassociate.cpp
2439

Makes sense, thanks for the explanation!

This revision is now accepted and ready to land.Jun 23 2023, 3:17 PM
This revision was landed with ongoing or failed builds.Jun 26 2023, 3:00 AM
This revision was automatically updated to reflect the committed changes.