This is an archive of the discontinued LLVM Phabricator instance.

[NaryReassociate] Transform expression (X << C1) + C2 to (X + C3) << C1,
Changes PlannedPublic

Authored by AlexM on Mar 26 2023, 4:37 PM.

Details

Summary

where C1, C2 and C3 are integer constants, such that C2 = C3 << C1.

Some transformations in Instruction Combining may block GVN later.
For example,

(X + const1) << 22 || (X + const1) >> 10

may become

(X << 22 + const2) || (X + const1) >> 10

which will disable the GVN on x + const1.

This patch implements reverse transformation of such expressions thus, unfolding a common code to be eliminated by either GVN or CSE. In the default compilation pipeline GVN/CSE are run several times: after Instruction Combining, but before NaryReassociate and after NaryReassociate. Therefore, with this patch we try both patterns for GVN/CSE.

Diff Detail

Event Timeline

AlexM created this revision.Mar 26 2023, 4:37 PM
AlexM edited the summary of this revision. (Show Details)Mar 26 2023, 4:45 PM
AlexM added reviewers: pavelkopyl, mkazantsev, ebrevnov.
AlexM published this revision for review.Mar 26 2023, 4:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 26 2023, 4:46 PM
mkazantsev added inline comments.Mar 27 2023, 7:44 PM
llvm/include/llvm/Transforms/Scalar/NaryReassociate.h
152
// Tries to match (X << C1) + C2...

?

llvm/lib/Transforms/Scalar/NaryReassociate.cpp
510

If the initial operations had nuw/nsw flags, will they now be lost? Can we preserve them somehow?

llvm/test/Transforms/NaryReassociate/NVPTX/const-shl.ll
1

Could you please precommit the test to show what your patch is changing?

Do you know why at all InstCombine would want to transform (X + const1) << 22 into (X << 22 + const2)? By itself it doesn't seem profitable. Does it enable any useful further transforms?

As I see it, you are trying to solve pass ordering issue (InstCombine prevents GVN/CSE). But your solution assumes there is GVN/CSE follows NaryReassociation what creates another potential pass ordering dependence.
Another point is the way the transformation is implemented right now doesn't fit intent of NaryReassociation pass. In the heart of NaryReassociation pass is findClosestMatchingDominator call to find and delete common expressions after reassociation. Since you are not checking for existence of common expressions you can easily do the transformation in some other place (ordinary Reassociate pass, or InstSimplify/InstCombine)
In order to solve both mentioned problems and cover your case you will need to use findClosestMatchingDominator and possibly extend it to look for post dominating expressions as well.

nikic requested changes to this revision.Mar 30 2023, 3:12 AM
nikic added a subscriber: nikic.

As the first step, please add a phase ordering test demonstrating the issue. Shouldn't EarlyCSE remove these already?

This revision now requires changes to proceed.Mar 30 2023, 3:12 AM
AlexM planned changes to this revision.Mar 30 2023, 3:55 PM

Do you know why at all InstCombine would want to transform (X + const1) << 22 into (X << 22 + const2)?

This functionality was added in 44bd392cbff238e75d84ee757189d5f62cf91679 which says simply that it helps by "opening opportunities for future improvements."

As I see it, you are trying to solve pass ordering issue (InstCombine prevents GVN/CSE). But your solution assumes there is GVN/CSE follows NaryReassociation what creates another potential pass ordering dependence.
Another point is the way the transformation is implemented right now doesn't fit intent of NaryReassociation pass. In the heart of NaryReassociation pass is findClosestMatchingDominator call to find and delete common expressions after reassociation. Since you are not checking for existence of common expressions you can easily do the transformation in some other place (ordinary Reassociate pass, or InstSimplify/InstCombine)
In order to solve both mentioned problems and cover your case you will need to use findClosestMatchingDominator and possibly extend it to look for post dominating expressions as well.

Hmm, this is a pretty fundamental issue, let me take some time to figure out how I can integrate this more deeply into the pass or move it somewhere more appropriate.

llvm/lib/Transforms/Scalar/NaryReassociate.cpp
510

This dose seem like an issue, might require a bit more code but certainly seems feasible. I did check InstCombine, which does the opposite transformation and these flags are completely discarded there. I'm very unfamiliar with the semantics of these flags but if dropping them is a problem here is it also a problem in InstCombine? Is dropping them a correctness issue or does it potentially prevent other optimizations?

llvm/test/Transforms/NaryReassociate/NVPTX/const-shl.ll
1

This is an embarrassingly ignorant question but what do you mean by this?