This is an archive of the discontinued LLVM Phabricator instance.

[Reassociate] Add negated value of negative constant to the Duplicates list.
ClosedPublic

Authored by mcrosier on Feb 21 2017, 2:59 PM.

Details

Summary

In OptimizeAdd, we scan the operand list to see if there are any common factors between operands that can be factored out to reduce the number of multiplies (e.g., 'A*A+A*B*C+D' -> 'A*(A+B*C)+D'). For each operand of the operand list, we only consider unique factors (which is tracked by the Duplicate set). Now if we find a factor that is a negative constant, we add the negated value as a factor as well because we can percolate the negate out. However, mistakenly don't add this negated constant to the Duplicates set.

Consider the expression A*2*-2 + B. Obviously, nothing to factor.

For the added value A*2*-2 we over count 2 as a factor without this patch, which causes the assert reported in PR30256.

Chad

Diff Detail

Repository
rL LLVM

Event Timeline

mcrosier created this revision.Feb 21 2017, 2:59 PM
efriedma added inline comments.Feb 21 2017, 4:23 PM
lib/Transforms/Scalar/Reassociate.cpp
1523 ↗(On Diff #89280)

Okay, now I understand the issue: the problem is that this code is assuming that all the multiply operands of the add are already reassociated. We break that assumption with the way we optimize shl->mul: we transform the shl into a mul, and stick the mul into RedoInsts, but don't revisit it until it's "too late".

Your first two patches dance around the issue in slightly different ways; basically, you dodge the issue by changing the visitation order. This seems like a landmine, even if it does fix the immediate problem.

This patch avoids the issue by making OptimizeAdd tolerate multiplies which haven't been completely optimized; this sort of works, but we're doing wasted work: we'll end up revisiting the add later anyway.

Another possible approach would be to enforce RPO iteration order more strongly. If we have RedoInsts, we process them immediately in RPO order, rather than waiting until we've finished processing the whole function. Intuitively, it seems like the natural approach: reassociation works on expression trees, so the optimization only works in one direction. That said, I'm not sure how practical that is given the current Reassociate; the "optimal" form for an expression depends on its use list (see all the uses of "user_back()"), so Reassociate is really an iterative optimization of sorts, so any changes here would probably get messy.

mcrosier added inline comments.
lib/Transforms/Scalar/Reassociate.cpp
1523 ↗(On Diff #89280)

Yes, this is a good synopsis of what is going on and I agree my previous two approaches (D29777 and D30223) skate around the issue. You're also correct in that this patch results in wasted work (which is pervasive throughout this pass, unfortunately), but I don't think the RPO suggestion is practical for fixing this issue for the 4.0 release.

Would you be okay accepting this patch?

hans added a subscriber: hans.Feb 22 2017, 10:11 AM
efriedma edited edge metadata.Feb 22 2017, 11:32 AM

Yes, I'm fine with this approach for now.

test/Transforms/Reassociate/basictest.ll
235 ↗(On Diff #89280)

Please add CHECK lines to make sure the end result is what we expect, even if that isn't really the point of the test.

mcrosier updated this revision to Diff 89405.Feb 22 2017, 12:32 PM

Update test with CHECKs, per Eli's feedback.

mcrosier accepted this revision.Feb 23 2017, 10:46 AM

Yes, I'm fine with this approach for now.

Accepting, per Eli's.

This revision is now accepted and ready to land.Feb 23 2017, 10:46 AM
This revision was automatically updated to reflect the committed changes.