This is an archive of the discontinued LLVM Phabricator instance.

[mips] Always check that `shift and add` optimization is efficient
ClosedPublic

Authored by atanasyan on May 20 2019, 3:41 PM.

Details

Summary

The D45316 introduced the shouldTransformMulToShiftsAddsSubs function to check that breaking down constant multiplications into a series of shifts, adds, and subs is efficient. Unfortunately, this function does not check maximum number of steps on all paths of the algorithm. This patch fixes this bug.

Fix for PR41929.

Diff Detail

Event Timeline

atanasyan created this revision.May 20 2019, 3:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2019, 3:41 PM
  • Make the code more compact
  • Remove magical constant from the loop condition
mstojanovic added inline comments.May 21 2019, 5:04 AM
llvm/lib/Target/Mips/MipsSEISelLowering.cpp
771–772

You could also move this as the first thing in the while block which removes the Steps <= MaxSteps check.

atanasyan updated this revision to Diff 200510.May 21 2019, 7:39 AM
  • Simplify the loop condition
atanasyan marked an inline comment as done.May 21 2019, 7:40 AM
Petar.Avramovic accepted this revision.May 21 2019, 7:50 AM

LGTM.

llvm/lib/Target/Mips/MipsSEISelLowering.cpp
723–737

This comment can be misleading. The number is approximately equal to the minimal number of powers of two that constant can be broken down to by adding or subtracting them.

This revision is now accepted and ready to land.May 21 2019, 7:50 AM

I've raised some points inline, I think (a) is most relevant to this patch but shouldn't stop it going forward but should be addressed quickly, (b) and (c) could be noted as TODO:s.

llvm/lib/Target/Mips/MipsSEISelLowering.cpp
724–737

Looking at this and the original version, some things come to mind:

a) MaxSteps needs to consider the VT of the constant for the target. I.E. for a 32 bit multiplication on a N32/N64 target the maximum number of steps is incorrect. I believe the calculation of the maximum number of steps needs to consider the case of <=i32 and >i32 && <=i64 cases for natively supported types.

I would suggest looking at starting with the maximum number of steps as equal to the number of cycles that it takes to perform a constant materialization sequence worst case, then applying a "legalization penalty" to the number of steps if the type of the operands is not natively supported.

b) This optimization can occur before type legalization, it may be worth considering restricting this optimization to after type legalization so that there is no fudge on the legalization penalty of an illegal type for some constant value (lines 771-780).

c) This optimization should be account for -Os, -Oz, as the optimization can increase code size.

atanasyan updated this revision to Diff 200941.May 23 2019, 6:02 AM
  • Add TODO comment

I've raised some points inline, I think (a) is most relevant to this patch but shouldn't stop it going forward but should be addressed quickly, (b) and (c) could be noted as TODO:s.

Thanks. I will implement it.

sdardis accepted this revision.May 23 2019, 2:56 PM

LGTM.

This revision was automatically updated to reflect the committed changes.

Thanks for review.