If the multiplicand is a constant with following formats:
- mul with (2^N * int16_imm) -> mulli + rldicr
- mul with (2^N + 2^M) -> rldicr + add + rldicr
- mul with (2^N - 2^M) -> rldicr + sub + rldicr
Scenario 2 and 3 are moved to DAGCombiner D88201.
Differential D87384
[PowerPC] Add ISEL patterns for Mul with Imm. Esme on Sep 9 2020, 8:12 AM. Authored by
Details
If the multiplicand is a constant with following formats:
Scenario 2 and 3 are moved to DAGCombiner D88201.
Diff Detail
Unit Tests Event TimelineComment Actions LGTM. Thank you for doing this. I assume that you have run it with bmk to make sure that the functionality is fine. Comment Actions Do you have evidence of profitability of (mul (shl %a, N) M) being preferable over (mul %a, C) on other targets? If so, I suppose extending the DAG combine that handles: mul x, (2^N + 1) --> add (shl x, N), x mul x, (2^N - 1) --> sub (shl x, N), x Perhaps it wold be good to gauge interest from other RISC targets. Scenarios 2 and 3 are quite similar to the existing DAG combine so this would be a straightforward implementation. However I am not sure how likely it is that two shifts and an add/sub are better than a multiply on other targets.
Comment Actions Thank you for your comments! @jsji @nemanjai For scenarios 2 and 3, I will modified it as @nemanjai 's hint, and move it to DAGCombiner since it is really similar to existing patterns and then implement our PPCTargetLowering::decomposeMulByConstant. If other targets are interested about this, they can add conditions for this in their decomposeMulByConstant. But for scenario 1, I am not sure if there are benefits to other targets, since it is inspired by our HW instruction MULLI. So I prefer to keep it in ISEL. Comment Actions Keep the scenario 1 implemented in ISEL. Comment Actions I think we already have similar pattern for scenario 1 as well: // Change (mul (shl X, C), Y) -> (shl (mul X, Y), C) when the shift has one // use. Just that (shl X,C) is not constant there.. So I would assume dealing with similar situation controlled by TLI.decomposeMulByConstantwill be easy and also no harm to other targets. BTW: looks like x86 also has imul.
Comment Actions Hi @jsji An infinite loop will occur if we handle the scenario 1 (mul X, c2 << c1) -> (shl (mul X, c2), c1) in DAGCombiner, because there exists a reverse conversion (shl (mul x, c1), c2) -> (mul x, c1 << c2). Comment Actions Hmm.. then OK to keep this in ISEL, but please add comments about DAGcombiner prefer (mul X, c2 << c1). Thanks |
Maybe it would be clearer if we use DAG expression in comments.
eg: (mul X, c2 << c1) -> (rldicr (mulli X, c2 >> c1) c1)