This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Scaling up values in ARMBaseInstrInfo::isProfitableToIfCvt() before they are scaled by a probability to avoid precision issue.
ClosedPublic

Authored by congh on Sep 9 2015, 4:35 PM.

Details

Summary

In ARMBaseInstrInfo::isProfitableToIfCvt(), there is a simple cost model in which the number of cycles is scaled by a probability to estimate the cost. However, when the number of cycles is small (which is usually the case), there is a precision issue after the computation. For example, for the following code from ARMBaseInstrInfo::isProfitableToIfCvt():

unsigned TUnpredCost = Probability.scale(TCycles);
unsigned FUnpredCost = Probability.getCompl().scale(FCycles);
unsigned UnpredCost = TUnpredCost + FUnpredCost;

Assume Probability is 0.5 and both TCycles and FCycles are 1, the resulted UnpredCost is 0 while 1 should be a more reasonable result. To avoid this issue, this patch scales both TCycles and FCycles up by 1024 (chosen to make the multiplication a litter faster) before they are scaled by the probability. Other variables also need to be scaled up for the final comparison.

Several test cases are adjusted due to this change.

Diff Detail

Event Timeline

congh updated this revision to Diff 34384.Sep 9 2015, 4:35 PM
congh retitled this revision from to [ARM] Scaling up values in ARMBaseInstrInfo::isProfitableToIfCvt() before they are scaled by a probability to avoid precision issue..
congh updated this object.
congh added reviewers: rengolin, t.p.northover, davidxl.
congh added a subscriber: llvm-commits.
rengolin added inline comments.Sep 16 2015, 5:54 AM
lib/Target/ARM/ARMBaseInstrInfo.cpp
1677

You don't really need the 1 * here. The comment makes it clear enough.

1701

scale() returns an uint64, so I'm guessing we were never be using more than 22 bits for the UnpredCost.

test/CodeGen/ARM/2013-10-11-select-stalls.ll
2

Is this intentional? I understand that pipeline stalls may be less bad than branch misprediction, but we don't want to have too many either. If this is a side effect of the change, I guess we should have some concrete numbers to rely on.

congh added inline comments.Sep 16 2015, 2:23 PM
lib/Target/ARM/ARMBaseInstrInfo.cpp
1677

OK. Thanks!

1701

As long as the input can fit in a 32-bit integer, so can the output.

test/CodeGen/ARM/2013-10-11-select-stalls.ll
2

Previously, the machine code sinking pass moves those two vector loads into different blocks to prevent pipeline stalls by intention. The if converter is scheduled after this pass and with this patch it will if-convert those two vector loads from two blocks to two loads in the same block. This may be a side effect of this patch. Ideally, the if converter should consider the cost of pipeline stall. So I think the solution is improving if converter's cost model to take pipeline stalls into consideration. What do you think?

rengolin added inline comments.Sep 17 2015, 10:33 AM
test/CodeGen/ARM/2013-10-11-select-stalls.ll
2

I think it's a good idea.

congh added a comment.Sep 17 2015, 1:37 PM

With further investigation, I found that pipeline stalls both occur w/ and w/o this patch. The stalls reported by LLVM are from waiting for vld being finished before using its loaded values. This is also true when two vld are put in two braches. LLVM doesn't report any stall just because the use of the loaded values and vld are in different basic blocks and the scheduler (which print stalls) only considers instructions in the same block. Therefore, I think the if-converter doesn't have to take care of this case specially.

rengolin edited edge metadata.Sep 18 2015, 5:33 AM

Ok, that makes sense. But just grep won't cut, as if future patches add another one on a different case, we won't notice. Please pipe to FileCheck and use a CHECK line where appropriate. Otherwise, LGTM.

congh updated this revision to Diff 35114.Sep 18 2015, 11:06 AM
congh edited edge metadata.

Update the patch by disabling if-conversion in 2013-10-11-select-stalls.ll.

rengolin accepted this revision.Sep 18 2015, 11:07 AM
rengolin edited edge metadata.

LGTM, thanks!

This revision is now accepted and ready to land.Sep 18 2015, 11:07 AM

Ok, that makes sense. But just grep won't cut, as if future patches add another one on a different case, we won't notice. Please pipe to FileCheck and use a CHECK line where appropriate. Otherwise, LGTM.

I have changed that test file by using FileCheck. I also disabled the if-conversion so that the original intention of the test remains unchanged. Thank you for the review!

This revision was automatically updated to reflect the committed changes.