This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] Swap to using TargetTransformInfo for cost analysis.
ClosedPublic

Authored by jmolloy on Feb 9 2015, 9:08 AM.

Details

Summary

We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness"
heuristic and use TTI directly. Generally NFC intended, but we're using a slightly
different heuristic now so there is a slight test churn.

Test changes:

  • combine-comparisons-by-cse.ll: Removed unneeded branch check.
  • 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq.
  • coalesce-subregs.ll: Superfluous block check.
  • 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv.
  • PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present.
  • select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI.

Diff Detail

Repository
rL LLVM

Event Timeline

jmolloy updated this revision to Diff 19588.Feb 9 2015, 9:08 AM
jmolloy retitled this revision from to [SimplifyCFG] Swap to using TargetTransformInfo for cost analysis..
jmolloy updated this object.
jmolloy edited the test plan for this revision. (Show Details)
jmolloy added reviewers: hfinkel, t.p.northover.
jmolloy set the repository for this revision to rL LLVM.
jmolloy added a subscriber: Unknown Object (MLST).

I'm assuming that this patch should help to make sure vmin/vmax/fmin/fmax instructions gets produced for C code similar to
if (a<b) b=a;

If that is indeed one of the intents, it may be good to write a test to check that? After having quickly looked into existing regression tests, it seems there are plenty of tests to check correct MC-level handling of vmin/vmax/fmin/fmax instructions and also quite a lot of tests checking it gets generated when the operation is encoded as a select in llvm-ir, but I couldn't find a test that checks that a multi-basic-block representation of the operation is turned into an vmin/vmax/fmin/fmax.

Hi Kristof,

This revision doesn't have any deliberate heuristic changes, so clamp() will not be optimized in this revision to min/max. I've added such a test to D7507, which is where I expect this to happen.

The AArch64 backend does not, at the moment, produce fmin/fmax for this, it produces fcsel. That is an extra improvement we need to make.

Cheers,

James

Hi James,

With this patch, intrinsic calls with no side effects are now considered to be viable candidates for speculation. Before your patch, SimplifyCFG would have conservatively returned a high cost for intrinsic calls. My question is: was this functional change intended? In general, I like the idea of flattening the CFG as long as CodeGenPrepare knows how to undo a speculation if it is not profitable for the target.

For example, your patch would allow SimplifyCFG to speculate calls to cttz/ctlz. This would be done without knowing if it is cheap for the target. This may cause performance degradation on x86 targets with no LZCNT/BMI. My opinion is that, if we want to speculate intrinsic calls, then we also have to teach CodeGenPrepare how to revert the change if it turns out that the speculation was not profitable (cheap) for the target. At the moment, CodeGenPrepare doesn't know how to undo a wrong speculation on cttz/ctlz calls (I guess, we can teach CodeGenPrepare how to do it in a separate patch..).

Have you collected some performance numbers after this change?

Thanks,
Andrea

hfinkel accepted this revision.Feb 10 2015, 9:43 AM
hfinkel edited edge metadata.

LGTM. As Andrea points out, this is going to cause some potentially-significant behavioral swings, and we'll need to watch for performance regressions carefully.

This revision is now accepted and ready to land.Feb 10 2015, 9:43 AM
andreadb accepted this revision.Feb 11 2015, 3:47 AM
andreadb edited edge metadata.

LGTM too.
I plan to send a follow-up patch to improve the cost heuristic of intrinsic cttz/ctlz for targets that don't have "cheap" count leading/trailing zeroes.

Thanks all - landed in r228826. I haven't committed the followup D7507.

jmolloy closed this revision.Mar 18 2015, 3:27 AM