Page MenuHomePhabricator

[MachineCombiner] add a hook to allow some extension for resource length - NFC
ClosedPublic

Authored by shchenz on May 26 2020, 7:47 PM.

Details

Summary

Currently preservesResourceLen in machine combiner pass requires that new instructions resource length is not bigger than old instructions resource length.

I think this is ok if size of new instructions is not bigger than size of old instructions. For now I think all reassociations in machine combiner pass meets this condition.

But in patch https://reviews.llvm.org/D80175, we add a new reassociation that combines fma + fma + fma to (fmul + fma + fma + fadd), this will cause callee in getResourceLength, getCycles sometimes return same resource length for new and old instructions and sometimes new resource length is 1 bigger than old resource length.

Add this hook to make it target specific to accept some extension in resource length but we may still benefit from reducing in data depth.

Diff Detail

Event Timeline

shchenz created this revision.May 26 2020, 7:47 PM
shchenz updated this revision to Diff 266404.May 26 2020, 8:11 PM

debug message handling

Harbormaster failed remote builds in B57981: Diff 266404!
spatel accepted this revision.May 30 2020, 9:43 AM

LGTM with minor fix.

These might also be interesting FMA discussions/examples to look at:
D80801
D18751

llvm/include/llvm/CodeGen/TargetInstrInfo.h
1099

Is there a reason to make this int16_t? If yes, it should be specified in the code comment. If not, just use plain 'int' or 'unsigned'.

This revision is now accepted and ready to land.May 30 2020, 9:43 AM
shchenz marked an inline comment as done.May 31 2020, 9:08 PM

Thanks for your review. @spatel

On Powerpc, at least on Power9, FMA, FADD, FMUL all have the same latency. so we always prefer FMA to FADD + FMUL in DAGCombiner for now.

Making the combining of FMA in MachineCombiner instead of DAGCombiner like ARM64 in https://reviews.llvm.org/D18751 is not a good idea for PowerPC. As the comment https://reviews.llvm.org/D18751#402906 indicates, there are many patterns to be compared with to get the optimal sequence and MachineCombiner is not designed to get the best pattern. Also extending the current implementation on ARM64 from 1 FMA tuning to a group of FMA's tuning may introduce big complexity.

So now on Powerpc, we decide to make the combining of FMA in DAGCombiner and do specific pattern breaking in Machinecombiner to get good ILP.

llvm/include/llvm/CodeGen/TargetInstrInfo.h
1099

No specific reason to use int16_t, I just want to use a small type. int16_t and int should use the same type register as a return value at least on PowerPC target. will be changed to int in the commit patch.