This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] Fix invalid size request in combineRepeatedFPDivisors
ClosedPublic

Authored by c-rhodes on Jan 27 2022, 3:15 AM.

Details

Summary

If we have a vector FP division with a splatted divisor, use
getVectorMinNumElements when scaling the num of uses by splat factor.

For AArch64 the combine kicks in for the <vscale x 4 x float> case since it's
above the fdiv threshold (3) when scaling num uses by splat factor, but the
codegen is worse (splat + vector fdiv + vector fmul) than the <vscale x 2 x
double> case (splat + vector fdiv).

If the combine could be converted into a scalar FP division by
scalarizeBinOpOfSplats it may be cheaper, but it looks like this is predicated
on the isExtractVecEltCheap TLI function which is implemented for x86 but not
AArch64. Perhaps for now combineRepeatedFPDivisors should only scale num uses
by splat if the division can be converted into scalar op.

Diff Detail

Event Timeline

c-rhodes created this revision.Jan 27 2022, 3:15 AM
c-rhodes requested review of this revision.Jan 27 2022, 3:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 27 2022, 3:15 AM
c-rhodes updated this revision to Diff 403586.Jan 27 2022, 4:30 AM
c-rhodes edited the summary of this revision. (Show Details)

Combine test with existing llvm/test/CodeGen/AArch64/fdiv-combine.ll

c-rhodes updated this revision to Diff 403622.Jan 27 2022, 5:45 AM

Added some more splat tests (including NEON). Will post a follow-up patch to prevent scaling of num uses by splat factor unless division can be converted into scalar op.

This revision is now accepted and ready to land.Jan 28 2022, 5:17 AM
This revision was landed with ongoing or failed builds.Jan 28 2022, 9:01 AM
This revision was automatically updated to reflect the committed changes.