This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner][X86] Teach SimplifyVBinOp to fold VBinOp (concat X, undef/constant), (concat Y, undef/constant) -> concat (VBinOp X, Y), VecC
ClosedPublic

Authored by craig.topper on Aug 23 2019, 1:30 PM.

Details

Summary

This improves the combine I included in D66504 to handle constants in the upper operands of the concat. If we can constant fold them away we can pull the concat after the bin op. This helps with chains of madd reductions on X86 from loop unrolling. The loop madd reduction pattern creates pmaddwd with half the width of the add that follows it using zeroes to fill the upper bits. If we have two of these added together we can pull the zeroes through the accumulating add and then shrink it.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Aug 23 2019, 1:30 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2019, 1:30 PM
craig.topper marked an inline comment as done.Aug 23 2019, 1:32 PM
craig.topper added inline comments.
llvm/test/CodeGen/X86/madd.ll
2761 ↗(On Diff #216937)

We were just barely able to prove the bits were all 0 and therefore disjoint, but weren't able to remove the ADD/OR completely because the disjoint bits code gets to go one level deeper in computeKnownBits then SimplifyDemandedBits. This is because the disjoint check starts at a depth of 0 for the operands, but the SimplifyDemandedBits starts at a depth of 0 for the OR node itself.

RKSimon accepted this revision.Aug 26 2019, 7:13 AM

LGTM - cheers

This revision is now accepted and ready to land.Aug 26 2019, 7:13 AM
This revision was automatically updated to reflect the committed changes.