This is an archive of the discontinued LLVM Phabricator instance.

[TargetLowering] try harder to determine undef elements of vector binops
ClosedPublic

Authored by spatel on Jan 22 2019, 2:25 PM.

Details

Summary

This might be the start of tracking all vector element constants generally if we take it to its logical conclusion, but I thought I'd better stop here and see if there's support for the general direction.

The affected tests require a convoluted path before they get simplified currently because we don't call SimplifyDemandedVectorElts() from binops directly and don't modify the binop operands directly in SimplifyDemandedVectorElts().

That's why the tests all have a trailing shuffle to induce a chain reaction of transforms. So something like this is happening:

  1. Improve the knowledge of undefs in the binop via a SimplifyDemandedVectorElts() call that originates from a shuffle.
  2. Transfer that undef knowledge back to the shuffle mask user as more undef lanes.
  3. Combine the modified shuffle by calling SimplifyDemandedVectorElts() again.
  4. Translate the improved shuffle mask as undemanded lanes of build vector constants causing those to become full undef constants.
  5. Simplify the binop now that it has a full undef operand.

As we can see from the unchanged 'and' and 'or' tests, tracking undefs alone isn't a full solution. We would need to track zero and all-ones constants to improve those opcodes. We'd probably need to track NaN for FP ops too (assuming we don't have fast-math-flags set).

Diff Detail

Event Timeline

spatel created this revision.Jan 22 2019, 2:25 PM
RKSimon added inline comments.Jan 23 2019, 4:19 AM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
1872

(off topic) Can ISD::MUL be supported here?

1885

Do we need to add getKnownUndefForVectorBinop support here?

spatel marked 2 inline comments as done.Jan 23 2019, 5:31 AM
spatel added inline comments.
lib/CodeGen/SelectionDAG/TargetLowering.cpp
1872

Yes - I figured we'd do that in a follow-up. We also want to add DAG-only binops such as saturating ops, (f)min/max?

1885

Yes, and I started that patch, but I'm struggling to find a case where it makes a difference because 'and C, undef --> 0'. Ie, I think we need to adjust/add the logic for calculating KnownZero to make this worthwhile for 'and'.

RKSimon added inline comments.Jan 29 2019, 3:57 AM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
1472

What's the policy on creating nodes on the fly like this?

spatel marked an inline comment as done.Jan 29 2019, 6:28 AM
spatel added inline comments.
lib/CodeGen/SelectionDAG/TargetLowering.cpp
1472

I don't think it's stated anywhere, but we don't want to do it in general.

I've seen cases in x86 lowering where a temp *variable* node is created speculatively and that inhibits other folds because it changes the number of uses of other values.

In this case, I've tried to ensure that we are only creating a constant or undef, so I could assert that to see if we escaped that assumption.

I may be missing some alternate means, but if we use FoldConstantArithmetic() directly, then we'll have to deal with FP separately because there's no equivalent/unified call for folding FP math. AFAIK, that's done directly within getNode().

RKSimon accepted this revision.Jan 30 2019, 1:05 AM

LGTM

lib/CodeGen/SelectionDAG/TargetLowering.cpp
1472

OK, please can you add a TODO comment.

This revision is now accepted and ready to land.Jan 30 2019, 1:05 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2019, 7:35 AM