This is an archive of the discontinued LLVM Phabricator instance.

[TargetLowering] Add support for non-uniform vectors to BuildUDIV
ClosedPublic

Authored by RKSimon on Jul 12 2018, 9:24 AM.

Details

Summary

This patch refactors the existing TargetLowering::BuildUDIV base implementation to support non-uniform constant vector denominators.

This does require a change to the function arguments, which would affect any target that has overridden this (none in mainline though).

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Jul 12 2018, 9:24 AM
RKSimon updated this revision to Diff 155205.Jul 12 2018, 9:59 AM

Bail out if the build vector has implicit truncations.

lebedev.ri added inline comments.Jul 14 2018, 6:30 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
17829

isKnownNeverZero() should also probably be wary of undef?

RKSimon added inline comments.Jul 14 2018, 6:45 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
17829

If its a constant build vector then isKnownNeverZero() will return false if any elements contain undef.

lebedev.ri added inline comments.Jul 14 2018, 6:52 AM
efriedma added inline comments.Jul 18 2018, 12:03 PM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
3568

I'm pretty sure this isn't the most efficient way to handle this case. Would it be more efficient to use a variable-amount shift to compute NPQ, rather than shuffle afterwards? Or is it possible to fudge the magic number to avoid the blend?

Please make sure you have test coverage for both UseNPQ=true and UseNPQ=false.

test/CodeGen/X86/combine-udiv.ll
383

I guess it's sort of orthogonal to this patch, but this should probably be using pmulhuw for the shift.

RKSimon added inline comments.Jul 19 2018, 11:00 AM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
3568

We do have full coverage for scalars and uniform vectors only - I'll try to add all the necessary non-uniform cases as well.

test/CodeGen/X86/combine-udiv.ll
383

I raised this on PR38151 but haven't had time to get to it.

RKSimon added inline comments.Jul 19 2018, 11:27 AM
test/CodeGen/X86/combine-udiv.ll
383

D49562 is an initial attempt at this.

RKSimon updated this revision to Diff 158725.Aug 2 2018, 4:23 AM

Rebased - now with new tests and better SRL->MULHU codegen

RKSimon marked 3 inline comments as done.Aug 2 2018, 4:24 AM
RKSimon added inline comments.
lib/CodeGen/SelectionDAG/TargetLowering.cpp
3568

Tests now ensure we have vector cases with all/none/some NPQ path

RKSimon added inline comments.Aug 3 2018, 3:21 AM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
3568

I'm pretty sure this isn't the most efficient way to handle this case. Would it be more efficient to use a variable amount shift to compute NPQ, rather than shuffle afterwards? Or is it possible to fudge the magic number to avoid the blend?

@efriedma I haven't found anything that works any better then the current shuffle - the additional SUB and ADD seem to get in the way - do you have a codegen example of what you have in mind?

efriedma added inline comments.Aug 3 2018, 12:52 PM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
3568

By "variable amount shift", I was thinking of something like VPSUB+VPSRLVD+VPADD: if the shift amount is zero, it's a no-op.

The other thing I was thinking is that it might be possible to transform the magic number for an element so that SelNPQ is true for all the elements by shifting magics.a left? Not sure if that actually works.

RKSimon added inline comments.Aug 5 2018, 6:09 AM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
3568

ADD( Q, SRL( SUB( N0, Q ), 0 ) ) --> N0

what I think you have in mind is if we can use MULHU/UMUL_LOHI, instead of SRL, so this becomes

ADD( Q, MULHU( SUB( N0, Q ), C ) ) --> Q (C == 0 for non-NPQ path)

Does that sound about right? DAGCombiner::visitMULHU might need some improvements as well.

RKSimon updated this revision to Diff 159269.Aug 6 2018, 3:47 AM

Use an extra MULHU to support the shift-by-one and shift-by-zero NPQ/no-NPQ case.

efriedma accepted this revision.Aug 6 2018, 5:26 PM

LGTM

This revision is now accepted and ready to land.Aug 6 2018, 5:26 PM
This revision was automatically updated to reflect the committed changes.