This is an archive of the discontinued LLVM Phabricator instance.

[X86] Use UADDSAT/USUBSAT instead of ADDUS/SUBUS
ClosedPublic

Authored by nikic on Dec 17 2018, 1:31 PM.

Details

Summary

Part of https://bugs.llvm.org/show_bug.cgi?id=40056.

This removes the ADDUS/SUBUS X86ISD opcodes in favor of the generic UADDSAT/USUBSAT opcodes, without touching anything else.

I've marked a bunch of types like v8i8 as custom legalized, so they are widened (as now) rather than promoted (as default). Is there any more elegant way to do this, by reusing the generic vector widening code, rather than the custom code in X86 ReplaceNodeResults?

Diff Detail

Repository
rL LLVM

Event Timeline

nikic created this revision.Dec 17 2018, 1:31 PM
RKSimon added inline comments.Dec 17 2018, 1:34 PM
lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
605 ↗(On Diff #178521)

Op1.getScalarValueSizeInBits();

lib/Target/X86/X86ISelLowering.cpp
26275 ↗(On Diff #178521)

X86ISD::AVG/VPMADDWD or ISD::UADDSAT/USUBSAT

nikic updated this revision to Diff 178524.Dec 17 2018, 1:42 PM

Use getScalarValueSizeInBits(), update comment.

craig.topper accepted this revision.Dec 17 2018, 1:52 PM

LGTM. I'll rebase D55780 after this goes in.

This revision is now accepted and ready to land.Dec 17 2018, 1:52 PM
RKSimon requested changes to this revision.Dec 17 2018, 2:07 PM
RKSimon added inline comments.
lib/Target/X86/X86ISelLowering.cpp
1217 ↗(On Diff #178524)

These need to use the HasInt256 ? Legal : Custom and you need to add a split256IntArith lowering for AVX1 targets (see LowerADD_SUB).

This revision now requires changes to proceed.Dec 17 2018, 2:07 PM

You’ll need to use an intrinsic to test the changes Simon asked for. The existing combines for X86 do their own split for AVX1.

You’ll need to use an intrinsic to test the changes Simon asked for. The existing combines for X86 do their own split for AVX1.

If that is going to be tricky you can just add a Subtarget->hasInt256() limit and a TODO comment.

nikic marked 2 inline comments as done.Dec 17 2018, 2:56 PM

You’ll need to use an intrinsic to test the changes Simon asked for. The existing combines for X86 do their own split for AVX1.

If that is going to be tricky you can just add a Subtarget->hasInt256() limit and a TODO comment.

I plan to add tests for llvm.uadd.sat/llvm.usub.sat with vectors, which are also going to exercise this case. We need those tests anyway :)

nikic updated this revision to Diff 178550.Dec 17 2018, 3:52 PM

Handle 256-bit vector split, add vector tests for llvm.uadd.sub, llvm.usub.sat.

nikic marked an inline comment as done.Dec 17 2018, 3:55 PM
nikic added inline comments.
test/CodeGen/X86/uadd_sat_vec.ll
614 ↗(On Diff #178550)

This expansion is pretty crazy...

craig.topper added inline comments.Dec 17 2018, 4:26 PM
test/CodeGen/X86/uadd_sat_vec.ll
614 ↗(On Diff #178550)

Yeah I wouldn't expect vXi1 to be good, but this is even worse than I imagined. Its definitely being scalarized and then promoted. But the scalarization looks weird. But I'm not sure how much we care.

We can pretty easily replace with vXi1 ADDUSAT with OR X,Y and SUBUSAT with X & ~Y if we care.

619 ↗(On Diff #178550)

Can you add nounwind to the test case to drop the .cfi directives

RKSimon added inline comments.Dec 17 2018, 11:49 PM
test/CodeGen/X86/uadd_sat_vec.ll
614 ↗(On Diff #178550)

You can either add vXi1 handling to LowerUADDSAT_USUBSAT in a similar way to LowerADD_SUB or you can add a DAGCombine transform just to convert them - I don't think we're losing any info converting to bool logic ops?

test/CodeGen/X86/usub_sat_vec.ll
2 ↗(On Diff #178550)

Might as well use --check-prefixes

--check-prefixes=SSE,SSE2
nikic updated this revision to Diff 178634.Dec 18 2018, 3:34 AM

Custom lower vNi1, use --check-prefixes.

nikic marked 5 inline comments as done.Dec 18 2018, 3:39 AM
nikic added inline comments.
test/CodeGen/X86/uadd_sat_vec.ll
614 ↗(On Diff #178550)

I've handled this in LowerUADDSAT_USUBSAT for now, with the suggested OR and AND+NOT expansions. If/when we move the X86 ADDUS/SUBUS combines into generic DAGCombine we can move this one as well.

Thanks - please add the new test files to trunk with current codegen so the patch shows the diff.

nikic marked an inline comment as done.Dec 18 2018, 4:05 AM

Thanks - please add the new test files to trunk with current codegen so the patch shows the diff.

I can't commit these as-is, because the fixes for UADDSAT/USUBSAT legalization at least are needed (there will be assertion failures otherwise). Should I commit the tests together with those changes?

Thanks - please add the new test files to trunk with current codegen so the patch shows the diff.

I can't commit these as-is, because the fixes for UADDSAT/USUBSAT legalization at least are needed (there will be assertion failures otherwise).

Phab has this feature of being able to specify dependencies between the differentials. It is sometimes useful to do that.

Should I commit the tests together with those changes?

You could at least split the baseline tests into a separate 'parent' differential, and base this differential on that new one.

RKSimon accepted this revision.Dec 18 2018, 4:15 AM

Thanks - please add the new test files to trunk with current codegen so the patch shows the diff.

I can't commit these as-is, because the fixes for UADDSAT/USUBSAT legalization at least are needed (there will be assertion failures otherwise). Should I commit the tests together with those changes?

LGTM - yes please. The legalization + (current codegen) tests, followed by the improved x86 support and tests codegen.

This revision is now accepted and ready to land.Dec 18 2018, 4:15 AM
This revision was automatically updated to reflect the committed changes.

@nikic I forgot to ask - are you intending to do the signed add/sub equivalents as well?

nikic added a comment.Dec 18 2018, 7:57 AM

@RKSimon Yeah, I plan to migrate the signed ones as well.

llvm/trunk/lib/Target/X86/X86InstrSSE.td