Page MenuHomePhabricator

[DAG] Move basic USUBSAT pattern matches from X86 to DAGCombine
ClosedPublic

Authored by RKSimon on Feb 10 2021, 6:04 AM.

Details

Summary

Begin transitioning the X86 vector code to recognise sub(umax(a,b) ,b) or sub(a,umin(a,b)) USUBSAT patterns more generic and available to all targets.

This initial patch just moves the basic umin/umax patterns to DAG, removing some vector-only checks on the way - these are some of the patterns that the legalizer will try to expand back to so we can be reasonably relaxed about matching these pre-legalization.

We can handle the trunc(sub(..))) variants as well, which helps with patterns where we were promoting to a wider type to detect overflow/saturation.

The remaining x86 code requires some cleanup first - some of it isn't actually tested etc. I also need to resurrect D25987.

Diff Detail

Event Timeline

RKSimon created this revision.Feb 10 2021, 6:04 AM
RKSimon requested review of this revision.Feb 10 2021, 6:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2021, 6:04 AM
Herald added a subscriber: wdng. · View Herald Transcript
nikic added inline comments.Feb 10 2021, 11:49 AM
llvm/test/CodeGen/AArch64/usub_sat.ll
35

Why are tests that directly use usub.sat intrinsics affected by this? Are we doing something weird like first expanding them to min/max sub and then combining them back to usubsat on the extended type?

craig.topper added inline comments.Feb 10 2021, 7:45 PM
llvm/test/CodeGen/AArch64/usub_sat.ll
35

I think type legalization expands it when it promotes the type. For USUBSAT, the expansion isn't necessary. Promoting the operands by zero extending would have been enough since the saturating value is 0 so is not affected by the promoted type. This is different than UADDSAT where the saturation value is UINT_MAX of the original type.

RKSimon added inline comments.Feb 11 2021, 2:44 AM
llvm/test/CodeGen/AArch64/usub_sat.ll
35

Its a (nice?) sideeffect of the args actually being passed as zeroext i32 - we start off with a usubsat i16, which gets promoted to a i32 expanded sequence, but we're still before legalops, and as we know we have zero'd upper bits the new combine reforms the usubsat i32 pattern, which then expands to the shorter i32 codegen.

craig.topper added inline comments.Feb 11 2021, 8:43 AM
llvm/test/CodeGen/AArch64/usub_sat.ll
35

I thought the zero upper bits is only checked if we look through a truncate.

If we change the type legalizer to preserve USUBSAT we pick up an improvement on at least one additional X86 test.

RKSimon added inline comments.Feb 11 2021, 10:17 AM
llvm/test/CodeGen/AArch64/usub_sat.ll
35

Sorry my mistake - I was getting mixed and was using other targets tests against aarch64.

Are you proposing we emit the promoted USUBSAT inside DAGTypeLegalizer::PromoteIntRes_ADDSUBSHLSAT ?

craig.topper added inline comments.Feb 11 2021, 11:45 AM
llvm/test/CodeGen/AArch64/usub_sat.ll
35

Yes. It already zero extended the operands so it should be safe I think. It picks up a couple improvements to
X86/usub_sat_plus.ll that get broken by SimplifyDemandedBits before the combine in this patch has a chance to kick in.

RKSimon added inline comments.Feb 11 2021, 12:52 PM
llvm/test/CodeGen/AArch64/usub_sat.ll
35

Yes I can do that - do you have any objection to me doing that as a follow-up patch for review after this one has landed?

craig.topper added inline comments.Feb 11 2021, 1:09 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3149

Is this equivalent to something like

if (!DAG.MaskedValueIsZero(LHS, APInt::getBitsSetFrom(SubVT.getScalarSizeInBits(), DstVT.getScalarSizeInBits())

llvm/test/CodeGen/AArch64/usub_sat.ll
35

No objection

RKSimon added inline comments.Feb 11 2021, 1:22 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3149

Yes - missed that one when I moved this from x86......

RKSimon updated this revision to Diff 323273.Feb 12 2021, 2:28 AM

Use SelectionDAG::MaskedValueIsZero

This revision is now accepted and ready to land.Feb 12 2021, 9:24 AM