This is an archive of the discontinued LLVM Phabricator instance.

[DAG] Add SimplifyDemandedVectorElts binop SimplifyMultipleUseDemandedBits handling
ClosedPublic

Authored by RKSimon on Apr 28 2020, 6:41 AM.

Details

Summary

For the supported binops (basic arithmetic, logicals + shifts), if we fail to simplify the demanded vector elts, then call SimplifyMultipleUseDemandedBits and try to peek through ops to remove unnecessary dependencies.

This helps with PR40502.

Diff Detail

Event Timeline

RKSimon created this revision.Apr 28 2020, 6:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 28 2020, 6:41 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
RKSimon added inline comments.Apr 28 2020, 6:47 AM
llvm/test/CodeGen/AArch64/mul_by_elt.ll
145

Not sure about this change - it strips the first shufflevector entirely losing the multiply by scalar, but instcombine would have done something similar anyhow:

define <4 x float> @splat0_before_fmul_fmul_constant(<4 x float> %a) {
  %1 = fmul <4 x float> %a, <float 3.000000e+00, float undef, float undef, float undef>
  %2 = fmul <4 x float> %1, <float 6.000000e+00, float undef, float undef, float undef>
  %mul2 = shufflevector <4 x float> %2, <4 x float> undef, <4 x i32> zeroinitializer
  ret <4 x float> %mul2
}

Seems reasonable to me in general.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
2676–2694

Just wondering, should we not demand elts that are KnownUndef?

RKSimon added inline comments.May 4 2020, 7:11 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
2676–2694

That's a good idea - decent tests might be tricky but should be doable. A slight thing to watch out for is KnownUndef/KnownZero bits are only guaranteed to be correct for demanded elts.

Would it be OK just to add this as a TODO comment for now? It might need some investigation (and improvements to KnownUndef/KnownZero emission which isn't great yet as we don't use it enough).

lebedev.ri added inline comments.May 4 2020, 7:53 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
2676–2694

That's a good idea - decent tests might be tricky but should be doable.

Aha. That might even be more generic than this if-block.

Would it be OK just to add this as a TODO comment for now?

Yes, absolutely.

craig.topper added inline comments.May 4 2020, 12:50 PM
llvm/test/CodeGen/AArch64/mul_by_elt.ll
145

Why does "multiple use" come into play here?

RKSimon marked an inline comment as done.May 4 2020, 2:02 PM
RKSimon added inline comments.
llvm/test/CodeGen/AArch64/mul_by_elt.ll
145

Because SimplifyMultipleUseDemandedBits will peek through a shuffle entirely but SimplifyDemandedVectorElts is (overly) careful at not simplifying to an identity shuffle - which tbh is something we should get rid of.

I'm not sure why the constant splat isn't being used as the scalar operand as an alternative though.

RKSimon updated this revision to Diff 262052.May 5 2020, 3:15 AM

add KnownUndef TODO comments

ping - any more comments?

The code duplication is making me itchy. Add a helper like:

if (simplifyDemandedVectorEltsBinop(Op, DemandedElts, TLO, Depth))
  return true;

That could include just the new block that's being created here, or we can dispatch directly on all of the binop cases in the top-level and then switch for the KnownZero/Undef/DemandedBits differences within there.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
2681

Is it correct to recycle the demanded bits on both operands here? That seems wrong for FP ops IIUC.

RKSimon updated this revision to Diff 265932.May 24 2020, 9:03 AM

Cheers @spatel - I've pulled out the SimplifyDemandedVectorEltsBinOp helper which makes separate DemandedBits 'all bits' maskes for each op which should also answer the query about the fpops.

spatel accepted this revision.May 24 2020, 9:50 AM

LGTM.
I'd sink the isAllOnesValue() check and/or hoist the CombineTo() into the lambda too for still less repeated code, but maybe there's more differentiation from known undef that could happen here?

This revision is now accepted and ready to land.May 24 2020, 9:50 AM

LGTM.
I'd sink the isAllOnesValue() check and/or hoist the CombineTo() into the lambda too for still less repeated code, but maybe there's more differentiation from known undef that could happen here?

I'll move the CombineTo handling into the lambda as well - the isAllOnes() I'm going to leave out until I can work out how to best merge it with KnownUndef as mentioned in the TODOs.

This revision was automatically updated to reflect the committed changes.