This is an archive of the discontinued LLVM Phabricator instance.

[X86 DAG] Lower vselects with a constant condition as blend+imm if possible
ClosedPublic

Authored by filcab on May 13 2014, 8:09 PM.

Details

Summary

I set the operation actions for VSELECT of appropriate vector types to be
Custom, and am taking care that the ones that were Legal are still treated
as that after the Custom step. Same goes for the ones previously marked as
Expand.

LowerVSELECT will, if possible, generate a X86ISD::BLENDI DAG node if the
condition is constant and we can emit that instruction, given the
subtarget.

This is not enough for all cases. If we're doing a vselect on vectors that
are not legal on this target they will be split. The best option I could
find, and which implies less code (and bugs) was to add a case to
PerformSELECTCombine, in order to match this operand split (we add some
size extensions when we legalize the types).

I also updated all the tests that failed because they were expecting
variable byte blends (blendvb) or similar and they had a vector of
constant conditions.

Diff Detail

Event Timeline

filcab updated this revision to Diff 9372.May 13 2014, 8:09 PM
filcab retitled this revision from to [X86 DAG] Lower vselects with a constant condition as blend+imm if possible.
filcab updated this object.
filcab edited the test plan for this revision. (Show Details)
filcab added reviewers: grosbach, delena, nadav.
filcab added a subscriber: Unknown Object (MLST).

Hi Filipe,

I have a minor comment/suggestion (see below).

lib/Target/X86/X86ISelLowering.cpp
7974

This function could be simplified a bit if
BuildVector is known to always be a build_vector dag node of all ConstantSDNode (or UNDEF) elements.
I think you can easily add it as a precondition; the caller can verify if a node is a build_vector of constants (or undef values) by simply calling function
ISD::isBuildVectorOfConstantSDNodes.
Also, given that argument 'BuildVector' can only be a build_vector dag node, you can change the signature of this function to something like:

static bool BUILD_VECTORtoBlendMask(BuildVectorSDNode *BuildVector, unsigned &MaskValue, SelectionDAG &DAG)

This is to make explicit that BuildVector can only be a BuildVectorSDNode.

Under the assumption that BuildVector is a build_vector of constants or UNDEFs, then
lines in the range [7992-8008] can be simplified and replaced by the folling 5 lines.

int Lane1Cond = -1, Lane2Cond = -1;
if (isa<ConstantSDNode>(EltCond))
  Lane1Cond = cast<ConstantSDNode>(EltCond)->isNullValue();
if (isa<ConstantSDNode>(SndLaneEltCond))
  Lane2Cond = cast<ConstantSDNode>(SndLaneEltCond)->isNullValue();

The function will return false only if Lane1Cond != Lane2Cond && Lane1Cond and Lane2Cond both are >= 0.

nadav edited edge metadata.May 14 2014, 1:22 PM

Overall this change makes sense. I did not review the patch carefully but I noticed that Andrea did. When you commit please split this patch to the different optimizations: shuffle blends, BLENDI, etc.

filcab updated this revision to Diff 9401.May 14 2014, 2:03 PM
filcab edited edge metadata.

Addresses Andrea's comments about the BUILD_VECTORtoBlendMask function.

Hi Nadav,

I'll separate the patch in three commits:

  • Actually lower VSELECTs, possibly with no additional functionality
    • Optimize lowered VSELECTs when possible
  • PerformSELECTCombine changes

    Filipe
nadav added a comment.May 14 2014, 2:49 PM

Excellent. Please update the cost model as well.

Thanks :)

Hi Nadav,

I'm not very familiar with the cost model part, but I couldn't find proper
places that I have to change:
I'm assuming you're talking about getCmpSelInstrCost() in
X86TargetTransformInfo, is this right?

If so, the function only seems to get the types of the instruction's
condition and result, not the proper arguments.
If we don't get the arguments, we can't be sure if this will be a cheap
vselect.

Even with that, the BaseTargetTransformInfo only cares about nodes set to
Expand. All other nodes' cost (Legal, Custom, Promote) are the type
legalization cost + 1 (BasicTargetTransformInfo.cpp:445).
X86TargetTransformInfo.cpp:576 has the X86 specialization, but this one
just adds what's on the cost table to the type legalization. I don't think
it's really needed to add cost table entries just to, in the end, have the
same behaviour as we now have.

I also added a test for the costs to lower a vselect, since we had none.
But I can commit it without the table changes.
What do you think? I have the tables ready to commit anyway, but I don't
think we need to.

Filipe

filcab accepted this revision.May 26 2014, 7:05 PM
filcab added a reviewer: filcab.
This revision is now accepted and ready to land.May 26 2014, 7:05 PM
filcab closed this revision.May 26 2014, 7:06 PM

Commited as r209042, r209043, r209044, and r209045.