This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Improve matching of SSE blend instructions with splatted vector inputs
AbandonedPublic

Authored by RKSimon on Dec 14 2014, 5:11 AM.

Details

Summary

If the input to a shuffle is safely splatted (i.e. no undef) then the SSE blend instructions can use any lane from that input to match against.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 17262.Dec 14 2014, 5:11 AM
RKSimon retitled this revision from to [X86][SSE] Improve matching of SSE blend instructions with splatted vector inputs.
RKSimon updated this object.
RKSimon edited the test plan for this revision. (Show Details)
RKSimon added reviewers: chandlerc, andreadb.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: Unknown Object (MLST).
chandlerc edited edge metadata.Dec 15 2014, 11:07 AM

I generally think this entire thing should be handled in the target-independent dag combiner. We should always canonicalize a splat-like shuffle into a buildvector that is obviously a splat. And we should always canonicalize a shuffle of such a buildvector into a blend IMO.

At the very least, I'd prefer to see these at x86 DAG combines rather than doing it in the lowering unless (as mentioned below) you have test cases where the pattern only emerges while lowering.

lib/Target/X86/X86ISelLowering.cpp
7366–7369

Why are only constants safe here? Shouldn't it be any buildvector of a single scalar SDValue without undefs?

I also could have sworn there was already a predicate for this...

7372–7376

If we fail to turn this kind of shuffle vector into a buildvector splat, we should fix that in the target independent dag combining, no?

The only time I've been unable to do this is when the pattern didn't emerge until *during* lowering. Do you have test cases showing that?

No problem - I'll look into moving this into the target-independent dag combiner.

lib/Target/X86/X86ISelLowering.cpp
7366–7369

I just added the most likely contenders to the safesplat detector. I can find several existing predicates that do some splat testing, mainly within various different targets' specific code, and none that appear to do all the obvious tests. I'll look into putting a more general predicate in the ISD namespace and convert some existing use cases (I've been meaning to do something similar for the zeroable shuffle tests as well).

7372–7376

I'll look at adding a dag combiner test to improve partial splats with undefs to a full splat.

No real world test cases that I can recall. The only cases I can think of are ones where we are blending with zero, lowering often raises 'zeroable' lanes to definite zeros, but AFAIK we don't track that in x86 shuffle lowering. But even this could technicaly be raised to the target independent dag combiner.

RKSimon abandoned this revision.Jan 8 2015, 10:38 AM

Abandoning patch - I will attempt to do this in DAG combine instead.