If the input to a shuffle is safely splatted (i.e. no undef) then the SSE blend instructions can use any lane from that input to match against.
Diff Detail
- Repository
- rL LLVM
Event Timeline
I generally think this entire thing should be handled in the target-independent dag combiner. We should always canonicalize a splat-like shuffle into a buildvector that is obviously a splat. And we should always canonicalize a shuffle of such a buildvector into a blend IMO.
At the very least, I'd prefer to see these at x86 DAG combines rather than doing it in the lowering unless (as mentioned below) you have test cases where the pattern only emerges while lowering.
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
7366–7369 | Why are only constants safe here? Shouldn't it be any buildvector of a single scalar SDValue without undefs? I also could have sworn there was already a predicate for this... | |
7372–7376 | If we fail to turn this kind of shuffle vector into a buildvector splat, we should fix that in the target independent dag combining, no? The only time I've been unable to do this is when the pattern didn't emerge until *during* lowering. Do you have test cases showing that? |
No problem - I'll look into moving this into the target-independent dag combiner.
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
7366–7369 | I just added the most likely contenders to the safesplat detector. I can find several existing predicates that do some splat testing, mainly within various different targets' specific code, and none that appear to do all the obvious tests. I'll look into putting a more general predicate in the ISD namespace and convert some existing use cases (I've been meaning to do something similar for the zeroable shuffle tests as well). | |
7372–7376 | I'll look at adding a dag combiner test to improve partial splats with undefs to a full splat. No real world test cases that I can recall. The only cases I can think of are ones where we are blending with zero, lowering often raises 'zeroable' lanes to definite zeros, but AFAIK we don't track that in x86 shuffle lowering. But even this could technicaly be raised to the target independent dag combiner. |
Why are only constants safe here? Shouldn't it be any buildvector of a single scalar SDValue without undefs?
I also could have sworn there was already a predicate for this...