This is an archive of the discontinued LLVM Phabricator instance.

Convert a vselect into a concat_vector if possible
ClosedPublic

Authored by filcab on May 26 2014, 8:36 PM.

Download Raw Diff

Details

Reviewers

grosbach
nadav
delena

Commits

rG82111f12fb0d: Convert a vselect into a concat_vector if possible
rL209929: Convert a vselect into a concat_vector if possible

Summary

If both vector args to vselect are concat_vectors and the condition is
constant and picks half a vector from each argument, convert the vselect
into a concat_vectors.

Added tests.

Diff Detail

Repository: rL LLVM

Event Timeline

filcab updated this revision to Diff 9818.May 26 2014, 8:36 PM

filcab retitled this revision from to Convert a vselect into a concat_vector if possible.

filcab updated this object.

filcab edited the test plan for this revision. (Show Details)

filcab added reviewers: nadav, delena, grosbach.

filcab added a subscriber: Unknown Object (MLST).

delena added inline comments.May 26 2014, 11:56 PM

lib/Target/X86/X86ISelLowering.cpp
17876 ↗	(On Diff #9818)	You do not check that the types are legal. On what stage this optimization should happen? <2 x float> that you take in your test is illegal.

Actually, seeing as it's generic, maybe I should move it to a more generic part of DAGCombiner, instead of the X86 backend.

What do you think, Elena? Nadav?

Filipe

lib/Target/X86/X86ISelLowering.cpp
17876 ↗	(On Diff #9818)	This is a generic optimization, which allows us to eliminate a vselect and replace it with a simpler concat_vectors, that's why I'm doing it before legalize types.

Moving this to DAGCombine makes sense to me.

lib/Target/X86/X86ISelLowering.cpp
17845 ↗	(On Diff #9818)	Does this do with right thing is BottomHalf is undef? It seems like you'd want to pick the first non-undef operand (and similarly with TopHalf below).

Moved the transform to DAGCombine.cpp, since it's generic, not X86 specific.

Also added the asserts Nadav requested and addressed the bug pointed out by Hal.

Nadav: Merging the loops would make for a big, complicated mess, instead
of two small, simple loops. I'd prefer to keep them separated, if you
don't mind.

hfinkel added inline comments.May 28 2014, 12:43 AM

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
4634 ↗	(On Diff #9860)	Are you sure this is always true on all possible code paths? If you are sure, please add a comment explaining where this is handled and why it will always happen first. Otherwise, please just make this a normal check.

If half is undef, and the other half is X (either true or false), this will
fall under the if statements before the call to this function in
visitSELECT. Around line 4707 (isBuildVectorAll{Zeros,Ones}). I also added
a comment at the call site of this function.

Filipe

Original Message -----

From: "Filipe Cabecinhas" <filcab+llvm.phabricator@gmail.com>
To: "filcab+llvm phabricator" <filcab+llvm.phabricator@gmail.com>, nrotem@apple.com, "elena demikhovsky"
<elena.demikhovsky@intel.com>, grosbach@apple.com
Cc: hfinkel@anl.gov, llvm-commits@cs.uiuc.edu
Sent: Wednesday, May 28, 2014 2:48:17 AM
Subject: Re: [PATCH] Convert a vselect into a concat_vector if possible

If half is undef, and the other half is X (either true or false),
this will
fall under the if statements before the call to this function in
visitSELECT. Around line 4707 (isBuildVectorAll{Zeros,Ones}). I also
added
a comment at the call site of this function.

Ah, right. isBuildVectorAllZeros also skips undefs.

-Hal

Filipe

http://reviews.llvm.org/D3916

hfinkel added inline comments.May 28 2014, 1:27 AM

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
4637 ↗	(On Diff #9860)	Why does operand 0 derive from operand 1 of the inputs?

Fixed a bug pointed by Hal Finkel, when returning the new concat_vectors

The code was doing the exact opposite of what it should.
I also added the specific registers to the test, in a way that accounts for some order change from the instruction scheduling.

I'm having a problem in making this test generic, though. Should I do a test for some additional architectures? The test/CodeGen/Generic directory only seems to check that llc doesn't error out when running on it. But I see no way to test this optimization without actually looking at the code.

Should I do more tests? Any way to check this that I didn't think about?

Thanks,

Filipe

Aside from fixing up the test, this LGTM.

test/CodeGen/X86/vselect.ll
269 ↗	(On Diff #9902)	I think can be made a little more specific. You know what the input registers are (because they're dictated by the calling convention), so you can do something like this: ; CHECK-DAG: movlhps %xmm2, %xmm{{[01]}} ; CHECK-DAG: movlhps %xmm3, %xmm{{[01]}} (where the CHECK-DAGs can match in either order). Because the add is commutative, it is hard to pattern match the resulting registers in a stable way (because they could be flipped). If you make the test use a subtract (so that the correct order is fixed), then we can be even more specific: ; CHECK-DAG: movlhps %xmm2, %[[REG1:[xmm[01]]]] ; CHECK-DAG: movlhps %xmm3, %[[REG2:[xmm[01]]]] ; CHECK-NEXT: subps [[REG1]], [[REG2]] (or something like that).

Thanks, I didn't remember check-dag. Btw, if we use check-dag, we wouldn't
need the %xmm{{[01]}} regex either! unless I'm looking at it wrong.

I'll commit later today.

Thank you,

Filipe

Closed by commit rL209929 (authored by @filcab).

Revision Contents

Path

Size

llvm/

trunk/

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

61 lines

test/

CodeGen/

Generic/

select.ll

1 line

X86/

vselect.ll

14 lines

Diff 9970

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,586 Lines • ▼ Show 20 Lines	std::pair<SDValue, SDValue> SplitVSETCC(const SDNode *N, SelectionDAG &DAG) {
std::tie(RL, RH) = DAG.SplitVectorOperand(N, 1);		std::tie(RL, RH) = DAG.SplitVectorOperand(N, 1);

Lo = DAG.getNode(N->getOpcode(), DL, LoVT, LL, RL, N->getOperand(2));		Lo = DAG.getNode(N->getOpcode(), DL, LoVT, LL, RL, N->getOperand(2));
Hi = DAG.getNode(N->getOpcode(), DL, HiVT, LH, RH, N->getOperand(2));		Hi = DAG.getNode(N->getOpcode(), DL, HiVT, LH, RH, N->getOperand(2));

return std::make_pair(Lo, Hi);		return std::make_pair(Lo, Hi);
}		}

		// This function assumes all the vselect's arguments are CONCAT_VECTOR
		// nodes and that the condition is a BV of ConstantSDNodes (or undefs).
		static SDValue ConvertSelectToConcatVector(SDNode *N, SelectionDAG &DAG) {
		SDLoc dl(N);
		SDValue Cond = N->getOperand(0);
		SDValue LHS = N->getOperand(1);
		SDValue RHS = N->getOperand(2);
		MVT VT = N->getSimpleValueType(0);
		int NumElems = VT.getVectorNumElements();
		assert(LHS.getOpcode() == ISD::CONCAT_VECTORS &&
		RHS.getOpcode() == ISD::CONCAT_VECTORS &&
		Cond.getOpcode() == ISD::BUILD_VECTOR);

		// We're sure we have an even number of elements due to the
		// concat_vectors we have as arguments to vselect.
		// Skip BV elements until we find one that's not an UNDEF
		// After we find an UNDEF element, keep looping until we get to half the
		// length of the BV and see if all the non-undef nodes are the same.
		ConstantSDNode *BottomHalf = nullptr;
		for (int i = 0; i < NumElems / 2; ++i) {
		if (Cond->getOperand(i)->getOpcode() == ISD::UNDEF)
		continue;

		if (BottomHalf == nullptr)
		BottomHalf = cast<ConstantSDNode>(Cond.getOperand(i));
		else if (Cond->getOperand(i).getNode() != BottomHalf)
		return SDValue();
		}

		// Do the same for the second half of the BuildVector
		ConstantSDNode *TopHalf = nullptr;
		for (int i = NumElems / 2; i < NumElems; ++i) {
		if (Cond->getOperand(i)->getOpcode() == ISD::UNDEF)
		continue;

		if (TopHalf == nullptr)
		TopHalf = cast<ConstantSDNode>(Cond.getOperand(i));
		else if (Cond->getOperand(i).getNode() != TopHalf)
		return SDValue();
		}

		assert(TopHalf && BottomHalf &&
		"One half of the selector was all UNDEFs and the other was all the "
		"same value. This should have been addressed before this function.");
		return DAG.getNode(
		ISD::CONCAT_VECTORS, dl, VT,
		BottomHalf->isNullValue() ? RHS->getOperand(0) : LHS->getOperand(0),
		TopHalf->isNullValue() ? RHS->getOperand(1) : LHS->getOperand(1));
		}

SDValue DAGCombiner::visitVSELECT(SDNode *N) {		SDValue DAGCombiner::visitVSELECT(SDNode *N) {
SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
SDValue N1 = N->getOperand(1);		SDValue N1 = N->getOperand(1);
SDValue N2 = N->getOperand(2);		SDValue N2 = N->getOperand(2);
SDLoc DL(N);		SDLoc DL(N);

// Canonicalize integer abs.		// Canonicalize integer abs.
// vselect (setg[te] X, 0), X, -X ->		// vselect (setg[te] X, 0), X, -X ->
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visitVSELECT(SDNode *N) {

// Fold (vselect (build_vector all_ones), N1, N2) -> N1		// Fold (vselect (build_vector all_ones), N1, N2) -> N1
if (ISD::isBuildVectorAllOnes(N0.getNode()))		if (ISD::isBuildVectorAllOnes(N0.getNode()))
return N1;		return N1;
// Fold (vselect (build_vector all_zeros), N1, N2) -> N2		// Fold (vselect (build_vector all_zeros), N1, N2) -> N2
if (ISD::isBuildVectorAllZeros(N0.getNode()))		if (ISD::isBuildVectorAllZeros(N0.getNode()))
return N2;		return N2;

		// The ConvertSelectToConcatVector function is assuming both the above
		// checks for (vselect (build_vector all{ones,zeros) ...) have been made
		// and addressed.
		if (N1.getOpcode() == ISD::CONCAT_VECTORS &&
		N2.getOpcode() == ISD::CONCAT_VECTORS &&
		ISD::isBuildVectorOfConstantSDNodes(N0.getNode())) {
		SDValue CV = ConvertSelectToConcatVector(N, DAG);
		if (CV.getNode())
		return CV;
		}

return SDValue();		return SDValue();
}		}

SDValue DAGCombiner::visitSELECT_CC(SDNode *N) {		SDValue DAGCombiner::visitSELECT_CC(SDNode *N) {
SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
SDValue N1 = N->getOperand(1);		SDValue N1 = N->getOperand(1);
SDValue N2 = N->getOperand(2);		SDValue N2 = N->getOperand(2);
SDValue N3 = N->getOperand(3);		SDValue N3 = N->getOperand(3);
▲ Show 20 Lines • Show All 6,889 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/Generic/select.ll

	Show First 20 Lines • Show All 186 Lines • ▼ Show 20 Lines

	; Test case for scalarising a 1 element vselect			; Test case for scalarising a 1 element vselect
	;			;
	define <1 x i32> @checkScalariseVSELECT(<1 x i32> %a, <1 x i32> %b) {			define <1 x i32> @checkScalariseVSELECT(<1 x i32> %a, <1 x i32> %b) {
	%cond = icmp uge <1 x i32> %a, %b			%cond = icmp uge <1 x i32> %a, %b
	%s = select <1 x i1> %cond, <1 x i32> %a, <1 x i32> %b			%s = select <1 x i1> %cond, <1 x i32> %a, <1 x i32> %b
	ret <1 x i32> %s			ret <1 x i32> %s
	}			}

llvm/trunk/test/CodeGen/X86/vselect.ll

	Show First 20 Lines • Show All 256 Lines • ▼ Show 20 Lines
	}			}
	; CHECK-LABEL: test25			; CHECK-LABEL: test25
	; CHECK-NOT: psllw			; CHECK-NOT: psllw
	; CHECK-NOT: psraw			; CHECK-NOT: psraw
	; CHECK-NOT: xorps			; CHECK-NOT: xorps
	; CHECK: movsd			; CHECK: movsd
	; CHECK: ret			; CHECK: ret

				define <4 x float> @select_of_shuffles_0(<2 x float> %a0, <2 x float> %b0, <2 x float> %a1, <2 x float> %b1) {
				; CHECK-LABEL: select_of_shuffles_0
				; CHECK-DAG: movlhps %xmm2, [[REGA:%xmm[0-9]+]]
				; CHECK-DAG: movlhps %xmm3, [[REGB:%xmm[0-9]+]]
				; CHECK: subps [[REGB]], [[REGA]]
				%1 = shufflevector <2 x float> %a0, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
				%2 = shufflevector <2 x float> %a1, <2 x float> undef, <4 x i32> <i32 undef, i32 undef, i32 0, i32 1>
				%3 = select <4 x i1> <i1 false, i1 false, i1 true, i1 true>, <4 x float> %2, <4 x float> %1
				%4 = shufflevector <2 x float> %b0, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
				%5 = shufflevector <2 x float> %b1, <2 x float> undef, <4 x i32> <i32 undef, i32 undef, i32 0, i32 1>
				%6 = select <4 x i1> <i1 false, i1 false, i1 true, i1 true>, <4 x float> %5, <4 x float> %4
				%7 = fsub <4 x float> %3, %6
				ret <4 x float> %7
				}

This is an archive of the discontinued LLVM Phabricator instance.

Convert a vselect into a concat_vector if possibleClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 9970

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/trunk/test/CodeGen/Generic/select.ll

llvm/trunk/test/CodeGen/X86/vselect.ll

Convert a vselect into a concat_vector if possible
ClosedPublic