This is an archive of the discontinued LLVM Phabricator instance.

[LegalizeTypes] Teach PromoteIntRes_BITCAST to better handle a bitcast with vector output type and a vector input type that needs to be widened
ClosedPublic

Authored by craig.topper on Oct 12 2018, 4:43 PM.

Download Raw Diff

Details

Reviewers

efriedma
RKSimon

Commits

rGb293322cee17: [LegalizeTypes] Teach PromoteIntRes_BITCAST to better handle a bitcast with…
rL345567: [LegalizeTypes] Teach PromoteIntRes_BITCAST to better handle a bitcast with…

Summary

Previously if we had a bitcast vector output type that needs promotion and a vector input type that needs widening we would just do a stack store and load to handle the conversion. We can do a little better if we can widen the bitcast to a legal vector type the same size as the widened input type. Then we can do the bitcast between this widened type and the widened input type. Afterwards we can extract_subvector back to the original output and any_extend that. Type legalization will then circle back and handle promotion of the extract_subvector and the any_extend will just be removed. This will avoid going through the stack and allows us to remove a custom version of this legalization from X86.

Diff Detail

Event Timeline

craig.topper created this revision.Oct 12 2018, 4:43 PM

RKSimon mentioned this in D53258: [LegalizeDAG] Add generic vector CTPOP expansion (PR32655).Oct 14 2018, 9:16 AM

I'd like to see more test coverage for this, I think. If I'm following correctly, this should affect something like the following on AArch64?

define <2 x i16> @foo(<2 x half> %x) {
  %y = bitcast <2 x half> %x to <2 x i16>
  ret <2 x i16> %y
}

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
326	Should we just generate an ANY_EXTEND_VECTOR_INREG directly here?

Add a test case for AArch64. I wasn't sure what file to put it in so I made a new file and put the diff of the old vs new code here.

Herald added a subscriber: javed.absar. · View Herald TranscriptOct 20 2018, 10:49 PM

craig.topper added inline comments.Oct 20 2018, 10:53 PM

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
326	I don't think I can. I don't have any guarantee of vectors of the same size here.

Ping

LGTM

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
326	Oh, good point. It looks like they are in the cases you've shown, but maybe not in general.

This revision is now accepted and ready to land.Oct 29 2018, 4:10 PM

Closed by commit rL345567: [LegalizeTypes] Teach PromoteIntRes_BITCAST to better handle a bitcast with… (authored by ctopper). · Explain WhyOct 29 2018, 8:29 PM

This revision was automatically updated to reflect the committed changes.

Diffusion mentioned this in rL345566: [AArch64] Add test case for D53229. NFC.

Revision Contents

Path

Size

lib/

CodeGen/

SelectionDAG/

LegalizeIntegerTypes.cpp

20 lines

Target/

X86/

X86ISelLowering.cpp

10 lines

test/

CodeGen/

AArch64/

bitcast-promote-widen.ll

14 lines

Diff 170327

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 301 Lines • ▼ Show 20 Lines	case TargetLowering::TypeSplitVector: {
return DAG.getNode(ISD::BITCAST, dl, NOutVT, InOp);		return DAG.getNode(ISD::BITCAST, dl, NOutVT, InOp);
}		}
case TargetLowering::TypeWidenVector:		case TargetLowering::TypeWidenVector:
// The input is widened to the same size. Convert to the widened value.		// The input is widened to the same size. Convert to the widened value.
// Make sure that the outgoing value is not a vector, because this would		// Make sure that the outgoing value is not a vector, because this would
// make us bitcast between two vectors which are legalized in different ways.		// make us bitcast between two vectors which are legalized in different ways.
if (NOutVT.bitsEq(NInVT) && !NOutVT.isVector())		if (NOutVT.bitsEq(NInVT) && !NOutVT.isVector())
return DAG.getNode(ISD::BITCAST, dl, NOutVT, GetWidenedVector(InOp));		return DAG.getNode(ISD::BITCAST, dl, NOutVT, GetWidenedVector(InOp));
		// If the output type is also a vector and widening it to the same size
		// as the widened input type would be a legal type, we can widen the bitcast
		// and handle the promotion after.
		if (NOutVT.isVector()) {
		unsigned WidenInSize = NInVT.getSizeInBits();
		unsigned OutSize = OutVT.getSizeInBits();
		if (WidenInSize % OutSize == 0) {
		unsigned Scale = WidenInSize / OutSize;
		EVT WideOutVT = EVT::getVectorVT(*DAG.getContext(),
		OutVT.getVectorElementType(),
		OutVT.getVectorNumElements() * Scale);
		if (isTypeLegal(WideOutVT)) {
		InOp = DAG.getBitcast(WideOutVT, GetWidenedVector(InOp));
		MVT IdxTy = TLI.getVectorIdxTy(DAG.getDataLayout());
		InOp = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, OutVT, InOp,
		DAG.getConstant(0, dl, IdxTy));
		return DAG.getNode(ISD::ANY_EXTEND, dl, NOutVT, InOp);
		efriedmaUnsubmitted Not Done Reply Inline Actions Should we just generate an ANY_EXTEND_VECTOR_INREG directly here? efriedma: Should we just generate an ANY_EXTEND_VECTOR_INREG directly here?
		craig.topperAuthorUnsubmitted Not Done Reply Inline Actions I don't think I can. I don't have any guarantee of vectors of the same size here. craig.topper: I don't think I can. I don't have any guarantee of vectors of the same size here.
		efriedmaUnsubmitted Not Done Reply Inline Actions Oh, good point. It looks like they are in the cases you've shown, but maybe not in general. efriedma: Oh, good point. It looks like they are in the cases you've shown, but maybe not in general.
		}
		}
		}
}		}

return DAG.getNode(ISD::ANY_EXTEND, dl, NOutVT,		return DAG.getNode(ISD::ANY_EXTEND, dl, NOutVT,
CreateStackStoreLoad(InOp, OutVT));		CreateStackStoreLoad(InOp, OutVT));
}		}

// Helper for BSWAP/BITREVERSE promotion to ensure we can fit the shift amount		// Helper for BSWAP/BITREVERSE promotion to ensure we can fit the shift amount
// in the VT returned by getShiftAmountTy and to return a safe VT if we can't.		// in the VT returned by getShiftAmountTy and to return a safe VT if we can't.
▲ Show 20 Lines • Show All 3,342 Lines • Show Last 20 Lines

lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 26,355 Lines • ▼ Show 20 Lines	if ((DstVT == MVT::v32i16 \|\| DstVT == MVT::v64i8) &&
MVT CastVT = (DstVT == MVT::v32i16) ? MVT::v16i16 : MVT::v32i8;		MVT CastVT = (DstVT == MVT::v32i16) ? MVT::v16i16 : MVT::v32i8;
Lo = DAG.getBitcast(CastVT, Lo);		Lo = DAG.getBitcast(CastVT, Lo);
Hi = DAG.getBitcast(CastVT, Hi);		Hi = DAG.getBitcast(CastVT, Hi);
SDValue Res = DAG.getNode(ISD::CONCAT_VECTORS, dl, DstVT, Lo, Hi);		SDValue Res = DAG.getNode(ISD::CONCAT_VECTORS, dl, DstVT, Lo, Hi);
Results.push_back(Res);		Results.push_back(Res);
return;		return;
}		}

if ((SrcVT != MVT::f64 && SrcVT != MVT::v2f32) \|\|		if (SrcVT != MVT::f64 \|\|
(DstVT != MVT::v2i32 && DstVT != MVT::v4i16 && DstVT != MVT::v8i8) \|\|		(DstVT != MVT::v2i32 && DstVT != MVT::v4i16 && DstVT != MVT::v8i8) \|\|
getTypeAction(*DAG.getContext(), DstVT) == TypeWidenVector)		getTypeAction(*DAG.getContext(), DstVT) == TypeWidenVector)
return;		return;

unsigned NumElts = DstVT.getVectorNumElements();		unsigned NumElts = DstVT.getVectorNumElements();
EVT SVT = DstVT.getVectorElementType();		EVT SVT = DstVT.getVectorElementType();
EVT WiderVT = EVT::getVectorVT(DAG.getContext(), SVT, NumElts 2);		EVT WiderVT = EVT::getVectorVT(DAG.getContext(), SVT, NumElts 2);
SDValue Res;		SDValue Res;
if (SrcVT == MVT::f64)		Res = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, MVT::v2f64, N->getOperand(0));
Res = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl,
MVT::v2f64, N->getOperand(0));
else
Res = DAG.getNode(ISD::CONCAT_VECTORS, dl, MVT::v4f32, N->getOperand(0),
DAG.getUNDEF(MVT::v2f32));

Res = DAG.getBitcast(WiderVT, Res);		Res = DAG.getBitcast(WiderVT, Res);
Res = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, DstVT, Res,		Res = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, DstVT, Res,
DAG.getIntPtrConstant(0, dl));		DAG.getIntPtrConstant(0, dl));
Results.push_back(Res);		Results.push_back(Res);
return;		return;
}		}
case ISD::MGATHER: {		case ISD::MGATHER: {
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);
▲ Show 20 Lines • Show All 15,296 Lines • Show Last 20 Lines

test/CodeGen/AArch64/bitcast-promote-widen.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=aarch64-unknown-linux-gnu \| FileCheck %s			; RUN: llc < %s -mtriple=aarch64-unknown-linux-gnu \| FileCheck %s

	; Test cases of bitcasts where one type needs to be widened and one needs to be promoted.			; Test cases of bitcasts where one type needs to be widened and one needs to be promoted.

	define <2 x i16> @bitcast_v2i16_v2f16(<2 x half> %x) {			define <2 x i16> @bitcast_v2i16_v2f16(<2 x half> %x) {
	; CHECK-LABEL: bitcast_v2i16_v2f16:			; CHECK-LABEL: bitcast_v2i16_v2f16:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: sub sp, sp, #16 // =16
	; CHECK-NEXT: .cfi_def_cfa_offset 16
	; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0			; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
	; CHECK-NEXT: str s0, [sp, #12]			; CHECK-NEXT: umov w8, v0.h[0]
	; CHECK-NEXT: ldrh w8, [sp, #12]			; CHECK-NEXT: fmov s1, w8
	; CHECK-NEXT: ldrh w9, [sp, #14]			; CHECK-NEXT: umov w8, v0.h[1]
	; CHECK-NEXT: fmov s0, w8			; CHECK-NEXT: mov v1.s[1], w8
	; CHECK-NEXT: mov v0.s[1], w9			; CHECK-NEXT: mov v0.16b, v1.16b
	; CHECK-NEXT: // kill: def $d0 killed $d0 killed $q0
	; CHECK-NEXT: add sp, sp, #16 // =16
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%y = bitcast <2 x half> %x to <2 x i16>			%y = bitcast <2 x half> %x to <2 x i16>
	ret <2 x i16> %y			ret <2 x i16> %y
	}			}

	define <2 x half> @bitcast_v2f16_v2i16(<2 x i16> %x) {			define <2 x half> @bitcast_v2f16_v2i16(<2 x i16> %x) {
	; CHECK-LABEL: bitcast_v2f16_v2i16:			; CHECK-LABEL: bitcast_v2f16_v2i16:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: uzp1 v0.4h, v0.4h, v0.4h			; CHECK-NEXT: uzp1 v0.4h, v0.4h, v0.4h
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%y = bitcast <2 x i16> %x to <2 x half>			%y = bitcast <2 x i16> %x to <2 x half>
	ret <2 x half> %y			ret <2 x half> %y
	}			}