This is an archive of the discontinued LLVM Phabricator instance.

Differential D58270

[SystemZ] Load all vector and FP constants in Select()
ClosedPublic

Authored by jonpa on Feb 14 2019, 6:51 PM.

Download Raw Diff

Details

Reviewers

uweigand

Summary

(Continued from and replacing https://reviews.llvm.org/D58142 and https://reviews.llvm.org/D57926)

Hmm. Actually, I'm now wondering why we need to reject anything in the first place. Can't we improve isFPImmLegal to accept *anything* that can be constructed via any of the vector instructions (VGBM, VGM, VREPI)?

OK - did a new attempt with this patch to follow this broader principle.

So in end, we just need two routines:

can a FP immediate or BUILD_VECTOR be loaded?

isVectorConstantLegal()

actually load a FP immediate or BUILD_VECTOR

loadVectorConstant()

Since the current code for for VREP / VGM is based on finding the smallest splat, I wanted to reuse BuildVectorSDNode::isConstantSplat(). For an APFloat, it was not simple to build a temporary BVN (DAG pointer not available), but the finding of the splat without any undefined bits was not that much work to implement.

To try VGBM, the int bits are used, and they are either found with isConstantSplat() called with 128 as minimum splat size, or for APFloat, with a conversion to APInt.

Added a new struct type to wrap this called SystemZVectorConstantInfo.

This should handle any and all constants, and all previously removed tests have now be restored.

as a general principle if we have an instruction that can do something, we should be using it, if it's possible without a lot of overhead ...

There actually seems to be a few less VGMGs now on SPEC after all, and it seems that this is because a scalar FP constant can now reuse a vector splat constant of the same value. This is what the new test vec-const-19.ll checks.

Opcode differences on SPEC over all files:
master <> patch

vgmg           :                 3982                 3945      -37
mdbr           :                 6957                 6949       -8
vgbm           :                 3885                 3893       +8 ?
wfmdb          :                19561                19569       +8

Could we actually handle FP128 as well with a present FeatureVectorEnhancements1?

Still a little unsure about the way to do the actual selection in loadVectorConstant(). It seems that the new node must be in the DAG before it is selected, so ReplaceNode() needs to be called first, and then SelectCode(). The bitcast handling varies depending on if VecVT and VT are the same. The fact that VGBM is a machine node, but ROTATE_MASK and REPLICATE are not also makes for some more handling. I guess one could consider selecting them all as machine nodes directly, and one could do an explicit comparison of VT and VecVT perhaps

Diff Detail

Event Timeline

jonpa created this revision.Feb 14 2019, 6:51 PM

Herald added a subscriber: jdoerfert. · View Herald TranscriptFeb 14 2019, 6:51 PM

Could we actually handle FP128 as well with a present FeatureVectorEnhancements1?

Sure, why not ...

Still a little unsure about the way to do the actual selection in loadVectorConstant(). It seems that the new node must be in the DAG before it is selected, so ReplaceNode() needs to be called first, and then SelectCode(). The bitcast handling varies depending on if VecVT and VT are the same. The fact that VGBM is a machine node, but ROTATE_MASK and REPLICATE are not also makes for some more handling. I guess one could consider selecting them all as machine nodes directly, and one could do an explicit comparison of VT and VecVT perhaps

Maybe it would be simplest to add back SystemZISD::BYTE_MASK and then just handle everything as ISD nodes? I agree you should only create a BITCAST node if the types are actually different.

Another question: it seems odd to have to call getTargetLowering in ::Select, that seems a bit of a layering violation. Why does isVectorConstantLegal have to be a member function of SystemZTargetLowering in the first place? Can't this be just a stand-alone function?

Added handling for fp128 on z14, with some new tests that checks that this is working in fp-const-11.ll. NFC on spec on z14.

Maybe it would be simplest to add back SystemZISD::BYTE_MASK and then just handle everything as ISD nodes? I agree you should only create a BITCAST node if the types are actually different.

Another question: it seems odd to have to call getTargetLowering in ::Select, that seems a bit of a layering violation. Why does isVectorConstantLegal have to be a member function of SystemZTargetLowering in the first place? Can't this be just a stand-alone function?

I followed those suggestions and it indeed looks better now, thanks.

This looks generally good to me. Some additional options to possibly make the code simpler occurred to me:

This looks a bit repetitve to me:

if (VCI.Opcode == SystemZISD::BYTE_MASK)
  Op = CurDAG->getNode(SystemZISD::BYTE_MASK, DL, VCI.VecVT,
                       CurDAG->getConstant(VCI.Mask, DL, MVT::i32));
else if (VCI.Opcode == SystemZISD::REPLICATE)
  Op = CurDAG->getNode(SystemZISD::REPLICATE, DL, VCI.VecVT,

Would it make sense to instead store in VCI just the Opcode and an Operands array? And then just create a DAG node with that Opcode and that list of (i32) operands, without checking specifically which Opcode it is?

Would it make sense to not just store FBImm or BVN in the VCI, but instead perform the equivalent of getIntBits() and getSplat() in the two constructors and store the result values of those in the VCI instead? This would eliminate those functions (which are completely different for FBImm and BVN anyway) and maybe just make the logic simpler overall.

lib/Target/SystemZ/SystemZISelDAGToDAG.cpp
1149	We're now so late that I don't think we need the isOpaque flag any more.
lib/Target/SystemZ/SystemZISelLowering.h
676	Do we really need to convert to APInt here just to determine the type?
683	This function (and possibly the type?) should probably be at least in the lvm::SystemZ:: namespace, to avoid polluting the common llvm:: namespace. Or else (maybe better?) make it a member function of SystemZVectorConstantInfo?

Patch updated per review, thanks.

Would it make sense to instead store in VCI just the Opcode and an Operands array? And then just create a DAG node with that Opcode and that list of (i32) operands, without checking specifically which Opcode it is?

Yes, that seems to simplify things. I however am somewhat wary to create DAG nodes during all the queries during legalization, so I instead first make a vector of unsigned and then make the actual DAG operands in loadVectorConstant().

Would it make sense to not just store FBImm or BVN in the VCI, but instead perform the equivalent of getIntBits() and getSplat() in the two constructors and store the result values of those in the VCI instead? This would eliminate those functions (which are completely different for FBImm and BVN anyway) and maybe just make the logic simpler overall.

That seems all good to me, except I am not sure if we would like to avoid computing the splat in the cases were it is not needed (BYTE_MASK case). If that was not a concern, it would probably be better to just have the constructors find the various data members common for both cases and then forget if it was a BVN or ConstantFP.

lib/Target/SystemZ/SystemZISelDAGToDAG.cpp
1149	right
lib/Target/SystemZ/SystemZISelLowering.h
676	Seems that we could do either (&FPImm.getSemantics() == &APFloat::IEEEquad()) or APFloat::getSizeInBits(FP.getSemantics()) > 64
683	Seems to work fine to make isVectorConstantLegal() a member function. I am guessing the rest is fine, since SystemZ is part of the name, or?

In D58270#1408154, @jonpa wrote:

Yes, that seems to simplify things. I however am somewhat wary to create DAG nodes during all the queries during legalization, so I instead first make a vector of unsigned and then make the actual DAG operands in loadVectorConstant().

Yes, that's what I meant. This is looking good to me now ...

That seems all good to me, except I am not sure if we would like to avoid computing the splat in the cases were it is not needed (BYTE_MASK case). If that was not a concern, it would probably be better to just have the constructors find the various data members common for both cases and then forget if it was a BVN or ConstantFP.

I don't think this will make much of a difference compile-time wise, so I'd prefer to go with the version where the code looks simpler ...

I don't think this will make much of a difference compile-time wise, so I'd prefer to go with the version where the code looks simpler ...

OK. It does seem to simplify things to just do the work in the constructors - patch updated.

OK, this version LGTM. Thanks!

This revision is now accepted and ready to land.Feb 26 2019, 1:23 AM

Thanks for help and review!

r354896

Revision Contents

Path

Size

lib/

Target/

SystemZ/

SystemZISelDAGToDAG.cpp

61 lines

SystemZISelLowering.h

25 lines

SystemZISelLowering.cpp

247 lines

SystemZInstrVector.td

2 lines

SystemZOperators.td

1 line

test/

CodeGen/

SystemZ/

30 lines

57 lines

40 lines

18 lines

Diff 188220

lib/Target/SystemZ/SystemZISelDAGToDAG.cpp

Show First 20 Lines • Show All 298 Lines • ▼ Show 20 Lines	class SystemZDAGToDAGISel : public SelectionDAGISel {
// (Opcode UpperVal LowerVal)		// (Opcode UpperVal LowerVal)
//		//
// If Op0 is nonnull, then Node can be implemented using:		// If Op0 is nonnull, then Node can be implemented using:
//		//
// (Opcode (Opcode Op0 UpperVal) LowerVal)		// (Opcode (Opcode Op0 UpperVal) LowerVal)
void splitLargeImmediate(unsigned Opcode, SDNode *Node, SDValue Op0,		void splitLargeImmediate(unsigned Opcode, SDNode *Node, SDValue Op0,
uint64_t UpperVal, uint64_t LowerVal);		uint64_t UpperVal, uint64_t LowerVal);

		void loadVectorConstant(const SystemZVectorConstantInfo &VCI,
		SDNode *Node);

// Try to use gather instruction Opcode to implement vector insertion N.		// Try to use gather instruction Opcode to implement vector insertion N.
bool tryGather(SDNode *N, unsigned Opcode);		bool tryGather(SDNode *N, unsigned Opcode);

// Try to use scatter instruction Opcode to implement store Store.		// Try to use scatter instruction Opcode to implement store Store.
bool tryScatter(StoreSDNode *Store, unsigned Opcode);		bool tryScatter(StoreSDNode *Store, unsigned Opcode);

// Change a chain of {load; op; store} of the same value into a simple op		// Change a chain of {load; op; store} of the same value into a simple op
// through memory of that value, if the uses of the modified value and its		// through memory of that value, if the uses of the modified value and its
▲ Show 20 Lines • Show All 812 Lines • ▼ Show 20 Lines	void SystemZDAGToDAGISel::splitLargeImmediate(unsigned Opcode, SDNode *Node,
SDValue Lower = CurDAG->getConstant(LowerVal, DL, VT);		SDValue Lower = CurDAG->getConstant(LowerVal, DL, VT);
SDValue Or = CurDAG->getNode(Opcode, DL, VT, Upper, Lower);		SDValue Or = CurDAG->getNode(Opcode, DL, VT, Upper, Lower);

ReplaceNode(Node, Or.getNode());		ReplaceNode(Node, Or.getNode());

SelectCode(Or.getNode());		SelectCode(Or.getNode());
}		}

		void SystemZDAGToDAGISel::loadVectorConstant(
		const SystemZVectorConstantInfo &VCI, SDNode *Node) {
		assert((VCI.Opcode == SystemZISD::BYTE_MASK \|\|
		VCI.Opcode == SystemZISD::REPLICATE \|\|
		VCI.Opcode == SystemZISD::ROTATE_MASK) &&
		"Bad opcode!");
		assert(VCI.VecVT.getSizeInBits() == 128 && "Expected a vector type");
		EVT VT = Node->getValueType(0);
		SDLoc DL(Node);
		SmallVector<SDValue, 2> Ops;
		for (unsigned OpVal : VCI.OpVals)
		Ops.push_back(CurDAG->getConstant(OpVal, DL, MVT::i32));
		uweigandUnsubmitted Done Reply Inline Actions We're now so late that I don't think we need the isOpaque flag any more. uweigand: We're now so late that I don't think we need the isOpaque flag any more.
		jonpaAuthorUnsubmitted Done Reply Inline Actions right jonpa: right
		SDValue Op = CurDAG->getNode(VCI.Opcode, DL, VCI.VecVT, Ops);

		if (VCI.VecVT == VT.getSimpleVT())
		ReplaceNode(Node, Op.getNode());
		else if (VT.getSizeInBits() == 128) {
		SDValue BitCast = CurDAG->getNode(ISD::BITCAST, DL, VT, Op);
		ReplaceNode(Node, BitCast.getNode());
		SelectCode(BitCast.getNode());
		} else { // float or double
		unsigned SubRegIdx =
		(VT.getSizeInBits() == 32 ? SystemZ::subreg_h32 : SystemZ::subreg_h64);
		ReplaceNode(
		Node, CurDAG->getTargetExtractSubreg(SubRegIdx, DL, VT, Op).getNode());
		}
		SelectCode(Op.getNode());
		}

bool SystemZDAGToDAGISel::tryGather(SDNode *N, unsigned Opcode) {		bool SystemZDAGToDAGISel::tryGather(SDNode *N, unsigned Opcode) {
SDValue ElemV = N->getOperand(2);		SDValue ElemV = N->getOperand(2);
auto *ElemN = dyn_cast<ConstantSDNode>(ElemV);		auto *ElemN = dyn_cast<ConstantSDNode>(ElemV);
if (!ElemN)		if (!ElemN)
return false;		return false;

unsigned Elem = ElemN->getZExtValue();		unsigned Elem = ElemN->getZExtValue();
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);
▲ Show 20 Lines • Show All 381 Lines • ▼ Show 20 Lines	if (ElemBitSize == 32) {
if (tryGather(Node, SystemZ::VGEG))		if (tryGather(Node, SystemZ::VGEG))
return;		return;
}		}
break;		break;
}		}

case ISD::BUILD_VECTOR: {		case ISD::BUILD_VECTOR: {
auto *BVN = cast<BuildVectorSDNode>(Node);		auto *BVN = cast<BuildVectorSDNode>(Node);
SDLoc DL(Node);		SystemZVectorConstantInfo VCI(BVN);
EVT VT = Node->getValueType(0);		if (VCI.isVectorConstantLegal(*Subtarget)) {
uint64_t Mask = 0;		loadVectorConstant(VCI, Node);
if (SystemZTargetLowering::tryBuildVectorByteMask(BVN, Mask)) {
SDNode *Res = CurDAG->getMachineNode(SystemZ::VGBM, DL, VT,
CurDAG->getTargetConstant(Mask, DL, MVT::i32));
ReplaceNode(Node, Res);
return;		return;
}		}
break;		break;
}		}

case ISD::ConstantFP: {		case ISD::ConstantFP: {
APFloat Imm = cast<ConstantFPSDNode>(Node)->getValueAPF();		APFloat Imm = cast<ConstantFPSDNode>(Node)->getValueAPF();
if (Imm.isZero() \|\| Imm.isNegZero())		if (Imm.isZero() \|\| Imm.isNegZero())
break;		break;
const SystemZInstrInfo *TII = getInstrInfo();		SystemZVectorConstantInfo VCI(Imm);
EVT VT = Node->getValueType(0);		bool Success = VCI.isVectorConstantLegal(*Subtarget); (void)Success;
unsigned Start, End;
unsigned BitWidth = VT.getSizeInBits();
bool Success = SystemZTargetLowering::analyzeFPImm(Imm, BitWidth, Start,
End, static_cast<const SystemZInstrInfo *>(TII)); (void)Success;
assert(Success && "Expected legal FP immediate");		assert(Success && "Expected legal FP immediate");
SDLoc DL(Node);		loadVectorConstant(VCI, Node);
unsigned Opcode = (BitWidth == 32 ? SystemZ::VGMF : SystemZ::VGMG);
SDNode *Res = CurDAG->getMachineNode(Opcode, DL, VT,
CurDAG->getTargetConstant(Start, DL, MVT::i32),
CurDAG->getTargetConstant(End, DL, MVT::i32));
unsigned SubRegIdx = (BitWidth == 32 ? SystemZ::subreg_h32
: SystemZ::subreg_h64);
Res = CurDAG->getTargetExtractSubreg(SubRegIdx, DL, VT, SDValue(Res, 0))
.getNode();
ReplaceNode(Node, Res);
return;		return;
}		}

case ISD::STORE: {		case ISD::STORE: {
if (tryFoldLoadStoreIntoMemOperand(Node))		if (tryFoldLoadStoreIntoMemOperand(Node))
return;		return;
auto *Store = cast<StoreSDNode>(Node);		auto *Store = cast<StoreSDNode>(Node);
unsigned ElemBitSize = Store->getValue().getValueSizeInBits();		unsigned ElemBitSize = Store->getValue().getValueSizeInBits();
▲ Show 20 Lines • Show All 309 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZISelLowering.h

Show First 20 Lines • Show All 156 Lines • ▼ Show 20 Lines	enum NodeType : unsigned {
// the TDB pointer, and the third the immediate control field.		// the TDB pointer, and the third the immediate control field.
// Returns CC value and chain.		// Returns CC value and chain.
TBEGIN,		TBEGIN,
TBEGIN_NOFLOAT,		TBEGIN_NOFLOAT,

// Transaction end. Just the chain operand. Returns CC value and chain.		// Transaction end. Just the chain operand. Returns CC value and chain.
TEND,		TEND,

		// Create a vector constant by filling byte N of the result with bit
		// 15-N of the single operand.
		BYTE_MASK,

// Create a vector constant by replicating an element-sized RISBG-style mask.		// Create a vector constant by replicating an element-sized RISBG-style mask.
// The first operand specifies the starting set bit and the second operand		// The first operand specifies the starting set bit and the second operand
// specifies the ending set bit. Both operands count from the MSB of the		// specifies the ending set bit. Both operands count from the MSB of the
// element.		// element.
ROTATE_MASK,		ROTATE_MASK,

// Replicate a GPR scalar value into all elements of a vector.		// Replicate a GPR scalar value into all elements of a vector.
REPLICATE,		REPLICATE,
▲ Show 20 Lines • Show All 335 Lines • ▼ Show 20 Lines	public:
ISD::NodeType getExtendForAtomicOps() const override {		ISD::NodeType getExtendForAtomicOps() const override {
return ISD::ANY_EXTEND;		return ISD::ANY_EXTEND;
}		}

bool supportSwiftError() const override {		bool supportSwiftError() const override {
return true;		return true;
}		}

static bool tryBuildVectorByteMask(BuildVectorSDNode *BVN, uint64_t &Mask);
static bool analyzeFPImm(const APFloat &Imm, unsigned BitWidth,
unsigned &Start, unsigned &End, const SystemZInstrInfo *TII);
private:		private:
const SystemZSubtarget &Subtarget;		const SystemZSubtarget &Subtarget;

// Implement LowerOperation for individual opcodes.		// Implement LowerOperation for individual opcodes.
SDValue getVectorCmp(SelectionDAG &DAG, unsigned Opcode,		SDValue getVectorCmp(SelectionDAG &DAG, unsigned Opcode,
const SDLoc &DL, EVT VT,		const SDLoc &DL, EVT VT,
SDValue CmpOp0, SDValue CmpOp1) const;		SDValue CmpOp0, SDValue CmpOp1) const;
SDValue lowerVectorSETCC(SelectionDAG &DAG, const SDLoc &DL,		SDValue lowerVectorSETCC(SelectionDAG &DAG, const SDLoc &DL,
▲ Show 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	MachineBasicBlock *emitTransactionBegin(MachineInstr &MI,
MachineBasicBlock *MBB,		MachineBasicBlock *MBB,
unsigned Opcode, bool NoFloat) const;		unsigned Opcode, bool NoFloat) const;
MachineBasicBlock *emitLoadAndTestCmp0(MachineInstr &MI,		MachineBasicBlock *emitLoadAndTestCmp0(MachineInstr &MI,
MachineBasicBlock *MBB,		MachineBasicBlock *MBB,
unsigned Opcode) const;		unsigned Opcode) const;

const TargetRegisterClass *getRepRegClassFor(MVT VT) const override;		const TargetRegisterClass *getRepRegClassFor(MVT VT) const override;
};		};

		struct SystemZVectorConstantInfo {
		private:
		APInt IntBits; // The 128 bits as an integer.
		APInt SplatBits; // Smallest splat value.
		APInt SplatUndef; // Bits correspoding to undef operands of the BVN.
		unsigned SplatBitSize = 0;
		bool isFP128 = false;

		public:
		unsigned Opcode = 0;
		SmallVector<unsigned, 2> OpVals;
		MVT VecVT;
		SystemZVectorConstantInfo(APFloat FPImm);
		SystemZVectorConstantInfo(BuildVectorSDNode *BVN);
		bool isVectorConstantLegal(const SystemZSubtarget &Subtarget);
		};

} // end namespace llvm		} // end namespace llvm

#endif		#endif
		uweigandUnsubmitted Done Reply Inline Actions Do we really need to convert to APInt here just to determine the type? uweigand: Do we really need to convert to APInt here just to determine the type?
		jonpaAuthorUnsubmitted Done Reply Inline Actions Seems that we could do either (&FPImm.getSemantics() == &APFloat::IEEEquad()) or APFloat::getSizeInBits(FP.getSemantics()) > 64 jonpa: Seems that we could do either ``` (&FPImm.getSemantics() == &APFloat::IEEEquad()) ``` or…
		uweigandUnsubmitted Done Reply Inline Actions This function (and possibly the type?) should probably be at least in the lvm::SystemZ:: namespace, to avoid polluting the common llvm:: namespace. Or else (maybe better?) make it a member function of SystemZVectorConstantInfo? uweigand: This function (and possibly the type?) should probably be at least in the lvm::SystemZ…
		jonpaAuthorUnsubmitted Done Reply Inline Actions Seems to work fine to make isVectorConstantLegal() a member function. I am guessing the rest is fine, since SystemZ is part of the name, or? jonpa: Seems to work fine to make isVectorConstantLegal() a member function. I am guessing the rest is…

lib/Target/SystemZ/SystemZISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 571 Lines • ▼ Show 20 Lines	case MVT::f128:
return Subtarget.hasVectorEnhancements1();		return Subtarget.hasVectorEnhancements1();
default:		default:
break;		break;
}		}

return false;		return false;
}		}

		// Return true if the constant can be generated with a vector instruction,
		// such as VGM, VGMB or VREPI.
		bool SystemZVectorConstantInfo::isVectorConstantLegal(
		const SystemZSubtarget &Subtarget) {
		const SystemZInstrInfo *TII =
		static_cast<const SystemZInstrInfo *>(Subtarget.getInstrInfo());
		if (!Subtarget.hasVector() \|\|
		(isFP128 && !Subtarget.hasVectorEnhancements1()))
		return false;

// Return true if Imm can be generated with a vector instruction, such as VGM.		// Try using VECTOR GENERATE BYTE MASK. This is the architecturally-
bool SystemZTargetLowering::		// preferred way of creating all-zero and all-one vectors so give it
analyzeFPImm(const APFloat &Imm, unsigned BitWidth, unsigned &Start,		// priority over other methods below.
unsigned &End, const SystemZInstrInfo *TII) {		unsigned Mask = 0;
APInt IntImm = Imm.bitcastToAPInt();		unsigned I = 0;
if (IntImm.getActiveBits() > 64)		for (; I < SystemZ::VectorBytes; ++I) {
return false;		uint64_t Byte = IntBits.lshr(I * 8).trunc(8).getZExtValue();
		if (Byte == 0xff)
// See if this immediate could be generated with VGM.		Mask \|= 1ULL << I;
bool Success = TII->isRxSBGMask(IntImm.getZExtValue(), BitWidth, Start, End);		else if (Byte != 0)
if (!Success)		break;
return false;		}
// isRxSBGMask returns the bit numbers for a full 64-bit value,		if (I == SystemZ::VectorBytes) {
// with 0 denoting 1 << 63 and 63 denoting 1. Convert them to		Opcode = SystemZISD::BYTE_MASK;
// bit numbers for an BitsPerElement value, so that 0 denotes		OpVals.push_back(Mask);
// 1 << (BitsPerElement-1).		VecVT = MVT::getVectorVT(MVT::getIntegerVT(8), 16);
Start -= 64 - BitWidth;
End -= 64 - BitWidth;
return true;		return true;
}		}

		if (SplatBitSize > 64)
		return false;

		auto tryValue = [&](uint64_t Value) -> bool {
		// Try VECTOR REPLICATE IMMEDIATE
		int64_t SignedValue = SignExtend64(Value, SplatBitSize);
		if (isInt<16>(SignedValue)) {
		OpVals.push_back(((unsigned) SignedValue));
		Opcode = SystemZISD::REPLICATE;
		VecVT = MVT::getVectorVT(MVT::getIntegerVT(SplatBitSize),
		SystemZ::VectorBits / SplatBitSize);
		return true;
		}
		// Try VECTOR GENERATE MASK
		unsigned Start, End;
		if (TII->isRxSBGMask(Value, SplatBitSize, Start, End)) {
		// isRxSBGMask returns the bit numbers for a full 64-bit value, with 0
		// denoting 1 << 63 and 63 denoting 1. Convert them to bit numbers for
		// an SplatBitSize value, so that 0 denotes 1 << (SplatBitSize-1).
		OpVals.push_back(Start - (64 - SplatBitSize));
		OpVals.push_back(End - (64 - SplatBitSize));
		Opcode = SystemZISD::ROTATE_MASK;
		VecVT = MVT::getVectorVT(MVT::getIntegerVT(SplatBitSize),
		SystemZ::VectorBits / SplatBitSize);
		return true;
		}
		return false;
		};

		// First try assuming that any undefined bits above the highest set bit
		// and below the lowest set bit are 1s. This increases the likelihood of
		// being able to use a sign-extended element value in VECTOR REPLICATE
		// IMMEDIATE or a wraparound mask in VECTOR GENERATE MASK.
		uint64_t SplatBitsZ = SplatBits.getZExtValue();
		uint64_t SplatUndefZ = SplatUndef.getZExtValue();
		uint64_t Lower =
		(SplatUndefZ & ((uint64_t(1) << findFirstSet(SplatBitsZ)) - 1));
		uint64_t Upper =
		(SplatUndefZ & ~((uint64_t(1) << findLastSet(SplatBitsZ)) - 1));
		if (tryValue(SplatBitsZ \| Upper \| Lower))
		return true;

		// Now try assuming that any undefined bits between the first and
		// last defined set bits are set. This increases the chances of
		// using a non-wraparound mask.
		uint64_t Middle = SplatUndefZ & ~Upper & ~Lower;
		return tryValue(SplatBitsZ \| Middle);
		}

		SystemZVectorConstantInfo::SystemZVectorConstantInfo(APFloat FPImm) {
		IntBits = FPImm.bitcastToAPInt().zextOrSelf(128);
		isFP128 = (&FPImm.getSemantics() == &APFloat::IEEEquad());

		// Find the smallest splat.
		SplatBits = FPImm.bitcastToAPInt();
		unsigned Width = SplatBits.getBitWidth();
		while (Width > 8) {
		unsigned HalfSize = Width / 2;
		APInt HighValue = SplatBits.lshr(HalfSize).trunc(HalfSize);
		APInt LowValue = SplatBits.trunc(HalfSize);

		// If the two halves do not match, stop here.
		if (HighValue != LowValue \|\| 8 > HalfSize)
		break;

		SplatBits = HighValue;
		Width = HalfSize;
		}
		SplatUndef = 0;
		SplatBitSize = Width;
		}

		SystemZVectorConstantInfo::SystemZVectorConstantInfo(BuildVectorSDNode *BVN) {
		assert(BVN->isConstant() && "Expected a constant BUILD_VECTOR");
		bool HasAnyUndefs;

		// Get IntBits by finding the 128 bit splat.
		BVN->isConstantSplat(IntBits, SplatUndef, SplatBitSize, HasAnyUndefs, 128,
		true);

		// Get SplatBits by finding the 8 bit or greater splat.
		BVN->isConstantSplat(SplatBits, SplatUndef, SplatBitSize, HasAnyUndefs, 8,
		true);
		}

bool SystemZTargetLowering::isFPImmLegal(const APFloat &Imm, EVT VT) const {		bool SystemZTargetLowering::isFPImmLegal(const APFloat &Imm, EVT VT) const {
// We can load zero using LZ?R and negative zero using LZ?R;LC?BR.		// We can load zero using LZ?R and negative zero using LZ?R;LC?BR.
if (Imm.isZero() \|\| Imm.isNegZero())		if (Imm.isZero() \|\| Imm.isNegZero())
return true;		return true;

if (!Subtarget.hasVector())		return SystemZVectorConstantInfo(Imm).isVectorConstantLegal(Subtarget);
return false;
const SystemZInstrInfo *TII =
static_cast<const SystemZInstrInfo *>(Subtarget.getInstrInfo());
unsigned Start, End;
return analyzeFPImm(Imm, VT.getSizeInBits(), Start, End, TII);
}		}

bool SystemZTargetLowering::isLegalICmpImmediate(int64_t Imm) const {		bool SystemZTargetLowering::isLegalICmpImmediate(int64_t Imm) const {
// We can use CGFI or CLGFI.		// We can use CGFI or CLGFI.
return isInt<32>(Imm) \|\| isUInt<32>(Imm);		return isInt<32>(Imm) \|\| isUInt<32>(Imm);
}		}

bool SystemZTargetLowering::isLegalAddImmediate(int64_t Imm) const {		bool SystemZTargetLowering::isLegalAddImmediate(int64_t Imm) const {
▲ Show 20 Lines • Show All 3,663 Lines • ▼ Show 20 Lines	else if (Op1.isUndef())
Op0 = Op1 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, Op0);		Op0 = Op1 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, Op0);
else {		else {
Op0 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, Op0);		Op0 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, Op0);
Op1 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, Op1);		Op1 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, Op1);
}		}
return DAG.getNode(SystemZISD::JOIN_DWORDS, DL, MVT::v2i64, Op0, Op1);		return DAG.getNode(SystemZISD::JOIN_DWORDS, DL, MVT::v2i64, Op0, Op1);
}		}

// Try to represent constant BUILD_VECTOR node BVN using a BYTE MASK style
// mask. Store the mask value in Mask on success.
bool SystemZTargetLowering::
tryBuildVectorByteMask(BuildVectorSDNode *BVN, uint64_t &Mask) {
EVT ElemVT = BVN->getValueType(0).getVectorElementType();
unsigned BytesPerElement = ElemVT.getStoreSize();
for (unsigned I = 0, E = BVN->getNumOperands(); I != E; ++I) {
SDValue Op = BVN->getOperand(I);
if (!Op.isUndef()) {
uint64_t Value;
if (Op.getOpcode() == ISD::Constant)
Value = cast<ConstantSDNode>(Op)->getZExtValue();
else if (Op.getOpcode() == ISD::ConstantFP)
Value = (cast<ConstantFPSDNode>(Op)->getValueAPF().bitcastToAPInt()
.getZExtValue());
else
return false;
for (unsigned J = 0; J < BytesPerElement; ++J) {
uint64_t Byte = (Value >> (J * 8)) & 0xff;
if (Byte == 0xff)
Mask \|= 1ULL << ((E - I - 1) * BytesPerElement + J);
else if (Byte != 0)
return false;
}
}
}
return true;
}

// Try to load a vector constant in which BitsPerElement-bit value Value
// is replicated to fill the vector. VT is the type of the resulting
// constant, which may have elements of a different size from BitsPerElement.
// Return the SDValue of the constant on success, otherwise return
// an empty value.
static SDValue tryBuildVectorReplicate(SelectionDAG &DAG,
const SystemZInstrInfo *TII,
const SDLoc &DL, EVT VT, uint64_t Value,
unsigned BitsPerElement) {
// Signed 16-bit values can be replicated using VREPI.
// Mark the constants as opaque or DAGCombiner will convert back to
// BUILD_VECTOR.
int64_t SignedValue = SignExtend64(Value, BitsPerElement);
if (isInt<16>(SignedValue)) {
MVT VecVT = MVT::getVectorVT(MVT::getIntegerVT(BitsPerElement),
SystemZ::VectorBits / BitsPerElement);
SDValue Op = DAG.getNode(
SystemZISD::REPLICATE, DL, VecVT,
DAG.getConstant(SignedValue, DL, MVT::i32, false, true /isOpaque/));
return DAG.getNode(ISD::BITCAST, DL, VT, Op);
}
// See whether rotating the constant left some N places gives a value that
// is one less than a power of 2 (i.e. all zeros followed by all ones).
// If so we can use VGM.
unsigned Start, End;
if (TII->isRxSBGMask(Value, BitsPerElement, Start, End)) {
// isRxSBGMask returns the bit numbers for a full 64-bit value,
// with 0 denoting 1 << 63 and 63 denoting 1. Convert them to
// bit numbers for an BitsPerElement value, so that 0 denotes
// 1 << (BitsPerElement-1).
Start -= 64 - BitsPerElement;
End -= 64 - BitsPerElement;
MVT VecVT = MVT::getVectorVT(MVT::getIntegerVT(BitsPerElement),
SystemZ::VectorBits / BitsPerElement);
SDValue Op = DAG.getNode(
SystemZISD::ROTATE_MASK, DL, VecVT,
DAG.getConstant(Start, DL, MVT::i32, false, true /isOpaque/),
DAG.getConstant(End, DL, MVT::i32, false, true /isOpaque/));
return DAG.getNode(ISD::BITCAST, DL, VT, Op);
}
return SDValue();
}

// If a BUILD_VECTOR contains some EXTRACT_VECTOR_ELTs, it's usually		// If a BUILD_VECTOR contains some EXTRACT_VECTOR_ELTs, it's usually
// better to use VECTOR_SHUFFLEs on them, only using BUILD_VECTOR for		// better to use VECTOR_SHUFFLEs on them, only using BUILD_VECTOR for
// the non-EXTRACT_VECTOR_ELT elements. See if the given BUILD_VECTOR		// the non-EXTRACT_VECTOR_ELT elements. See if the given BUILD_VECTOR
// would benefit from this representation and return it if so.		// would benefit from this representation and return it if so.
static SDValue tryBuildVectorShuffle(SelectionDAG &DAG,		static SDValue tryBuildVectorShuffle(SelectionDAG &DAG,
BuildVectorSDNode *BVN) {		BuildVectorSDNode *BVN) {
EVT VT = BVN->getValueType(0);		EVT VT = BVN->getValueType(0);
unsigned NumElements = VT.getVectorNumElements();		unsigned NumElements = VT.getVectorNumElements();
▲ Show 20 Lines • Show All 184 Lines • ▼ Show 20 Lines	for (unsigned I = 0; I < NumElements; ++I)
if (!Done[I] && !Elems[I].isUndef() && Elems[I] != ReplicatedVal)		if (!Done[I] && !Elems[I].isUndef() && Elems[I] != ReplicatedVal)
Result = DAG.getNode(ISD::INSERT_VECTOR_ELT, DL, VT, Result, Elems[I],		Result = DAG.getNode(ISD::INSERT_VECTOR_ELT, DL, VT, Result, Elems[I],
DAG.getConstant(I, DL, MVT::i32));		DAG.getConstant(I, DL, MVT::i32));
return Result;		return Result;
}		}

SDValue SystemZTargetLowering::lowerBUILD_VECTOR(SDValue Op,		SDValue SystemZTargetLowering::lowerBUILD_VECTOR(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
const SystemZInstrInfo *TII =
static_cast<const SystemZInstrInfo *>(Subtarget.getInstrInfo());
auto *BVN = cast<BuildVectorSDNode>(Op.getNode());		auto *BVN = cast<BuildVectorSDNode>(Op.getNode());
SDLoc DL(Op);		SDLoc DL(Op);
EVT VT = Op.getValueType();		EVT VT = Op.getValueType();

if (BVN->isConstant()) {		if (BVN->isConstant()) {
// Try using VECTOR GENERATE BYTE MASK. This is the architecturally-		if (SystemZVectorConstantInfo(BVN).isVectorConstantLegal(Subtarget))
// preferred way of creating all-zero and all-one vectors so give it
// priority over other methods below.
uint64_t Mask;
if (ISD::isBuildVectorAllZeros(Op.getNode()) \|\|
ISD::isBuildVectorAllOnes(Op.getNode()) \|\|
(VT.isInteger() && tryBuildVectorByteMask(BVN, Mask)))
return Op;

// Try using some form of replication.
APInt SplatBits, SplatUndef;
unsigned SplatBitSize;
bool HasAnyUndefs;
if (BVN->isConstantSplat(SplatBits, SplatUndef, SplatBitSize, HasAnyUndefs,
8, true) &&
SplatBitSize <= 64) {
// First try assuming that any undefined bits above the highest set bit
// and below the lowest set bit are 1s. This increases the likelihood of
// being able to use a sign-extended element value in VECTOR REPLICATE
// IMMEDIATE or a wraparound mask in VECTOR GENERATE MASK.
uint64_t SplatBitsZ = SplatBits.getZExtValue();
uint64_t SplatUndefZ = SplatUndef.getZExtValue();
uint64_t Lower = (SplatUndefZ
& ((uint64_t(1) << findFirstSet(SplatBitsZ)) - 1));
uint64_t Upper = (SplatUndefZ
& ~((uint64_t(1) << findLastSet(SplatBitsZ)) - 1));
uint64_t Value = SplatBitsZ \| Upper \| Lower;
SDValue Op = tryBuildVectorReplicate(DAG, TII, DL, VT, Value,
SplatBitSize);
if (Op.getNode())
return Op;

// Now try assuming that any undefined bits between the first and
// last defined set bits are set. This increases the chances of
// using a non-wraparound mask.
uint64_t Middle = SplatUndefZ & ~Upper & ~Lower;
Value = SplatBitsZ \| Middle;
Op = tryBuildVectorReplicate(DAG, TII, DL, VT, Value, SplatBitSize);
if (Op.getNode())
return Op;		return Op;
}

// Fall back to loading it from memory.		// Fall back to loading it from memory.
return SDValue();		return SDValue();
}		}

// See if we should use shuffles to construct the vector from other vectors.		// See if we should use shuffles to construct the vector from other vectors.
if (SDValue Res = tryBuildVectorShuffle(DAG, BVN))		if (SDValue Res = tryBuildVectorShuffle(DAG, BVN))
return Res;		return Res;
▲ Show 20 Lines • Show All 430 Lines • ▼ Show 20 Lines	switch ((SystemZISD::NodeType)Opcode) {
OPCODE(STPCPY);		OPCODE(STPCPY);
OPCODE(STRCMP);		OPCODE(STRCMP);
OPCODE(SEARCH_STRING);		OPCODE(SEARCH_STRING);
OPCODE(IPM);		OPCODE(IPM);
OPCODE(MEMBARRIER);		OPCODE(MEMBARRIER);
OPCODE(TBEGIN);		OPCODE(TBEGIN);
OPCODE(TBEGIN_NOFLOAT);		OPCODE(TBEGIN_NOFLOAT);
OPCODE(TEND);		OPCODE(TEND);
		OPCODE(BYTE_MASK);
OPCODE(ROTATE_MASK);		OPCODE(ROTATE_MASK);
OPCODE(REPLICATE);		OPCODE(REPLICATE);
OPCODE(JOIN_DWORDS);		OPCODE(JOIN_DWORDS);
OPCODE(SPLAT);		OPCODE(SPLAT);
OPCODE(MERGE_HIGH);		OPCODE(MERGE_HIGH);
OPCODE(MERGE_LOW);		OPCODE(MERGE_LOW);
OPCODE(SHL_DOUBLE);		OPCODE(SHL_DOUBLE);
OPCODE(PERMUTE_DWORDS);		OPCODE(PERMUTE_DWORDS);
▲ Show 20 Lines • Show All 2,424 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZInstrVector.td

	Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	let Predicates = [FeatureVector] in {			let Predicates = [FeatureVector] in {
	let isAsCheapAsAMove = 1, isMoveImm = 1, isReMaterializable = 1 in {			let isAsCheapAsAMove = 1, isMoveImm = 1, isReMaterializable = 1 in {

	// Generate byte mask.			// Generate byte mask.
	def VZERO : InherentVRIa<"vzero", 0xE744, 0>;			def VZERO : InherentVRIa<"vzero", 0xE744, 0>;
	def VONE : InherentVRIa<"vone", 0xE744, 0xffff>;			def VONE : InherentVRIa<"vone", 0xE744, 0xffff>;
	def VGBM : UnaryVRIa<"vgbm", 0xE744, null_frag, v128b, imm32zx16>;			def VGBM : UnaryVRIa<"vgbm", 0xE744, z_byte_mask, v128b, imm32zx16>;

	// Generate mask.			// Generate mask.
	def VGM : BinaryVRIbGeneric<"vgm", 0xE746>;			def VGM : BinaryVRIbGeneric<"vgm", 0xE746>;
	def VGMB : BinaryVRIb<"vgmb", 0xE746, z_rotate_mask, v128b, 0>;			def VGMB : BinaryVRIb<"vgmb", 0xE746, z_rotate_mask, v128b, 0>;
	def VGMH : BinaryVRIb<"vgmh", 0xE746, z_rotate_mask, v128h, 1>;			def VGMH : BinaryVRIb<"vgmh", 0xE746, z_rotate_mask, v128h, 1>;
	def VGMF : BinaryVRIb<"vgmf", 0xE746, z_rotate_mask, v128f, 2>;			def VGMF : BinaryVRIb<"vgmf", 0xE746, z_rotate_mask, v128f, 2>;
	def VGMG : BinaryVRIb<"vgmg", 0xE746, z_rotate_mask, v128g, 3>;			def VGMG : BinaryVRIb<"vgmg", 0xE746, z_rotate_mask, v128g, 3>;

	▲ Show 20 Lines • Show All 1,483 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZOperators.td

	Show First 20 Lines • Show All 280 Lines • ▼ Show 20 Lines

	def z_tdc : SDNode<"SystemZISD::TDC", SDT_ZTest>;			def z_tdc : SDNode<"SystemZISD::TDC", SDT_ZTest>;

	// Defined because the index is an i32 rather than a pointer.			// Defined because the index is an i32 rather than a pointer.
	def z_vector_insert : SDNode<"ISD::INSERT_VECTOR_ELT",			def z_vector_insert : SDNode<"ISD::INSERT_VECTOR_ELT",
	SDT_ZInsertVectorElt>;			SDT_ZInsertVectorElt>;
	def z_vector_extract : SDNode<"ISD::EXTRACT_VECTOR_ELT",			def z_vector_extract : SDNode<"ISD::EXTRACT_VECTOR_ELT",
	SDT_ZExtractVectorElt>;			SDT_ZExtractVectorElt>;
				def z_byte_mask : SDNode<"SystemZISD::BYTE_MASK", SDT_ZReplicate>;
	def z_rotate_mask : SDNode<"SystemZISD::ROTATE_MASK", SDT_ZRotateMask>;			def z_rotate_mask : SDNode<"SystemZISD::ROTATE_MASK", SDT_ZRotateMask>;
	def z_replicate : SDNode<"SystemZISD::REPLICATE", SDT_ZReplicate>;			def z_replicate : SDNode<"SystemZISD::REPLICATE", SDT_ZReplicate>;
	def z_join_dwords : SDNode<"SystemZISD::JOIN_DWORDS", SDT_ZJoinDwords>;			def z_join_dwords : SDNode<"SystemZISD::JOIN_DWORDS", SDT_ZJoinDwords>;
	def z_splat : SDNode<"SystemZISD::SPLAT", SDT_ZVecBinaryInt>;			def z_splat : SDNode<"SystemZISD::SPLAT", SDT_ZVecBinaryInt>;
	def z_merge_high : SDNode<"SystemZISD::MERGE_HIGH", SDT_ZVecBinary>;			def z_merge_high : SDNode<"SystemZISD::MERGE_HIGH", SDT_ZVecBinary>;
	def z_merge_low : SDNode<"SystemZISD::MERGE_LOW", SDT_ZVecBinary>;			def z_merge_low : SDNode<"SystemZISD::MERGE_LOW", SDT_ZVecBinary>;
	def z_shl_double : SDNode<"SystemZISD::SHL_DOUBLE", SDT_ZVecTernaryInt>;			def z_shl_double : SDNode<"SystemZISD::SHL_DOUBLE", SDT_ZVecTernaryInt>;
	def z_permute_dwords : SDNode<"SystemZISD::PERMUTE_DWORDS",			def z_permute_dwords : SDNode<"SystemZISD::PERMUTE_DWORDS",
	▲ Show 20 Lines • Show All 530 Lines • Show Last 20 Lines

test/CodeGen/SystemZ/fp-const-11.ll

	Show All 32 Lines
	; CHECK: vl [[REG:%v[0-9]+]], 0([[REGISTER]])			; CHECK: vl [[REG:%v[0-9]+]], 0([[REGISTER]])
	; CHECK: vst [[REG]], 0(%r2)			; CHECK: vst [[REG]], 0(%r2)
	; CHECK: br %r14			; CHECK: br %r14
	; CONST: .quad 4611404543484231680			; CONST: .quad 4611404543484231680
	; CONST: .quad 0			; CONST: .quad 0
	store fp128 0xL00000000000000003fff000002000000, fp128 *%x			store fp128 0xL00000000000000003fff000002000000, fp128 *%x
	ret void			ret void
	}			}

				; Test that VGBM works.
				define void @f4(fp128 *%x) {
				; CHECK-LABEL: f4:
				; CHECK: vgbm %v0, 21845
				; CHECK-NEXT: vst %v0, 0(%r2)
				; CHECK-NEXT: br %r14
				store fp128 0xL00ff00ff00ff00ff00ff00ff00ff00ff, fp128 *%x
				ret void
				}

				; Test that VREPI works.
				define void @f5(fp128 *%x) {
				; CHECK-LABEL: f5:
				; CHECK: vrepib %v0, -8
				; CHECK-NEXT: vst %v0, 0(%r2)
				; CHECK-NEXT: br %r14
				store fp128 0xLf8f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8, fp128 *%x
				ret void
				}

				; Test that VGM works.
				define void @f6(fp128 *%x) {
				; CHECK-LABEL: f6:
				; CHECK: vgmg %v0, 12, 31
				; CHECK-NEXT: vst %v0, 0(%r2)
				; CHECK-NEXT: br %r14
				store fp128 0xL000fffff00000000000fffff00000000, fp128 *%x
				ret void
				}

test/CodeGen/SystemZ/vec-const-05.ll

	; Test vector byte masks, v4f32 version. Only all-zero vectors are handled.			; Test vector byte masks, v4f32 version.
	;			;
	; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| FileCheck %s			; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| FileCheck %s

	; Test an all-zeros vector.			; Test an all-zeros vector.
	define <4 x float> @f0() {			define <4 x float> @f1() {
	; CHECK-LABEL: f0:			; CHECK-LABEL: f1:
	; CHECK: vgbm %v24, 0			; CHECK: vgbm %v24, 0
	; CHECK: br %r14			; CHECK: br %r14
	ret <4 x float> zeroinitializer			ret <4 x float> zeroinitializer
	}			}

	; Test that undefs are treated as zero.			; Test an all-ones vector.
	define <4 x float> @f1() {			define <4 x float> @f2() {
	; CHECK-LABEL: f1:			; CHECK-LABEL: f2:
	; CHECK: vgbm %v24, 0			; CHECK: vgbm %v24, 65535
	; CHECK: br %r14			; CHECK: br %r14
	ret <4 x float> <float zeroinitializer, float undef,			ret <4 x float> <float 0xffffffffe0000000, float 0xffffffffe0000000,
	float zeroinitializer, float undef>			float 0xffffffffe0000000, float 0xffffffffe0000000>
				}

				; Test a mixed vector (mask 0xc731).
				define <4 x float> @f3() {
				; CHECK-LABEL: f3:
				; CHECK: vgbm %v24, 50993
				; CHECK: br %r14
				ret <4 x float> <float 0xffffe00000000000, float 0x381fffffe0000000,
				float 0x379fffe000000000, float 0x371fe00000000000>
				}

				; Test that undefs are treated as zero (mask 0xc031).
				define <4 x float> @f4() {
				; CHECK-LABEL: f4:
				; CHECK: vgbm %v24, 49201
				; CHECK: br %r14
				ret <4 x float> <float 0xffffe00000000000, float undef,
				float 0x379fffe000000000, float 0x371fe00000000000>
				}

				; Test that we don't use VGBM if one of the bytes is not 0 or 0xff.
				define <4 x float> @f5() {
				; CHECK-LABEL: f5:
				; CHECK-NOT: vgbm
				; CHECK: br %r14
				ret <4 x float> <float 0xffffe00000000000, float 0x381fffffc0000000,
				float 0x379fffe000000000, float 0x371fe00000000000>
	}			}

	; Test an all-zeros v2f32 that gets promoted to v4f32.			; Test an all-zeros v2f32 that gets promoted to v4f32.
	define <2 x float> @f2() {			define <2 x float> @f6() {
	; CHECK-LABEL: f2:			; CHECK-LABEL: f6:
	; CHECK: vgbm %v24, 0			; CHECK: vgbm %v24, 0
	; CHECK: br %r14			; CHECK: br %r14
	ret <2 x float> zeroinitializer			ret <2 x float> zeroinitializer
	}			}

				; Test a mixed v2f32 that gets promoted to v4f32 (mask 0xc700).
				define <2 x float> @f7() {
				; CHECK-LABEL: f7:
				; CHECK: vgbm %v24, 50944
				; CHECK: br %r14
				ret <2 x float> <float 0xffffe00000000000, float 0x381fffffe0000000>
				}

test/CodeGen/SystemZ/vec-const-06.ll

	; Test vector byte masks, v2f64 version. Only all-zero vectors are handled.			; Test vector byte masks, v2f64 version.
	;			;
	; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| FileCheck %s			; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| FileCheck %s

	; Test an all-zeros vector.			; Test an all-zeros vector.
	define <2 x double> @f0() {			define <2 x double> @f1() {
	; CHECK-LABEL: f0:			; CHECK-LABEL: f1:
	; CHECK: vgbm %v24, 0			; CHECK: vgbm %v24, 0
	; CHECK: br %r14			; CHECK: br %r14
	ret <2 x double> zeroinitializer			ret <2 x double> zeroinitializer
	}			}

	; Test that undefs are treated as zero.			; Test an all-ones vector.
	define <2 x double> @f1() {			define <2 x double> @f2() {
	; CHECK-LABEL: f1:			; CHECK-LABEL: f2:
	; CHECK: vgbm %v24, 0			; CHECK: vgbm %v24, 65535
				; CHECK: br %r14
				ret <2 x double> <double 0xffffffffffffffff, double 0xffffffffffffffff>
				}

				; Test a mixed vector (mask 0x8c76).
				define <2 x double> @f3() {
				; CHECK-LABEL: f3:
				; CHECK: vgbm %v24, 35958
				; CHECK: br %r14
				ret <2 x double> <double 0xff000000ffff0000, double 0x00ffffff00ffff00>
				}

				; Test that undefs are treated as zero (mask 0x8c00).
				define <2 x double> @f4() {
				; CHECK-LABEL: f4:
				; CHECK: vgbm %v24, 35840
				; CHECK: br %r14
				ret <2 x double> <double 0xff000000ffff0000, double undef>
				}

				; Test that we don't use VGBM if one of the bytes is not 0 or 0xff.
				define <2 x double> @f5() {
				; CHECK-LABEL: f5:
				; CHECK-NOT: vgbm
	; CHECK: br %r14			; CHECK: br %r14
	ret <2 x double> <double zeroinitializer, double undef>			ret <2 x double> <double 0xfe000000ffff0000, double 0x00ffffff00ffff00>
	}			}

test/CodeGen/SystemZ/vec-const-19.ll

This file was added.

				; Test that a scalar FP constant can be reused from a vector splat constant
				; of the same value.
				;
				; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| FileCheck %s

				define void @fun() {
				; CHECK-LABEL: fun:
				; CHECK: vgmg %v0, 2, 10
				; CHECK-NOT: vgmg %v0, 2, 10

				%tmp = fadd <2 x double> zeroinitializer, <double 1.000000e+00, double 1.000000e+00>
				%tmp1 = fmul <2 x double> %tmp, <double 5.000000e-01, double 5.000000e-01>
				store <2 x double> %tmp1, <2 x double>* undef
				%tmp2 = load double, double* undef
				%tmp3 = fmul double %tmp2, 5.000000e-01
				store double %tmp3, double* undef
				ret void
				}