This is an archive of the discontinued LLVM Phabricator instance.

[clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate casts
ClosedPublic

Authored by bsmith on Jul 27 2021, 3:56 AM.

Download Raw Diff

Details

Reviewers

paulwalker-arm
peterwaller-arm
eli.friedman
junparser
efriedma
c-rhodes

Commits

rGe57e1e4e0026: [clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate…

Summary

For fixed SVE types, predicates are represented using vectors of i8,
where as for scalable types they are represented using vectors of i1. We
can avoid going through memory for casts between these by bitcasting the
i1 scalable vectors to/from a scalable i8 vector of matching size, which
can then use the existing vector insert/extract logic.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

bsmith created this revision.Jul 27 2021, 3:56 AM

Herald added a reviewer: efriedma. · View Herald TranscriptJul 27 2021, 3:56 AM

Herald added subscribers: psnobl, kristof.beyls, tschuett. · View Herald Transcript

bsmith requested review of this revision.Jul 27 2021, 3:56 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2021, 3:56 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

paulwalker-arm added a reviewer: c-rhodes.Jul 27 2021, 4:23 AM

Harbormaster completed remote builds in B116389: Diff 361970.Jul 27 2021, 4:37 AM

Matt added a subscriber: Matt.Jul 27 2021, 4:54 AM

c-rhodes added inline comments.Jul 27 2021, 6:25 AM

clang/lib/CodeGen/CGExprScalar.cpp
2065–2076	should const be dropped now the type might change?
2110–2128	With the predicate casting now using the intrinsics I don't think this is needed any longer. Perhaps we should add an unreachable above if the element type doesn't match?

bsmith added inline comments.Jul 28 2021, 5:18 AM

clang/lib/CodeGen/CGExprScalar.cpp
2110–2128	Don't we still need this for casting between vectors with different element types, or are these guaranteed to not hit this code path?

c-rhodes added inline comments.Jul 28 2021, 8:28 AM

clang/lib/CodeGen/CGExprScalar.cpp
2110–2128	Don't we still need this for casting between vectors with different element types, or are these guaranteed to not hit this code path? Apologies I was wrong, you're right we'll still need it for casting between vectors with different element types to support lax vector conversions, although it's no longer tested. Might be worth adding a codegen test for ~: svint64_t lax_cast(fixed_int32_t t) { return t; } and the comment could also be updated

junparser added inline comments.Jul 28 2021, 7:00 PM

clang/lib/CodeGen/CGExprScalar.cpp
2102	I think this may also works for casting between vectors with different element types.

Herald added a subscriber: ctetreau. · View Herald TranscriptJul 28 2021, 7:00 PM

bsmith added inline comments.Jul 29 2021, 8:44 AM

clang/lib/CodeGen/CGExprScalar.cpp
2102	A similar argument applies here as the other related ticket, in principal we could, however it's not clear that there is a good use case for writing code that would make use of this. So for now it's probably best to just deal with predicates which are definitely a problem and other cases as they arise.

junparser added inline comments.Jul 29 2021, 7:31 PM

clang/lib/CodeGen/CGExprScalar.cpp
2102	Although i believe this generates better code than using memory load/store. Thanks for explaining this.

Update comment to reflect changes
Add new test for lax casting via memory

Harbormaster completed remote builds in B117188: Diff 363093.Jul 30 2021, 8:32 AM

thanks @bsmith, just left one minor nit but otherwise LGTM

clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
10	nit: unused?

This revision is now accepted and ready to land.Aug 2 2021, 6:29 AM

This revision was landed with ongoing or failed builds.Aug 4 2021, 9:10 AM

Closed by commit rGe57e1e4e0026: [clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate… (authored by bsmith). · Explain Why

This revision was automatically updated to reflect the committed changes.

bsmith added a commit: rGe57e1e4e0026: [clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate….

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGCall.cpp

29 lines

CGExprScalar.cpp

34 lines

test/

CodeGen/

attr-arm-sve-vector-bits-bitcast.c

42 lines

attr-arm-sve-vector-bits-call.c

60 lines

attr-arm-sve-vector-bits-cast.c

35 lines

attr-arm-sve-vector-bits-codegen.c

34 lines

attr-arm-sve-vector-bits-globals.c

28 lines

Diff 364153

clang/lib/CodeGen/CGCall.cpp

Show First 20 Lines • Show All 1,265 Lines • ▼ Show 20 Lines	if (!SrcSize.isScalable() && !DstSize.isScalable() &&
return CGF.Builder.CreateLoad(Src);		return CGF.Builder.CreateLoad(Src);
}		}

// If coercing a fixed vector to a scalable vector for ABI compatibility, and		// If coercing a fixed vector to a scalable vector for ABI compatibility, and
// the types match, use the llvm.experimental.vector.insert intrinsic to		// the types match, use the llvm.experimental.vector.insert intrinsic to
// perform the conversion.		// perform the conversion.
if (auto *ScalableDst = dyn_cast<llvm::ScalableVectorType>(Ty)) {		if (auto *ScalableDst = dyn_cast<llvm::ScalableVectorType>(Ty)) {
if (auto *FixedSrc = dyn_cast<llvm::FixedVectorType>(SrcTy)) {		if (auto *FixedSrc = dyn_cast<llvm::FixedVectorType>(SrcTy)) {
		// If we are casting a fixed i8 vector to a scalable 16 x i1 predicate
		// vector, use a vector insert and bitcast the result.
		bool NeedsBitcast = false;
		auto PredType =
		llvm::ScalableVectorType::get(CGF.Builder.getInt1Ty(), 16);
		llvm::Type *OrigType = Ty;
		if (ScalableDst == PredType &&
		FixedSrc->getElementType() == CGF.Builder.getInt8Ty()) {
		ScalableDst = llvm::ScalableVectorType::get(CGF.Builder.getInt8Ty(), 2);
		NeedsBitcast = true;
		}
if (ScalableDst->getElementType() == FixedSrc->getElementType()) {		if (ScalableDst->getElementType() == FixedSrc->getElementType()) {
auto *Load = CGF.Builder.CreateLoad(Src);		auto *Load = CGF.Builder.CreateLoad(Src);
auto *UndefVec = llvm::UndefValue::get(ScalableDst);		auto *UndefVec = llvm::UndefValue::get(ScalableDst);
auto *Zero = llvm::Constant::getNullValue(CGF.CGM.Int64Ty);		auto *Zero = llvm::Constant::getNullValue(CGF.CGM.Int64Ty);
return CGF.Builder.CreateInsertVector(ScalableDst, UndefVec, Load, Zero,		llvm::Value *Result = CGF.Builder.CreateInsertVector(
"castScalableSve");		ScalableDst, UndefVec, Load, Zero, "castScalableSve");
		if (NeedsBitcast)
		Result = CGF.Builder.CreateBitCast(Result, OrigType);
		return Result;
}		}
}		}
}		}

// Otherwise do coercion through memory. This is stupid, but simple.		// Otherwise do coercion through memory. This is stupid, but simple.
Address Tmp =		Address Tmp =
CreateTempAllocaForCoercion(CGF, Ty, Src.getAlignment(), Src.getName());		CreateTempAllocaForCoercion(CGF, Ty, Src.getAlignment(), Src.getName());
CGF.Builder.CreateMemCpy(		CGF.Builder.CreateMemCpy(
▲ Show 20 Lines • Show All 1,564 Lines • ▼ Show 20 Lines	case ABIArgInfo::Direct: {
}		}

// VLST arguments are coerced to VLATs at the function boundary for		// VLST arguments are coerced to VLATs at the function boundary for
// ABI consistency. If this is a VLST that was coerced to		// ABI consistency. If this is a VLST that was coerced to
// a VLAT at the function boundary and the types match up, use		// a VLAT at the function boundary and the types match up, use
// llvm.experimental.vector.extract to convert back to the original		// llvm.experimental.vector.extract to convert back to the original
// VLST.		// VLST.
if (auto *VecTyTo = dyn_cast<llvm::FixedVectorType>(ConvertType(Ty))) {		if (auto *VecTyTo = dyn_cast<llvm::FixedVectorType>(ConvertType(Ty))) {
auto *Coerced = Fn->getArg(FirstIRArg);		llvm::Value *Coerced = Fn->getArg(FirstIRArg);
if (auto *VecTyFrom =		if (auto *VecTyFrom =
dyn_cast<llvm::ScalableVectorType>(Coerced->getType())) {		dyn_cast<llvm::ScalableVectorType>(Coerced->getType())) {
		// If we are casting a scalable 16 x i1 predicate vector to a fixed i8
		// vector, bitcast the source and use a vector extract.
		auto PredType =
		llvm::ScalableVectorType::get(Builder.getInt1Ty(), 16);
		if (VecTyFrom == PredType &&
		VecTyTo->getElementType() == Builder.getInt8Ty()) {
		VecTyFrom = llvm::ScalableVectorType::get(Builder.getInt8Ty(), 2);
		Coerced = Builder.CreateBitCast(Coerced, VecTyFrom);
		}
if (VecTyFrom->getElementType() == VecTyTo->getElementType()) {		if (VecTyFrom->getElementType() == VecTyTo->getElementType()) {
llvm::Value *Zero = llvm::Constant::getNullValue(CGM.Int64Ty);		llvm::Value *Zero = llvm::Constant::getNullValue(CGM.Int64Ty);

assert(NumIRArgs == 1);		assert(NumIRArgs == 1);
Coerced->setName(Arg->getName() + ".coerce");		Coerced->setName(Arg->getName() + ".coerce");
ArgVals.push_back(ParamValue::forDirect(Builder.CreateExtractVector(		ArgVals.push_back(ParamValue::forDirect(Builder.CreateExtractVector(
VecTyTo, Coerced, Zero, "castFixedSve")));		VecTyTo, Coerced, Zero, "castFixedSve")));
break;		break;
▲ Show 20 Lines • Show All 2,653 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGExprScalar.cpp

Show First 20 Lines • Show All 2,056 Lines • ▼ Show 20 Lines	if (auto *CI = dyn_cast<llvm::CallBase>(Src)) {
CE->getExprLoc());		CE->getExprLoc());
}		}
}		}

// If Src is a fixed vector and Dst is a scalable vector, and both have the		// If Src is a fixed vector and Dst is a scalable vector, and both have the
// same element type, use the llvm.experimental.vector.insert intrinsic to		// same element type, use the llvm.experimental.vector.insert intrinsic to
// perform the bitcast.		// perform the bitcast.
if (const auto *FixedSrc = dyn_cast<llvm::FixedVectorType>(SrcTy)) {		if (const auto *FixedSrc = dyn_cast<llvm::FixedVectorType>(SrcTy)) {
if (const auto *ScalableDst = dyn_cast<llvm::ScalableVectorType>(DstTy)) {		if (const auto *ScalableDst = dyn_cast<llvm::ScalableVectorType>(DstTy)) {
		// If we are casting a fixed i8 vector to a scalable 16 x i1 predicate
		// vector, use a vector insert and bitcast the result.
		bool NeedsBitCast = false;
		auto PredType = llvm::ScalableVectorType::get(Builder.getInt1Ty(), 16);
		llvm::Type *OrigType = DstTy;
		if (ScalableDst == PredType &&
		FixedSrc->getElementType() == Builder.getInt8Ty()) {
		DstTy = llvm::ScalableVectorType::get(Builder.getInt8Ty(), 2);
		ScalableDst = dyn_cast<llvm::ScalableVectorType>(DstTy);
		NeedsBitCast = true;
		}
		c-rhodesUnsubmitted Not Done Reply Inline Actions should const be dropped now the type might change? c-rhodes: should const be dropped now the type might change?
if (FixedSrc->getElementType() == ScalableDst->getElementType()) {		if (FixedSrc->getElementType() == ScalableDst->getElementType()) {
llvm::Value *UndefVec = llvm::UndefValue::get(DstTy);		llvm::Value *UndefVec = llvm::UndefValue::get(DstTy);
llvm::Value *Zero = llvm::Constant::getNullValue(CGF.CGM.Int64Ty);		llvm::Value *Zero = llvm::Constant::getNullValue(CGF.CGM.Int64Ty);
return Builder.CreateInsertVector(DstTy, UndefVec, Src, Zero,		llvm::Value *Result = Builder.CreateInsertVector(
"castScalableSve");		DstTy, UndefVec, Src, Zero, "castScalableSve");
		if (NeedsBitCast)
		Result = Builder.CreateBitCast(Result, OrigType);
		return Result;
}		}
}		}
}		}

// If Src is a scalable vector and Dst is a fixed vector, and both have the		// If Src is a scalable vector and Dst is a fixed vector, and both have the
// same element type, use the llvm.experimental.vector.extract intrinsic to		// same element type, use the llvm.experimental.vector.extract intrinsic to
// perform the bitcast.		// perform the bitcast.
if (const auto *ScalableSrc = dyn_cast<llvm::ScalableVectorType>(SrcTy)) {		if (const auto *ScalableSrc = dyn_cast<llvm::ScalableVectorType>(SrcTy)) {
if (const auto *FixedDst = dyn_cast<llvm::FixedVectorType>(DstTy)) {		if (const auto *FixedDst = dyn_cast<llvm::FixedVectorType>(DstTy)) {
		// If we are casting a scalable 16 x i1 predicate vector to a fixed i8
		// vector, bitcast the source and use a vector extract.
		auto PredType = llvm::ScalableVectorType::get(Builder.getInt1Ty(), 16);
		if (ScalableSrc == PredType &&
		FixedDst->getElementType() == Builder.getInt8Ty()) {
		SrcTy = llvm::ScalableVectorType::get(Builder.getInt8Ty(), 2);
		ScalableSrc = dyn_cast<llvm::ScalableVectorType>(SrcTy);
		Src = Builder.CreateBitCast(Src, SrcTy);
		}
		junparserUnsubmitted Not Done Reply Inline Actions I think this may also works for casting between vectors with different element types. junparser: I think this may also works for casting between vectors with different element types.
		bsmithAuthorUnsubmitted Done Reply Inline Actions A similar argument applies here as the other related ticket, in principal we could, however it's not clear that there is a good use case for writing code that would make use of this. So for now it's probably best to just deal with predicates which are definitely a problem and other cases as they arise. bsmith: A similar argument applies here as the other related ticket, in principal we could, however…
		junparserUnsubmitted Not Done Reply Inline Actions Although i believe this generates better code than using memory load/store. Thanks for explaining this. junparser: Although i believe this generates better code than using memory load/store. Thanks for…
if (ScalableSrc->getElementType() == FixedDst->getElementType()) {		if (ScalableSrc->getElementType() == FixedDst->getElementType()) {
llvm::Value *Zero = llvm::Constant::getNullValue(CGF.CGM.Int64Ty);		llvm::Value *Zero = llvm::Constant::getNullValue(CGF.CGM.Int64Ty);
return Builder.CreateExtractVector(DstTy, Src, Zero, "castFixedSve");		return Builder.CreateExtractVector(DstTy, Src, Zero, "castFixedSve");
}		}
}		}
}		}

// Perform VLAT <-> VLST bitcast through memory.		// Perform VLAT <-> VLST bitcast through memory.
// TODO: since the llvm.experimental.vector.{insert,extract} intrinsics		// TODO: since the llvm.experimental.vector.{insert,extract} intrinsics
// require the element types of the vectors to be the same, we		// require the element types of the vectors to be the same, we
// need to keep this around for casting between predicates, or more		// need to keep this around for bitcasts between VLAT <-> VLST where
// generally for bitcasts between VLAT <-> VLST where the element		// the element types of the vectors are not the same, until we figure
// types of the vectors are not the same, until we figure out a better		// out a better way of doing these casts.
// way of doing these casts.
if ((isa<llvm::FixedVectorType>(SrcTy) &&		if ((isa<llvm::FixedVectorType>(SrcTy) &&
isa<llvm::ScalableVectorType>(DstTy)) \|\|		isa<llvm::ScalableVectorType>(DstTy)) \|\|
(isa<llvm::ScalableVectorType>(SrcTy) &&		(isa<llvm::ScalableVectorType>(SrcTy) &&
isa<llvm::FixedVectorType>(DstTy))) {		isa<llvm::FixedVectorType>(DstTy))) {
Address Addr = CGF.CreateDefaultAlignTempAlloca(SrcTy, "saved-value");		Address Addr = CGF.CreateDefaultAlignTempAlloca(SrcTy, "saved-value");
LValue LV = CGF.MakeAddrLValue(Addr, E->getType());		LValue LV = CGF.MakeAddrLValue(Addr, E->getType());
CGF.EmitStoreOfScalar(Src, LV);		CGF.EmitStoreOfScalar(Src, LV);
Addr = Builder.CreateElementBitCast(Addr, CGF.ConvertTypeForMem(DestTy),		Addr = Builder.CreateElementBitCast(Addr, CGF.ConvertTypeForMem(DestTy),
"castFixedSve");		"castFixedSve");
LValue DestLV = CGF.MakeAddrLValue(Addr, DestTy);		LValue DestLV = CGF.MakeAddrLValue(Addr, DestTy);
DestLV.setTBAAInfo(TBAAAccessInfo::getMayAliasInfo());		DestLV.setTBAAInfo(TBAAAccessInfo::getMayAliasInfo());
return EmitLoadOfLValue(DestLV, CE->getExprLoc());		return EmitLoadOfLValue(DestLV, CE->getExprLoc());
}		}
		c-rhodesUnsubmitted Not Done Reply Inline Actions With the predicate casting now using the intrinsics I don't think this is needed any longer. Perhaps we should add an unreachable above if the element type doesn't match? c-rhodes: With the predicate casting now using the intrinsics I don't think this is needed any longer.
		bsmithAuthorUnsubmitted Done Reply Inline Actions Don't we still need this for casting between vectors with different element types, or are these guaranteed to not hit this code path? bsmith: Don't we still need this for casting between vectors with different element types, or are these…
		c-rhodesUnsubmitted Done Reply Inline Actions Don't we still need this for casting between vectors with different element types, or are these guaranteed to not hit this code path? Apologies I was wrong, you're right we'll still need it for casting between vectors with different element types to support lax vector conversions, although it's no longer tested. Might be worth adding a codegen test for ~: svint64_t lax_cast(fixed_int32_t t) { return t; } and the comment could also be updated c-rhodes: > Don't we still need this for casting between vectors with different element types, or are…

return Builder.CreateBitCast(Src, DstTy);		return Builder.CreateBitCast(Src, DstTy);
}		}
case CK_AddressSpaceConversion: {		case CK_AddressSpaceConversion: {
Expr::EvalResult Result;		Expr::EvalResult Result;
if (E->EvaluateAsRValue(Result, CGF.getContext()) &&		if (E->EvaluateAsRValue(Result, CGF.getContext()) &&
Result.Val.isNullPointer()) {		Result.Val.isNullPointer()) {
// If E has side effect, it is emitted even if its final result is a		// If E has side effect, it is emitted even if its final result is a
▲ Show 20 Lines • Show All 3,018 Lines • Show Last 20 Lines

clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c

	Show First 20 Lines • Show All 185 Lines • ▼ Show 20 Lines
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// bool			// bool
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	// CHECK-128-LABEL: @read_bool(			// CHECK-128-LABEL: @read_bool(
	// CHECK-128-NEXT: entry:			// CHECK-128-NEXT: entry:
	// CHECK-128-NEXT: [[SAVED_VALUE:%.*]] = alloca <2 x i8>, align 2
	// CHECK-128-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0			// CHECK-128-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0
	// CHECK-128-NEXT: [[TMP0:%.]] = load <2 x i8>, <2 x i8> [[ARRAYIDX]], align 2, !tbaa [[TBAA6]]			// CHECK-128-NEXT: [[TMP0:%.]] = load <2 x i8>, <2 x i8> [[ARRAYIDX]], align 2, !tbaa [[TBAA6]]
	// CHECK-128-NEXT: store <2 x i8> [[TMP0]], <2 x i8>* [[SAVED_VALUE]], align 2, !tbaa [[TBAA6]]			// CHECK-128-NEXT: [[CASTFIXEDSVE:%.*]] = call <vscale x 2 x i8> @llvm.experimental.vector.insert.nxv2i8.v2i8(<vscale x 2 x i8> undef, <2 x i8> [[TMP0]], i64 0)
	// CHECK-128-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <2 x i8> [[SAVED_VALUE]] to <vscale x 16 x i1>*			// CHECK-128-NEXT: [[TMP1:%.*]] = bitcast <vscale x 2 x i8> [[CASTFIXEDSVE]] to <vscale x 16 x i1>
	// CHECK-128-NEXT: [[TMP1:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[CASTFIXEDSVE]], align 2, !tbaa [[TBAA6]]
	// CHECK-128-NEXT: ret <vscale x 16 x i1> [[TMP1]]			// CHECK-128-NEXT: ret <vscale x 16 x i1> [[TMP1]]
	//			//
	// CHECK-256-LABEL: @read_bool(			// CHECK-256-LABEL: @read_bool(
	// CHECK-256-NEXT: entry:			// CHECK-256-NEXT: entry:
	// CHECK-256-NEXT: [[SAVED_VALUE:%.*]] = alloca <4 x i8>, align 4
	// CHECK-256-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0			// CHECK-256-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0
	// CHECK-256-NEXT: [[TMP0:%.]] = load <4 x i8>, <4 x i8> [[ARRAYIDX]], align 2, !tbaa [[TBAA6]]			// CHECK-256-NEXT: [[TMP0:%.]] = load <4 x i8>, <4 x i8> [[ARRAYIDX]], align 2, !tbaa [[TBAA6]]
	// CHECK-256-NEXT: store <4 x i8> [[TMP0]], <4 x i8>* [[SAVED_VALUE]], align 4, !tbaa [[TBAA6]]			// CHECK-256-NEXT: [[CASTFIXEDSVE:%.*]] = call <vscale x 2 x i8> @llvm.experimental.vector.insert.nxv2i8.v4i8(<vscale x 2 x i8> undef, <4 x i8> [[TMP0]], i64 0)
	// CHECK-256-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <4 x i8> [[SAVED_VALUE]] to <vscale x 16 x i1>*			// CHECK-256-NEXT: [[TMP1:%.*]] = bitcast <vscale x 2 x i8> [[CASTFIXEDSVE]] to <vscale x 16 x i1>
	// CHECK-256-NEXT: [[TMP1:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[CASTFIXEDSVE]], align 4, !tbaa [[TBAA6]]
	// CHECK-256-NEXT: ret <vscale x 16 x i1> [[TMP1]]			// CHECK-256-NEXT: ret <vscale x 16 x i1> [[TMP1]]
	//			//
	// CHECK-512-LABEL: @read_bool(			// CHECK-512-LABEL: @read_bool(
	// CHECK-512-NEXT: entry:			// CHECK-512-NEXT: entry:
	// CHECK-512-NEXT: [[SAVED_VALUE:%.*]] = alloca <8 x i8>, align 8
	// CHECK-512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0			// CHECK-512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0
	// CHECK-512-NEXT: [[TMP0:%.]] = load <8 x i8>, <8 x i8> [[ARRAYIDX]], align 2, !tbaa [[TBAA6]]			// CHECK-512-NEXT: [[TMP0:%.]] = load <8 x i8>, <8 x i8> [[ARRAYIDX]], align 2, !tbaa [[TBAA6]]
	// CHECK-512-NEXT: store <8 x i8> [[TMP0]], <8 x i8>* [[SAVED_VALUE]], align 8, !tbaa [[TBAA6]]			// CHECK-512-NEXT: [[CASTFIXEDSVE:%.*]] = call <vscale x 2 x i8> @llvm.experimental.vector.insert.nxv2i8.v8i8(<vscale x 2 x i8> undef, <8 x i8> [[TMP0]], i64 0)
	// CHECK-512-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <8 x i8> [[SAVED_VALUE]] to <vscale x 16 x i1>*			// CHECK-512-NEXT: [[TMP1:%.*]] = bitcast <vscale x 2 x i8> [[CASTFIXEDSVE]] to <vscale x 16 x i1>
	// CHECK-512-NEXT: [[TMP1:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[CASTFIXEDSVE]], align 8, !tbaa [[TBAA6]]
	// CHECK-512-NEXT: ret <vscale x 16 x i1> [[TMP1]]			// CHECK-512-NEXT: ret <vscale x 16 x i1> [[TMP1]]
	//			//
	svbool_t read_bool(struct struct_bool *s) {			svbool_t read_bool(struct struct_bool *s) {
	return s->y[0];			return s->y[0];
	}			}

	// CHECK-128-LABEL: @write_bool(			// CHECK-128-LABEL: @write_bool(
	// CHECK-128-NEXT: entry:			// CHECK-128-NEXT: entry:
	// CHECK-128-NEXT: [[SAVED_VALUE:%.*]] = alloca <vscale x 16 x i1>, align 2			// CHECK-128-NEXT: [[TMP0:%.*]] = bitcast <vscale x 16 x i1> %x to <vscale x 2 x i8>
	// CHECK-128-NEXT: store <vscale x 16 x i1> [[X:%.]], <vscale x 16 x i1> [[SAVED_VALUE]], align 2, !tbaa [[TBAA9:![0-9]+]]			// CHECK-128-NEXT: [[CASTFIXEDSVE:%.*]] = call <2 x i8> @llvm.experimental.vector.extract.v2i8.nxv2i8(<vscale x 2 x i8> [[TMP0]], i64 0)
	// CHECK-128-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 16 x i1> [[SAVED_VALUE]] to <2 x i8>*
	// CHECK-128-NEXT: [[TMP0:%.]] = load <2 x i8>, <2 x i8> [[CASTFIXEDSVE]], align 2, !tbaa [[TBAA6]]
	// CHECK-128-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0			// CHECK-128-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0
	// CHECK-128-NEXT: store <2 x i8> [[TMP0]], <2 x i8>* [[ARRAYIDX]], align 2, !tbaa [[TBAA6]]			// CHECK-128-NEXT: store <2 x i8> [[CASTFIXEDSVE]], <2 x i8>* [[ARRAYIDX]], align 2, !tbaa [[TBAA6]]
	// CHECK-128-NEXT: ret void			// CHECK-128-NEXT: ret void
	//			//
	// CHECK-256-LABEL: @write_bool(			// CHECK-256-LABEL: @write_bool(
	// CHECK-256-NEXT: entry:			// CHECK-256-NEXT: entry:
	// CHECK-256-NEXT: [[SAVED_VALUE:%.*]] = alloca <vscale x 16 x i1>, align 4			// CHECK-256-NEXT: [[TMP0:%.*]] = bitcast <vscale x 16 x i1> %x to <vscale x 2 x i8>
	// CHECK-256-NEXT: store <vscale x 16 x i1> [[X:%.]], <vscale x 16 x i1> [[SAVED_VALUE]], align 4, !tbaa [[TBAA9:![0-9]+]]			// CHECK-256-NEXT: [[CASTFIXEDSVE:%.*]] = call <4 x i8> @llvm.experimental.vector.extract.v4i8.nxv2i8(<vscale x 2 x i8> [[TMP0]], i64 0)
	// CHECK-256-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 16 x i1> [[SAVED_VALUE]] to <4 x i8>*
	// CHECK-256-NEXT: [[TMP0:%.]] = load <4 x i8>, <4 x i8> [[CASTFIXEDSVE]], align 4, !tbaa [[TBAA6]]
	// CHECK-256-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0			// CHECK-256-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0
	// CHECK-256-NEXT: store <4 x i8> [[TMP0]], <4 x i8>* [[ARRAYIDX]], align 2, !tbaa [[TBAA6]]			// CHECK-256-NEXT: store <4 x i8> [[CASTFIXEDSVE]], <4 x i8>* [[ARRAYIDX]], align 2, !tbaa [[TBAA6]]
	// CHECK-256-NEXT: ret void			// CHECK-256-NEXT: ret void
	//			//
	// CHECK-512-LABEL: @write_bool(			// CHECK-512-LABEL: @write_bool(
	// CHECK-512-NEXT: entry:			// CHECK-512-NEXT: entry:
	// CHECK-512-NEXT: [[SAVED_VALUE:%.*]] = alloca <vscale x 16 x i1>, align 8			// CHECK-512-NEXT: [[TMP0:%.*]] = bitcast <vscale x 16 x i1> %x to <vscale x 2 x i8>
	// CHECK-512-NEXT: store <vscale x 16 x i1> [[X:%.]], <vscale x 16 x i1> [[SAVED_VALUE]], align 8, !tbaa [[TBAA9:![0-9]+]]			// CHECK-512-NEXT: [[CASTFIXEDSVE:%.*]] = call <8 x i8> @llvm.experimental.vector.extract.v8i8.nxv2i8(<vscale x 2 x i8> [[TMP0]], i64 0)
	// CHECK-512-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 16 x i1> [[SAVED_VALUE]] to <8 x i8>*
	// CHECK-512-NEXT: [[TMP0:%.]] = load <8 x i8>, <8 x i8> [[CASTFIXEDSVE]], align 8, !tbaa [[TBAA6]]
	// CHECK-512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0			// CHECK-512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0
	// CHECK-512-NEXT: store <8 x i8> [[TMP0]], <8 x i8>* [[ARRAYIDX]], align 2, !tbaa [[TBAA6]]			// CHECK-512-NEXT: store <8 x i8> [[CASTFIXEDSVE]], <8 x i8>* [[ARRAYIDX]], align 2, !tbaa [[TBAA6]]
	// CHECK-512-NEXT: ret void			// CHECK-512-NEXT: ret void
	//			//
	void write_bool(struct struct_bool *s, svbool_t x) {			void write_bool(struct struct_bool *s, svbool_t x) {
	s->y[0] = x;			s->y[0] = x;
	}			}

clang/test/CodeGen/attr-arm-sve-vector-bits-call.c

	Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
	// CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]			// CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]
	//			//
	fixed_float64_t call_float64_ff(svbool_t pg, fixed_float64_t op1, fixed_float64_t op2) {			fixed_float64_t call_float64_ff(svbool_t pg, fixed_float64_t op1, fixed_float64_t op2) {
	return svsel(pg, op1, op2);			return svsel(pg, op1, op2);
	}			}

	// CHECK-LABEL: @call_bool_ff(			// CHECK-LABEL: @call_bool_ff(
	// CHECK-NEXT: entry:			// CHECK-NEXT: entry:
	// CHECK-NEXT: [[OP1:%.*]] = alloca <8 x i8>, align 8			// CHECK-NEXT: [[TMP0:%.]] = call <vscale x 16 x i1> @llvm.aarch64.sve.sel.nxv16i1(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i1> [[OP1_COERCE:%.]], <vscale x 16 x i1> [[OP2_COERCE:%.]])
	// CHECK-NEXT: [[OP2:%.*]] = alloca <8 x i8>, align 8			// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP0]]
	// CHECK-NEXT: [[SAVED_VALUE:%.*]] = alloca <8 x i8>, align 8
	// CHECK-NEXT: [[SAVED_VALUE3:%.*]] = alloca <8 x i8>, align 8
	// CHECK-NEXT: [[SAVED_VALUE5:%.*]] = alloca <vscale x 16 x i1>, align 8
	// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 16 x i1>, align 8
	// CHECK-NEXT: [[TMP0:%.]] = bitcast <8 x i8> [[OP1]] to <vscale x 16 x i1>*
	// CHECK-NEXT: store <vscale x 16 x i1> [[OP1_COERCE:%.]], <vscale x 16 x i1> [[TMP0]], align 8
	// CHECK-NEXT: [[OP11:%.]] = load <8 x i8>, <8 x i8> [[OP1]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: [[TMP1:%.]] = bitcast <8 x i8> [[OP2]] to <vscale x 16 x i1>*
	// CHECK-NEXT: store <vscale x 16 x i1> [[OP2_COERCE:%.]], <vscale x 16 x i1> [[TMP1]], align 8
	// CHECK-NEXT: [[OP22:%.]] = load <8 x i8>, <8 x i8> [[OP2]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: store <8 x i8> [[OP11]], <8 x i8>* [[SAVED_VALUE]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <8 x i8> [[SAVED_VALUE]] to <vscale x 16 x i1>*
	// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[CASTFIXEDSVE]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: store <8 x i8> [[OP22]], <8 x i8>* [[SAVED_VALUE3]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: [[CASTFIXEDSVE4:%.]] = bitcast <8 x i8> [[SAVED_VALUE3]] to <vscale x 16 x i1>*
	// CHECK-NEXT: [[TMP3:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[CASTFIXEDSVE4]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: [[TMP4:%.]] = call <vscale x 16 x i1> @llvm.aarch64.sve.sel.nxv16i1(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i1> [[TMP2]], <vscale x 16 x i1> [[TMP3]])
	// CHECK-NEXT: store <vscale x 16 x i1> [[TMP4]], <vscale x 16 x i1>* [[SAVED_VALUE5]], align 8, !tbaa [[TBAA9:![0-9]+]]
	// CHECK-NEXT: [[CASTFIXEDSVE6:%.]] = bitcast <vscale x 16 x i1> [[SAVED_VALUE5]] to <8 x i8>*
	// CHECK-NEXT: [[TMP5:%.]] = load <8 x i8>, <8 x i8> [[CASTFIXEDSVE6]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 16 x i1> [[RETVAL_COERCE]] to <8 x i8>*
	// CHECK-NEXT: store <8 x i8> [[TMP5]], <8 x i8>* [[RETVAL_0__SROA_CAST]], align 8
	// CHECK-NEXT: [[TMP6:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[RETVAL_COERCE]], align 8
	// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP6]]
	//			//
	fixed_bool_t call_bool_ff(svbool_t pg, fixed_bool_t op1, fixed_bool_t op2) {			fixed_bool_t call_bool_ff(svbool_t pg, fixed_bool_t op1, fixed_bool_t op2) {
	return svsel(pg, op1, op2);			return svsel(pg, op1, op2);
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// fixed, scalable			// fixed, scalable
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	Show All 15 Lines
	// CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]			// CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]
	//			//
	fixed_float64_t call_float64_fs(svbool_t pg, fixed_float64_t op1, svfloat64_t op2) {			fixed_float64_t call_float64_fs(svbool_t pg, fixed_float64_t op1, svfloat64_t op2) {
	return svsel(pg, op1, op2);			return svsel(pg, op1, op2);
	}			}

	// CHECK-LABEL: @call_bool_fs(			// CHECK-LABEL: @call_bool_fs(
	// CHECK-NEXT: entry:			// CHECK-NEXT: entry:
	// CHECK-NEXT: [[OP1:%.*]] = alloca <8 x i8>, align 8			// CHECK-NEXT: [[TMP0:%.]] = call <vscale x 16 x i1> @llvm.aarch64.sve.sel.nxv16i1(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i1> [[OP1_COERCE:%.]], <vscale x 16 x i1> [[OP2_COERCE:%.]])
	// CHECK-NEXT: [[SAVED_VALUE:%.*]] = alloca <8 x i8>, align 8			// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP0]]
	// CHECK-NEXT: [[SAVED_VALUE2:%.*]] = alloca <vscale x 16 x i1>, align 8
	// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 16 x i1>, align 8
	// CHECK-NEXT: [[TMP0:%.]] = bitcast <8 x i8> [[OP1]] to <vscale x 16 x i1>*
	// CHECK-NEXT: store <vscale x 16 x i1> [[OP1_COERCE:%.]], <vscale x 16 x i1> [[TMP0]], align 8
	// CHECK-NEXT: [[OP11:%.]] = load <8 x i8>, <8 x i8> [[OP1]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: store <8 x i8> [[OP11]], <8 x i8>* [[SAVED_VALUE]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <8 x i8> [[SAVED_VALUE]] to <vscale x 16 x i1>*
	// CHECK-NEXT: [[TMP1:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[CASTFIXEDSVE]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: [[TMP2:%.]] = call <vscale x 16 x i1> @llvm.aarch64.sve.sel.nxv16i1(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i1> [[TMP1]], <vscale x 16 x i1> [[OP2:%.*]])
	// CHECK-NEXT: store <vscale x 16 x i1> [[TMP2]], <vscale x 16 x i1>* [[SAVED_VALUE2]], align 8, !tbaa [[TBAA9]]
	// CHECK-NEXT: [[CASTFIXEDSVE3:%.]] = bitcast <vscale x 16 x i1> [[SAVED_VALUE2]] to <8 x i8>*
	// CHECK-NEXT: [[TMP3:%.]] = load <8 x i8>, <8 x i8> [[CASTFIXEDSVE3]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 16 x i1> [[RETVAL_COERCE]] to <8 x i8>*
	// CHECK-NEXT: store <8 x i8> [[TMP3]], <8 x i8>* [[RETVAL_0__SROA_CAST]], align 8
	// CHECK-NEXT: [[TMP4:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[RETVAL_COERCE]], align 8
	// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP4]]
	//			//
	fixed_bool_t call_bool_fs(svbool_t pg, fixed_bool_t op1, svbool_t op2) {			fixed_bool_t call_bool_fs(svbool_t pg, fixed_bool_t op1, svbool_t op2) {
	return svsel(pg, op1, op2);			return svsel(pg, op1, op2);
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// scalable, scalable			// scalable, scalable
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	Show All 15 Lines
	// CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]			// CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]
	//			//
	fixed_float64_t call_float64_ss(svbool_t pg, svfloat64_t op1, svfloat64_t op2) {			fixed_float64_t call_float64_ss(svbool_t pg, svfloat64_t op1, svfloat64_t op2) {
	return svsel(pg, op1, op2);			return svsel(pg, op1, op2);
	}			}

	// CHECK-LABEL: @call_bool_ss(			// CHECK-LABEL: @call_bool_ss(
	// CHECK-NEXT: entry:			// CHECK-NEXT: entry:
	// CHECK-NEXT: [[SAVED_VALUE:%.*]] = alloca <vscale x 16 x i1>, align 8			// CHECK-NEXT: [[TMP0:%.]] = call <vscale x 16 x i1> @llvm.aarch64.sve.sel.nxv16i1(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i1> [[OP1_COERCE:%.]], <vscale x 16 x i1> [[OP2_COERCE:%.]])
	// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 16 x i1>, align 8			// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP0]]
	// CHECK-NEXT: [[TMP0:%.]] = call <vscale x 16 x i1> @llvm.aarch64.sve.sel.nxv16i1(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i1> [[OP1:%.]], <vscale x 16 x i1> [[OP2:%.]])
	// CHECK-NEXT: store <vscale x 16 x i1> [[TMP0]], <vscale x 16 x i1>* [[SAVED_VALUE]], align 8, !tbaa [[TBAA9]]
	// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 16 x i1> [[SAVED_VALUE]] to <8 x i8>*
	// CHECK-NEXT: [[TMP1:%.]] = load <8 x i8>, <8 x i8> [[CASTFIXEDSVE]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 16 x i1> [[RETVAL_COERCE]] to <8 x i8>*
	// CHECK-NEXT: store <8 x i8> [[TMP1]], <8 x i8>* [[RETVAL_0__SROA_CAST]], align 8
	// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[RETVAL_COERCE]], align 8
	// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP2]]
	//			//
	fixed_bool_t call_bool_ss(svbool_t pg, svbool_t op1, svbool_t op2) {			fixed_bool_t call_bool_ss(svbool_t pg, svbool_t op1, svbool_t op2) {
	return svsel(pg, op1, op2);			return svsel(pg, op1, op2);
	}			}

clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c

	// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py			// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
	// REQUIRES: aarch64-registered-target			// REQUIRES: aarch64-registered-target
	// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -fallow-half-arguments-and-returns -S -O1 -emit-llvm -o - %s \| FileCheck %s			// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -fallow-half-arguments-and-returns -S -O1 -emit-llvm -o - %s \| FileCheck %s

	#include <arm_sve.h>			#include <arm_sve.h>

	#define N __ARM_FEATURE_SVE_BITS			#define N __ARM_FEATURE_SVE_BITS

	typedef svint32_t fixed_int32_t __attribute__((arm_sve_vector_bits(N)));			typedef svint32_t fixed_int32_t __attribute__((arm_sve_vector_bits(N)));
	typedef svfloat64_t fixed_float64_t __attribute__((arm_sve_vector_bits(N)));			typedef svfloat64_t fixed_float64_t __attribute__((arm_sve_vector_bits(N)));
				c-rhodesUnsubmitted Not Done Reply Inline Actions nit: unused? c-rhodes: nit: unused?
	typedef svbool_t fixed_bool_t __attribute__((arm_sve_vector_bits(N)));			typedef svbool_t fixed_bool_t __attribute__((arm_sve_vector_bits(N)));
	typedef int32_t gnu_int32_t __attribute__((vector_size(N / 8)));			typedef int32_t gnu_int32_t __attribute__((vector_size(N / 8)));

	// CHECK-LABEL: @to_svint32_t(			// CHECK-LABEL: @to_svint32_t(
	// CHECK-NEXT: entry:			// CHECK-NEXT: entry:
	// CHECK-NEXT: ret <vscale x 4 x i32> [[TYPE_COERCE:%.*]]			// CHECK-NEXT: ret <vscale x 4 x i32> [[TYPE_COERCE:%.*]]
	//			//
	svint32_t to_svint32_t(fixed_int32_t type) {			svint32_t to_svint32_t(fixed_int32_t type) {
	Show All 21 Lines
	// CHECK-NEXT: ret <vscale x 2 x double> [[TYPE:%.*]]			// CHECK-NEXT: ret <vscale x 2 x double> [[TYPE:%.*]]
	//			//
	fixed_float64_t from_svfloat64_t(svfloat64_t type) {			fixed_float64_t from_svfloat64_t(svfloat64_t type) {
	return type;			return type;
	}			}

	// CHECK-LABEL: @to_svbool_t(			// CHECK-LABEL: @to_svbool_t(
	// CHECK-NEXT: entry:			// CHECK-NEXT: entry:
	// CHECK-NEXT: [[TYPE:%.*]] = alloca <8 x i8>, align 8			// CHECK-NEXT: ret <vscale x 16 x i1> [[TYPE:%.*]]
	// CHECK-NEXT: [[SAVED_VALUE:%.*]] = alloca <8 x i8>, align 8
	// CHECK-NEXT: [[TMP0:%.]] = bitcast <8 x i8> [[TYPE]] to <vscale x 16 x i1>*
	// CHECK-NEXT: store <vscale x 16 x i1> [[TYPE_COERCE:%.]], <vscale x 16 x i1> [[TMP0]], align 8
	// CHECK-NEXT: [[TYPE1:%.]] = load <8 x i8>, <8 x i8> [[TYPE]], align 8, !tbaa [[TBAA6:![0-9]+]]
	// CHECK-NEXT: store <8 x i8> [[TYPE1]], <8 x i8>* [[SAVED_VALUE]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <8 x i8> [[SAVED_VALUE]] to <vscale x 16 x i1>*
	// CHECK-NEXT: [[TMP1:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[CASTFIXEDSVE]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP1]]
	//			//
	svbool_t to_svbool_t(fixed_bool_t type) {			svbool_t to_svbool_t(fixed_bool_t type) {
	return type;			return type;
	}			}

	// CHECK-LABEL: @from_svbool_t(			// CHECK-LABEL: @from_svbool_t(
	// CHECK-NEXT: entry:			// CHECK-NEXT: entry:
	// CHECK-NEXT: [[SAVED_VALUE:%.*]] = alloca <vscale x 16 x i1>, align 8			// CHECK-NEXT: ret <vscale x 16 x i1> [[TYPE:%.*]]
	// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 16 x i1>, align 8
	// CHECK-NEXT: store <vscale x 16 x i1> [[TYPE:%.]], <vscale x 16 x i1> [[SAVED_VALUE]], align 8, !tbaa [[TBAA9:![0-9]+]]
	// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 16 x i1> [[SAVED_VALUE]] to <8 x i8>*
	// CHECK-NEXT: [[TMP0:%.]] = load <8 x i8>, <8 x i8> [[CASTFIXEDSVE]], align 8, !tbaa [[TBAA6]]
	// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 16 x i1> [[RETVAL_COERCE]] to <8 x i8>*
	// CHECK-NEXT: store <8 x i8> [[TMP0]], <8 x i8>* [[RETVAL_0__SROA_CAST]], align 8
	// CHECK-NEXT: [[TMP1:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[RETVAL_COERCE]], align 8
	// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP1]]
	//			//
	fixed_bool_t from_svbool_t(svbool_t type) {			fixed_bool_t from_svbool_t(svbool_t type) {
	return type;			return type;
	}			}

				// CHECK-LABEL: @lax_cast(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[TMP0:%.*]] = alloca <16 x i32>, align 64
				// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = call <16 x i32> @llvm.experimental.vector.extract.v16i32.nxv4i32(<vscale x 4 x i32> [[TYPE_COERCE:%.]], i64 0)
				// CHECK-NEXT: store <16 x i32> [[CASTFIXEDSVE]], <16 x i32>* [[TMP0:%.*]], align 64, !tbaa [[TBAA6:![0-9]+]]
				// CHECK-NEXT: [[TMP1:%.]] = bitcast <16 x i32> [[TMP0]] to <vscale x 2 x i64>*
				// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 2 x i64>, <vscale x 2 x i64> [[TMP1]], align 64, !tbaa [[TBAA6]]
				// CHECK-NEXT: ret <vscale x 2 x i64> [[TMP2]]
				//
				svint64_t lax_cast(fixed_int32_t type) {
				return type;
				}

	// CHECK-LABEL: @to_svint32_t__from_gnu_int32_t(			// CHECK-LABEL: @to_svint32_t__from_gnu_int32_t(
	// CHECK-NEXT: entry:			// CHECK-NEXT: entry:
	// CHECK-NEXT: [[TYPE:%.]] = load <16 x i32>, <16 x i32> [[TMP0:%.*]], align 16, !tbaa [[TBAA6]]			// CHECK-NEXT: [[TYPE:%.]] = load <16 x i32>, <16 x i32> [[TMP0:%.*]], align 16, !tbaa [[TBAA6:![0-9]+]]
	// CHECK-NEXT: [[CASTSCALABLESVE:%.*]] = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v16i32(<vscale x 4 x i32> undef, <16 x i32> [[TYPE]], i64 0)			// CHECK-NEXT: [[CASTSCALABLESVE:%.*]] = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v16i32(<vscale x 4 x i32> undef, <16 x i32> [[TYPE]], i64 0)
	// CHECK-NEXT: ret <vscale x 4 x i32> [[CASTSCALABLESVE]]			// CHECK-NEXT: ret <vscale x 4 x i32> [[CASTSCALABLESVE]]
	//			//
	svint32_t to_svint32_t__from_gnu_int32_t(gnu_int32_t type) {			svint32_t to_svint32_t__from_gnu_int32_t(gnu_int32_t type) {
	return type;			return type;
	}			}

	// CHECK-LABEL: @from_svint32_t__to_gnu_int32_t(			// CHECK-LABEL: @from_svint32_t__to_gnu_int32_t(
	Show All 28 Lines

clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c

Show All 12 Lines
fixed_int32_t global_vec;		fixed_int32_t global_vec;

// CHECK-LABEL: @foo(		// CHECK-LABEL: @foo(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[RETVAL:%.*]] = alloca <16 x i32>, align 16		// CHECK-NEXT: [[RETVAL:%.*]] = alloca <16 x i32>, align 16
// CHECK-NEXT: [[PRED_ADDR:%.*]] = alloca <vscale x 16 x i1>, align 2		// CHECK-NEXT: [[PRED_ADDR:%.*]] = alloca <vscale x 16 x i1>, align 2
// CHECK-NEXT: [[VEC_ADDR:%.*]] = alloca <vscale x 4 x i32>, align 16		// CHECK-NEXT: [[VEC_ADDR:%.*]] = alloca <vscale x 4 x i32>, align 16
// CHECK-NEXT: [[PG:%.*]] = alloca <vscale x 16 x i1>, align 2		// CHECK-NEXT: [[PG:%.*]] = alloca <vscale x 16 x i1>, align 2
// CHECK-NEXT: [[SAVED_VALUE:%.*]] = alloca <8 x i8>, align 8
// CHECK-NEXT: [[SAVED_VALUE1:%.*]] = alloca <8 x i8>, align 8
// CHECK-NEXT: store <vscale x 16 x i1> [[PRED:%.]], <vscale x 16 x i1> [[PRED_ADDR]], align 2		// CHECK-NEXT: store <vscale x 16 x i1> [[PRED:%.]], <vscale x 16 x i1> [[PRED_ADDR]], align 2
// CHECK-NEXT: store <vscale x 4 x i32> [[VEC:%.]], <vscale x 4 x i32> [[VEC_ADDR]], align 16		// CHECK-NEXT: store <vscale x 4 x i32> [[VEC:%.]], <vscale x 4 x i32> [[VEC_ADDR]], align 16
// CHECK-NEXT: [[TMP0:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[PRED_ADDR]], align 2		// CHECK-NEXT: [[TMP0:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[PRED_ADDR]], align 2
// CHECK-NEXT: [[TMP1:%.]] = load <8 x i8>, <8 x i8> @global_pred, align 2		// CHECK-NEXT: [[TMP1:%.]] = load <8 x i8>, <8 x i8> @global_pred, align 2
// CHECK-NEXT: store <8 x i8> [[TMP1]], <8 x i8>* [[SAVED_VALUE]], align 8		// CHECK-NEXT: [[CASTFIXEDSVE:%.*]] = call <vscale x 2 x i8> @llvm.experimental.vector.insert.nxv2i8.v8i8(<vscale x 2 x i8> undef, <8 x i8> [[TMP1]], i64 0)
// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <8 x i8> [[SAVED_VALUE]] to <vscale x 16 x i1>*		// CHECK-NEXT: [[TMP2:%.*]] = bitcast <vscale x 2 x i8> [[CASTFIXEDSVE]] to <vscale x 16 x i1>
// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[CASTFIXEDSVE]], align 8
// CHECK-NEXT: [[TMP3:%.]] = load <8 x i8>, <8 x i8> @global_pred, align 2		// CHECK-NEXT: [[TMP3:%.]] = load <8 x i8>, <8 x i8> @global_pred, align 2
// CHECK-NEXT: store <8 x i8> [[TMP3]], <8 x i8>* [[SAVED_VALUE1]], align 8		// CHECK-NEXT: [[CASTFIXEDSVE2:%.*]] = call <vscale x 2 x i8> @llvm.experimental.vector.insert.nxv2i8.v8i8(<vscale x 2 x i8> undef, <8 x i8> [[TMP3]], i64 0)
// CHECK-NEXT: [[CASTFIXEDSVE2:%.]] = bitcast <8 x i8> [[SAVED_VALUE1]] to <vscale x 16 x i1>*		// CHECK-NEXT: [[TMP4:%.*]] = bitcast <vscale x 2 x i8> [[CASTFIXEDSVE2]] to <vscale x 16 x i1>
// CHECK-NEXT: [[TMP4:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[CASTFIXEDSVE2]], align 8
// CHECK-NEXT: [[TMP5:%.*]] = call <vscale x 16 x i1> @llvm.aarch64.sve.and.z.nxv16i1(<vscale x 16 x i1> [[TMP0]], <vscale x 16 x i1> [[TMP2]], <vscale x 16 x i1> [[TMP4]])		// CHECK-NEXT: [[TMP5:%.*]] = call <vscale x 16 x i1> @llvm.aarch64.sve.and.z.nxv16i1(<vscale x 16 x i1> [[TMP0]], <vscale x 16 x i1> [[TMP2]], <vscale x 16 x i1> [[TMP4]])
// CHECK-NEXT: store <vscale x 16 x i1> [[TMP5]], <vscale x 16 x i1>* [[PG]], align 2		// CHECK-NEXT: store <vscale x 16 x i1> [[TMP5]], <vscale x 16 x i1>* [[PG]], align 2
// CHECK-NEXT: [[TMP6:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[PG]], align 2		// CHECK-NEXT: [[TMP6:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[PG]], align 2
// CHECK-NEXT: [[TMP7:%.]] = load <16 x i32>, <16 x i32> @global_vec, align 16		// CHECK-NEXT: [[TMP7:%.]] = load <16 x i32>, <16 x i32> @global_vec, align 16
// CHECK-NEXT: [[CASTSCALABLESVE:%.*]] = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v16i32(<vscale x 4 x i32> undef, <16 x i32> [[TMP7]], i64 0)		// CHECK-NEXT: [[CASTSCALABLESVE:%.*]] = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v16i32(<vscale x 4 x i32> undef, <16 x i32> [[TMP7]], i64 0)
// CHECK-NEXT: [[TMP8:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[VEC_ADDR]], align 16		// CHECK-NEXT: [[TMP8:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[VEC_ADDR]], align 16
// CHECK-NEXT: [[TMP9:%.*]] = call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[TMP6]])		// CHECK-NEXT: [[TMP9:%.*]] = call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[TMP6]])
// CHECK-NEXT: [[TMP10:%.*]] = call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP9]], <vscale x 4 x i32> [[CASTSCALABLESVE]], <vscale x 4 x i32> [[TMP8]])		// CHECK-NEXT: [[TMP10:%.*]] = call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP9]], <vscale x 4 x i32> [[CASTSCALABLESVE]], <vscale x 4 x i32> [[TMP8]])
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	fixed_int32_t array_arg(fixed_int32_t arr[]) {
return arr[0];		return arr[0];
}		}

// CHECK-LABEL: @address_of_array_idx(		// CHECK-LABEL: @address_of_array_idx(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[RETVAL:%.*]] = alloca <8 x i8>, align 2		// CHECK-NEXT: [[RETVAL:%.*]] = alloca <8 x i8>, align 2
// CHECK-NEXT: [[ARR:%.*]] = alloca [3 x <8 x i8>], align 2		// CHECK-NEXT: [[ARR:%.*]] = alloca [3 x <8 x i8>], align 2
// CHECK-NEXT: [[PARR:%.]] = alloca <8 x i8>, align 8		// CHECK-NEXT: [[PARR:%.]] = alloca <8 x i8>, align 8
// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 16 x i1>, align 2
// CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [3 x <8 x i8>], [3 x <8 x i8>] [[ARR]], i64 0, i64 0		// CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [3 x <8 x i8>], [3 x <8 x i8>] [[ARR]], i64 0, i64 0
// CHECK-NEXT: store <8 x i8>* [[ARRAYIDX]], <8 x i8>** [[PARR]], align 8		// CHECK-NEXT: store <8 x i8>* [[ARRAYIDX]], <8 x i8>** [[PARR]], align 8
// CHECK-NEXT: [[TMP0:%.]] = load <8 x i8>, <8 x i8>** [[PARR]], align 8		// CHECK-NEXT: [[TMP0:%.]] = load <8 x i8>, <8 x i8>** [[PARR]], align 8
// CHECK-NEXT: [[TMP1:%.]] = load <8 x i8>, <8 x i8> [[TMP0]], align 2		// CHECK-NEXT: [[TMP1:%.]] = load <8 x i8>, <8 x i8> [[TMP0]], align 2
// CHECK-NEXT: store <8 x i8> [[TMP1]], <8 x i8>* [[RETVAL]], align 2		// CHECK-NEXT: store <8 x i8> [[TMP1]], <8 x i8>* [[RETVAL]], align 2
// CHECK-NEXT: [[TMP2:%.]] = bitcast <vscale x 16 x i1> [[RETVAL_COERCE]] to i8*		// CHECK-NEXT: [[TMP2:%.]] = load <8 x i8>, <8 x i8> [[RETVAL]], align 2
// CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i8> [[RETVAL]] to i8*		// CHECK-NEXT: [[CASTFIXEDSVE:%.*]] = call <vscale x 2 x i8> @llvm.experimental.vector.insert.nxv2i8.v8i8(<vscale x 2 x i8> undef, <8 x i8> [[TMP2]], i64 0)
// CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 2 [[TMP2]], i8* align 2 [[TMP3]], i64 8, i1 false)		// CHECK-NEXT: [[TMP3:%.*]] = bitcast <vscale x 2 x i8> [[CASTFIXEDSVE]] to <vscale x 16 x i1>
// CHECK-NEXT: [[TMP4:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[RETVAL_COERCE]], align 2		// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP3]]
// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP4]]
//		//
fixed_bool_t address_of_array_idx() {		fixed_bool_t address_of_array_idx() {
fixed_bool_t arr[3];		fixed_bool_t arr[3];
fixed_bool_t *parr;		fixed_bool_t *parr;
parr = &arr[0];		parr = &arr[0];
return *parr;		return *parr;
}		}

// CHECK-LABEL: @test_cast(		// CHECK-LABEL: @test_cast(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[RETVAL:%.*]] = alloca <16 x i32>, align 16		// CHECK-NEXT: [[RETVAL:%.*]] = alloca <16 x i32>, align 16
// CHECK-NEXT: [[PRED_ADDR:%.*]] = alloca <vscale x 16 x i1>, align 2		// CHECK-NEXT: [[PRED_ADDR:%.*]] = alloca <vscale x 16 x i1>, align 2
// CHECK-NEXT: [[VEC_ADDR:%.*]] = alloca <vscale x 4 x i32>, align 16		// CHECK-NEXT: [[VEC_ADDR:%.*]] = alloca <vscale x 4 x i32>, align 16
// CHECK-NEXT: [[XX:%.*]] = alloca <8 x i8>, align 8		// CHECK-NEXT: [[XX:%.*]] = alloca <8 x i8>, align 8
// CHECK-NEXT: [[YY:%.*]] = alloca <8 x i8>, align 8		// CHECK-NEXT: [[YY:%.*]] = alloca <8 x i8>, align 8
// CHECK-NEXT: [[PG:%.*]] = alloca <vscale x 16 x i1>, align 2		// CHECK-NEXT: [[PG:%.*]] = alloca <vscale x 16 x i1>, align 2
// CHECK-NEXT: [[SAVED_VALUE:%.*]] = alloca <8 x i8>, align 8
// CHECK-NEXT: [[SAVED_VALUE1:%.*]] = alloca <8 x i8>, align 8
// CHECK-NEXT: store <vscale x 16 x i1> [[PRED:%.]], <vscale x 16 x i1> [[PRED_ADDR]], align 2		// CHECK-NEXT: store <vscale x 16 x i1> [[PRED:%.]], <vscale x 16 x i1> [[PRED_ADDR]], align 2
// CHECK-NEXT: store <vscale x 4 x i32> [[VEC:%.]], <vscale x 4 x i32> [[VEC_ADDR]], align 16		// CHECK-NEXT: store <vscale x 4 x i32> [[VEC:%.]], <vscale x 4 x i32> [[VEC_ADDR]], align 16
// CHECK-NEXT: store <8 x i8> <i8 1, i8 2, i8 3, i8 4, i8 0, i8 0, i8 0, i8 0>, <8 x i8>* [[XX]], align 8		// CHECK-NEXT: store <8 x i8> <i8 1, i8 2, i8 3, i8 4, i8 0, i8 0, i8 0, i8 0>, <8 x i8>* [[XX]], align 8
// CHECK-NEXT: store <8 x i8> <i8 2, i8 5, i8 4, i8 6, i8 0, i8 0, i8 0, i8 0>, <8 x i8>* [[YY]], align 8		// CHECK-NEXT: store <8 x i8> <i8 2, i8 5, i8 4, i8 6, i8 0, i8 0, i8 0, i8 0>, <8 x i8>* [[YY]], align 8
// CHECK-NEXT: [[TMP0:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[PRED_ADDR]], align 2		// CHECK-NEXT: [[TMP0:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[PRED_ADDR]], align 2
// CHECK-NEXT: [[TMP1:%.]] = load <8 x i8>, <8 x i8> @global_pred, align 2		// CHECK-NEXT: [[TMP1:%.]] = load <8 x i8>, <8 x i8> @global_pred, align 2
// CHECK-NEXT: store <8 x i8> [[TMP1]], <8 x i8>* [[SAVED_VALUE]], align 8		// CHECK-NEXT: [[CASTFIXEDSVE:%.*]] = call <vscale x 2 x i8> @llvm.experimental.vector.insert.nxv2i8.v8i8(<vscale x 2 x i8> undef, <8 x i8> [[TMP1]], i64 0)
// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <8 x i8> [[SAVED_VALUE]] to <vscale x 16 x i1>*		// CHECK-NEXT: [[TMP2:%.*]] = bitcast <vscale x 2 x i8> [[CASTFIXEDSVE]] to <vscale x 16 x i1>
// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[CASTFIXEDSVE]], align 8
// CHECK-NEXT: [[TMP3:%.]] = load <8 x i8>, <8 x i8> [[XX]], align 8		// CHECK-NEXT: [[TMP3:%.]] = load <8 x i8>, <8 x i8> [[XX]], align 8
// CHECK-NEXT: [[TMP4:%.]] = load <8 x i8>, <8 x i8> [[YY]], align 8		// CHECK-NEXT: [[TMP4:%.]] = load <8 x i8>, <8 x i8> [[YY]], align 8
// CHECK-NEXT: [[ADD:%.*]] = add <8 x i8> [[TMP3]], [[TMP4]]		// CHECK-NEXT: [[ADD:%.*]] = add <8 x i8> [[TMP3]], [[TMP4]]
// CHECK-NEXT: store <8 x i8> [[ADD]], <8 x i8>* [[SAVED_VALUE1]], align 8		// CHECK-NEXT: [[CASTFIXEDSVE2:%.*]] = call <vscale x 2 x i8> @llvm.experimental.vector.insert.nxv2i8.v8i8(<vscale x 2 x i8> undef, <8 x i8> [[ADD]], i64 0)
// CHECK-NEXT: [[CASTFIXEDSVE2:%.]] = bitcast <8 x i8> [[SAVED_VALUE1]] to <vscale x 16 x i1>*		// CHECK-NEXT: [[TMP5:%.*]] = bitcast <vscale x 2 x i8> [[CASTFIXEDSVE2]] to <vscale x 16 x i1>
// CHECK-NEXT: [[TMP5:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[CASTFIXEDSVE2]], align 8
// CHECK-NEXT: [[TMP6:%.*]] = call <vscale x 16 x i1> @llvm.aarch64.sve.and.z.nxv16i1(<vscale x 16 x i1> [[TMP0]], <vscale x 16 x i1> [[TMP2]], <vscale x 16 x i1> [[TMP5]])		// CHECK-NEXT: [[TMP6:%.*]] = call <vscale x 16 x i1> @llvm.aarch64.sve.and.z.nxv16i1(<vscale x 16 x i1> [[TMP0]], <vscale x 16 x i1> [[TMP2]], <vscale x 16 x i1> [[TMP5]])
// CHECK-NEXT: store <vscale x 16 x i1> [[TMP6]], <vscale x 16 x i1>* [[PG]], align 2		// CHECK-NEXT: store <vscale x 16 x i1> [[TMP6]], <vscale x 16 x i1>* [[PG]], align 2
// CHECK-NEXT: [[TMP7:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[PG]], align 2		// CHECK-NEXT: [[TMP7:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[PG]], align 2
// CHECK-NEXT: [[TMP8:%.]] = load <16 x i32>, <16 x i32> @global_vec, align 16		// CHECK-NEXT: [[TMP8:%.]] = load <16 x i32>, <16 x i32> @global_vec, align 16
// CHECK-NEXT: [[CASTSCALABLESVE:%.*]] = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v16i32(<vscale x 4 x i32> undef, <16 x i32> [[TMP8]], i64 0)		// CHECK-NEXT: [[CASTSCALABLESVE:%.*]] = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v16i32(<vscale x 4 x i32> undef, <16 x i32> [[TMP8]], i64 0)
// CHECK-NEXT: [[TMP9:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[VEC_ADDR]], align 16		// CHECK-NEXT: [[TMP9:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[VEC_ADDR]], align 16
// CHECK-NEXT: [[TMP10:%.*]] = call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[TMP7]])		// CHECK-NEXT: [[TMP10:%.*]] = call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[TMP7]])
// CHECK-NEXT: [[TMP11:%.*]] = call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP10]], <vscale x 4 x i32> [[CASTSCALABLESVE]], <vscale x 4 x i32> [[TMP9]])		// CHECK-NEXT: [[TMP11:%.*]] = call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP10]], <vscale x 4 x i32> [[CASTSCALABLESVE]], <vscale x 4 x i32> [[TMP9]])
Show All 12 Lines

clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c

	Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
	// CHECK-512-NEXT: [[CASTFIXEDSVE:%.]] = call <32 x bfloat> @llvm.experimental.vector.extract.v32bf16.nxv8bf16(<vscale x 8 x bfloat> [[V:%.]], i64 0)			// CHECK-512-NEXT: [[CASTFIXEDSVE:%.]] = call <32 x bfloat> @llvm.experimental.vector.extract.v32bf16.nxv8bf16(<vscale x 8 x bfloat> [[V:%.]], i64 0)
	// CHECK-512-NEXT: store <32 x bfloat> [[CASTFIXEDSVE]], <32 x bfloat>* @global_bf16, align 16, !tbaa [[TBAA6]]			// CHECK-512-NEXT: store <32 x bfloat> [[CASTFIXEDSVE]], <32 x bfloat>* @global_bf16, align 16, !tbaa [[TBAA6]]
	// CHECK-512-NEXT: ret void			// CHECK-512-NEXT: ret void
	//			//
	void write_global_bf16(svbfloat16_t v) { global_bf16 = v; }			void write_global_bf16(svbfloat16_t v) { global_bf16 = v; }

	// CHECK-128-LABEL: @write_global_bool(			// CHECK-128-LABEL: @write_global_bool(
	// CHECK-128-NEXT: entry:			// CHECK-128-NEXT: entry:
	// CHECK-128-NEXT: [[SAVED_VALUE:%.*]] = alloca <vscale x 16 x i1>, align 2			// CHECK-128-NEXT: [[TMP0:%.]] = bitcast <vscale x 16 x i1> [[V:%.]] to <vscale x 2 x i8>
	// CHECK-128-NEXT: store <vscale x 16 x i1> [[V:%.]], <vscale x 16 x i1> [[SAVED_VALUE]], align 2, !tbaa [[TBAA9:![0-9]+]]			// CHECK-128-NEXT: [[CASTFIXEDSVE:%.*]] = call <2 x i8> @llvm.experimental.vector.extract.v2i8.nxv2i8(<vscale x 2 x i8> [[TMP0]], i64 0)
	// CHECK-128-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 16 x i1> [[SAVED_VALUE]] to <2 x i8>*			// CHECK-128-NEXT: store <2 x i8> [[CASTFIXEDSVE]], <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6:![0-9]+]]
	// CHECK-128-NEXT: [[TMP0:%.]] = load <2 x i8>, <2 x i8> [[CASTFIXEDSVE]], align 2, !tbaa [[TBAA6]]
	// CHECK-128-NEXT: store <2 x i8> [[TMP0]], <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
	// CHECK-128-NEXT: ret void			// CHECK-128-NEXT: ret void
	//			//
	// CHECK-512-LABEL: @write_global_bool(			// CHECK-512-LABEL: @write_global_bool(
	// CHECK-512-NEXT: entry:			// CHECK-512-NEXT: entry:
	// CHECK-512-NEXT: [[SAVED_VALUE:%.*]] = alloca <vscale x 16 x i1>, align 8			// CHECK-512-NEXT: [[TMP0:%.]] = bitcast <vscale x 16 x i1> [[V:%.]] to <vscale x 2 x i8>
	// CHECK-512-NEXT: store <vscale x 16 x i1> [[V:%.]], <vscale x 16 x i1> [[SAVED_VALUE]], align 8, !tbaa [[TBAA9:![0-9]+]]			// CHECK-512-NEXT: [[CASTFIXEDSVE:%.*]] = call <8 x i8> @llvm.experimental.vector.extract.v8i8.nxv2i8(<vscale x 2 x i8> [[TMP0]], i64 0)
	// CHECK-512-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 16 x i1> [[SAVED_VALUE]] to <8 x i8>*			// CHECK-512-NEXT: store <8 x i8> [[CASTFIXEDSVE]], <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
	// CHECK-512-NEXT: [[TMP0:%.]] = load <8 x i8>, <8 x i8> [[CASTFIXEDSVE]], align 8, !tbaa [[TBAA6]]
	// CHECK-512-NEXT: store <8 x i8> [[TMP0]], <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
	// CHECK-512-NEXT: ret void			// CHECK-512-NEXT: ret void
	//			//
	void write_global_bool(svbool_t v) { global_bool = v; }			void write_global_bool(svbool_t v) { global_bool = v; }

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// READS			// READS
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	Show All 22 Lines
	// CHECK-512-NEXT: [[TMP0:%.]] = load <32 x bfloat>, <32 x bfloat> @global_bf16, align 16, !tbaa [[TBAA6]]			// CHECK-512-NEXT: [[TMP0:%.]] = load <32 x bfloat>, <32 x bfloat> @global_bf16, align 16, !tbaa [[TBAA6]]
	// CHECK-512-NEXT: [[CASTSCALABLESVE:%.*]] = call <vscale x 8 x bfloat> @llvm.experimental.vector.insert.nxv8bf16.v32bf16(<vscale x 8 x bfloat> undef, <32 x bfloat> [[TMP0]], i64 0)			// CHECK-512-NEXT: [[CASTSCALABLESVE:%.*]] = call <vscale x 8 x bfloat> @llvm.experimental.vector.insert.nxv8bf16.v32bf16(<vscale x 8 x bfloat> undef, <32 x bfloat> [[TMP0]], i64 0)
	// CHECK-512-NEXT: ret <vscale x 8 x bfloat> [[CASTSCALABLESVE]]			// CHECK-512-NEXT: ret <vscale x 8 x bfloat> [[CASTSCALABLESVE]]
	//			//
	svbfloat16_t read_global_bf16() { return global_bf16; }			svbfloat16_t read_global_bf16() { return global_bf16; }

	// CHECK-128-LABEL: @read_global_bool(			// CHECK-128-LABEL: @read_global_bool(
	// CHECK-128-NEXT: entry:			// CHECK-128-NEXT: entry:
	// CHECK-128-NEXT: [[SAVED_VALUE:%.*]] = alloca <2 x i8>, align 2
	// CHECK-128-NEXT: [[TMP0:%.]] = load <2 x i8>, <2 x i8> @global_bool, align 2, !tbaa [[TBAA6]]			// CHECK-128-NEXT: [[TMP0:%.]] = load <2 x i8>, <2 x i8> @global_bool, align 2, !tbaa [[TBAA6]]
	// CHECK-128-NEXT: store <2 x i8> [[TMP0]], <2 x i8>* [[SAVED_VALUE]], align 2, !tbaa [[TBAA6]]			// CHECK-128-NEXT: [[CASTFIXEDSVE:%.*]] = call <vscale x 2 x i8> @llvm.experimental.vector.insert.nxv2i8.v2i8(<vscale x 2 x i8> undef, <2 x i8> [[TMP0]], i64 0)
	// CHECK-128-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <2 x i8> [[SAVED_VALUE]] to <vscale x 16 x i1>*			// CHECK-128-NEXT: [[TMP1:%.*]] = bitcast <vscale x 2 x i8> [[CASTFIXEDSVE]] to <vscale x 16 x i1>
	// CHECK-128-NEXT: [[TMP1:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[CASTFIXEDSVE]], align 2, !tbaa [[TBAA6]]
	// CHECK-128-NEXT: ret <vscale x 16 x i1> [[TMP1]]			// CHECK-128-NEXT: ret <vscale x 16 x i1> [[TMP1]]
	//			//
	// CHECK-512-LABEL: @read_global_bool(			// CHECK-512-LABEL: @read_global_bool(
	// CHECK-512-NEXT: entry:			// CHECK-512-NEXT: entry:
	// CHECK-512-NEXT: [[SAVED_VALUE:%.*]] = alloca <8 x i8>, align 8
	// CHECK-512-NEXT: [[TMP0:%.]] = load <8 x i8>, <8 x i8> @global_bool, align 2, !tbaa [[TBAA6]]			// CHECK-512-NEXT: [[TMP0:%.]] = load <8 x i8>, <8 x i8> @global_bool, align 2, !tbaa [[TBAA6]]
	// CHECK-512-NEXT: store <8 x i8> [[TMP0]], <8 x i8>* [[SAVED_VALUE]], align 8, !tbaa [[TBAA6]]			// CHECK-512-NEXT: [[CASTFIXEDSVE:%.*]] = call <vscale x 2 x i8> @llvm.experimental.vector.insert.nxv2i8.v8i8(<vscale x 2 x i8> undef, <8 x i8> [[TMP0]], i64 0)
	// CHECK-512-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <8 x i8> [[SAVED_VALUE]] to <vscale x 16 x i1>*			// CHECK-512-NEXT: [[TMP1:%.*]] = bitcast <vscale x 2 x i8> [[CASTFIXEDSVE]] to <vscale x 16 x i1>
	// CHECK-512-NEXT: [[TMP1:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[CASTFIXEDSVE]], align 8, !tbaa [[TBAA6]]
	// CHECK-512-NEXT: ret <vscale x 16 x i1> [[TMP1]]			// CHECK-512-NEXT: ret <vscale x 16 x i1> [[TMP1]]
	//			//
	svbool_t read_global_bool() { return global_bool; }			svbool_t read_global_bool() { return global_bool; }