This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/
-
AST/
3
ItaniumMangle.cpp
-
CodeGen/
4/5
CGCall.cpp
-
CGExprScalar.cpp
-
TargetInfo.cpp
-
test/
-
CodeGen/
-
attr-arm-sve-vector-bits-bitcast.c
-
attr-arm-sve-vector-bits-call.c
-
attr-arm-sve-vector-bits-cast.c
-
attr-arm-sve-vector-bits-codegen.c
-
attr-arm-sve-vector-bits-globals.c
-
attr-arm-sve-vector-bits-types.c
-
CodeGenCXX/
-
aarch64-sve-fixedtypeinfo.cpp

Differential D85743

[CodeGen][AArch64] Support arm_sve_vector_bits attribute
ClosedPublic

Authored by c-rhodes on Aug 11 2020, 9:01 AM.

Download Raw Diff

Details

Reviewers

efriedma
sdesmalen
david-arm
paulwalker-arm
rsandifo-arm
cameron.mcinally
rengolin

Commits

rG2ddf795e8cac: Reland "[CodeGen][AArch64] Support arm_sve_vector_bits attribute"
rG42587345a3af: [CodeGen][AArch64] Support arm_sve_vector_bits attribute

Summary

This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.

VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:

Implicit casting between VLA <-> VLS types.
Coercion of VLS types in function args/return.
Mangling of VLS types.

Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.

Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.

The VLS and VLA types are defined by the ACLE to map to the same
machine-level SVE vectors. This patch implements mangling support for
this. The mangling scheme is defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.

This is based on the prototype in D85128.

[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling

Diff Detail

Event Timeline

c-rhodes created this revision.Aug 11 2020, 9:01 AM

Herald added a reviewer: rengolin. · View Herald TranscriptAug 11 2020, 9:01 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: aaron.ballman, danielkiss, kristof.beyls, tschuett. · View Herald Transcript

c-rhodes requested review of this revision.Aug 11 2020, 9:01 AM

Harbormaster completed remote builds in B67929: Diff 284766.Aug 11 2020, 10:36 AM

efriedma added inline comments.Aug 12 2020, 11:29 AM

clang/lib/AST/ItaniumMangle.cpp
3330	Mangling them the same way is going to cause practical issues; they're different types from a C++ perspective, so they need distinct manglings. For example, you'll crash the compiler if you refer to both foo<svint64_t> and foo<fixed_int64_t>.
clang/lib/CodeGen/CGCall.cpp
1238	getFixedSize()?
1254	getFixedSize():? (etc.; please go through the whole patch.)

efriedma added a parent revision: D85736: [Sema][AArch64] Support arm_sve_vector_bits attribute.Aug 12 2020, 11:31 AM

Changes:

s/getKnownMinSize/getFixedSize/g fixes.
Implemented new mangling scheme for VLS types.

c-rhodes marked 2 inline comments as done.Aug 14 2020, 1:48 AM

c-rhodes added inline comments.

clang/lib/AST/ItaniumMangle.cpp
3330	Mangling them the same way is going to cause practical issues; they're different types from a C++ perspective, so they need distinct manglings. For example, you'll crash the compiler if you refer to both foo<svint64_t> and foo<fixed_int64_t>. The ACLE is yet to define the mangling scheme for fixed-length SVE types so I kept the mangling the same, which is also what GCC currently does. After speaking with @rsandifo-arm yesterday we agreed to come up with a mangling scheme where the types are mangled in the same way as: __SVE_VLS<typename, unsigned> where the first argument is the underlying variable-length type and the second argument is the SVE vector length in bits. For example: #if __ARM_FEATURE_SVE_BITS==512 // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE typedef svint32_t vec __attribute__((arm_sve_vector_bits(512))); // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE typedef svbool_t pred __attribute__((arm_sve_vector_bits(512))); #endif let us know if you have any feedback/concerns about this approach.
clang/lib/CodeGen/CGCall.cpp
1361	@efriedma If we're happy with the element bitcast above this can also be fixed but I wasn't if that was ok, although it's pretty much what was implemented in the original codegen patch.

david-arm added inline comments.Aug 14 2020, 6:28 AM

clang/lib/CodeGen/CGCall.cpp
1342	I think if you restructure the code here you could do: if (isa<llvm::ScalableVectorType>(SrcTy) \|\| isa<llvm::ScalableVectorType>(DstTy) \|\| SrcSize.getFixedSize() <= DstSize.getFixedSize()) since you know that the scalable types have been eliminated by the time we do the "<=" comparison.
1361	Given the if statement above has eliminated scalable vector types I think it's safe to use DstSize.getFixedSize() here.

More getFixedSize fixes.

c-rhodes marked 2 inline comments as done.Aug 14 2020, 6:55 AM

LGTM

Like I mentioned on the review for the prototype, I still think we should try to implement a scheme that makes CK_BItCast between fixed and scalable types trivial. Doing coercion this way is going to have a significant performance cost. But there isn't any user-visible effect, so I'm fine with leaving that for a followup.

clang/lib/AST/ItaniumMangle.cpp
3330	Makes sense.

This revision is now accepted and ready to land.Aug 14 2020, 2:54 PM

Closed by commit rG42587345a3af: [CodeGen][AArch64] Support arm_sve_vector_bits attribute (authored by c-rhodes). · Explain WhyAug 27 2020, 8:12 AM

This revision was automatically updated to reflect the committed changes.

c-rhodes added a commit: rG42587345a3af: [CodeGen][AArch64] Support arm_sve_vector_bits attribute.

In D85743#2219215, @efriedma wrote:

LGTM

Like I mentioned on the review for the prototype, I still think we should try to implement a scheme that makes CK_BItCast between fixed and scalable types trivial. Doing coercion this way is going to have a significant performance cost. But there isn't any user-visible effect, so I'm fine with leaving that for a followup.

I agree the bitcast scheme certainly isn't optimal but it's a start at least and something we intend to address going forward. Thanks for reviewing!

Hi! The attr-arm-sve-vector-bits-call.c test seems to be failing on our clang builders:

FAIL: Clang :: CodeGen/attr-arm-sve-vector-bits-call.c (3020 of 25924)
******************** TEST 'Clang :: CodeGen/attr-arm-sve-vector-bits-call.c' FAILED ********************
Script:
--
: 'RUN: at line 3';   /b/s/w/ir/k/staging/llvm_build/bin/clang -cc1 -internal-isystem /b/s/w/ir/k/staging/llvm_build/lib/clang/12.0.0/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -fallow-half-arguments-and-returns -S -O1 -emit-llvm -o - /b/s/w/ir/k/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-call.c | /b/s/w/ir/k/staging/llvm_build/bin/FileCheck /b/s/w/ir/k/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
--
Exit Code: 1

Command Output (stderr):
--
warning: Compiler has made implicit assumption that TypeSize is not scalable. This may or may not lead to broken code.
warning: Compiler has made implicit assumption that TypeSize is not scalable. This may or may not lead to broken code.
/b/s/w/ir/k/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-call.c:67:16: error: CHECK-NEXT: is not on the line after the previous match
// CHECK-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 4 x i32>, align 16
               ^
<stdin>:52:2: note: 'next' match was here
 %retval.coerce.i = alloca <vscale x 4 x i32>, align 16
 ^
<stdin>:50:7: note: previous match ended here
entry:
      ^
<stdin>:51:1: note: non-matching line after previous match is here
 %x.i = alloca <16 x i32>, align 16
^

Input file: <stdin>
Check file: /b/s/w/ir/k/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-call.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
         .
         .
         .
        47: 
        48: ; Function Attrs: nounwind readnone
        49: define <vscale x 4 x i32> @sizeless_caller(<vscale x 4 x i32> %x) local_unnamed_addr #1 {
        50: entry:
        51:  %x.i = alloca <16 x i32>, align 16
        52:  %retval.coerce.i = alloca <vscale x 4 x i32>, align 16
next:67      !~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: match on wrong line
        53:  %x.addr = alloca <vscale x 4 x i32>, align 16
        54:  %coerce.coerce = alloca <vscale x 4 x i32>, align 16
        55:  %coerce1 = alloca <16 x i32>, align 16
        56:  %saved-call-rvalue = alloca <16 x i32>, align 64
        57:  store <vscale x 4 x i32> %x, <vscale x 4 x i32>* %x.addr, align 16, !tbaa !5
         .
         .
         .
>>>>>>

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
********************
Failed Tests (1):
  Clang :: CodeGen/attr-arm-sve-vector-bits-call.c

Could you take a look? Thanks.

Builder: https://luci-milo.appspot.com/p/fuchsia/builders/ci/clang-linux-x64/b8870800848452818112?

c-rhodes added a reverting change: rG2e7041fdc223: Revert "[CodeGen][AArch64] Support arm_sve_vector_bits attribute".Aug 27 2020, 2:33 PM

In D85743#2242931, @leonardchan wrote:

Hi! The attr-arm-sve-vector-bits-call.c test seems to be failing on our clang builders:

Could you take a look? Thanks.

Builder: https://luci-milo.appspot.com/p/fuchsia/builders/ci/clang-linux-x64/b8870800848452818112?

Sorry about that, I've reverted it (commit 2e7041f) whilst I investigate. Thanks for raising.

c-rhodes added a commit: rG2ddf795e8cac: Reland "[CodeGen][AArch64] Support arm_sve_vector_bits attribute".Aug 28 2020, 8:57 AM

In D85743#2243188, @c-rhodes wrote:

In D85743#2242931, @leonardchan wrote:

Hi! The attr-arm-sve-vector-bits-call.c test seems to be failing on our clang builders:

Could you take a look? Thanks.

Builder: https://luci-milo.appspot.com/p/fuchsia/builders/ci/clang-linux-x64/b8870800848452818112?

Sorry about that, I've reverted it (commit 2e7041f) whilst I investigate. Thanks for raising.

The IR differences were caused by the new pass manager which is on by default for the Fuchsia builder. I've re-landed the patch with a fix for CodeGen/attr-arm-sve-vector-bits-call.c to use the legacy pm with -fno-experimental-new-pass-manager.

The IR differences were caused by the new pass manager which is on by default for the Fuchsia builder. I've re-landed the patch with a fix for CodeGen/attr-arm-sve-vector-bits-call.c to use the legacy pm with -fno-experimental-new-pass-manager.

Thanks for the update! We do have the new PM on by default, but I'm surprised that this wouldn't appear on clang-x86_64-debian-new-pass-manager-fast which also tests the new PM.

In D85743#2244839, @leonardchan wrote:

The IR differences were caused by the new pass manager which is on by default for the Fuchsia builder. I've re-landed the patch with a fix for CodeGen/attr-arm-sve-vector-bits-call.c to use the legacy pm with -fno-experimental-new-pass-manager.

Thanks for the update! We do have the new PM on by default, but I'm surprised that this wouldn't appear on clang-x86_64-debian-new-pass-manager-fast which also tests the new PM.

No problem, I checked and it did fail for that builder [1] but for some reason I didn't receive an email.

[1] http://lab.llvm.org:8011/builders/clang-x86_64-debian-new-pass-manager-fast/builds/14050

Revision Contents

Path

Size

clang/

lib/

AST/

ItaniumMangle.cpp

81 lines

CodeGen/

CGCall.cpp

44 lines

CGExprScalar.cpp

28 lines

TargetInfo.cpp

123 lines

test/

CodeGen/

attr-arm-sve-vector-bits-bitcast.c

278 lines

attr-arm-sve-vector-bits-call.c

337 lines

attr-arm-sve-vector-bits-cast.c

109 lines

attr-arm-sve-vector-bits-codegen.c

117 lines

attr-arm-sve-vector-bits-globals.c

120 lines

attr-arm-sve-vector-bits-types.c

581 lines

CodeGenCXX/

aarch64-sve-fixedtypeinfo.cpp

75 lines

Diff 284766

clang/lib/AST/ItaniumMangle.cpp

Show First 20 Lines • Show All 525 Lines • ▼ Show 20 Lines	#include "clang/AST/TypeNodes.inc"
void mangleExtParameterInfo(FunctionProtoType::ExtParameterInfo info);		void mangleExtParameterInfo(FunctionProtoType::ExtParameterInfo info);
void mangleExtFunctionInfo(const FunctionType *T);		void mangleExtFunctionInfo(const FunctionType *T);
void mangleBareFunctionType(const FunctionProtoType *T, bool MangleReturnType,		void mangleBareFunctionType(const FunctionProtoType *T, bool MangleReturnType,
const FunctionDecl *FD = nullptr);		const FunctionDecl *FD = nullptr);
void mangleNeonVectorType(const VectorType *T);		void mangleNeonVectorType(const VectorType *T);
void mangleNeonVectorType(const DependentVectorType *T);		void mangleNeonVectorType(const DependentVectorType *T);
void mangleAArch64NeonVectorType(const VectorType *T);		void mangleAArch64NeonVectorType(const VectorType *T);
void mangleAArch64NeonVectorType(const DependentVectorType *T);		void mangleAArch64NeonVectorType(const DependentVectorType *T);
		void mangleAArch64FixedSveVectorType(const VectorType *T);
		void mangleAArch64FixedSveVectorType(const DependentVectorType *T);

void mangleIntegerLiteral(QualType T, const llvm::APSInt &Value);		void mangleIntegerLiteral(QualType T, const llvm::APSInt &Value);
void mangleMemberExprBase(const Expr *base, bool isArrow);		void mangleMemberExprBase(const Expr *base, bool isArrow);
void mangleMemberExpr(const Expr *base, bool isArrow,		void mangleMemberExpr(const Expr *base, bool isArrow,
NestedNameSpecifier *qualifier,		NestedNameSpecifier *qualifier,
NamedDecl *firstQualifierLookup,		NamedDecl *firstQualifierLookup,
DeclarationName name,		DeclarationName name,
const TemplateArgumentLoc *TemplateArgs,		const TemplateArgumentLoc *TemplateArgs,
▲ Show 20 Lines • Show All 2,774 Lines • ▼ Show 20 Lines
void CXXNameMangler::mangleAArch64NeonVectorType(const DependentVectorType *T) {		void CXXNameMangler::mangleAArch64NeonVectorType(const DependentVectorType *T) {
DiagnosticsEngine &Diags = Context.getDiags();		DiagnosticsEngine &Diags = Context.getDiags();
unsigned DiagID = Diags.getCustomDiagID(		unsigned DiagID = Diags.getCustomDiagID(
DiagnosticsEngine::Error,		DiagnosticsEngine::Error,
"cannot mangle this dependent neon vector type yet");		"cannot mangle this dependent neon vector type yet");
Diags.Report(T->getAttributeLoc(), DiagID);		Diags.Report(T->getAttributeLoc(), DiagID);
}		}

		// The AArch64 ACLE specifies that fixed-length SVE vector and predicate types
		// defined with the 'arm_sve_vector_bits' attribute map to the same AAPCS64
		// type as the sizeless variants. The mangling scheme is defined in the
		// appendices to the Procedure Call Standard for the Arm Architecture, see:
		// https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling
		efriedmaUnsubmitted Not Done Reply Inline Actions Mangling them the same way is going to cause practical issues; they're different types from a C++ perspective, so they need distinct manglings. For example, you'll crash the compiler if you refer to both foo<svint64_t> and foo<fixed_int64_t>. efriedma: Mangling them the same way is going to cause practical issues; they're different types from a…
		c-rhodesAuthorUnsubmitted Not Done Reply Inline Actions Mangling them the same way is going to cause practical issues; they're different types from a C++ perspective, so they need distinct manglings. For example, you'll crash the compiler if you refer to both foo<svint64_t> and foo<fixed_int64_t>. The ACLE is yet to define the mangling scheme for fixed-length SVE types so I kept the mangling the same, which is also what GCC currently does. After speaking with @rsandifo-arm yesterday we agreed to come up with a mangling scheme where the types are mangled in the same way as: __SVE_VLS<typename, unsigned> where the first argument is the underlying variable-length type and the second argument is the SVE vector length in bits. For example: #if __ARM_FEATURE_SVE_BITS==512 // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE typedef svint32_t vec __attribute__((arm_sve_vector_bits(512))); // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE typedef svbool_t pred __attribute__((arm_sve_vector_bits(512))); #endif let us know if you have any feedback/concerns about this approach. c-rhodes: > Mangling them the same way is going to cause practical issues; they're different types from a…
		efriedmaUnsubmitted Not Done Reply Inline Actions Makes sense. efriedma: Makes sense.
		void CXXNameMangler::mangleAArch64FixedSveVectorType(const VectorType *T) {
		assert((T->getVectorKind() == VectorType::SveFixedLengthDataVector \|\|
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang…
		T->getVectorKind() == VectorType::SveFixedLengthPredicateVector) &&
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'SveFixedLengthPredicateVector' in 'clang::VectorType' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'SveFixedLengthPredicateVector' in 'clang::VectorType'…
		"expected fixed-length SVE vector!");
		QualType EltType = T->getElementType();
		assert(EltType->isBuiltinType() &&
		"expected builtin type for fixed-length SVE vector!");

		StringRef TypeName;
		switch (cast<BuiltinType>(EltType)->getKind()) {
		case BuiltinType::SChar:
		TypeName = "__SVInt8_t";
		break;
		case BuiltinType::UChar: {
		if (T->getVectorKind() == VectorType::SveFixedLengthDataVector)
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang…
		TypeName = "__SVUint8_t";
		else
		TypeName = "__SVBool_t";
		break;
		}
		case BuiltinType::Short:
		TypeName = "__SVInt16_t";
		break;
		case BuiltinType::UShort:
		TypeName = "__SVUint16_t";
		break;
		case BuiltinType::Int:
		TypeName = "__SVInt32_t";
		break;
		case BuiltinType::UInt:
		TypeName = "__SVUint32_t";
		break;
		case BuiltinType::Long:
		TypeName = "__SVInt64_t";
		break;
		case BuiltinType::ULong:
		TypeName = "__SVUint64_t";
		break;
		case BuiltinType::Float16:
		TypeName = "__SVFloat16_t";
		break;
		case BuiltinType::Float:
		TypeName = "__SVFloat32_t";
		break;
		case BuiltinType::Double:
		TypeName = "__SVFloat64_t";
		break;
		case BuiltinType::BFloat16:
		TypeName = "__SVBfloat16_t";
		break;
		default:
		llvm_unreachable("unexpected element type for fixed-length SVE vector!");
		}

		Out << 'u' << TypeName.size() << TypeName;
		}

		void CXXNameMangler::mangleAArch64FixedSveVectorType(
		const DependentVectorType *T) {
		DiagnosticsEngine &Diags = Context.getDiags();
		unsigned DiagID = Diags.getCustomDiagID(
		DiagnosticsEngine::Error,
		"cannot mangle this dependent fixed-length SVE vector type yet");
		Diags.Report(T->getAttributeLoc(), DiagID);
		}

// GNU extension: vector types		// GNU extension: vector types
// <type> ::= <vector-type>		// <type> ::= <vector-type>
// <vector-type> ::= Dv <positive dimension number> _		// <vector-type> ::= Dv <positive dimension number> _
// <extended element type>		// <extended element type>
// ::= Dv [<dimension expression>] _ <element type>		// ::= Dv [<dimension expression>] _ <element type>
// <extended element type> ::= <element type>		// <extended element type> ::= <element type>
// ::= p # AltiVec vector pixel		// ::= p # AltiVec vector pixel
// ::= b # Altivec vector bool		// ::= b # Altivec vector bool
void CXXNameMangler::mangleType(const VectorType *T) {		void CXXNameMangler::mangleType(const VectorType *T) {
if ((T->getVectorKind() == VectorType::NeonVector \|\|		if ((T->getVectorKind() == VectorType::NeonVector \|\|
T->getVectorKind() == VectorType::NeonPolyVector)) {		T->getVectorKind() == VectorType::NeonPolyVector)) {
llvm::Triple Target = getASTContext().getTargetInfo().getTriple();		llvm::Triple Target = getASTContext().getTargetInfo().getTriple();
llvm::Triple::ArchType Arch =		llvm::Triple::ArchType Arch =
getASTContext().getTargetInfo().getTriple().getArch();		getASTContext().getTargetInfo().getTriple().getArch();
if ((Arch == llvm::Triple::aarch64 \|\|		if ((Arch == llvm::Triple::aarch64 \|\|
Arch == llvm::Triple::aarch64_be) && !Target.isOSDarwin())		Arch == llvm::Triple::aarch64_be) && !Target.isOSDarwin())
mangleAArch64NeonVectorType(T);		mangleAArch64NeonVectorType(T);
else		else
mangleNeonVectorType(T);		mangleNeonVectorType(T);
return;		return;
		} else if (T->getVectorKind() == VectorType::SveFixedLengthDataVector \|\|
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang…
		T->getVectorKind() == VectorType::SveFixedLengthPredicateVector) {
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'SveFixedLengthPredicateVector' in 'clang::VectorType' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'SveFixedLengthPredicateVector' in 'clang::VectorType'…
		mangleAArch64FixedSveVectorType(T);
		return;
}		}
Out << "Dv" << T->getNumElements() << '_';		Out << "Dv" << T->getNumElements() << '_';
if (T->getVectorKind() == VectorType::AltiVecPixel)		if (T->getVectorKind() == VectorType::AltiVecPixel)
Out << 'p';		Out << 'p';
else if (T->getVectorKind() == VectorType::AltiVecBool)		else if (T->getVectorKind() == VectorType::AltiVecBool)
Out << 'b';		Out << 'b';
else		else
mangleType(T->getElementType());		mangleType(T->getElementType());
}		}

void CXXNameMangler::mangleType(const DependentVectorType *T) {		void CXXNameMangler::mangleType(const DependentVectorType *T) {
if ((T->getVectorKind() == VectorType::NeonVector \|\|		if ((T->getVectorKind() == VectorType::NeonVector \|\|
T->getVectorKind() == VectorType::NeonPolyVector)) {		T->getVectorKind() == VectorType::NeonPolyVector)) {
llvm::Triple Target = getASTContext().getTargetInfo().getTriple();		llvm::Triple Target = getASTContext().getTargetInfo().getTriple();
llvm::Triple::ArchType Arch =		llvm::Triple::ArchType Arch =
getASTContext().getTargetInfo().getTriple().getArch();		getASTContext().getTargetInfo().getTriple().getArch();
if ((Arch == llvm::Triple::aarch64 \|\| Arch == llvm::Triple::aarch64_be) &&		if ((Arch == llvm::Triple::aarch64 \|\| Arch == llvm::Triple::aarch64_be) &&
!Target.isOSDarwin())		!Target.isOSDarwin())
mangleAArch64NeonVectorType(T);		mangleAArch64NeonVectorType(T);
else		else
mangleNeonVectorType(T);		mangleNeonVectorType(T);
return;		return;
		} else if (T->getVectorKind() == VectorType::SveFixedLengthDataVector \|\|
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang…
		T->getVectorKind() == VectorType::SveFixedLengthPredicateVector) {
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'SveFixedLengthPredicateVector' in 'clang::VectorType' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'SveFixedLengthPredicateVector' in 'clang::VectorType'…
		mangleAArch64FixedSveVectorType(T);
		return;
}		}

Out << "Dv";		Out << "Dv";
mangleExpression(T->getSizeExpr());		mangleExpression(T->getSizeExpr());
Out << '_';		Out << '_';
if (T->getVectorKind() == VectorType::AltiVecPixel)		if (T->getVectorKind() == VectorType::AltiVecPixel)
Out << 'p';		Out << 'p';
else if (T->getVectorKind() == VectorType::AltiVecBool)		else if (T->getVectorKind() == VectorType::AltiVecBool)
▲ Show 20 Lines • Show All 2,008 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGCall.cpp

Show First 20 Lines • Show All 1,113 Lines • ▼ Show 20 Lines	if (IRCallArgPos < IRFuncTy->getNumParams() &&
V = Builder.CreateBitCast(V, IRFuncTy->getParamType(IRCallArgPos));		V = Builder.CreateBitCast(V, IRFuncTy->getParamType(IRCallArgPos));

IRCallArgs[IRCallArgPos++] = V;		IRCallArgs[IRCallArgPos++] = V;
}		}
}		}

/// Create a temporary allocation for the purposes of coercion.		/// Create a temporary allocation for the purposes of coercion.
static Address CreateTempAllocaForCoercion(CodeGenFunction &CGF, llvm::Type *Ty,		static Address CreateTempAllocaForCoercion(CodeGenFunction &CGF, llvm::Type *Ty,
CharUnits MinAlign) {		CharUnits MinAlign,
		const Twine &Name = "tmp") {
// Don't use an alignment that's worse than what LLVM would prefer.		// Don't use an alignment that's worse than what LLVM would prefer.
auto PrefAlign = CGF.CGM.getDataLayout().getPrefTypeAlignment(Ty);		auto PrefAlign = CGF.CGM.getDataLayout().getPrefTypeAlignment(Ty);
CharUnits Align = std::max(MinAlign, CharUnits::fromQuantity(PrefAlign));		CharUnits Align = std::max(MinAlign, CharUnits::fromQuantity(PrefAlign));

return CGF.CreateTempAlloca(Ty, Align);		return CGF.CreateTempAlloca(Ty, Align, Name + ".coerce");
}		}

/// EnterStructPointerForCoercedAccess - Given a struct pointer that we are		/// EnterStructPointerForCoercedAccess - Given a struct pointer that we are
/// accessing some number of bytes out of it, try to gep into the struct to get		/// accessing some number of bytes out of it, try to gep into the struct to get
/// at its inner goodness. Dive as deep as possible without entering an element		/// at its inner goodness. Dive as deep as possible without entering an element
/// with an in-memory size smaller than DstSize.		/// with an in-memory size smaller than DstSize.
static Address		static Address
EnterStructPointerForCoercedAccess(Address SrcPtr,		EnterStructPointerForCoercedAccess(Address SrcPtr,
▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines
static llvm::Value CreateCoercedLoad(Address Src, llvm::Type Ty,		static llvm::Value CreateCoercedLoad(Address Src, llvm::Type Ty,
CodeGenFunction &CGF) {		CodeGenFunction &CGF) {
llvm::Type *SrcTy = Src.getElementType();		llvm::Type *SrcTy = Src.getElementType();

// If SrcTy and Ty are the same, just do a load.		// If SrcTy and Ty are the same, just do a load.
if (SrcTy == Ty)		if (SrcTy == Ty)
return CGF.Builder.CreateLoad(Src);		return CGF.Builder.CreateLoad(Src);

uint64_t DstSize = CGF.CGM.getDataLayout().getTypeAllocSize(Ty);		llvm::TypeSize DstSize = CGF.CGM.getDataLayout().getTypeAllocSize(Ty);

if (llvm::StructType *SrcSTy = dyn_cast<llvm::StructType>(SrcTy)) {		if (llvm::StructType *SrcSTy = dyn_cast<llvm::StructType>(SrcTy)) {
Src = EnterStructPointerForCoercedAccess(Src, SrcSTy, DstSize, CGF);		Src = EnterStructPointerForCoercedAccess(Src, SrcSTy,
		DstSize.getKnownMinSize(), CGF);
		efriedmaUnsubmitted Done Reply Inline Actions getFixedSize()? efriedma: getFixedSize()?
SrcTy = Src.getElementType();		SrcTy = Src.getElementType();
}		}

uint64_t SrcSize = CGF.CGM.getDataLayout().getTypeAllocSize(SrcTy);		llvm::TypeSize SrcSize = CGF.CGM.getDataLayout().getTypeAllocSize(SrcTy);

// If the source and destination are integer or pointer types, just do an		// If the source and destination are integer or pointer types, just do an
// extension or truncation to the desired type.		// extension or truncation to the desired type.
if ((isa<llvm::IntegerType>(Ty) \|\| isa<llvm::PointerType>(Ty)) &&		if ((isa<llvm::IntegerType>(Ty) \|\| isa<llvm::PointerType>(Ty)) &&
(isa<llvm::IntegerType>(SrcTy) \|\| isa<llvm::PointerType>(SrcTy))) {		(isa<llvm::IntegerType>(SrcTy) \|\| isa<llvm::PointerType>(SrcTy))) {
llvm::Value *Load = CGF.Builder.CreateLoad(Src);		llvm::Value *Load = CGF.Builder.CreateLoad(Src);
return CoerceIntOrPtrToIntOrPtr(Load, Ty, CGF);		return CoerceIntOrPtrToIntOrPtr(Load, Ty, CGF);
}		}

// If load is legal, just bitcast the src pointer.		// If load is legal, just bitcast the src pointer.
if (SrcSize >= DstSize) {		if ((!SrcSize.isScalable() && !DstSize.isScalable()) &&
		SrcSize.getKnownMinSize() >= DstSize.getKnownMinSize()) {
		efriedmaUnsubmitted Done Reply Inline Actions getFixedSize():? (etc.; please go through the whole patch.) efriedma: getFixedSize():? (etc.; please go through the whole patch.)
// Generally SrcSize is never greater than DstSize, since this means we are		// Generally SrcSize is never greater than DstSize, since this means we are
// losing bits. However, this can happen in cases where the structure has		// losing bits. However, this can happen in cases where the structure has
// additional padding, for example due to a user specified alignment.		// additional padding, for example due to a user specified alignment.
//		//
// FIXME: Assert that we aren't truncating non-padding bits when have access		// FIXME: Assert that we aren't truncating non-padding bits when have access
// to that information.		// to that information.
Src = CGF.Builder.CreateBitCast(Src,		Src = CGF.Builder.CreateBitCast(Src,
Ty->getPointerTo(Src.getAddressSpace()));		Ty->getPointerTo(Src.getAddressSpace()));
return CGF.Builder.CreateLoad(Src);		return CGF.Builder.CreateLoad(Src);
}		}

// Otherwise do coercion through memory. This is stupid, but simple.		// Otherwise do coercion through memory. This is stupid, but simple.
Address Tmp = CreateTempAllocaForCoercion(CGF, Ty, Src.getAlignment());		Address Tmp =
CGF.Builder.CreateMemCpy(Tmp.getPointer(), Tmp.getAlignment().getAsAlign(),		CreateTempAllocaForCoercion(CGF, Ty, Src.getAlignment(), Src.getName());
Src.getPointer(), Src.getAlignment().getAsAlign(),		CGF.Builder.CreateMemCpy(
llvm::ConstantInt::get(CGF.IntPtrTy, SrcSize));		Tmp.getPointer(), Tmp.getAlignment().getAsAlign(), Src.getPointer(),
		Src.getAlignment().getAsAlign(),
		llvm::ConstantInt::get(CGF.IntPtrTy, SrcSize.getKnownMinSize()));
return CGF.Builder.CreateLoad(Tmp);		return CGF.Builder.CreateLoad(Tmp);
}		}

// Function to store a first-class aggregate into memory. We prefer to		// Function to store a first-class aggregate into memory. We prefer to
// store the elements rather than the aggregate to be more friendly to		// store the elements rather than the aggregate to be more friendly to
// fast-isel.		// fast-isel.
// FIXME: Do we need to recurse here?		// FIXME: Do we need to recurse here?
void CodeGenFunction::EmitAggregateStore(llvm::Value *Val, Address Dest,		void CodeGenFunction::EmitAggregateStore(llvm::Value *Val, Address Dest,
Show All 22 Lines	static void CreateCoercedStore(llvm::Value *Src,
CodeGenFunction &CGF) {		CodeGenFunction &CGF) {
llvm::Type *SrcTy = Src->getType();		llvm::Type *SrcTy = Src->getType();
llvm::Type *DstTy = Dst.getElementType();		llvm::Type *DstTy = Dst.getElementType();
if (SrcTy == DstTy) {		if (SrcTy == DstTy) {
CGF.Builder.CreateStore(Src, Dst, DstIsVolatile);		CGF.Builder.CreateStore(Src, Dst, DstIsVolatile);
return;		return;
}		}

uint64_t SrcSize = CGF.CGM.getDataLayout().getTypeAllocSize(SrcTy);		llvm::TypeSize SrcSize = CGF.CGM.getDataLayout().getTypeAllocSize(SrcTy);

if (llvm::StructType *DstSTy = dyn_cast<llvm::StructType>(DstTy)) {		if (llvm::StructType *DstSTy = dyn_cast<llvm::StructType>(DstTy)) {
Dst = EnterStructPointerForCoercedAccess(Dst, DstSTy, SrcSize, CGF);		Dst = EnterStructPointerForCoercedAccess(Dst, DstSTy,
		SrcSize.getKnownMinSize(), CGF);
DstTy = Dst.getElementType();		DstTy = Dst.getElementType();
}		}

llvm::PointerType *SrcPtrTy = llvm::dyn_cast<llvm::PointerType>(SrcTy);		llvm::PointerType *SrcPtrTy = llvm::dyn_cast<llvm::PointerType>(SrcTy);
llvm::PointerType *DstPtrTy = llvm::dyn_cast<llvm::PointerType>(DstTy);		llvm::PointerType *DstPtrTy = llvm::dyn_cast<llvm::PointerType>(DstTy);
if (SrcPtrTy && DstPtrTy &&		if (SrcPtrTy && DstPtrTy &&
SrcPtrTy->getAddressSpace() != DstPtrTy->getAddressSpace()) {		SrcPtrTy->getAddressSpace() != DstPtrTy->getAddressSpace()) {
Src = CGF.Builder.CreatePointerBitCastOrAddrSpaceCast(Src, DstTy);		Src = CGF.Builder.CreatePointerBitCastOrAddrSpaceCast(Src, DstTy);
CGF.Builder.CreateStore(Src, Dst, DstIsVolatile);		CGF.Builder.CreateStore(Src, Dst, DstIsVolatile);
return;		return;
}		}

// If the source and destination are integer or pointer types, just do an		// If the source and destination are integer or pointer types, just do an
// extension or truncation to the desired type.		// extension or truncation to the desired type.
if ((isa<llvm::IntegerType>(SrcTy) \|\| isa<llvm::PointerType>(SrcTy)) &&		if ((isa<llvm::IntegerType>(SrcTy) \|\| isa<llvm::PointerType>(SrcTy)) &&
(isa<llvm::IntegerType>(DstTy) \|\| isa<llvm::PointerType>(DstTy))) {		(isa<llvm::IntegerType>(DstTy) \|\| isa<llvm::PointerType>(DstTy))) {
Src = CoerceIntOrPtrToIntOrPtr(Src, DstTy, CGF);		Src = CoerceIntOrPtrToIntOrPtr(Src, DstTy, CGF);
CGF.Builder.CreateStore(Src, Dst, DstIsVolatile);		CGF.Builder.CreateStore(Src, Dst, DstIsVolatile);
return;		return;
}		}

uint64_t DstSize = CGF.CGM.getDataLayout().getTypeAllocSize(DstTy);		llvm::TypeSize DstSize = CGF.CGM.getDataLayout().getTypeAllocSize(DstTy);

// If store is legal, just bitcast the src pointer.		// If store is legal, just bitcast the src pointer.
if (SrcSize <= DstSize) {		// FIXME: does this check for scalable vectors need to be more conservative?
		if (SrcSize.getKnownMinSize() <= DstSize.getKnownMinSize() \|\|
		(isa<llvm::ScalableVectorType>(SrcTy) \|\|
		david-armUnsubmitted Done Reply Inline Actions I think if you restructure the code here you could do: if (isa<llvm::ScalableVectorType>(SrcTy) \|\| isa<llvm::ScalableVectorType>(DstTy) \|\| SrcSize.getFixedSize() <= DstSize.getFixedSize()) since you know that the scalable types have been eliminated by the time we do the "<=" comparison. david-arm: I think if you restructure the code here you could do: if (isa<llvm::ScalableVectorType>…
		isa<llvm::ScalableVectorType>(DstTy))) {
Dst = CGF.Builder.CreateElementBitCast(Dst, SrcTy);		Dst = CGF.Builder.CreateElementBitCast(Dst, SrcTy);
CGF.EmitAggregateStore(Src, Dst, DstIsVolatile);		CGF.EmitAggregateStore(Src, Dst, DstIsVolatile);
} else {		} else {
// Otherwise do coercion through memory. This is stupid, but		// Otherwise do coercion through memory. This is stupid, but
// simple.		// simple.

// Generally SrcSize is never greater than DstSize, since this means we are		// Generally SrcSize is never greater than DstSize, since this means we are
// losing bits. However, this can happen in cases where the structure has		// losing bits. However, this can happen in cases where the structure has
// additional padding, for example due to a user specified alignment.		// additional padding, for example due to a user specified alignment.
//		//
// FIXME: Assert that we aren't truncating non-padding bits when have access		// FIXME: Assert that we aren't truncating non-padding bits when have access
// to that information.		// to that information.
Address Tmp = CreateTempAllocaForCoercion(CGF, SrcTy, Dst.getAlignment());		Address Tmp = CreateTempAllocaForCoercion(CGF, SrcTy, Dst.getAlignment());
CGF.Builder.CreateStore(Src, Tmp);		CGF.Builder.CreateStore(Src, Tmp);
CGF.Builder.CreateMemCpy(Dst.getPointer(), Dst.getAlignment().getAsAlign(),		CGF.Builder.CreateMemCpy(
Tmp.getPointer(), Tmp.getAlignment().getAsAlign(),		Dst.getPointer(), Dst.getAlignment().getAsAlign(), Tmp.getPointer(),
llvm::ConstantInt::get(CGF.IntPtrTy, DstSize));		Tmp.getAlignment().getAsAlign(),
		llvm::ConstantInt::get(CGF.IntPtrTy, DstSize.getKnownMinSize()));
		c-rhodesAuthorUnsubmitted Not Done Reply Inline Actions @efriedma If we're happy with the element bitcast above this can also be fixed but I wasn't if that was ok, although it's pretty much what was implemented in the original codegen patch. c-rhodes: @efriedma If we're happy with the element bitcast above this can also be fixed but I wasn't if…
		david-armUnsubmitted Done Reply Inline Actions Given the if statement above has eliminated scalable vector types I think it's safe to use DstSize.getFixedSize() here. david-arm: Given the if statement above has eliminated scalable vector types I think it's safe to use…
}		}
}		}

static Address emitAddressAtOffset(CodeGenFunction &CGF, Address addr,		static Address emitAddressAtOffset(CodeGenFunction &CGF, Address addr,
const ABIArgInfo &info) {		const ABIArgInfo &info) {
if (unsigned offset = info.getDirectOffset()) {		if (unsigned offset = info.getDirectOffset()) {
addr = CGF.Builder.CreateElementBitCast(addr, CGF.Int8Ty);		addr = CGF.Builder.CreateElementBitCast(addr, CGF.Int8Ty);
addr = CGF.Builder.CreateConstInBoundsByteGEP(addr,		addr = CGF.Builder.CreateConstInBoundsByteGEP(addr,
▲ Show 20 Lines • Show All 3,798 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGExprScalar.cpp

Show First 20 Lines • Show All 2,073 Lines • ▼ Show 20 Lines	if (auto *CI = dyn_cast<llvm::CallBase>(Src)) {
if (CI->getMetadata("heapallocsite") && isa<ExplicitCastExpr>(CE)) {		if (CI->getMetadata("heapallocsite") && isa<ExplicitCastExpr>(CE)) {
QualType PointeeType = DestTy->getPointeeType();		QualType PointeeType = DestTy->getPointeeType();
if (!PointeeType.isNull())		if (!PointeeType.isNull())
CGF.getDebugInfo()->addHeapAllocSiteMetadata(CI, PointeeType,		CGF.getDebugInfo()->addHeapAllocSiteMetadata(CI, PointeeType,
CE->getExprLoc());		CE->getExprLoc());
}		}
}		}

		// Perform VLAT <-> VLST bitcast through memory.
		if ((isa<llvm::FixedVectorType>(SrcTy) &&
		isa<llvm::ScalableVectorType>(DstTy)) \|\|
		(isa<llvm::ScalableVectorType>(SrcTy) &&
		isa<llvm::FixedVectorType>(DstTy))) {
		if (const CallExpr *CE = dyn_cast<CallExpr>(E)) {
		// Call expressions can't have a scalar return unless the return type
		// is a reference type so an lvalue can't be emitted. Create a temp
		// alloca to store the call, bitcast the address then load.
		QualType RetTy = CE->getCallReturnType(CGF.getContext());
		Address Addr =
		CGF.CreateDefaultAlignTempAlloca(SrcTy, "saved-call-rvalue");
		LValue LV = CGF.MakeAddrLValue(Addr, RetTy);
		CGF.EmitStoreOfScalar(Src, LV);
		Addr = Builder.CreateElementBitCast(Addr, CGF.ConvertTypeForMem(DestTy),
		"castFixedSve");
		LValue DestLV = CGF.MakeAddrLValue(Addr, DestTy);
		DestLV.setTBAAInfo(TBAAAccessInfo::getMayAliasInfo());
		return EmitLoadOfLValue(DestLV, CE->getExprLoc());
		}

		Address Addr = EmitLValue(E).getAddress(CGF);
		Addr = Builder.CreateElementBitCast(Addr, CGF.ConvertTypeForMem(DestTy));
		LValue DestLV = CGF.MakeAddrLValue(Addr, DestTy);
		DestLV.setTBAAInfo(TBAAAccessInfo::getMayAliasInfo());
		return EmitLoadOfLValue(DestLV, CE->getExprLoc());
		}

return Builder.CreateBitCast(Src, DstTy);		return Builder.CreateBitCast(Src, DstTy);
}		}
case CK_AddressSpaceConversion: {		case CK_AddressSpaceConversion: {
Expr::EvalResult Result;		Expr::EvalResult Result;
if (E->EvaluateAsRValue(Result, CGF.getContext()) &&		if (E->EvaluateAsRValue(Result, CGF.getContext()) &&
Result.Val.isNullPointer()) {		Result.Val.isNullPointer()) {
// If E has side effect, it is emitted even if its final result is a		// If E has side effect, it is emitted even if its final result is a
// null pointer. In that case, a DCE pass should be able to		// null pointer. In that case, a DCE pass should be able to
▲ Show 20 Lines • Show All 2,933 Lines • Show Last 20 Lines

clang/lib/CodeGen/TargetInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,446 Lines • ▼ Show 20 Lines	AArch64ABIInfo(CodeGenTypes &CGT, ABIKind Kind)
: SwiftABIInfo(CGT), Kind(Kind) {}		: SwiftABIInfo(CGT), Kind(Kind) {}

private:		private:
ABIKind getABIKind() const { return Kind; }		ABIKind getABIKind() const { return Kind; }
bool isDarwinPCS() const { return Kind == DarwinPCS; }		bool isDarwinPCS() const { return Kind == DarwinPCS; }

ABIArgInfo classifyReturnType(QualType RetTy, bool IsVariadic) const;		ABIArgInfo classifyReturnType(QualType RetTy, bool IsVariadic) const;
ABIArgInfo classifyArgumentType(QualType RetTy) const;		ABIArgInfo classifyArgumentType(QualType RetTy) const;
		ABIArgInfo coerceIllegalVector(QualType Ty) const;
bool isHomogeneousAggregateBaseType(QualType Ty) const override;		bool isHomogeneousAggregateBaseType(QualType Ty) const override;
bool isHomogeneousAggregateSmallEnough(const Type *Ty,		bool isHomogeneousAggregateSmallEnough(const Type *Ty,
uint64_t Members) const override;		uint64_t Members) const override;

bool isIllegalVectorType(QualType Ty) const;		bool isIllegalVectorType(QualType Ty) const;

void computeInfo(CGFunctionInfo &FI) const override {		void computeInfo(CGFunctionInfo &FI) const override {
if (!::classifyReturnType(getCXXABI(), FI, *this))		if (!::classifyReturnType(getCXXABI(), FI, *this))
▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	void WindowsAArch64TargetCodeGenInfo::setTargetAttributes(
const Decl D, llvm::GlobalValue GV, CodeGen::CodeGenModule &CGM) const {		const Decl D, llvm::GlobalValue GV, CodeGen::CodeGenModule &CGM) const {
AArch64TargetCodeGenInfo::setTargetAttributes(D, GV, CGM);		AArch64TargetCodeGenInfo::setTargetAttributes(D, GV, CGM);
if (GV->isDeclaration())		if (GV->isDeclaration())
return;		return;
addStackProbeTargetAttributes(D, GV, CGM);		addStackProbeTargetAttributes(D, GV, CGM);
}		}
}		}

ABIArgInfo AArch64ABIInfo::classifyArgumentType(QualType Ty) const {		ABIArgInfo AArch64ABIInfo::coerceIllegalVector(QualType Ty) const {
Ty = useFirstFieldIfTransparentUnion(Ty);		assert(Ty->isVectorType() && "expected vector type!");

		const auto *VT = Ty->castAs<VectorType>();
		if (VT->getVectorKind() == VectorType::SveFixedLengthPredicateVector) {
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'SveFixedLengthPredicateVector' in 'clang::VectorType' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'SveFixedLengthPredicateVector' in 'clang::VectorType'…
		assert(VT->getElementType()->isBuiltinType() && "expected builtin type!");
		assert(VT->getElementType()->castAs<BuiltinType>()->getKind() ==
		BuiltinType::UChar &&
		"unexpected builtin type for SVE predicate!");
		return ABIArgInfo::getDirect(llvm::ScalableVectorType::get(
		llvm::Type::getInt1Ty(getVMContext()), 16));
		}

		if (VT->getVectorKind() == VectorType::SveFixedLengthDataVector) {
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang…
		assert(VT->getElementType()->isBuiltinType() && "expected builtin type!");

		const auto *BT = VT->getElementType()->castAs<BuiltinType>();
		llvm::ScalableVectorType *ResType = nullptr;
		switch (BT->getKind()) {
		default:
		llvm_unreachable("unexpected builtin type for SVE vector!");
		case BuiltinType::SChar:
		case BuiltinType::UChar:
		ResType = llvm::ScalableVectorType::get(
		llvm::Type::getInt8Ty(getVMContext()), 16);
		break;
		case BuiltinType::Short:
		case BuiltinType::UShort:
		ResType = llvm::ScalableVectorType::get(
		llvm::Type::getInt16Ty(getVMContext()), 8);
		break;
		case BuiltinType::Int:
		case BuiltinType::UInt:
		ResType = llvm::ScalableVectorType::get(
		llvm::Type::getInt32Ty(getVMContext()), 4);
		break;
		case BuiltinType::Long:
		case BuiltinType::ULong:
		ResType = llvm::ScalableVectorType::get(
		llvm::Type::getInt64Ty(getVMContext()), 2);
		break;
		case BuiltinType::Float16:
		ResType = llvm::ScalableVectorType::get(
		llvm::Type::getHalfTy(getVMContext()), 8);
		break;
		case BuiltinType::Float:
		ResType = llvm::ScalableVectorType::get(
		llvm::Type::getFloatTy(getVMContext()), 4);
		break;
		case BuiltinType::Double:
		ResType = llvm::ScalableVectorType::get(
		llvm::Type::getDoubleTy(getVMContext()), 2);
		break;
		case BuiltinType::BFloat16:
		ResType = llvm::ScalableVectorType::get(
		llvm::Type::getBFloatTy(getVMContext()), 8);
		break;
		}
		return ABIArgInfo::getDirect(ResType);
		}

// Handle illegal vector types here.
if (isIllegalVectorType(Ty)) {
uint64_t Size = getContext().getTypeSize(Ty);		uint64_t Size = getContext().getTypeSize(Ty);
// Android promotes <2 x i8> to i16, not i32		// Android promotes <2 x i8> to i16, not i32
if (isAndroid() && (Size <= 16)) {		if (isAndroid() && (Size <= 16)) {
llvm::Type *ResType = llvm::Type::getInt16Ty(getVMContext());		llvm::Type *ResType = llvm::Type::getInt16Ty(getVMContext());
return ABIArgInfo::getDirect(ResType);		return ABIArgInfo::getDirect(ResType);
}		}
if (Size <= 32) {		if (Size <= 32) {
llvm::Type *ResType = llvm::Type::getInt32Ty(getVMContext());		llvm::Type *ResType = llvm::Type::getInt32Ty(getVMContext());
return ABIArgInfo::getDirect(ResType);		return ABIArgInfo::getDirect(ResType);
}		}
if (Size == 64) {		if (Size == 64) {
auto *ResType =		auto *ResType =
llvm::FixedVectorType::get(llvm::Type::getInt32Ty(getVMContext()), 2);		llvm::FixedVectorType::get(llvm::Type::getInt32Ty(getVMContext()), 2);
return ABIArgInfo::getDirect(ResType);		return ABIArgInfo::getDirect(ResType);
}		}
if (Size == 128) {		if (Size == 128) {
auto *ResType =		auto *ResType =
llvm::FixedVectorType::get(llvm::Type::getInt32Ty(getVMContext()), 4);		llvm::FixedVectorType::get(llvm::Type::getInt32Ty(getVMContext()), 4);
return ABIArgInfo::getDirect(ResType);		return ABIArgInfo::getDirect(ResType);
}		}
return getNaturalAlignIndirect(Ty, /ByVal=/false);		return getNaturalAlignIndirect(Ty, /ByVal=/false);
}		}

		ABIArgInfo AArch64ABIInfo::classifyArgumentType(QualType Ty) const {
		Ty = useFirstFieldIfTransparentUnion(Ty);

		// Handle illegal vector types here.
		if (isIllegalVectorType(Ty))
		return coerceIllegalVector(Ty);

if (!isAggregateTypeForABI(Ty)) {		if (!isAggregateTypeForABI(Ty)) {
// Treat an enum type as its underlying type.		// Treat an enum type as its underlying type.
if (const EnumType *EnumTy = Ty->getAs<EnumType>())		if (const EnumType *EnumTy = Ty->getAs<EnumType>())
Ty = EnumTy->getDecl()->getIntegerType();		Ty = EnumTy->getDecl()->getIntegerType();

if (const auto *EIT = Ty->getAs<ExtIntType>())		if (const auto *EIT = Ty->getAs<ExtIntType>())
if (EIT->getNumBits() > 128)		if (EIT->getNumBits() > 128)
return getNaturalAlignIndirect(Ty);		return getNaturalAlignIndirect(Ty);
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	ABIArgInfo AArch64ABIInfo::classifyArgumentType(QualType Ty) const {
return getNaturalAlignIndirect(Ty, /ByVal=/false);		return getNaturalAlignIndirect(Ty, /ByVal=/false);
}		}

ABIArgInfo AArch64ABIInfo::classifyReturnType(QualType RetTy,		ABIArgInfo AArch64ABIInfo::classifyReturnType(QualType RetTy,
bool IsVariadic) const {		bool IsVariadic) const {
if (RetTy->isVoidType())		if (RetTy->isVoidType())
return ABIArgInfo::getIgnore();		return ABIArgInfo::getIgnore();

		if (const auto *VT = RetTy->getAs<VectorType>()) {
		if (VT->getVectorKind() == VectorType::SveFixedLengthDataVector \|\|
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang…
		VT->getVectorKind() == VectorType::SveFixedLengthPredicateVector)
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'SveFixedLengthPredicateVector' in 'clang::VectorType' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'SveFixedLengthPredicateVector' in 'clang::VectorType'…
		return coerceIllegalVector(RetTy);
		}

// Large vector types should be returned via memory.		// Large vector types should be returned via memory.
if (RetTy->isVectorType() && getContext().getTypeSize(RetTy) > 128)		if (RetTy->isVectorType() && getContext().getTypeSize(RetTy) > 128)
return getNaturalAlignIndirect(RetTy);		return getNaturalAlignIndirect(RetTy);

if (!isAggregateTypeForABI(RetTy)) {		if (!isAggregateTypeForABI(RetTy)) {
// Treat an enum type as its underlying type.		// Treat an enum type as its underlying type.
if (const EnumType *EnumTy = RetTy->getAs<EnumType>())		if (const EnumType *EnumTy = RetTy->getAs<EnumType>())
RetTy = EnumTy->getDecl()->getIntegerType();		RetTy = EnumTy->getDecl()->getIntegerType();
Show All 39 Lines	ABIArgInfo AArch64ABIInfo::classifyReturnType(QualType RetTy,
}		}

return getNaturalAlignIndirect(RetTy);		return getNaturalAlignIndirect(RetTy);
}		}

/// isIllegalVectorType - check whether the vector type is legal for AArch64.		/// isIllegalVectorType - check whether the vector type is legal for AArch64.
bool AArch64ABIInfo::isIllegalVectorType(QualType Ty) const {		bool AArch64ABIInfo::isIllegalVectorType(QualType Ty) const {
if (const VectorType *VT = Ty->getAs<VectorType>()) {		if (const VectorType *VT = Ty->getAs<VectorType>()) {
		// Check whether VT is a fixed-length SVE vector. These types are
		// represented as scalable vectors in function args/return and must be
		// coerced from fixed vectors.
		if (VT->getVectorKind() == VectorType::SveFixedLengthDataVector \|\|
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'SveFixedLengthDataVector' in 'clang::VectorType' [clang…
		VT->getVectorKind() == VectorType::SveFixedLengthPredicateVector)
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'SveFixedLengthPredicateVector' in 'clang::VectorType' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'SveFixedLengthPredicateVector' in 'clang::VectorType'…
		return true;

// Check whether VT is legal.		// Check whether VT is legal.
unsigned NumElements = VT->getNumElements();		unsigned NumElements = VT->getNumElements();
uint64_t Size = getContext().getTypeSize(VT);		uint64_t Size = getContext().getTypeSize(VT);
// NumElements should be power of 2.		// NumElements should be power of 2.
if (!llvm::isPowerOf2_32(NumElements))		if (!llvm::isPowerOf2_32(NumElements))
return true;		return true;

// arm64_32 has to be compatible with the ARM logic here, which allows huge		// arm64_32 has to be compatible with the ARM logic here, which allows huge
▲ Show 20 Lines • Show All 5,359 Lines • Show Last 20 Lines

clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c

This file was added.

				// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
				// REQUIRES: aarch64-registered-target
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=128 -fallow-half-arguments-and-returns -S -O1 -emit-llvm -o - %s \| FileCheck %s --check-prefix=CHECK-128
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=256 -fallow-half-arguments-and-returns -S -O1 -emit-llvm -o - %s \| FileCheck %s --check-prefix=CHECK-256
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=512 -fallow-half-arguments-and-returns -S -O1 -emit-llvm -o - %s \| FileCheck %s --check-prefix=CHECK-512

				#include <arm_sve.h>

				#define N __ARM_FEATURE_SVE_BITS_EXPERIMENTAL

				typedef svint64_t fixed_int64_t __attribute__((arm_sve_vector_bits(N)));
				typedef svfloat64_t fixed_float64_t __attribute__((arm_sve_vector_bits(N)));
				typedef svbfloat16_t fixed_bfloat16_t __attribute__((arm_sve_vector_bits(N)));
				typedef svbool_t fixed_bool_t __attribute__((arm_sve_vector_bits(N)));

				#define DEFINE_STRUCT(ty) \
				struct struct_##ty { \
				fixed_##ty##_t x, y[3]; \
				} struct_##ty;

				DEFINE_STRUCT(int64)
				DEFINE_STRUCT(float64)
				DEFINE_STRUCT(bfloat16)
				DEFINE_STRUCT(bool)

				//===----------------------------------------------------------------------===//
				// int64
				//===----------------------------------------------------------------------===//

				// CHECK-128-LABEL: @read_int64(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_INT64:%.]], %struct.struct_int64* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-128-NEXT: [[TMP0:%.]] = bitcast <2 x i64> [[ARRAYIDX]] to <vscale x 2 x i64>*
				// CHECK-128-NEXT: [[TMP1:%.]] = load <vscale x 2 x i64>, <vscale x 2 x i64> [[TMP0]], align 16, !tbaa !2
				// CHECK-128-NEXT: ret <vscale x 2 x i64> [[TMP1]]
				//
				// CHECK-256-LABEL: @read_int64(
				// CHECK-256-NEXT: entry:
				// CHECK-256-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_INT64:%.]], %struct.struct_int64* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-256-NEXT: [[TMP0:%.]] = bitcast <4 x i64> [[ARRAYIDX]] to <vscale x 2 x i64>*
				// CHECK-256-NEXT: [[TMP1:%.]] = load <vscale x 2 x i64>, <vscale x 2 x i64> [[TMP0]], align 16, !tbaa !2
				// CHECK-256-NEXT: ret <vscale x 2 x i64> [[TMP1]]
				//
				// CHECK-512-LABEL: @read_int64(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_INT64:%.]], %struct.struct_int64* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-512-NEXT: [[TMP0:%.]] = bitcast <8 x i64> [[ARRAYIDX]] to <vscale x 2 x i64>*
				// CHECK-512-NEXT: [[TMP1:%.]] = load <vscale x 2 x i64>, <vscale x 2 x i64> [[TMP0]], align 16, !tbaa !2
				// CHECK-512-NEXT: ret <vscale x 2 x i64> [[TMP1]]
				//
				svint64_t read_int64(struct struct_int64 *s) {
				return s->y[0];
				}

				// CHECK-128-LABEL: @write_int64(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 2 x i64>, align 16
				// CHECK-128-NEXT: store <vscale x 2 x i64> [[X:%.]], <vscale x 2 x i64> [[X_ADDR]], align 16, !tbaa !5
				// CHECK-128-NEXT: [[TMP0:%.]] = bitcast <vscale x 2 x i64> [[X_ADDR]] to <2 x i64>*
				// CHECK-128-NEXT: [[TMP1:%.]] = load <2 x i64>, <2 x i64> [[TMP0]], align 16, !tbaa !2
				// CHECK-128-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_INT64:%.]], %struct.struct_int64* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-128-NEXT: store <2 x i64> [[TMP1]], <2 x i64>* [[ARRAYIDX]], align 16, !tbaa !2
				// CHECK-128-NEXT: ret void
				//
				// CHECK-256-LABEL: @write_int64(
				// CHECK-256-NEXT: entry:
				// CHECK-256-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 2 x i64>, align 16
				// CHECK-256-NEXT: store <vscale x 2 x i64> [[X:%.]], <vscale x 2 x i64> [[X_ADDR]], align 16, !tbaa !5
				// CHECK-256-NEXT: [[TMP0:%.]] = bitcast <vscale x 2 x i64> [[X_ADDR]] to <4 x i64>*
				// CHECK-256-NEXT: [[TMP1:%.]] = load <4 x i64>, <4 x i64> [[TMP0]], align 16, !tbaa !2
				// CHECK-256-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_INT64:%.]], %struct.struct_int64* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-256-NEXT: store <4 x i64> [[TMP1]], <4 x i64>* [[ARRAYIDX]], align 16, !tbaa !2
				// CHECK-256-NEXT: ret void
				//
				// CHECK-512-LABEL: @write_int64(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 2 x i64>, align 16
				// CHECK-512-NEXT: store <vscale x 2 x i64> [[X:%.]], <vscale x 2 x i64> [[X_ADDR]], align 16, !tbaa !5
				// CHECK-512-NEXT: [[TMP0:%.]] = bitcast <vscale x 2 x i64> [[X_ADDR]] to <8 x i64>*
				// CHECK-512-NEXT: [[TMP1:%.]] = load <8 x i64>, <8 x i64> [[TMP0]], align 16, !tbaa !2
				// CHECK-512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_INT64:%.]], %struct.struct_int64* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-512-NEXT: store <8 x i64> [[TMP1]], <8 x i64>* [[ARRAYIDX]], align 16, !tbaa !2
				// CHECK-512-NEXT: ret void
				//
				void write_int64(struct struct_int64 *s, svint64_t x) {
				s->y[0] = x;
				}

				//===----------------------------------------------------------------------===//
				// float64
				//===----------------------------------------------------------------------===//

				// CHECK-128-LABEL: @read_float64(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_FLOAT64:%.]], %struct.struct_float64* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-128-NEXT: [[TMP0:%.]] = bitcast <2 x double> [[ARRAYIDX]] to <vscale x 2 x double>*
				// CHECK-128-NEXT: [[TMP1:%.]] = load <vscale x 2 x double>, <vscale x 2 x double> [[TMP0]], align 16, !tbaa !2
				// CHECK-128-NEXT: ret <vscale x 2 x double> [[TMP1]]
				//
				// CHECK-256-LABEL: @read_float64(
				// CHECK-256-NEXT: entry:
				// CHECK-256-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_FLOAT64:%.]], %struct.struct_float64* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-256-NEXT: [[TMP0:%.]] = bitcast <4 x double> [[ARRAYIDX]] to <vscale x 2 x double>*
				// CHECK-256-NEXT: [[TMP1:%.]] = load <vscale x 2 x double>, <vscale x 2 x double> [[TMP0]], align 16, !tbaa !2
				// CHECK-256-NEXT: ret <vscale x 2 x double> [[TMP1]]
				//
				// CHECK-512-LABEL: @read_float64(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_FLOAT64:%.]], %struct.struct_float64* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-512-NEXT: [[TMP0:%.]] = bitcast <8 x double> [[ARRAYIDX]] to <vscale x 2 x double>*
				// CHECK-512-NEXT: [[TMP1:%.]] = load <vscale x 2 x double>, <vscale x 2 x double> [[TMP0]], align 16, !tbaa !2
				// CHECK-512-NEXT: ret <vscale x 2 x double> [[TMP1]]
				//
				svfloat64_t read_float64(struct struct_float64 *s) {
				return s->y[0];
				}

				// CHECK-128-LABEL: @write_float64(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 2 x double>, align 16
				// CHECK-128-NEXT: store <vscale x 2 x double> [[X:%.]], <vscale x 2 x double> [[X_ADDR]], align 16, !tbaa !7
				// CHECK-128-NEXT: [[TMP0:%.]] = bitcast <vscale x 2 x double> [[X_ADDR]] to <2 x double>*
				// CHECK-128-NEXT: [[TMP1:%.]] = load <2 x double>, <2 x double> [[TMP0]], align 16, !tbaa !2
				// CHECK-128-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_FLOAT64:%.]], %struct.struct_float64* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-128-NEXT: store <2 x double> [[TMP1]], <2 x double>* [[ARRAYIDX]], align 16, !tbaa !2
				// CHECK-128-NEXT: ret void
				//
				// CHECK-256-LABEL: @write_float64(
				// CHECK-256-NEXT: entry:
				// CHECK-256-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 2 x double>, align 16
				// CHECK-256-NEXT: store <vscale x 2 x double> [[X:%.]], <vscale x 2 x double> [[X_ADDR]], align 16, !tbaa !7
				// CHECK-256-NEXT: [[TMP0:%.]] = bitcast <vscale x 2 x double> [[X_ADDR]] to <4 x double>*
				// CHECK-256-NEXT: [[TMP1:%.]] = load <4 x double>, <4 x double> [[TMP0]], align 16, !tbaa !2
				// CHECK-256-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_FLOAT64:%.]], %struct.struct_float64* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-256-NEXT: store <4 x double> [[TMP1]], <4 x double>* [[ARRAYIDX]], align 16, !tbaa !2
				// CHECK-256-NEXT: ret void
				//
				// CHECK-512-LABEL: @write_float64(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 2 x double>, align 16
				// CHECK-512-NEXT: store <vscale x 2 x double> [[X:%.]], <vscale x 2 x double> [[X_ADDR]], align 16, !tbaa !7
				// CHECK-512-NEXT: [[TMP0:%.]] = bitcast <vscale x 2 x double> [[X_ADDR]] to <8 x double>*
				// CHECK-512-NEXT: [[TMP1:%.]] = load <8 x double>, <8 x double> [[TMP0]], align 16, !tbaa !2
				// CHECK-512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_FLOAT64:%.]], %struct.struct_float64* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-512-NEXT: store <8 x double> [[TMP1]], <8 x double>* [[ARRAYIDX]], align 16, !tbaa !2
				// CHECK-512-NEXT: ret void
				//
				void write_float64(struct struct_float64 *s, svfloat64_t x) {
				s->y[0] = x;
				}

				//===----------------------------------------------------------------------===//
				// bfloat16
				//===----------------------------------------------------------------------===//

				// CHECK-128-LABEL: @read_bfloat16(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BFLOAT16:%.]], %struct.struct_bfloat16* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-128-NEXT: [[TMP0:%.]] = bitcast <8 x bfloat> [[ARRAYIDX]] to <vscale x 8 x bfloat>*
				// CHECK-128-NEXT: [[TMP1:%.]] = load <vscale x 8 x bfloat>, <vscale x 8 x bfloat> [[TMP0]], align 16, !tbaa !2
				// CHECK-128-NEXT: ret <vscale x 8 x bfloat> [[TMP1]]
				//
				// CHECK-256-LABEL: @read_bfloat16(
				// CHECK-256-NEXT: entry:
				// CHECK-256-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BFLOAT16:%.]], %struct.struct_bfloat16* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-256-NEXT: [[TMP0:%.]] = bitcast <16 x bfloat> [[ARRAYIDX]] to <vscale x 8 x bfloat>*
				// CHECK-256-NEXT: [[TMP1:%.]] = load <vscale x 8 x bfloat>, <vscale x 8 x bfloat> [[TMP0]], align 16, !tbaa !2
				// CHECK-256-NEXT: ret <vscale x 8 x bfloat> [[TMP1]]
				//
				// CHECK-512-LABEL: @read_bfloat16(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BFLOAT16:%.]], %struct.struct_bfloat16* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-512-NEXT: [[TMP0:%.]] = bitcast <32 x bfloat> [[ARRAYIDX]] to <vscale x 8 x bfloat>*
				// CHECK-512-NEXT: [[TMP1:%.]] = load <vscale x 8 x bfloat>, <vscale x 8 x bfloat> [[TMP0]], align 16, !tbaa !2
				// CHECK-512-NEXT: ret <vscale x 8 x bfloat> [[TMP1]]
				//
				svbfloat16_t read_bfloat16(struct struct_bfloat16 *s) {
				return s->y[0];
				}

				// CHECK-128-LABEL: @write_bfloat16(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 8 x bfloat>, align 16
				// CHECK-128-NEXT: store <vscale x 8 x bfloat> [[X:%.]], <vscale x 8 x bfloat> [[X_ADDR]], align 16, !tbaa !9
				// CHECK-128-NEXT: [[TMP0:%.]] = bitcast <vscale x 8 x bfloat> [[X_ADDR]] to <8 x bfloat>*
				// CHECK-128-NEXT: [[TMP1:%.]] = load <8 x bfloat>, <8 x bfloat> [[TMP0]], align 16, !tbaa !2
				// CHECK-128-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BFLOAT16:%.]], %struct.struct_bfloat16* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-128-NEXT: store <8 x bfloat> [[TMP1]], <8 x bfloat>* [[ARRAYIDX]], align 16, !tbaa !2
				// CHECK-128-NEXT: ret void
				//
				// CHECK-256-LABEL: @write_bfloat16(
				// CHECK-256-NEXT: entry:
				// CHECK-256-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 8 x bfloat>, align 16
				// CHECK-256-NEXT: store <vscale x 8 x bfloat> [[X:%.]], <vscale x 8 x bfloat> [[X_ADDR]], align 16, !tbaa !9
				// CHECK-256-NEXT: [[TMP0:%.]] = bitcast <vscale x 8 x bfloat> [[X_ADDR]] to <16 x bfloat>*
				// CHECK-256-NEXT: [[TMP1:%.]] = load <16 x bfloat>, <16 x bfloat> [[TMP0]], align 16, !tbaa !2
				// CHECK-256-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BFLOAT16:%.]], %struct.struct_bfloat16* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-256-NEXT: store <16 x bfloat> [[TMP1]], <16 x bfloat>* [[ARRAYIDX]], align 16, !tbaa !2
				// CHECK-256-NEXT: ret void
				//
				// CHECK-512-LABEL: @write_bfloat16(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 8 x bfloat>, align 16
				// CHECK-512-NEXT: store <vscale x 8 x bfloat> [[X:%.]], <vscale x 8 x bfloat> [[X_ADDR]], align 16, !tbaa !9
				// CHECK-512-NEXT: [[TMP0:%.]] = bitcast <vscale x 8 x bfloat> [[X_ADDR]] to <32 x bfloat>*
				// CHECK-512-NEXT: [[TMP1:%.]] = load <32 x bfloat>, <32 x bfloat> [[TMP0]], align 16, !tbaa !2
				// CHECK-512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BFLOAT16:%.]], %struct.struct_bfloat16* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-512-NEXT: store <32 x bfloat> [[TMP1]], <32 x bfloat>* [[ARRAYIDX]], align 16, !tbaa !2
				// CHECK-512-NEXT: ret void
				//
				void write_bfloat16(struct struct_bfloat16 *s, svbfloat16_t x) {
				s->y[0] = x;
				}

				//===----------------------------------------------------------------------===//
				// bool
				//===----------------------------------------------------------------------===//

				// CHECK-128-LABEL: @read_bool(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-128-NEXT: [[TMP0:%.]] = bitcast <2 x i8> [[ARRAYIDX]] to <vscale x 16 x i1>*
				// CHECK-128-NEXT: [[TMP1:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[TMP0]], align 2, !tbaa !2
				// CHECK-128-NEXT: ret <vscale x 16 x i1> [[TMP1]]
				//
				// CHECK-256-LABEL: @read_bool(
				// CHECK-256-NEXT: entry:
				// CHECK-256-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-256-NEXT: [[TMP0:%.]] = bitcast <4 x i8> [[ARRAYIDX]] to <vscale x 16 x i1>*
				// CHECK-256-NEXT: [[TMP1:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[TMP0]], align 2, !tbaa !2
				// CHECK-256-NEXT: ret <vscale x 16 x i1> [[TMP1]]
				//
				// CHECK-512-LABEL: @read_bool(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-512-NEXT: [[TMP0:%.]] = bitcast <8 x i8> [[ARRAYIDX]] to <vscale x 16 x i1>*
				// CHECK-512-NEXT: [[TMP1:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[TMP0]], align 2, !tbaa !2
				// CHECK-512-NEXT: ret <vscale x 16 x i1> [[TMP1]]
				//
				svbool_t read_bool(struct struct_bool *s) {
				return s->y[0];
				}

				// CHECK-128-LABEL: @write_bool(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-128-NEXT: store <vscale x 16 x i1> [[X:%.]], <vscale x 16 x i1> [[X_ADDR]], align 16, !tbaa !11
				// CHECK-128-NEXT: [[TMP0:%.]] = bitcast <vscale x 16 x i1> [[X_ADDR]] to <2 x i8>*
				// CHECK-128-NEXT: [[TMP1:%.]] = load <2 x i8>, <2 x i8> [[TMP0]], align 16, !tbaa !2
				// CHECK-128-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1, i64 0
				// CHECK-128-NEXT: store <2 x i8> [[TMP1]], <2 x i8>* [[ARRAYIDX]], align 2, !tbaa !2
				// CHECK-128-NEXT: ret void
				//
				// CHECK-256-LABEL: @write_bool(
				// CHECK-256-NEXT: entry:
				// CHECK-256-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-256-NEXT: store <vscale x 16 x i1> [[X:%.]], <vscale x 16 x i1> [[X_ADDR]], align 16, !tbaa !11
				// CHECK-256-NEXT: [[TMP0:%.]] = bitcast <vscale x 16 x i1> [[X_ADDR]] to i32*
				// CHECK-256-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP0]], align 16, !tbaa !2
				// CHECK-256-NEXT: [[Y:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1
				// CHECK-256-NEXT: [[TMP2:%.]] = bitcast [3 x <4 x i8>] [[Y]] to i32*
				// CHECK-256-NEXT: store i32 [[TMP1]], i32* [[TMP2]], align 2, !tbaa !2
				// CHECK-256-NEXT: ret void
				//
				// CHECK-512-LABEL: @write_bool(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-512-NEXT: store <vscale x 16 x i1> [[X:%.]], <vscale x 16 x i1> [[X_ADDR]], align 16, !tbaa !11
				// CHECK-512-NEXT: [[TMP0:%.]] = bitcast <vscale x 16 x i1> [[X_ADDR]] to i64*
				// CHECK-512-NEXT: [[TMP1:%.]] = load i64, i64 [[TMP0]], align 16, !tbaa !2
				// CHECK-512-NEXT: [[Y:%.]] = getelementptr inbounds [[STRUCT_STRUCT_BOOL:%.]], %struct.struct_bool* [[S:%.*]], i64 0, i32 1
				// CHECK-512-NEXT: [[TMP2:%.]] = bitcast [3 x <8 x i8>] [[Y]] to i64*
				// CHECK-512-NEXT: store i64 [[TMP1]], i64* [[TMP2]], align 2, !tbaa !2
				// CHECK-512-NEXT: ret void
				//
				void write_bool(struct struct_bool *s, svbool_t x) {
				s->y[0] = x;
				}

clang/test/CodeGen/attr-arm-sve-vector-bits-call.c

This file was added.

				// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
				// REQUIRES: aarch64-registered-target
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -fallow-half-arguments-and-returns -S -O1 -emit-llvm -o - %s \| FileCheck %s

				#include <arm_sve.h>

				#define N __ARM_FEATURE_SVE_BITS_EXPERIMENTAL

				typedef svint32_t fixed_int32_t __attribute__((arm_sve_vector_bits(N)));
				typedef svfloat64_t fixed_float64_t __attribute__((arm_sve_vector_bits(N)));
				typedef svbool_t fixed_bool_t __attribute__((arm_sve_vector_bits(N)));

				//===----------------------------------------------------------------------===//
				// Test caller/callee with VLST <-> VLAT
				//===----------------------------------------------------------------------===//

				// CHECK-LABEL: @sizeless_callee(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: ret <vscale x 4 x i32> [[X:%.*]]
				//
				svint32_t sizeless_callee(svint32_t x) {
				return x;
				}

				// CHECK-LABEL: @fixed_caller(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[X:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[X_ADDR:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[SAVED_CALL_RVALUE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <16 x i32> [[X]] to <vscale x 4 x i32>*
				// CHECK-NEXT: store <vscale x 4 x i32> [[X_COERCE:%.]], <vscale x 4 x i32> [[TMP0]], align 16
				// CHECK-NEXT: [[X1:%.]] = load <16 x i32>, <16 x i32> [[X]], align 16, !tbaa !2
				// CHECK-NEXT: store <16 x i32> [[X1]], <16 x i32>* [[X_ADDR]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP1:%.]] = bitcast <16 x i32> [[X_ADDR]] to <vscale x 4 x i32>*
				// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[TMP1]], align 16, !tbaa !2
				// CHECK-NEXT: store <vscale x 4 x i32> [[TMP2]], <vscale x 4 x i32>* [[SAVED_CALL_RVALUE]], align 16, !tbaa !5
				// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 4 x i32> [[SAVED_CALL_RVALUE]] to <16 x i32>*
				// CHECK-NEXT: [[TMP3:%.]] = load <16 x i32>, <16 x i32> [[CASTFIXEDSVE]], align 16, !tbaa !2
				// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 4 x i32> [[RETVAL_COERCE]] to <16 x i32>*
				// CHECK-NEXT: store <16 x i32> [[TMP3]], <16 x i32>* [[RETVAL_0__SROA_CAST]], align 16
				// CHECK-NEXT: [[TMP4:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP4]]
				//
				fixed_int32_t fixed_caller(fixed_int32_t x) {
				return sizeless_callee(x);
				}

				// CHECK-LABEL: @fixed_callee(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[X:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <16 x i32> [[X]] to <vscale x 4 x i32>*
				// CHECK-NEXT: store <vscale x 4 x i32> [[X_COERCE:%.]], <vscale x 4 x i32> [[TMP0]], align 16
				// CHECK-NEXT: [[X1:%.]] = load <16 x i32>, <16 x i32> [[X]], align 16, !tbaa !2
				// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 4 x i32> [[RETVAL_COERCE]] to <16 x i32>*
				// CHECK-NEXT: store <16 x i32> [[X1]], <16 x i32>* [[RETVAL_0__SROA_CAST]], align 16
				// CHECK-NEXT: [[TMP1:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP1]]
				//
				fixed_int32_t fixed_callee(fixed_int32_t x) {
				return x;
				}

				// CHECK-LABEL: @sizeless_caller(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[X_ADDR:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[COERCE_COERCE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[COERCE1:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[SAVED_CALL_RVALUE:%.*]] = alloca <16 x i32>, align 64
				// CHECK-NEXT: store <vscale x 4 x i32> [[X:%.]], <vscale x 4 x i32> [[X_ADDR]], align 16, !tbaa !5
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <vscale x 4 x i32> [[X_ADDR]] to <16 x i32>*
				// CHECK-NEXT: [[TMP1:%.]] = load <16 x i32>, <16 x i32> [[TMP0]], align 16, !tbaa !2
				// CHECK-NEXT: [[COERCE_0__SROA_CAST:%.]] = bitcast <vscale x 4 x i32> [[COERCE_COERCE]] to <16 x i32>*
				// CHECK-NEXT: store <16 x i32> [[TMP1]], <16 x i32>* [[COERCE_0__SROA_CAST]], align 16
				// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[COERCE_COERCE]], align 16
				// CHECK-NEXT: [[CALL:%.*]] = call <vscale x 4 x i32> @fixed_callee(<vscale x 4 x i32> [[TMP2]])
				// CHECK-NEXT: [[TMP3:%.]] = bitcast <16 x i32> [[COERCE1]] to <vscale x 4 x i32>*
				// CHECK-NEXT: store <vscale x 4 x i32> [[CALL]], <vscale x 4 x i32>* [[TMP3]], align 16
				// CHECK-NEXT: [[TMP4:%.]] = load <16 x i32>, <16 x i32> [[COERCE1]], align 16, !tbaa !2
				// CHECK-NEXT: store <16 x i32> [[TMP4]], <16 x i32>* [[SAVED_CALL_RVALUE]], align 64, !tbaa !2
				// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <16 x i32> [[SAVED_CALL_RVALUE]] to <vscale x 4 x i32>*
				// CHECK-NEXT: [[TMP5:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[CASTFIXEDSVE]], align 64, !tbaa !2
				// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP5]]
				//
				svint32_t sizeless_caller(svint32_t x) {
				return fixed_callee(x);
				}

				//===----------------------------------------------------------------------===//
				// fixed, fixed
				//===----------------------------------------------------------------------===//

				// CHECK-LABEL: @call_int32_ff(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[OP1:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[OP2:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[OP1_ADDR:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[OP2_ADDR:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[SAVED_CALL_RVALUE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <16 x i32> [[OP1]] to <vscale x 4 x i32>*
				// CHECK-NEXT: store <vscale x 4 x i32> [[OP1_COERCE:%.]], <vscale x 4 x i32> [[TMP0]], align 16
				// CHECK-NEXT: [[OP11:%.]] = load <16 x i32>, <16 x i32> [[OP1]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP1:%.]] = bitcast <16 x i32> [[OP2]] to <vscale x 4 x i32>*
				// CHECK-NEXT: store <vscale x 4 x i32> [[OP2_COERCE:%.]], <vscale x 4 x i32> [[TMP1]], align 16
				// CHECK-NEXT: [[OP22:%.]] = load <16 x i32>, <16 x i32> [[OP2]], align 16, !tbaa !2
				// CHECK-NEXT: store <16 x i32> [[OP11]], <16 x i32>* [[OP1_ADDR]], align 16, !tbaa !2
				// CHECK-NEXT: store <16 x i32> [[OP22]], <16 x i32>* [[OP2_ADDR]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP2:%.]] = bitcast <16 x i32> [[OP1_ADDR]] to <vscale x 4 x i32>*
				// CHECK-NEXT: [[TMP3:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[TMP2]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP4:%.]] = bitcast <16 x i32> [[OP2_ADDR]] to <vscale x 4 x i32>*
				// CHECK-NEXT: [[TMP5:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[TMP4]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP6:%.]] = call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
				// CHECK-NEXT: [[TMP7:%.*]] = call <vscale x 4 x i32> @llvm.aarch64.sve.sel.nxv4i32(<vscale x 4 x i1> [[TMP6]], <vscale x 4 x i32> [[TMP3]], <vscale x 4 x i32> [[TMP5]])
				// CHECK-NEXT: store <vscale x 4 x i32> [[TMP7]], <vscale x 4 x i32>* [[SAVED_CALL_RVALUE]], align 16, !tbaa !5
				// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 4 x i32> [[SAVED_CALL_RVALUE]] to <16 x i32>*
				// CHECK-NEXT: [[TMP8:%.]] = load <16 x i32>, <16 x i32> [[CASTFIXEDSVE]], align 16, !tbaa !2
				// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 4 x i32> [[RETVAL_COERCE]] to <16 x i32>*
				// CHECK-NEXT: store <16 x i32> [[TMP8]], <16 x i32>* [[RETVAL_0__SROA_CAST]], align 16
				// CHECK-NEXT: [[TMP9:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP9]]
				//
				fixed_int32_t call_int32_ff(svbool_t pg, fixed_int32_t op1, fixed_int32_t op2) {
				return svsel(pg, op1, op2);
				}

				// CHECK-LABEL: @call_float64_ff(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[OP1:%.*]] = alloca <8 x double>, align 16
				// CHECK-NEXT: [[OP2:%.*]] = alloca <8 x double>, align 16
				// CHECK-NEXT: [[OP1_ADDR:%.*]] = alloca <8 x double>, align 16
				// CHECK-NEXT: [[OP2_ADDR:%.*]] = alloca <8 x double>, align 16
				// CHECK-NEXT: [[SAVED_CALL_RVALUE:%.*]] = alloca <vscale x 2 x double>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 2 x double>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <8 x double> [[OP1]] to <vscale x 2 x double>*
				// CHECK-NEXT: store <vscale x 2 x double> [[OP1_COERCE:%.]], <vscale x 2 x double> [[TMP0]], align 16
				// CHECK-NEXT: [[OP11:%.]] = load <8 x double>, <8 x double> [[OP1]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP1:%.]] = bitcast <8 x double> [[OP2]] to <vscale x 2 x double>*
				// CHECK-NEXT: store <vscale x 2 x double> [[OP2_COERCE:%.]], <vscale x 2 x double> [[TMP1]], align 16
				// CHECK-NEXT: [[OP22:%.]] = load <8 x double>, <8 x double> [[OP2]], align 16, !tbaa !2
				// CHECK-NEXT: store <8 x double> [[OP11]], <8 x double>* [[OP1_ADDR]], align 16, !tbaa !2
				// CHECK-NEXT: store <8 x double> [[OP22]], <8 x double>* [[OP2_ADDR]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP2:%.]] = bitcast <8 x double> [[OP1_ADDR]] to <vscale x 2 x double>*
				// CHECK-NEXT: [[TMP3:%.]] = load <vscale x 2 x double>, <vscale x 2 x double> [[TMP2]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP4:%.]] = bitcast <8 x double> [[OP2_ADDR]] to <vscale x 2 x double>*
				// CHECK-NEXT: [[TMP5:%.]] = load <vscale x 2 x double>, <vscale x 2 x double> [[TMP4]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP6:%.]] = call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
				// CHECK-NEXT: [[TMP7:%.*]] = call <vscale x 2 x double> @llvm.aarch64.sve.sel.nxv2f64(<vscale x 2 x i1> [[TMP6]], <vscale x 2 x double> [[TMP3]], <vscale x 2 x double> [[TMP5]])
				// CHECK-NEXT: store <vscale x 2 x double> [[TMP7]], <vscale x 2 x double>* [[SAVED_CALL_RVALUE]], align 16, !tbaa !7
				// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 2 x double> [[SAVED_CALL_RVALUE]] to <8 x double>*
				// CHECK-NEXT: [[TMP8:%.]] = load <8 x double>, <8 x double> [[CASTFIXEDSVE]], align 16, !tbaa !2
				// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 2 x double> [[RETVAL_COERCE]] to <8 x double>*
				// CHECK-NEXT: store <8 x double> [[TMP8]], <8 x double>* [[RETVAL_0__SROA_CAST]], align 16
				// CHECK-NEXT: [[TMP9:%.]] = load <vscale x 2 x double>, <vscale x 2 x double> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 2 x double> [[TMP9]]
				//
				fixed_float64_t call_float64_ff(svbool_t pg, fixed_float64_t op1, fixed_float64_t op2) {
				return svsel(pg, op1, op2);
				}

				// CHECK-LABEL: @call_bool_ff(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[OP1:%.*]] = alloca <8 x i8>, align 16
				// CHECK-NEXT: [[OP2:%.*]] = alloca <8 x i8>, align 16
				// CHECK-NEXT: [[OP1_ADDR:%.*]] = alloca <8 x i8>, align 16
				// CHECK-NEXT: [[OP2_ADDR:%.*]] = alloca <8 x i8>, align 16
				// CHECK-NEXT: [[SAVED_CALL_RVALUE:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <8 x i8> [[OP1]] to <vscale x 16 x i1>*
				// CHECK-NEXT: store <vscale x 16 x i1> [[OP1_COERCE:%.]], <vscale x 16 x i1> [[TMP0]], align 16
				// CHECK-NEXT: [[TMP1:%.]] = bitcast <8 x i8> [[OP1]] to i64*
				// CHECK-NEXT: [[OP113:%.]] = load i64, i64 [[TMP1]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP2:%.]] = bitcast <8 x i8> [[OP2]] to <vscale x 16 x i1>*
				// CHECK-NEXT: store <vscale x 16 x i1> [[OP2_COERCE:%.]], <vscale x 16 x i1> [[TMP2]], align 16
				// CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i8> [[OP2]] to i64*
				// CHECK-NEXT: [[OP224:%.]] = load i64, i64 [[TMP3]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP4:%.]] = bitcast <8 x i8> [[OP1_ADDR]] to i64*
				// CHECK-NEXT: store i64 [[OP113]], i64* [[TMP4]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP5:%.]] = bitcast <8 x i8> [[OP2_ADDR]] to i64*
				// CHECK-NEXT: store i64 [[OP224]], i64* [[TMP5]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP6:%.]] = bitcast <8 x i8> [[OP1_ADDR]] to <vscale x 16 x i1>*
				// CHECK-NEXT: [[TMP7:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[TMP6]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP8:%.]] = bitcast <8 x i8> [[OP2_ADDR]] to <vscale x 16 x i1>*
				// CHECK-NEXT: [[TMP9:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[TMP8]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP10:%.]] = call <vscale x 16 x i1> @llvm.aarch64.sve.sel.nxv16i1(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i1> [[TMP7]], <vscale x 16 x i1> [[TMP9]])
				// CHECK-NEXT: store <vscale x 16 x i1> [[TMP10]], <vscale x 16 x i1>* [[SAVED_CALL_RVALUE]], align 16, !tbaa !9
				// CHECK-NEXT: [[TMP11:%.]] = bitcast <vscale x 16 x i1> [[SAVED_CALL_RVALUE]] to i64*
				// CHECK-NEXT: [[TMP12:%.]] = load i64, i64 [[TMP11]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP13:%.]] = bitcast <vscale x 16 x i1> [[RETVAL_COERCE]] to i64*
				// CHECK-NEXT: store i64 [[TMP12]], i64* [[TMP13]], align 16
				// CHECK-NEXT: [[TMP14:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP14]]
				//
				fixed_bool_t call_bool_ff(svbool_t pg, fixed_bool_t op1, fixed_bool_t op2) {
				return svsel(pg, op1, op2);
				}

				//===----------------------------------------------------------------------===//
				// fixed, scalable
				//===----------------------------------------------------------------------===//

				// CHECK-LABEL: @call_int32_fs(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[OP1:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[OP1_ADDR:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[SAVED_CALL_RVALUE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <16 x i32> [[OP1]] to <vscale x 4 x i32>*
				// CHECK-NEXT: store <vscale x 4 x i32> [[OP1_COERCE:%.]], <vscale x 4 x i32> [[TMP0]], align 16
				// CHECK-NEXT: [[OP11:%.]] = load <16 x i32>, <16 x i32> [[OP1]], align 16, !tbaa !2
				// CHECK-NEXT: store <16 x i32> [[OP11]], <16 x i32>* [[OP1_ADDR]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP1:%.]] = bitcast <16 x i32> [[OP1_ADDR]] to <vscale x 4 x i32>*
				// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[TMP1]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP3:%.]] = call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
				// CHECK-NEXT: [[TMP4:%.]] = call <vscale x 4 x i32> @llvm.aarch64.sve.sel.nxv4i32(<vscale x 4 x i1> [[TMP3]], <vscale x 4 x i32> [[TMP2]], <vscale x 4 x i32> [[OP2:%.]])
				// CHECK-NEXT: store <vscale x 4 x i32> [[TMP4]], <vscale x 4 x i32>* [[SAVED_CALL_RVALUE]], align 16, !tbaa !5
				// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 4 x i32> [[SAVED_CALL_RVALUE]] to <16 x i32>*
				// CHECK-NEXT: [[TMP5:%.]] = load <16 x i32>, <16 x i32> [[CASTFIXEDSVE]], align 16, !tbaa !2
				// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 4 x i32> [[RETVAL_COERCE]] to <16 x i32>*
				// CHECK-NEXT: store <16 x i32> [[TMP5]], <16 x i32>* [[RETVAL_0__SROA_CAST]], align 16
				// CHECK-NEXT: [[TMP6:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP6]]
				//
				fixed_int32_t call_int32_fs(svbool_t pg, fixed_int32_t op1, svint32_t op2) {
				return svsel(pg, op1, op2);
				}

				// CHECK-LABEL: @call_float64_fs(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[OP1:%.*]] = alloca <8 x double>, align 16
				// CHECK-NEXT: [[OP1_ADDR:%.*]] = alloca <8 x double>, align 16
				// CHECK-NEXT: [[SAVED_CALL_RVALUE:%.*]] = alloca <vscale x 2 x double>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 2 x double>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <8 x double> [[OP1]] to <vscale x 2 x double>*
				// CHECK-NEXT: store <vscale x 2 x double> [[OP1_COERCE:%.]], <vscale x 2 x double> [[TMP0]], align 16
				// CHECK-NEXT: [[OP11:%.]] = load <8 x double>, <8 x double> [[OP1]], align 16, !tbaa !2
				// CHECK-NEXT: store <8 x double> [[OP11]], <8 x double>* [[OP1_ADDR]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP1:%.]] = bitcast <8 x double> [[OP1_ADDR]] to <vscale x 2 x double>*
				// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 2 x double>, <vscale x 2 x double> [[TMP1]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP3:%.]] = call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
				// CHECK-NEXT: [[TMP4:%.]] = call <vscale x 2 x double> @llvm.aarch64.sve.sel.nxv2f64(<vscale x 2 x i1> [[TMP3]], <vscale x 2 x double> [[TMP2]], <vscale x 2 x double> [[OP2:%.]])
				// CHECK-NEXT: store <vscale x 2 x double> [[TMP4]], <vscale x 2 x double>* [[SAVED_CALL_RVALUE]], align 16, !tbaa !7
				// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 2 x double> [[SAVED_CALL_RVALUE]] to <8 x double>*
				// CHECK-NEXT: [[TMP5:%.]] = load <8 x double>, <8 x double> [[CASTFIXEDSVE]], align 16, !tbaa !2
				// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 2 x double> [[RETVAL_COERCE]] to <8 x double>*
				// CHECK-NEXT: store <8 x double> [[TMP5]], <8 x double>* [[RETVAL_0__SROA_CAST]], align 16
				// CHECK-NEXT: [[TMP6:%.]] = load <vscale x 2 x double>, <vscale x 2 x double> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 2 x double> [[TMP6]]
				//
				fixed_float64_t call_float64_fs(svbool_t pg, fixed_float64_t op1, svfloat64_t op2) {
				return svsel(pg, op1, op2);
				}

				// CHECK-LABEL: @call_bool_fs(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[OP1:%.*]] = alloca <8 x i8>, align 16
				// CHECK-NEXT: [[OP1_ADDR:%.*]] = alloca <8 x i8>, align 16
				// CHECK-NEXT: [[SAVED_CALL_RVALUE:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <8 x i8> [[OP1]] to <vscale x 16 x i1>*
				// CHECK-NEXT: store <vscale x 16 x i1> [[OP1_COERCE:%.]], <vscale x 16 x i1> [[TMP0]], align 16
				// CHECK-NEXT: [[TMP1:%.]] = bitcast <8 x i8> [[OP1]] to i64*
				// CHECK-NEXT: [[OP112:%.]] = load i64, i64 [[TMP1]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP2:%.]] = bitcast <8 x i8> [[OP1_ADDR]] to i64*
				// CHECK-NEXT: store i64 [[OP112]], i64* [[TMP2]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i8> [[OP1_ADDR]] to <vscale x 16 x i1>*
				// CHECK-NEXT: [[TMP4:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[TMP3]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP5:%.]] = call <vscale x 16 x i1> @llvm.aarch64.sve.sel.nxv16i1(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i1> [[TMP4]], <vscale x 16 x i1> [[OP2:%.*]])
				// CHECK-NEXT: store <vscale x 16 x i1> [[TMP5]], <vscale x 16 x i1>* [[SAVED_CALL_RVALUE]], align 16, !tbaa !9
				// CHECK-NEXT: [[TMP6:%.]] = bitcast <vscale x 16 x i1> [[SAVED_CALL_RVALUE]] to i64*
				// CHECK-NEXT: [[TMP7:%.]] = load i64, i64 [[TMP6]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP8:%.]] = bitcast <vscale x 16 x i1> [[RETVAL_COERCE]] to i64*
				// CHECK-NEXT: store i64 [[TMP7]], i64* [[TMP8]], align 16
				// CHECK-NEXT: [[TMP9:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP9]]
				//
				fixed_bool_t call_bool_fs(svbool_t pg, fixed_bool_t op1, svbool_t op2) {
				return svsel(pg, op1, op2);
				}

				//===----------------------------------------------------------------------===//
				// scalable, scalable
				//===----------------------------------------------------------------------===//

				// CHECK-LABEL: @call_int32_ss(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[SAVED_CALL_RVALUE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
				// CHECK-NEXT: [[TMP1:%.]] = call <vscale x 4 x i32> @llvm.aarch64.sve.sel.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[OP2:%.*]])
				// CHECK-NEXT: store <vscale x 4 x i32> [[TMP1]], <vscale x 4 x i32>* [[SAVED_CALL_RVALUE]], align 16, !tbaa !5
				// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 4 x i32> [[SAVED_CALL_RVALUE]] to <16 x i32>*
				// CHECK-NEXT: [[TMP2:%.]] = load <16 x i32>, <16 x i32> [[CASTFIXEDSVE]], align 16, !tbaa !2
				// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 4 x i32> [[RETVAL_COERCE]] to <16 x i32>*
				// CHECK-NEXT: store <16 x i32> [[TMP2]], <16 x i32>* [[RETVAL_0__SROA_CAST]], align 16
				// CHECK-NEXT: [[TMP3:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP3]]
				//
				fixed_int32_t call_int32_ss(svbool_t pg, svint32_t op1, svint32_t op2) {
				return svsel(pg, op1, op2);
				}

				// CHECK-LABEL: @call_float64_ss(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[SAVED_CALL_RVALUE:%.*]] = alloca <vscale x 2 x double>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 2 x double>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
				// CHECK-NEXT: [[TMP1:%.]] = call <vscale x 2 x double> @llvm.aarch64.sve.sel.nxv2f64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x double> [[OP1:%.]], <vscale x 2 x double> [[OP2:%.*]])
				// CHECK-NEXT: store <vscale x 2 x double> [[TMP1]], <vscale x 2 x double>* [[SAVED_CALL_RVALUE]], align 16, !tbaa !7
				// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 2 x double> [[SAVED_CALL_RVALUE]] to <8 x double>*
				// CHECK-NEXT: [[TMP2:%.]] = load <8 x double>, <8 x double> [[CASTFIXEDSVE]], align 16, !tbaa !2
				// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 2 x double> [[RETVAL_COERCE]] to <8 x double>*
				// CHECK-NEXT: store <8 x double> [[TMP2]], <8 x double>* [[RETVAL_0__SROA_CAST]], align 16
				// CHECK-NEXT: [[TMP3:%.]] = load <vscale x 2 x double>, <vscale x 2 x double> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 2 x double> [[TMP3]]
				//
				fixed_float64_t call_float64_ss(svbool_t pg, svfloat64_t op1, svfloat64_t op2) {
				return svsel(pg, op1, op2);
				}

				// CHECK-LABEL: @call_bool_ss(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[SAVED_CALL_RVALUE:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = call <vscale x 16 x i1> @llvm.aarch64.sve.sel.nxv16i1(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i1> [[OP1:%.]], <vscale x 16 x i1> [[OP2:%.]])
				// CHECK-NEXT: store <vscale x 16 x i1> [[TMP0]], <vscale x 16 x i1>* [[SAVED_CALL_RVALUE]], align 16, !tbaa !9
				// CHECK-NEXT: [[TMP1:%.]] = bitcast <vscale x 16 x i1> [[SAVED_CALL_RVALUE]] to i64*
				// CHECK-NEXT: [[TMP2:%.]] = load i64, i64 [[TMP1]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP3:%.]] = bitcast <vscale x 16 x i1> [[RETVAL_COERCE]] to i64*
				// CHECK-NEXT: store i64 [[TMP2]], i64* [[TMP3]], align 16
				// CHECK-NEXT: [[TMP4:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP4]]
				//
				fixed_bool_t call_bool_ss(svbool_t pg, svbool_t op1, svbool_t op2) {
				return svsel(pg, op1, op2);
				}

clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c

This file was added.

				// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
				// REQUIRES: aarch64-registered-target
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -fallow-half-arguments-and-returns -S -O1 -emit-llvm -o - %s \| FileCheck %s

				#include <arm_sve.h>

				#define N __ARM_FEATURE_SVE_BITS_EXPERIMENTAL

				typedef svint32_t fixed_int32_t __attribute__((arm_sve_vector_bits(N)));
				typedef svfloat64_t fixed_float64_t __attribute__((arm_sve_vector_bits(N)));
				typedef svbool_t fixed_bool_t __attribute__((arm_sve_vector_bits(N)));

				// CHECK-LABEL: @to_svint32_t(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[TYPE:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[TYPE_ADDR:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <16 x i32> [[TYPE]] to <vscale x 4 x i32>*
				// CHECK-NEXT: store <vscale x 4 x i32> [[TYPE_COERCE:%.]], <vscale x 4 x i32> [[TMP0]], align 16
				// CHECK-NEXT: [[TYPE1:%.]] = load <16 x i32>, <16 x i32> [[TYPE]], align 16, !tbaa !2
				// CHECK-NEXT: store <16 x i32> [[TYPE1]], <16 x i32>* [[TYPE_ADDR]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP1:%.]] = bitcast <16 x i32> [[TYPE_ADDR]] to <vscale x 4 x i32>*
				// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[TMP1]], align 16, !tbaa !2
				// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP2]]
				//
				svint32_t to_svint32_t(fixed_int32_t type) {
				return type;
				}

				// CHECK-LABEL: @from_svint32_t(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[TYPE_ADDR:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: store <vscale x 4 x i32> [[TYPE:%.]], <vscale x 4 x i32> [[TYPE_ADDR]], align 16, !tbaa !5
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <vscale x 4 x i32> [[TYPE_ADDR]] to <16 x i32>*
				// CHECK-NEXT: [[TMP1:%.]] = load <16 x i32>, <16 x i32> [[TMP0]], align 16, !tbaa !2
				// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 4 x i32> [[RETVAL_COERCE]] to <16 x i32>*
				// CHECK-NEXT: store <16 x i32> [[TMP1]], <16 x i32>* [[RETVAL_0__SROA_CAST]], align 16
				// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP2]]
				//
				fixed_int32_t from_svint32_t(svint32_t type) {
				return type;
				}

				// CHECK-LABEL: @to_svfloat64_t(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[TYPE:%.*]] = alloca <8 x double>, align 16
				// CHECK-NEXT: [[TYPE_ADDR:%.*]] = alloca <8 x double>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <8 x double> [[TYPE]] to <vscale x 2 x double>*
				// CHECK-NEXT: store <vscale x 2 x double> [[TYPE_COERCE:%.]], <vscale x 2 x double> [[TMP0]], align 16
				// CHECK-NEXT: [[TYPE1:%.]] = load <8 x double>, <8 x double> [[TYPE]], align 16, !tbaa !2
				// CHECK-NEXT: store <8 x double> [[TYPE1]], <8 x double>* [[TYPE_ADDR]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP1:%.]] = bitcast <8 x double> [[TYPE_ADDR]] to <vscale x 2 x double>*
				// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 2 x double>, <vscale x 2 x double> [[TMP1]], align 16, !tbaa !2
				// CHECK-NEXT: ret <vscale x 2 x double> [[TMP2]]
				//
				svfloat64_t to_svfloat64_t(fixed_float64_t type) {
				return type;
				}

				// CHECK-LABEL: @from_svfloat64_t(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[TYPE_ADDR:%.*]] = alloca <vscale x 2 x double>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 2 x double>, align 16
				// CHECK-NEXT: store <vscale x 2 x double> [[TYPE:%.]], <vscale x 2 x double> [[TYPE_ADDR]], align 16, !tbaa !7
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <vscale x 2 x double> [[TYPE_ADDR]] to <8 x double>*
				// CHECK-NEXT: [[TMP1:%.]] = load <8 x double>, <8 x double> [[TMP0]], align 16, !tbaa !2
				// CHECK-NEXT: [[RETVAL_0__SROA_CAST:%.]] = bitcast <vscale x 2 x double> [[RETVAL_COERCE]] to <8 x double>*
				// CHECK-NEXT: store <8 x double> [[TMP1]], <8 x double>* [[RETVAL_0__SROA_CAST]], align 16
				// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 2 x double>, <vscale x 2 x double> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 2 x double> [[TMP2]]
				//
				fixed_float64_t from_svfloat64_t(svfloat64_t type) {
				return type;
				}

				// CHECK-LABEL: @to_svbool_t(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[TYPE:%.*]] = alloca <8 x i8>, align 16
				// CHECK-NEXT: [[TYPE_ADDR:%.*]] = alloca <8 x i8>, align 16
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <8 x i8> [[TYPE]] to <vscale x 16 x i1>*
				// CHECK-NEXT: store <vscale x 16 x i1> [[TYPE_COERCE:%.]], <vscale x 16 x i1> [[TMP0]], align 16
				// CHECK-NEXT: [[TMP1:%.]] = bitcast <8 x i8> [[TYPE]] to i64*
				// CHECK-NEXT: [[TYPE12:%.]] = load i64, i64 [[TMP1]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP2:%.]] = bitcast <8 x i8> [[TYPE_ADDR]] to i64*
				// CHECK-NEXT: store i64 [[TYPE12]], i64* [[TMP2]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i8> [[TYPE_ADDR]] to <vscale x 16 x i1>*
				// CHECK-NEXT: [[TMP4:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[TMP3]], align 16, !tbaa !2
				// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP4]]
				//
				svbool_t to_svbool_t(fixed_bool_t type) {
				return type;
				}

				// CHECK-LABEL: @from_svbool_t(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[TYPE_ADDR:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-NEXT: store <vscale x 16 x i1> [[TYPE:%.]], <vscale x 16 x i1> [[TYPE_ADDR]], align 16, !tbaa !9
				// CHECK-NEXT: [[TMP0:%.]] = bitcast <vscale x 16 x i1> [[TYPE_ADDR]] to i64*
				// CHECK-NEXT: [[TMP1:%.]] = load i64, i64 [[TMP0]], align 16, !tbaa !2
				// CHECK-NEXT: [[TMP2:%.]] = bitcast <vscale x 16 x i1> [[RETVAL_COERCE]] to i64*
				// CHECK-NEXT: store i64 [[TMP1]], i64* [[TMP2]], align 16
				// CHECK-NEXT: [[TMP3:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP3]]
				//
				fixed_bool_t from_svbool_t(svbool_t type) {
				return type;
				}

clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c

This file was added.

				// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=512 -fallow-half-arguments-and-returns -S -disable-llvm-passes -emit-llvm -o - %s \| FileCheck %s

				#include <arm_sve.h>

				#define N __ARM_FEATURE_SVE_BITS_EXPERIMENTAL

				typedef svint32_t fixed_int32_t __attribute__((arm_sve_vector_bits(N)));
				typedef svbool_t fixed_bool_t __attribute__((arm_sve_vector_bits(N)));

				fixed_bool_t global_pred;
				fixed_int32_t global_vec;

				// CHECK-LABEL: @foo(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[RETVAL:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[PRED_ADDR:%.*]] = alloca <vscale x 16 x i1>, align 2
				// CHECK-NEXT: [[VEC_ADDR:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[PG:%.*]] = alloca <vscale x 16 x i1>, align 2
				// CHECK-NEXT: [[SAVED_CALL_RVALUE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: store <vscale x 16 x i1> [[PRED:%.]], <vscale x 16 x i1> [[PRED_ADDR]], align 2
				// CHECK-NEXT: store <vscale x 4 x i32> [[VEC:%.]], <vscale x 4 x i32> [[VEC_ADDR]], align 16
				// CHECK-NEXT: [[TMP0:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[PRED_ADDR]], align 2
				// CHECK-NEXT: [[TMP1:%.]] = load <8 x i8>, <8 x i8> @global_pred, align 2
				// CHECK-NEXT: [[TMP2:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> bitcast (<8 x i8>* @global_pred to <vscale x 16 x i1>*), align 2
				// CHECK-NEXT: [[TMP3:%.]] = load <8 x i8>, <8 x i8> @global_pred, align 2
				// CHECK-NEXT: [[TMP4:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> bitcast (<8 x i8>* @global_pred to <vscale x 16 x i1>*), align 2
				// CHECK-NEXT: [[TMP5:%.*]] = call <vscale x 16 x i1> @llvm.aarch64.sve.and.z.nxv16i1(<vscale x 16 x i1> [[TMP0]], <vscale x 16 x i1> [[TMP2]], <vscale x 16 x i1> [[TMP4]])
				// CHECK-NEXT: store <vscale x 16 x i1> [[TMP5]], <vscale x 16 x i1>* [[PG]], align 2
				// CHECK-NEXT: [[TMP6:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[PG]], align 2
				// CHECK-NEXT: [[TMP7:%.]] = load <16 x i32>, <16 x i32> @global_vec, align 16
				// CHECK-NEXT: [[TMP8:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> bitcast (<16 x i32>* @global_vec to <vscale x 4 x i32>*), align 16
				// CHECK-NEXT: [[TMP9:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[VEC_ADDR]], align 16
				// CHECK-NEXT: [[TMP10:%.*]] = call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[TMP6]])
				// CHECK-NEXT: [[TMP11:%.*]] = call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP10]], <vscale x 4 x i32> [[TMP8]], <vscale x 4 x i32> [[TMP9]])
				// CHECK-NEXT: store <vscale x 4 x i32> [[TMP11]], <vscale x 4 x i32>* [[SAVED_CALL_RVALUE]], align 16
				// CHECK-NEXT: [[CASTFIXEDSVE:%.]] = bitcast <vscale x 4 x i32> [[SAVED_CALL_RVALUE]] to <16 x i32>*
				// CHECK-NEXT: [[TMP12:%.]] = load <16 x i32>, <16 x i32> [[CASTFIXEDSVE]], align 16
				// CHECK-NEXT: store <16 x i32> [[TMP12]], <16 x i32>* [[RETVAL]], align 16
				// CHECK-NEXT: [[TMP13:%.]] = bitcast <vscale x 4 x i32> [[RETVAL_COERCE]] to i8*
				// CHECK-NEXT: [[TMP14:%.]] = bitcast <16 x i32> [[RETVAL]] to i8*
				// CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 16 [[TMP13]], i8* align 16 [[TMP14]], i64 64, i1 false)
				// CHECK-NEXT: [[TMP15:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP15]]
				//
				fixed_int32_t foo(svbool_t pred, svint32_t vec) {
				svbool_t pg = svand_z(pred, global_pred, global_pred);
				return svadd_m(pg, global_vec, vec);
				}

				// CHECK-LABEL: @test_ptr_to_global(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[RETVAL:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[GLOBAL_VEC_PTR:%.]] = alloca <16 x i32>, align 8
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: store <16 x i32>* @global_vec, <16 x i32>** [[GLOBAL_VEC_PTR]], align 8
				// CHECK-NEXT: [[TMP0:%.]] = load <16 x i32>, <16 x i32>** [[GLOBAL_VEC_PTR]], align 8
				// CHECK-NEXT: [[TMP1:%.]] = load <16 x i32>, <16 x i32> [[TMP0]], align 16
				// CHECK-NEXT: store <16 x i32> [[TMP1]], <16 x i32>* [[RETVAL]], align 16
				// CHECK-NEXT: [[TMP2:%.]] = bitcast <vscale x 4 x i32> [[RETVAL_COERCE]] to i8*
				// CHECK-NEXT: [[TMP3:%.]] = bitcast <16 x i32> [[RETVAL]] to i8*
				// CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 16 [[TMP2]], i8* align 16 [[TMP3]], i64 64, i1 false)
				// CHECK-NEXT: [[TMP4:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP4]]
				//
				fixed_int32_t test_ptr_to_global() {
				fixed_int32_t *global_vec_ptr;
				global_vec_ptr = &global_vec;
				return *global_vec_ptr;
				}

				//
				// Test casting pointer from fixed-length array to scalable vector.
				// CHECK-LABEL: @array_arg(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[RETVAL:%.*]] = alloca <16 x i32>, align 16
				// CHECK-NEXT: [[ARR_ADDR:%.]] = alloca <16 x i32>, align 8
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 4 x i32>, align 16
				// CHECK-NEXT: store <16 x i32>* [[ARR:%.]], <16 x i32>* [[ARR_ADDR]], align 8
				// CHECK-NEXT: [[TMP0:%.]] = load <16 x i32>, <16 x i32>** [[ARR_ADDR]], align 8
				// CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds <16 x i32>, <16 x i32> [[TMP0]], i64 0
				// CHECK-NEXT: [[TMP1:%.]] = load <16 x i32>, <16 x i32> [[ARRAYIDX]], align 16
				// CHECK-NEXT: store <16 x i32> [[TMP1]], <16 x i32>* [[RETVAL]], align 16
				// CHECK-NEXT: [[TMP2:%.]] = bitcast <vscale x 4 x i32> [[RETVAL_COERCE]] to i8*
				// CHECK-NEXT: [[TMP3:%.]] = bitcast <16 x i32> [[RETVAL]] to i8*
				// CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 16 [[TMP2]], i8* align 16 [[TMP3]], i64 64, i1 false)
				// CHECK-NEXT: [[TMP4:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP4]]
				//
				fixed_int32_t array_arg(fixed_int32_t arr[]) {
				return arr[0];
				}

				// CHECK-LABEL: @address_of_array_idx(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[RETVAL:%.*]] = alloca <8 x i8>, align 2
				// CHECK-NEXT: [[ARR:%.*]] = alloca [3 x <8 x i8>], align 2
				// CHECK-NEXT: [[PARR:%.]] = alloca <8 x i8>, align 8
				// CHECK-NEXT: [[RETVAL_COERCE:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [3 x <8 x i8>], [3 x <8 x i8>] [[ARR]], i64 0, i64 0
				// CHECK-NEXT: store <8 x i8>* [[ARRAYIDX]], <8 x i8>** [[PARR]], align 8
				// CHECK-NEXT: [[TMP0:%.]] = load <8 x i8>, <8 x i8>** [[PARR]], align 8
				// CHECK-NEXT: [[TMP1:%.]] = load <8 x i8>, <8 x i8> [[TMP0]], align 2
				// CHECK-NEXT: store <8 x i8> [[TMP1]], <8 x i8>* [[RETVAL]], align 2
				// CHECK-NEXT: [[TMP2:%.]] = bitcast <vscale x 16 x i1> [[RETVAL_COERCE]] to i8*
				// CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i8> [[RETVAL]] to i8*
				// CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 16 [[TMP2]], i8* align 2 [[TMP3]], i64 8, i1 false)
				// CHECK-NEXT: [[TMP4:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> [[RETVAL_COERCE]], align 16
				// CHECK-NEXT: ret <vscale x 16 x i1> [[TMP4]]
				//
				fixed_bool_t address_of_array_idx() {
				fixed_bool_t arr[3];
				fixed_bool_t *parr;
				parr = &arr[0];
				return *parr;
				}

clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c

This file was added.

				// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
				// REQUIRES: aarch64-registered-target
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=128 -fallow-half-arguments-and-returns -S -O1 -emit-llvm -o - %s \| FileCheck %s --check-prefix=CHECK-128
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=512 -fallow-half-arguments-and-returns -S -O1 -emit-llvm -o - %s \| FileCheck %s --check-prefix=CHECK-512

				#include <arm_sve.h>

				#define N __ARM_FEATURE_SVE_BITS_EXPERIMENTAL

				typedef svint64_t fixed_int64_t __attribute__((arm_sve_vector_bits(N)));
				typedef svbfloat16_t fixed_bfloat16_t __attribute__((arm_sve_vector_bits(N)));
				typedef svbool_t fixed_bool_t __attribute__((arm_sve_vector_bits(N)));

				fixed_int64_t global_i64;
				fixed_bfloat16_t global_bf16;
				fixed_bool_t global_bool;

				//===----------------------------------------------------------------------===//
				// WRITES
				//===----------------------------------------------------------------------===//

				// CHECK-128-LABEL: @write_global_i64(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[V_ADDR:%.*]] = alloca <vscale x 2 x i64>, align 16
				// CHECK-128-NEXT: store <vscale x 2 x i64> [[V:%.]], <vscale x 2 x i64> [[V_ADDR]], align 16, !tbaa !2
				// CHECK-128-NEXT: [[TMP0:%.]] = bitcast <vscale x 2 x i64> [[V_ADDR]] to <2 x i64>*
				// CHECK-128-NEXT: [[TMP1:%.]] = load <2 x i64>, <2 x i64> [[TMP0]], align 16, !tbaa !6
				// CHECK-128-NEXT: store <2 x i64> [[TMP1]], <2 x i64>* @global_i64, align 16, !tbaa !6
				// CHECK-128-NEXT: ret void
				//
				// CHECK-512-LABEL: @write_global_i64(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[V_ADDR:%.*]] = alloca <vscale x 2 x i64>, align 16
				// CHECK-512-NEXT: store <vscale x 2 x i64> [[V:%.]], <vscale x 2 x i64> [[V_ADDR]], align 16, !tbaa !2
				// CHECK-512-NEXT: [[TMP0:%.]] = bitcast <vscale x 2 x i64> [[V_ADDR]] to <8 x i64>*
				// CHECK-512-NEXT: [[TMP1:%.]] = load <8 x i64>, <8 x i64> [[TMP0]], align 16, !tbaa !6
				// CHECK-512-NEXT: store <8 x i64> [[TMP1]], <8 x i64>* @global_i64, align 16, !tbaa !6
				// CHECK-512-NEXT: ret void
				//
				void write_global_i64(svint64_t v) { global_i64 = v; }

				// CHECK-128-LABEL: @write_global_bf16(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[V_ADDR:%.*]] = alloca <vscale x 8 x bfloat>, align 16
				// CHECK-128-NEXT: store <vscale x 8 x bfloat> [[V:%.]], <vscale x 8 x bfloat> [[V_ADDR]], align 16, !tbaa !7
				// CHECK-128-NEXT: [[TMP0:%.]] = bitcast <vscale x 8 x bfloat> [[V_ADDR]] to <8 x bfloat>*
				// CHECK-128-NEXT: [[TMP1:%.]] = load <8 x bfloat>, <8 x bfloat> [[TMP0]], align 16, !tbaa !6
				// CHECK-128-NEXT: store <8 x bfloat> [[TMP1]], <8 x bfloat>* @global_bf16, align 16, !tbaa !6
				// CHECK-128-NEXT: ret void
				//
				// CHECK-512-LABEL: @write_global_bf16(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[V_ADDR:%.*]] = alloca <vscale x 8 x bfloat>, align 16
				// CHECK-512-NEXT: store <vscale x 8 x bfloat> [[V:%.]], <vscale x 8 x bfloat> [[V_ADDR]], align 16, !tbaa !7
				// CHECK-512-NEXT: [[TMP0:%.]] = bitcast <vscale x 8 x bfloat> [[V_ADDR]] to <32 x bfloat>*
				// CHECK-512-NEXT: [[TMP1:%.]] = load <32 x bfloat>, <32 x bfloat> [[TMP0]], align 16, !tbaa !6
				// CHECK-512-NEXT: store <32 x bfloat> [[TMP1]], <32 x bfloat>* @global_bf16, align 16, !tbaa !6
				// CHECK-512-NEXT: ret void
				//
				void write_global_bf16(svbfloat16_t v) { global_bf16 = v; }

				// CHECK-128-LABEL: @write_global_bool(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[V_ADDR:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-128-NEXT: store <vscale x 16 x i1> [[V:%.]], <vscale x 16 x i1> [[V_ADDR]], align 16, !tbaa !9
				// CHECK-128-NEXT: [[TMP0:%.]] = bitcast <vscale x 16 x i1> [[V_ADDR]] to <2 x i8>*
				// CHECK-128-NEXT: [[TMP1:%.]] = load <2 x i8>, <2 x i8> [[TMP0]], align 16, !tbaa !6
				// CHECK-128-NEXT: store <2 x i8> [[TMP1]], <2 x i8>* @global_bool, align 2, !tbaa !6
				// CHECK-128-NEXT: ret void
				//
				// CHECK-512-LABEL: @write_global_bool(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[V_ADDR:%.*]] = alloca <vscale x 16 x i1>, align 16
				// CHECK-512-NEXT: store <vscale x 16 x i1> [[V:%.]], <vscale x 16 x i1> [[V_ADDR]], align 16, !tbaa !9
				// CHECK-512-NEXT: [[TMP0:%.]] = bitcast <vscale x 16 x i1> [[V_ADDR]] to i64*
				// CHECK-512-NEXT: [[TMP1:%.]] = load i64, i64 [[TMP0]], align 16, !tbaa !6
				// CHECK-512-NEXT: store i64 [[TMP1]], i64* bitcast (<8 x i8>* @global_bool to i64*), align 2, !tbaa !6
				// CHECK-512-NEXT: ret void
				//
				void write_global_bool(svbool_t v) { global_bool = v; }

				//===----------------------------------------------------------------------===//
				// READS
				//===----------------------------------------------------------------------===//

				// CHECK-128-LABEL: @read_global_i64(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[TMP0:%.]] = load <vscale x 2 x i64>, <vscale x 2 x i64> bitcast (<2 x i64>* @global_i64 to <vscale x 2 x i64>*), align 16, !tbaa !6
				// CHECK-128-NEXT: ret <vscale x 2 x i64> [[TMP0]]
				//
				// CHECK-512-LABEL: @read_global_i64(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[TMP0:%.]] = load <vscale x 2 x i64>, <vscale x 2 x i64> bitcast (<8 x i64>* @global_i64 to <vscale x 2 x i64>*), align 16, !tbaa !6
				// CHECK-512-NEXT: ret <vscale x 2 x i64> [[TMP0]]
				//
				svint64_t read_global_i64() { return global_i64; }

				// CHECK-128-LABEL: @read_global_bf16(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[TMP0:%.]] = load <vscale x 8 x bfloat>, <vscale x 8 x bfloat> bitcast (<8 x bfloat>* @global_bf16 to <vscale x 8 x bfloat>*), align 16, !tbaa !6
				// CHECK-128-NEXT: ret <vscale x 8 x bfloat> [[TMP0]]
				//
				// CHECK-512-LABEL: @read_global_bf16(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[TMP0:%.]] = load <vscale x 8 x bfloat>, <vscale x 8 x bfloat> bitcast (<32 x bfloat>* @global_bf16 to <vscale x 8 x bfloat>*), align 16, !tbaa !6
				// CHECK-512-NEXT: ret <vscale x 8 x bfloat> [[TMP0]]
				//
				svbfloat16_t read_global_bf16() { return global_bf16; }

				// CHECK-128-LABEL: @read_global_bool(
				// CHECK-128-NEXT: entry:
				// CHECK-128-NEXT: [[TMP0:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> bitcast (<2 x i8>* @global_bool to <vscale x 16 x i1>*), align 2, !tbaa !6
				// CHECK-128-NEXT: ret <vscale x 16 x i1> [[TMP0]]
				//
				// CHECK-512-LABEL: @read_global_bool(
				// CHECK-512-NEXT: entry:
				// CHECK-512-NEXT: [[TMP0:%.]] = load <vscale x 16 x i1>, <vscale x 16 x i1> bitcast (<8 x i8>* @global_bool to <vscale x 16 x i1>*), align 2, !tbaa !6
				// CHECK-512-NEXT: ret <vscale x 16 x i1> [[TMP0]]
				//
				svbool_t read_global_bool() { return global_bool; }

clang/test/CodeGen/attr-arm-sve-vector-bits-types.c

This file was added.

				// REQUIRES: aarch64-registered-target
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=128 -fallow-half-arguments-and-returns -S -emit-llvm -o - %s \| FileCheck %s --check-prefix=CHECK-128
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=256 -fallow-half-arguments-and-returns -S -emit-llvm -o - %s \| FileCheck %s --check-prefix=CHECK-256
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=512 -fallow-half-arguments-and-returns -S -emit-llvm -o - %s \| FileCheck %s --check-prefix=CHECK-512
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=1024 -fallow-half-arguments-and-returns -S -emit-llvm -o - %s \| FileCheck %s --check-prefix=CHECK-1024
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=2048 -fallow-half-arguments-and-returns -S -emit-llvm -o - %s \| FileCheck %s --check-prefix=CHECK-2048

				#include <arm_sve.h>

				#define N __ARM_FEATURE_SVE_BITS_EXPERIMENTAL

				typedef svint8_t fixed_int8_t __attribute__((arm_sve_vector_bits(N)));
				typedef svint16_t fixed_int16_t __attribute__((arm_sve_vector_bits(N)));
				typedef svint32_t fixed_int32_t __attribute__((arm_sve_vector_bits(N)));
				typedef svint64_t fixed_int64_t __attribute__((arm_sve_vector_bits(N)));

				typedef svuint8_t fixed_uint8_t __attribute__((arm_sve_vector_bits(N)));
				typedef svuint16_t fixed_uint16_t __attribute__((arm_sve_vector_bits(N)));
				typedef svuint32_t fixed_uint32_t __attribute__((arm_sve_vector_bits(N)));
				typedef svuint64_t fixed_uint64_t __attribute__((arm_sve_vector_bits(N)));

				typedef svfloat16_t fixed_float16_t __attribute__((arm_sve_vector_bits(N)));
				typedef svfloat32_t fixed_float32_t __attribute__((arm_sve_vector_bits(N)));
				typedef svfloat64_t fixed_float64_t __attribute__((arm_sve_vector_bits(N)));

				typedef svbfloat16_t fixed_bfloat16_t __attribute__((arm_sve_vector_bits(N)));

				typedef svbool_t fixed_bool_t __attribute__((arm_sve_vector_bits(N)));

				//===----------------------------------------------------------------------===//
				// Structs and unions
				//===----------------------------------------------------------------------===//
				#define DEFINE_STRUCT(ty) \
				struct struct_##ty { \
				fixed_##ty##_t x; \
				} struct_##ty;

				#define DEFINE_UNION(ty) \
				union union_##ty { \
				fixed_##ty##_t x; \
				} union_##ty;

				DEFINE_STRUCT(int8)
				DEFINE_STRUCT(int16)
				DEFINE_STRUCT(int32)
				DEFINE_STRUCT(int64)
				DEFINE_STRUCT(uint8)
				DEFINE_STRUCT(uint16)
				DEFINE_STRUCT(uint32)
				DEFINE_STRUCT(uint64)
				DEFINE_STRUCT(float16)
				DEFINE_STRUCT(float32)
				DEFINE_STRUCT(float64)
				DEFINE_STRUCT(bfloat16)
				DEFINE_STRUCT(bool)

				DEFINE_UNION(int8)
				DEFINE_UNION(int16)
				DEFINE_UNION(int32)
				DEFINE_UNION(int64)
				DEFINE_UNION(uint8)
				DEFINE_UNION(uint16)
				DEFINE_UNION(uint32)
				DEFINE_UNION(uint64)
				DEFINE_UNION(float16)
				DEFINE_UNION(float32)
				DEFINE_UNION(float64)
				DEFINE_UNION(bfloat16)
				DEFINE_UNION(bool)

				//===----------------------------------------------------------------------===//
				// Global variables
				//===----------------------------------------------------------------------===//
				fixed_int8_t global_i8;
				fixed_int16_t global_i16;
				fixed_int32_t global_i32;
				fixed_int64_t global_i64;

				fixed_uint8_t global_u8;
				fixed_uint16_t global_u16;
				fixed_uint32_t global_u32;
				fixed_uint64_t global_u64;

				fixed_float16_t global_f16;
				fixed_float32_t global_f32;
				fixed_float64_t global_f64;

				fixed_bfloat16_t global_bf16;

				fixed_bool_t global_bool;

				//===----------------------------------------------------------------------===//
				// Global arrays
				//===----------------------------------------------------------------------===//
				fixed_int8_t global_arr_i8[3];
				fixed_int16_t global_arr_i16[3];
				fixed_int32_t global_arr_i32[3];
				fixed_int64_t global_arr_i64[3];

				fixed_uint8_t global_arr_u8[3];
				fixed_uint16_t global_arr_u16[3];
				fixed_uint32_t global_arr_u32[3];
				fixed_uint64_t global_arr_u64[3];

				fixed_float16_t global_arr_f16[3];
				fixed_float32_t global_arr_f32[3];
				fixed_float64_t global_arr_f64[3];

				fixed_bfloat16_t global_arr_bf16[3];

				fixed_bool_t global_arr_bool[3];

				//===----------------------------------------------------------------------===//
				// Locals
				//===----------------------------------------------------------------------===//
				void f() {
				// Variables
				fixed_int8_t local_i8;
				fixed_int16_t local_i16;
				fixed_int32_t local_i32;
				fixed_int64_t local_i64;
				fixed_uint8_t local_u8;
				fixed_uint16_t local_u16;
				fixed_uint32_t local_u32;
				fixed_uint64_t local_u64;
				fixed_float16_t local_f16;
				fixed_float32_t local_f32;
				fixed_float64_t local_f64;
				fixed_bfloat16_t local_bf16;
				fixed_bool_t local_bool;

				// Arrays
				fixed_int8_t local_arr_i8[3];
				fixed_int16_t local_arr_i16[3];
				fixed_int32_t local_arr_i32[3];
				fixed_int64_t local_arr_i64[3];
				fixed_uint8_t local_arr_u8[3];
				fixed_uint16_t local_arr_u16[3];
				fixed_uint32_t local_arr_u32[3];
				fixed_uint64_t local_arr_u64[3];
				fixed_float16_t local_arr_f16[3];
				fixed_float32_t local_arr_f32[3];
				fixed_float64_t local_arr_f64[3];
				fixed_bfloat16_t local_arr_bf16[3];
				fixed_bool_t local_arr_bool[3];
				}

				//===----------------------------------------------------------------------===//
				// Structs and unions
				//===----------------------------------------------------------------------===//
				// CHECK-128: %struct.struct_int8 = type { <16 x i8> }
				// CHECK-128-NEXT: %struct.struct_int16 = type { <8 x i16> }
				// CHECK-128-NEXT: %struct.struct_int32 = type { <4 x i32> }
				// CHECK-128-NEXT: %struct.struct_int64 = type { <2 x i64> }
				// CHECK-128-NEXT: %struct.struct_uint8 = type { <16 x i8> }
				// CHECK-128-NEXT: %struct.struct_uint16 = type { <8 x i16> }
				// CHECK-128-NEXT: %struct.struct_uint32 = type { <4 x i32> }
				// CHECK-128-NEXT: %struct.struct_uint64 = type { <2 x i64> }
				// CHECK-128-NEXT: %struct.struct_float16 = type { <8 x half> }
				// CHECK-128-NEXT: %struct.struct_float32 = type { <4 x float> }
				// CHECK-128-NEXT: %struct.struct_float64 = type { <2 x double> }
				// CHECK-128-NEXT: %struct.struct_bfloat16 = type { <8 x bfloat> }
				// CHECK-128-NEXT: %struct.struct_bool = type { <2 x i8> }

				// CHECK-256: %struct.struct_int8 = type { <32 x i8> }
				// CHECK-256-NEXT: %struct.struct_int16 = type { <16 x i16> }
				// CHECK-256-NEXT: %struct.struct_int32 = type { <8 x i32> }
				// CHECK-256-NEXT: %struct.struct_int64 = type { <4 x i64> }
				// CHECK-256-NEXT: %struct.struct_uint8 = type { <32 x i8> }
				// CHECK-256-NEXT: %struct.struct_uint16 = type { <16 x i16> }
				// CHECK-256-NEXT: %struct.struct_uint32 = type { <8 x i32> }
				// CHECK-256-NEXT: %struct.struct_uint64 = type { <4 x i64> }
				// CHECK-256-NEXT: %struct.struct_float16 = type { <16 x half> }
				// CHECK-256-NEXT: %struct.struct_float32 = type { <8 x float> }
				// CHECK-256-NEXT: %struct.struct_float64 = type { <4 x double> }
				// CHECK-256-NEXT: %struct.struct_bfloat16 = type { <16 x bfloat> }
				// CHECK-256-NEXT: %struct.struct_bool = type { <4 x i8> }

				// CHECK-512: %struct.struct_int8 = type { <64 x i8> }
				// CHECK-512-NEXT: %struct.struct_int16 = type { <32 x i16> }
				// CHECK-512-NEXT: %struct.struct_int32 = type { <16 x i32> }
				// CHECK-512-NEXT: %struct.struct_int64 = type { <8 x i64> }
				// CHECK-512-NEXT: %struct.struct_uint8 = type { <64 x i8> }
				// CHECK-512-NEXT: %struct.struct_uint16 = type { <32 x i16> }
				// CHECK-512-NEXT: %struct.struct_uint32 = type { <16 x i32> }
				// CHECK-512-NEXT: %struct.struct_uint64 = type { <8 x i64> }
				// CHECK-512-NEXT: %struct.struct_float16 = type { <32 x half> }
				// CHECK-512-NEXT: %struct.struct_float32 = type { <16 x float> }
				// CHECK-512-NEXT: %struct.struct_float64 = type { <8 x double> }
				// CHECK-512-NEXT: %struct.struct_bfloat16 = type { <32 x bfloat> }
				// CHECK-512-NEXT: %struct.struct_bool = type { <8 x i8> }

				// CHECK-1024: %struct.struct_int8 = type { <128 x i8> }
				// CHECK-1024-NEXT: %struct.struct_int16 = type { <64 x i16> }
				// CHECK-1024-NEXT: %struct.struct_int32 = type { <32 x i32> }
				// CHECK-1024-NEXT: %struct.struct_int64 = type { <16 x i64> }
				// CHECK-1024-NEXT: %struct.struct_uint8 = type { <128 x i8> }
				// CHECK-1024-NEXT: %struct.struct_uint16 = type { <64 x i16> }
				// CHECK-1024-NEXT: %struct.struct_uint32 = type { <32 x i32> }
				// CHECK-1024-NEXT: %struct.struct_uint64 = type { <16 x i64> }
				// CHECK-1024-NEXT: %struct.struct_float16 = type { <64 x half> }
				// CHECK-1024-NEXT: %struct.struct_float32 = type { <32 x float> }
				// CHECK-1024-NEXT: %struct.struct_float64 = type { <16 x double> }
				// CHECK-1024-NEXT: %struct.struct_bfloat16 = type { <64 x bfloat> }
				// CHECK-1024-NEXT: %struct.struct_bool = type { <16 x i8> }

				// CHECK-2048: %struct.struct_int8 = type { <256 x i8> }
				// CHECK-2048-NEXT: %struct.struct_int16 = type { <128 x i16> }
				// CHECK-2048-NEXT: %struct.struct_int32 = type { <64 x i32> }
				// CHECK-2048-NEXT: %struct.struct_int64 = type { <32 x i64> }
				// CHECK-2048-NEXT: %struct.struct_uint8 = type { <256 x i8> }
				// CHECK-2048-NEXT: %struct.struct_uint16 = type { <128 x i16> }
				// CHECK-2048-NEXT: %struct.struct_uint32 = type { <64 x i32> }
				// CHECK-2048-NEXT: %struct.struct_uint64 = type { <32 x i64> }
				// CHECK-2048-NEXT: %struct.struct_float16 = type { <128 x half> }
				// CHECK-2048-NEXT: %struct.struct_float32 = type { <64 x float> }
				// CHECK-2048-NEXT: %struct.struct_float64 = type { <32 x double> }
				// CHECK-2048-NEXT: %struct.struct_bfloat16 = type { <128 x bfloat> }
				// CHECK-2048-NEXT: %struct.struct_bool = type { <32 x i8> }

				// CHECK-128: %union.union_int8 = type { <16 x i8> }
				// CHECK-128-NEXT: %union.union_int16 = type { <8 x i16> }
				// CHECK-128-NEXT: %union.union_int32 = type { <4 x i32> }
				// CHECK-128-NEXT: %union.union_int64 = type { <2 x i64> }
				// CHECK-128-NEXT: %union.union_uint8 = type { <16 x i8> }
				// CHECK-128-NEXT: %union.union_uint16 = type { <8 x i16> }
				// CHECK-128-NEXT: %union.union_uint32 = type { <4 x i32> }
				// CHECK-128-NEXT: %union.union_uint64 = type { <2 x i64> }
				// CHECK-128-NEXT: %union.union_float16 = type { <8 x half> }
				// CHECK-128-NEXT: %union.union_float32 = type { <4 x float> }
				// CHECK-128-NEXT: %union.union_float64 = type { <2 x double> }
				// CHECK-128-NEXT: %union.union_bfloat16 = type { <8 x bfloat> }
				// CHECK-128-NEXT: %union.union_bool = type { <2 x i8> }

				// CHECK-256: %union.union_int8 = type { <32 x i8> }
				// CHECK-256-NEXT: %union.union_int16 = type { <16 x i16> }
				// CHECK-256-NEXT: %union.union_int32 = type { <8 x i32> }
				// CHECK-256-NEXT: %union.union_int64 = type { <4 x i64> }
				// CHECK-256-NEXT: %union.union_uint8 = type { <32 x i8> }
				// CHECK-256-NEXT: %union.union_uint16 = type { <16 x i16> }
				// CHECK-256-NEXT: %union.union_uint32 = type { <8 x i32> }
				// CHECK-256-NEXT: %union.union_uint64 = type { <4 x i64> }
				// CHECK-256-NEXT: %union.union_float16 = type { <16 x half> }
				// CHECK-256-NEXT: %union.union_float32 = type { <8 x float> }
				// CHECK-256-NEXT: %union.union_float64 = type { <4 x double> }
				// CHECK-256-NEXT: %union.union_bfloat16 = type { <16 x bfloat> }
				// CHECK-256-NEXT: %union.union_bool = type { <4 x i8> }

				// CHECK-512: %union.union_int8 = type { <64 x i8> }
				// CHECK-512-NEXT: %union.union_int16 = type { <32 x i16> }
				// CHECK-512-NEXT: %union.union_int32 = type { <16 x i32> }
				// CHECK-512-NEXT: %union.union_int64 = type { <8 x i64> }
				// CHECK-512-NEXT: %union.union_uint8 = type { <64 x i8> }
				// CHECK-512-NEXT: %union.union_uint16 = type { <32 x i16> }
				// CHECK-512-NEXT: %union.union_uint32 = type { <16 x i32> }
				// CHECK-512-NEXT: %union.union_uint64 = type { <8 x i64> }
				// CHECK-512-NEXT: %union.union_float16 = type { <32 x half> }
				// CHECK-512-NEXT: %union.union_float32 = type { <16 x float> }
				// CHECK-512-NEXT: %union.union_float64 = type { <8 x double> }
				// CHECK-512-NEXT: %union.union_bfloat16 = type { <32 x bfloat> }
				// CHECK-512-NEXT: %union.union_bool = type { <8 x i8> }

				// CHECK-1024: %union.union_int8 = type { <128 x i8> }
				// CHECK-1024-NEXT: %union.union_int16 = type { <64 x i16> }
				// CHECK-1024-NEXT: %union.union_int32 = type { <32 x i32> }
				// CHECK-1024-NEXT: %union.union_int64 = type { <16 x i64> }
				// CHECK-1024-NEXT: %union.union_uint8 = type { <128 x i8> }
				// CHECK-1024-NEXT: %union.union_uint16 = type { <64 x i16> }
				// CHECK-1024-NEXT: %union.union_uint32 = type { <32 x i32> }
				// CHECK-1024-NEXT: %union.union_uint64 = type { <16 x i64> }
				// CHECK-1024-NEXT: %union.union_float16 = type { <64 x half> }
				// CHECK-1024-NEXT: %union.union_float32 = type { <32 x float> }
				// CHECK-1024-NEXT: %union.union_float64 = type { <16 x double> }
				// CHECK-1024-NEXT: %union.union_bfloat16 = type { <64 x bfloat> }
				// CHECK-1024-NEXT: %union.union_bool = type { <16 x i8> }

				// CHECK-2048: %union.union_int8 = type { <256 x i8> }
				// CHECK-2048-NEXT: %union.union_int16 = type { <128 x i16> }
				// CHECK-2048-NEXT: %union.union_int32 = type { <64 x i32> }
				// CHECK-2048-NEXT: %union.union_int64 = type { <32 x i64> }
				// CHECK-2048-NEXT: %union.union_uint8 = type { <256 x i8> }
				// CHECK-2048-NEXT: %union.union_uint16 = type { <128 x i16> }
				// CHECK-2048-NEXT: %union.union_uint32 = type { <64 x i32> }
				// CHECK-2048-NEXT: %union.union_uint64 = type { <32 x i64> }
				// CHECK-2048-NEXT: %union.union_float16 = type { <128 x half> }
				// CHECK-2048-NEXT: %union.union_float32 = type { <64 x float> }
				// CHECK-2048-NEXT: %union.union_float64 = type { <32 x double> }
				// CHECK-2048-NEXT: %union.union_bfloat16 = type { <128 x bfloat> }
				// CHECK-2048-NEXT: %union.union_bool = type { <32 x i8> }

				//===----------------------------------------------------------------------===//
				// Global variables
				//===----------------------------------------------------------------------===//
				// CHECK-128: @global_i8 = global <16 x i8> zeroinitializer, align 16
				// CHECK-128-NEXT: @global_i16 = global <8 x i16> zeroinitializer, align 16
				// CHECK-128-NEXT: @global_i32 = global <4 x i32> zeroinitializer, align 16
				// CHECK-128-NEXT: @global_i64 = global <2 x i64> zeroinitializer, align 16
				// CHECK-128-NEXT: @global_u8 = global <16 x i8> zeroinitializer, align 16
				// CHECK-128-NEXT: @global_u16 = global <8 x i16> zeroinitializer, align 16
				// CHECK-128-NEXT: @global_u32 = global <4 x i32> zeroinitializer, align 16
				// CHECK-128-NEXT: @global_u64 = global <2 x i64> zeroinitializer, align 16
				// CHECK-128-NEXT: @global_f16 = global <8 x half> zeroinitializer, align 16
				// CHECK-128-NEXT: @global_f32 = global <4 x float> zeroinitializer, align 16
				// CHECK-128-NEXT: @global_f64 = global <2 x double> zeroinitializer, align 16
				// CHECK-128-NEXT: @global_bf16 = global <8 x bfloat> zeroinitializer, align 16
				// CHECK-128-NEXT: @global_bool = global <2 x i8> zeroinitializer, align 2

				// CHECK-256: @global_i8 = global <32 x i8> zeroinitializer, align 16
				// CHECK-NEXT-256: @global_i16 = global <16 x i16> zeroinitializer, align 16
				// CHECK-NEXT-256: @global_i32 = global <8 x i32> zeroinitializer, align 16
				// CHECK-NEXT-256: @global_i64 = global <4 x i64> zeroinitializer, align 16
				// CHECK-NEXT-256: @global_u8 = global <32 x i8> zeroinitializer, align 16
				// CHECK-NEXT-256: @global_u16 = global <16 x i16> zeroinitializer, align 16
				// CHECK-NEXT-256: @global_u32 = global <8 x i32> zeroinitializer, align 16
				// CHECK-NEXT-256: @global_u64 = global <4 x i64> zeroinitializer, align 16
				// CHECK-NEXT-256: @global_f16 = global <16 x half> zeroinitializer, align 16
				// CHECK-NEXT-256: @global_f32 = global <8 x float> zeroinitializer, align 16
				// CHECK-NEXT-256: @global_f64 = global <4 x double> zeroinitializer, align 16
				// CHECK-NEXT-256: @global_bf16 = global <16 x bfloat> zeroinitializer, align 16
				// CHECK-NEXT-256: @global_bool = global <4 x i8> zeroinitializer, align 2

				// CHECK-512: @global_i8 = global <64 x i8> zeroinitializer, align 16
				// CHECK-NEXT-512: @global_i16 = global <32 x i16> zeroinitializer, align 16
				// CHECK-NEXT-512: @global_i32 = global <16 x i32> zeroinitializer, align 16
				// CHECK-NEXT-512: @global_i64 = global <8 x i64> zeroinitializer, align 16
				// CHECK-NEXT-512: @global_u8 = global <64 x i8> zeroinitializer, align 16
				// CHECK-NEXT-512: @global_u16 = global <32 x i16> zeroinitializer, align 16
				// CHECK-NEXT-512: @global_u32 = global <16 x i32> zeroinitializer, align 16
				// CHECK-NEXT-512: @global_u64 = global <8 x i64> zeroinitializer, align 16
				// CHECK-NEXT-512: @global_f16 = global <32 x half> zeroinitializer, align 16
				// CHECK-NEXT-512: @global_f32 = global <16 x float> zeroinitializer, align 16
				// CHECK-NEXT-512: @global_f64 = global <8 x double> zeroinitializer, align 16
				// CHECK-NEXT-512: @global_bf16 = global <32 x bfloat> zeroinitializer, align 16
				// CHECK-NEXT-512: @global_bool = global <8 x i8> zeroinitializer, align 2

				// CHECK-1024: @global_i8 = global <128 x i8> zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_i16 = global <64 x i16> zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_i32 = global <32 x i32> zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_i64 = global <16 x i64> zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_u8 = global <128 x i8> zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_u16 = global <64 x i16> zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_u32 = global <32 x i32> zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_u64 = global <16 x i64> zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_f16 = global <64 x half> zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_f32 = global <32 x float> zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_f64 = global <16 x double> zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_bf16 = global <64 x bfloat> zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_bool = global <16 x i8> zeroinitializer, align 2

				// CHECK-2048: @global_i8 = global <256 x i8> zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_i16 = global <128 x i16> zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_i32 = global <64 x i32> zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_i64 = global <32 x i64> zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_u8 = global <256 x i8> zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_u16 = global <128 x i16> zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_u32 = global <64 x i32> zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_u64 = global <32 x i64> zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_f16 = global <128 x half> zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_f32 = global <64 x float> zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_f64 = global <32 x double> zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_bf16 = global <128 x bfloat> zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_bool = global <32 x i8> zeroinitializer, align 2

				//===----------------------------------------------------------------------===//
				// Global arrays
				//===----------------------------------------------------------------------===//
				// CHECK-128: @global_arr_i8 = global [3 x <16 x i8>] zeroinitializer, align 16
				// CHECK-128-NEXT: @global_arr_i16 = global [3 x <8 x i16>] zeroinitializer, align 16
				// CHECK-128-NEXT: @global_arr_i32 = global [3 x <4 x i32>] zeroinitializer, align 16
				// CHECK-128-NEXT: @global_arr_i64 = global [3 x <2 x i64>] zeroinitializer, align 16
				// CHECK-128-NEXT: @global_arr_u8 = global [3 x <16 x i8>] zeroinitializer, align 16
				// CHECK-128-NEXT: @global_arr_u16 = global [3 x <8 x i16>] zeroinitializer, align 16
				// CHECK-128-NEXT: @global_arr_u32 = global [3 x <4 x i32>] zeroinitializer, align 16
				// CHECK-128-NEXT: @global_arr_u64 = global [3 x <2 x i64>] zeroinitializer, align 16
				// CHECK-128-NEXT: @global_arr_f16 = global [3 x <8 x half>] zeroinitializer, align 16
				// CHECK-128-NEXT: @global_arr_f32 = global [3 x <4 x float>] zeroinitializer, align 16
				// CHECK-128-NEXT: @global_arr_f64 = global [3 x <2 x double>] zeroinitializer, align 16
				// CHECK-128-NEXT: @global_arr_bf16 = global [3 x <8 x bfloat>] zeroinitializer, align 16
				// CHECK-128-NEXT: @global_arr_bool = global [3 x <2 x i8>] zeroinitializer, align 2

				// CHECK-256: @global_arr_i8 = global [3 x <32 x i8>] zeroinitializer, align 16
				// CHECK-NEXT-256: @global_arr_i16 = global [3 x <16 x i16>] zeroinitializer, align 16
				// CHECK-NEXT-256: @global_arr_i32 = global [3 x <8 x i32>] zeroinitializer, align 16
				// CHECK-NEXT-256: @global_arr_i64 = global [3 x <4 x i64>] zeroinitializer, align 16
				// CHECK-NEXT-256: @global_arr_u8 = global [3 x <32 x i8>] zeroinitializer, align 16
				// CHECK-NEXT-256: @global_arr_u16 = global [3 x <16 x i16>] zeroinitializer, align 16
				// CHECK-NEXT-256: @global_arr_u32 = global [3 x <8 x i32>] zeroinitializer, align 16
				// CHECK-NEXT-256: @global_arr_u64 = global [3 x <4 x i64>] zeroinitializer, align 16
				// CHECK-NEXT-256: @global_arr_f16 = global [3 x <16 x half>] zeroinitializer, align 16
				// CHECK-NEXT-256: @global_arr_f32 = global [3 x <8 x float>] zeroinitializer, align 16
				// CHECK-NEXT-256: @global_arr_f64 = global [3 x <4 x double>] zeroinitializer, align 16
				// CHECK-NEXT-256: @global_arr_bf16 = global [3 x <16 x bfloat>] zeroinitializer, align 16
				// CHECK-NEXT-256: @global_arr_bool = global [3 x <4 x i8>] zeroinitializer, align 2

				// CHECK-512: @global_arr_i8 = global [3 x <64 x i8>] zeroinitializer, align 16
				// CHECK-NEXT-512: @global_arr_i16 = global [3 x <32 x i16>] zeroinitializer, align 16
				// CHECK-NEXT-512: @global_arr_i32 = global [3 x <16 x i32>] zeroinitializer, align 16
				// CHECK-NEXT-512: @global_arr_i64 = global [3 x <8 x i64>] zeroinitializer, align 16
				// CHECK-NEXT-512: @global_arr_u8 = global [3 x <64 x i8>] zeroinitializer, align 16
				// CHECK-NEXT-512: @global_arr_u16 = global [3 x <32 x i16>] zeroinitializer, align 16
				// CHECK-NEXT-512: @global_arr_u32 = global [3 x <16 x i32>] zeroinitializer, align 16
				// CHECK-NEXT-512: @global_arr_u64 = global [3 x <8 x i64>] zeroinitializer, align 16
				// CHECK-NEXT-512: @global_arr_f16 = global [3 x <32 x half>] zeroinitializer, align 16
				// CHECK-NEXT-512: @global_arr_f32 = global [3 x <16 x float>] zeroinitializer, align 16
				// CHECK-NEXT-512: @global_arr_f64 = global [3 x <8 x double>] zeroinitializer, align 16
				// CHECK-NEXT-512: @global_arr_bf16 = global [3 x <32 x bfloat>] zeroinitializer, align 16
				// CHECK-NEXT-512: @global_arr_bool = global [3 x <8 x i8>] zeroinitializer, align 2

				// CHECK-1024: @global_arr_i8 = global [3 x <128 x i8>] zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_arr_i16 = global [3 x <64 x i16>] zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_arr_i32 = global [3 x <32 x i32>] zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_arr_i64 = global [3 x <16 x i64>] zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_arr_u8 = global [3 x <128 x i8>] zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_arr_u16 = global [3 x <64 x i16>] zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_arr_u32 = global [3 x <32 x i32>] zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_arr_u64 = global [3 x <16 x i64>] zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_arr_f16 = global [3 x <64 x half>] zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_arr_f32 = global [3 x <32 x float>] zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_arr_f64 = global [3 x <16 x double>] zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_arr_bf16 = global [3 x <64 x bfloat>] zeroinitializer, align 16
				// CHECK-NEXT-1024: @global_arr_bool = global [3 x <16 x i8>] zeroinitializer, align 2

				// CHECK-2048: @global_arr_i8 = global [3 x <256 x i8>] zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_arr_i16 = global [3 x <128 x i16>] zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_arr_i32 = global [3 x <64 x i32>] zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_arr_i64 = global [3 x <32 x i64>] zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_arr_u8 = global [3 x <256 x i8>] zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_arr_u16 = global [3 x <128 x i16>] zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_arr_u32 = global [3 x <64 x i32>] zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_arr_u64 = global [3 x <32 x i64>] zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_arr_f16 = global [3 x <128 x half>] zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_arr_f32 = global [3 x <64 x float>] zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_arr_f64 = global [3 x <32 x double>] zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_arr_bf16 = global [3 x <128 x bfloat>] zeroinitializer, align 16
				// CHECK-NEXT-2048: @global_arr_bool = global [3 x <32 x i8>] zeroinitializer, align 2

				//===----------------------------------------------------------------------===//
				// Local variables
				//===----------------------------------------------------------------------===//
				// CHECK-128: %local_i8 = alloca <16 x i8>, align 16
				// CHECK-128-NEXT: %local_i16 = alloca <8 x i16>, align 16
				// CHECK-128-NEXT: %local_i32 = alloca <4 x i32>, align 16
				// CHECK-128-NEXT: %local_i64 = alloca <2 x i64>, align 16
				// CHECK-128-NEXT: %local_u8 = alloca <16 x i8>, align 16
				// CHECK-128-NEXT: %local_u16 = alloca <8 x i16>, align 16
				// CHECK-128-NEXT: %local_u32 = alloca <4 x i32>, align 16
				// CHECK-128-NEXT: %local_u64 = alloca <2 x i64>, align 16
				// CHECK-128-NEXT: %local_f16 = alloca <8 x half>, align 16
				// CHECK-128-NEXT: %local_f32 = alloca <4 x float>, align 16
				// CHECK-128-NEXT: %local_f64 = alloca <2 x double>, align 16
				// CHECK-128-NEXT: %local_bf16 = alloca <8 x bfloat>, align 16
				// CHECK-128-NEXT: %local_bool = alloca <2 x i8>, align 2

				// CHECK-256: %local_i8 = alloca <32 x i8>, align 16
				// CHECK-256-NEXT: %local_i16 = alloca <16 x i16>, align 16
				// CHECK-256-NEXT: %local_i32 = alloca <8 x i32>, align 16
				// CHECK-256-NEXT: %local_i64 = alloca <4 x i64>, align 16
				// CHECK-256-NEXT: %local_u8 = alloca <32 x i8>, align 16
				// CHECK-256-NEXT: %local_u16 = alloca <16 x i16>, align 16
				// CHECK-256-NEXT: %local_u32 = alloca <8 x i32>, align 16
				// CHECK-256-NEXT: %local_u64 = alloca <4 x i64>, align 16
				// CHECK-256-NEXT: %local_f16 = alloca <16 x half>, align 16
				// CHECK-256-NEXT: %local_f32 = alloca <8 x float>, align 16
				// CHECK-256-NEXT: %local_f64 = alloca <4 x double>, align 16
				// CHECK-256-NEXT: %local_bf16 = alloca <16 x bfloat>, align 16
				// CHECK-256-NEXT: %local_bool = alloca <4 x i8>, align 2

				// CHECK-512: %local_i8 = alloca <64 x i8>, align 16
				// CHECK-512-NEXT: %local_i16 = alloca <32 x i16>, align 16
				// CHECK-512-NEXT: %local_i32 = alloca <16 x i32>, align 16
				// CHECK-512-NEXT: %local_i64 = alloca <8 x i64>, align 16
				// CHECK-512-NEXT: %local_u8 = alloca <64 x i8>, align 16
				// CHECK-512-NEXT: %local_u16 = alloca <32 x i16>, align 16
				// CHECK-512-NEXT: %local_u32 = alloca <16 x i32>, align 16
				// CHECK-512-NEXT: %local_u64 = alloca <8 x i64>, align 16
				// CHECK-512-NEXT: %local_f16 = alloca <32 x half>, align 16
				// CHECK-512-NEXT: %local_f32 = alloca <16 x float>, align 16
				// CHECK-512-NEXT: %local_f64 = alloca <8 x double>, align 16
				// CHECK-512-NEXT: %local_bf16 = alloca <32 x bfloat>, align 16
				// CHECK-512-NEXT: %local_bool = alloca <8 x i8>, align 2

				// CHECK-1024: %local_i8 = alloca <128 x i8>, align 16
				// CHECK-1024-NEXT: %local_i16 = alloca <64 x i16>, align 16
				// CHECK-1024-NEXT: %local_i32 = alloca <32 x i32>, align 16
				// CHECK-1024-NEXT: %local_i64 = alloca <16 x i64>, align 16
				// CHECK-1024-NEXT: %local_u8 = alloca <128 x i8>, align 16
				// CHECK-1024-NEXT: %local_u16 = alloca <64 x i16>, align 16
				// CHECK-1024-NEXT: %local_u32 = alloca <32 x i32>, align 16
				// CHECK-1024-NEXT: %local_u64 = alloca <16 x i64>, align 16
				// CHECK-1024-NEXT: %local_f16 = alloca <64 x half>, align 16
				// CHECK-1024-NEXT: %local_f32 = alloca <32 x float>, align 16
				// CHECK-1024-NEXT: %local_f64 = alloca <16 x double>, align 16
				// CHECK-1024-NEXT: %local_bf16 = alloca <64 x bfloat>, align 16
				// CHECK-1024-NEXT: %local_bool = alloca <16 x i8>, align 2

				// CHECK-2048: %local_i8 = alloca <256 x i8>, align 16
				// CHECK-2048-NEXT: %local_i16 = alloca <128 x i16>, align 16
				// CHECK-2048-NEXT: %local_i32 = alloca <64 x i32>, align 16
				// CHECK-2048-NEXT: %local_i64 = alloca <32 x i64>, align 16
				// CHECK-2048-NEXT: %local_u8 = alloca <256 x i8>, align 16
				// CHECK-2048-NEXT: %local_u16 = alloca <128 x i16>, align 16
				// CHECK-2048-NEXT: %local_u32 = alloca <64 x i32>, align 16
				// CHECK-2048-NEXT: %local_u64 = alloca <32 x i64>, align 16
				// CHECK-2048-NEXT: %local_f16 = alloca <128 x half>, align 16
				// CHECK-2048-NEXT: %local_f32 = alloca <64 x float>, align 16
				// CHECK-2048-NEXT: %local_f64 = alloca <32 x double>, align 16
				// CHECK-2048-NEXT: %local_bf16 = alloca <128 x bfloat>, align 16
				// CHECK-2048-NEXT: %local_bool = alloca <32 x i8>, align 2

				//===----------------------------------------------------------------------===//
				// Local arrays
				//===----------------------------------------------------------------------===//
				// CHECK-128: %local_arr_i8 = alloca [3 x <16 x i8>], align 16
				// CHECK-128-NEXT: %local_arr_i16 = alloca [3 x <8 x i16>], align 16
				// CHECK-128-NEXT: %local_arr_i32 = alloca [3 x <4 x i32>], align 16
				// CHECK-128-NEXT: %local_arr_i64 = alloca [3 x <2 x i64>], align 16
				// CHECK-128-NEXT: %local_arr_u8 = alloca [3 x <16 x i8>], align 16
				// CHECK-128-NEXT: %local_arr_u16 = alloca [3 x <8 x i16>], align 16
				// CHECK-128-NEXT: %local_arr_u32 = alloca [3 x <4 x i32>], align 16
				// CHECK-128-NEXT: %local_arr_u64 = alloca [3 x <2 x i64>], align 16
				// CHECK-128-NEXT: %local_arr_f16 = alloca [3 x <8 x half>], align 16
				// CHECK-128-NEXT: %local_arr_f32 = alloca [3 x <4 x float>], align 16
				// CHECK-128-NEXT: %local_arr_f64 = alloca [3 x <2 x double>], align 16
				// CHECK-128-NEXT: %local_arr_bf16 = alloca [3 x <8 x bfloat>], align 16
				// CHECK-128-NEXT: %local_arr_bool = alloca [3 x <2 x i8>], align 2

				// CHECK-256: %local_arr_i8 = alloca [3 x <32 x i8>], align 16
				// CHECK-256-NEXT: %local_arr_i16 = alloca [3 x <16 x i16>], align 16
				// CHECK-256-NEXT: %local_arr_i32 = alloca [3 x <8 x i32>], align 16
				// CHECK-256-NEXT: %local_arr_i64 = alloca [3 x <4 x i64>], align 16
				// CHECK-256-NEXT: %local_arr_u8 = alloca [3 x <32 x i8>], align 16
				// CHECK-256-NEXT: %local_arr_u16 = alloca [3 x <16 x i16>], align 16
				// CHECK-256-NEXT: %local_arr_u32 = alloca [3 x <8 x i32>], align 16
				// CHECK-256-NEXT: %local_arr_u64 = alloca [3 x <4 x i64>], align 16
				// CHECK-256-NEXT: %local_arr_f16 = alloca [3 x <16 x half>], align 16
				// CHECK-256-NEXT: %local_arr_f32 = alloca [3 x <8 x float>], align 16
				// CHECK-256-NEXT: %local_arr_f64 = alloca [3 x <4 x double>], align 16
				// CHECK-256-NEXT: %local_arr_bf16 = alloca [3 x <16 x bfloat>], align 16
				// CHECK-256-NEXT: %local_arr_bool = alloca [3 x <4 x i8>], align 2

				// CHECK-512: %local_arr_i8 = alloca [3 x <64 x i8>], align 16
				// CHECK-512-NEXT: %local_arr_i16 = alloca [3 x <32 x i16>], align 16
				// CHECK-512-NEXT: %local_arr_i32 = alloca [3 x <16 x i32>], align 16
				// CHECK-512-NEXT: %local_arr_i64 = alloca [3 x <8 x i64>], align 16
				// CHECK-512-NEXT: %local_arr_u8 = alloca [3 x <64 x i8>], align 16
				// CHECK-512-NEXT: %local_arr_u16 = alloca [3 x <32 x i16>], align 16
				// CHECK-512-NEXT: %local_arr_u32 = alloca [3 x <16 x i32>], align 16
				// CHECK-512-NEXT: %local_arr_u64 = alloca [3 x <8 x i64>], align 16
				// CHECK-512-NEXT: %local_arr_f16 = alloca [3 x <32 x half>], align 16
				// CHECK-512-NEXT: %local_arr_f32 = alloca [3 x <16 x float>], align 16
				// CHECK-512-NEXT: %local_arr_f64 = alloca [3 x <8 x double>], align 16
				// CHECK-512-NEXT: %local_arr_bf16 = alloca [3 x <32 x bfloat>], align 16
				// CHECK-512-NEXT: %local_arr_bool = alloca [3 x <8 x i8>], align 2

				// CHECK-1024: %local_arr_i8 = alloca [3 x <128 x i8>], align 16
				// CHECK-1024-NEXT: %local_arr_i16 = alloca [3 x <64 x i16>], align 16
				// CHECK-1024-NEXT: %local_arr_i32 = alloca [3 x <32 x i32>], align 16
				// CHECK-1024-NEXT: %local_arr_i64 = alloca [3 x <16 x i64>], align 16
				// CHECK-1024-NEXT: %local_arr_u8 = alloca [3 x <128 x i8>], align 16
				// CHECK-1024-NEXT: %local_arr_u16 = alloca [3 x <64 x i16>], align 16
				// CHECK-1024-NEXT: %local_arr_u32 = alloca [3 x <32 x i32>], align 16
				// CHECK-1024-NEXT: %local_arr_u64 = alloca [3 x <16 x i64>], align 16
				// CHECK-1024-NEXT: %local_arr_f16 = alloca [3 x <64 x half>], align 16
				// CHECK-1024-NEXT: %local_arr_f32 = alloca [3 x <32 x float>], align 16
				// CHECK-1024-NEXT: %local_arr_f64 = alloca [3 x <16 x double>], align 16
				// CHECK-1024-NEXT: %local_arr_bf16 = alloca [3 x <64 x bfloat>], align 16
				// CHECK-1024-NEXT: %local_arr_bool = alloca [3 x <16 x i8>], align 2

				// CHECK-2048: %local_arr_i8 = alloca [3 x <256 x i8>], align 16
				// CHECK-2048-NEXT: %local_arr_i16 = alloca [3 x <128 x i16>], align 16
				// CHECK-2048-NEXT: %local_arr_i32 = alloca [3 x <64 x i32>], align 16
				// CHECK-2048-NEXT: %local_arr_i64 = alloca [3 x <32 x i64>], align 16
				// CHECK-2048-NEXT: %local_arr_u8 = alloca [3 x <256 x i8>], align 16
				// CHECK-2048-NEXT: %local_arr_u16 = alloca [3 x <128 x i16>], align 16
				// CHECK-2048-NEXT: %local_arr_u32 = alloca [3 x <64 x i32>], align 16
				// CHECK-2048-NEXT: %local_arr_u64 = alloca [3 x <32 x i64>], align 16
				// CHECK-2048-NEXT: %local_arr_f16 = alloca [3 x <128 x half>], align 16
				// CHECK-2048-NEXT: %local_arr_f32 = alloca [3 x <64 x float>], align 16
				// CHECK-2048-NEXT: %local_arr_f64 = alloca [3 x <32 x double>], align 16
				// CHECK-2048-NEXT: %local_arr_bf16 = alloca [3 x <128 x bfloat>], align 16
				// CHECK-2048-NEXT: %local_arr_bool = alloca [3 x <32 x i8>], align 2

clang/test/CodeGenCXX/aarch64-sve-fixedtypeinfo.cpp

This file was added.

				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu %s -emit-llvm -o - \
				// RUN: -target-feature +sve -target-feature +bf16 \
				// RUN: -D__ARM_FEATURE_SVE -msve-vector-bits=128 \| FileCheck %s
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu %s -emit-llvm -o - \
				// RUN: -target-feature +sve -target-feature +bf16 \
				// RUN: -D__ARM_FEATURE_SVE -msve-vector-bits=256 \| FileCheck %s
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu %s -emit-llvm -o - \
				// RUN: -target-feature +sve -target-feature +bf16 \
				// RUN: -D__ARM_FEATURE_SVE -msve-vector-bits=512 \| FileCheck %s
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu %s -emit-llvm -o - \
				// RUN: -target-feature +sve -target-feature +bf16 \
				// RUN: -D__ARM_FEATURE_SVE -msve-vector-bits=1024 \| FileCheck %s
				// RUN: %clang_cc1 -triple aarch64-none-linux-gnu %s -emit-llvm -o - \
				// RUN: -target-feature +sve -target-feature +bf16 \
				// RUN: -D__ARM_FEATURE_SVE -msve-vector-bits=2048 \| FileCheck %s

				// This test verifies fixed-length vectors defined with the
				// 'arm_sve_vector_bits' attribute map to the same AAPCS64 ABI type as the
				// sizeless variants.

				#define N __ARM_FEATURE_SVE_BITS_EXPERIMENTAL

				namespace std {
				class type_info;
				};

				typedef __SVInt8_t fixed_int8_t __attribute__((arm_sve_vector_bits(N)));
				typedef __SVInt16_t fixed_int16_t __attribute__((arm_sve_vector_bits(N)));
				typedef __SVInt32_t fixed_int32_t __attribute__((arm_sve_vector_bits(N)));
				typedef __SVInt64_t fixed_int64_t __attribute__((arm_sve_vector_bits(N)));

				typedef __SVUint8_t fixed_uint8_t __attribute__((arm_sve_vector_bits(N)));
				typedef __SVUint16_t fixed_uint16_t __attribute__((arm_sve_vector_bits(N)));
				typedef __SVUint32_t fixed_uint32_t __attribute__((arm_sve_vector_bits(N)));
				typedef __SVUint64_t fixed_uint64_t __attribute__((arm_sve_vector_bits(N)));

				typedef __SVFloat16_t fixed_float16_t __attribute__((arm_sve_vector_bits(N)));
				typedef __SVFloat32_t fixed_float32_t __attribute__((arm_sve_vector_bits(N)));
				typedef __SVFloat64_t fixed_float64_t __attribute__((arm_sve_vector_bits(N)));

				typedef __SVBFloat16_t fixed_bfloat16_t __attribute__((arm_sve_vector_bits(N)));

				typedef __SVBool_t fixed_bool_t __attribute__((arm_sve_vector_bits(N)));

				auto &fs8 = typeid(fixed_int8_t);
				auto &fs16 = typeid(fixed_int16_t);
				auto &fs32 = typeid(fixed_int32_t);
				auto &fs64 = typeid(fixed_int64_t);

				auto &fu8 = typeid(fixed_uint8_t);
				auto &fu16 = typeid(fixed_uint16_t);
				auto &fu32 = typeid(fixed_uint32_t);
				auto &fu64 = typeid(fixed_uint64_t);

				auto &ff16 = typeid(fixed_float16_t);
				auto &ff32 = typeid(fixed_float32_t);
				auto &ff64 = typeid(fixed_float64_t);

				auto &fbf16 = typeid(fixed_bfloat16_t);

				auto &fb8 = typeid(fixed_bool_t);

				// CHECK-DAG: @_ZTIu10__SVInt8_t = {{.}} @_ZTVN10__cxxabiv123__fundamental_type_infoE, {{.}} @_ZTSu10__SVInt8_t
				// CHECK-DAG: @_ZTIu11__SVInt16_t = {{.}} @_ZTVN10__cxxabiv123__fundamental_type_infoE, {{.}} @_ZTSu11__SVInt16_t
				// CHECK-DAG: @_ZTIu11__SVInt32_t = {{.}} @_ZTVN10__cxxabiv123__fundamental_type_infoE, {{.}} @_ZTSu11__SVInt32_t
				// CHECK-DAG: @_ZTIu11__SVInt64_t = {{.}} @_ZTVN10__cxxabiv123__fundamental_type_infoE, {{.}} @_ZTSu11__SVInt64_t
				// CHECK-DAG: @_ZTIu11__SVUint8_t = {{.}} @_ZTVN10__cxxabiv123__fundamental_type_infoE, {{.}} @_ZTSu11__SVUint8_t
				// CHECK-DAG: @_ZTIu12__SVUint16_t = {{.}} @_ZTVN10__cxxabiv123__fundamental_type_infoE, {{.}} @_ZTSu12__SVUint16_t
				// CHECK-DAG: @_ZTIu12__SVUint32_t = {{.}} @_ZTVN10__cxxabiv123__fundamental_type_infoE, {{.}} @_ZTSu12__SVUint32_t
				// CHECK-DAG: @_ZTIu12__SVUint64_t = {{.}} @_ZTVN10__cxxabiv123__fundamental_type_infoE, {{.}} @_ZTSu12__SVUint64_t
				// CHECK-DAG: @_ZTIu13__SVFloat16_t = {{.}} @_ZTVN10__cxxabiv123__fundamental_type_infoE, {{.}} @_ZTSu13__SVFloat16_t
				// CHECK-DAG: @_ZTIu13__SVFloat32_t = {{.}} @_ZTVN10__cxxabiv123__fundamental_type_infoE, {{.}} @_ZTSu13__SVFloat32_t
				// CHECK-DAG: @_ZTIu13__SVFloat64_t = {{.}} @_ZTVN10__cxxabiv123__fundamental_type_infoE, {{.}} @_ZTSu13__SVFloat64_t
				// CHECK-DAG: @_ZTIu14__SVBfloat16_t = {{.}} @_ZTVN10__cxxabiv123__fundamental_type_infoE, {{.}} @_ZTSu14__SVBfloat16_t
				// CHECK-DAG: @_ZTIu10__SVBool_t = {{.}} @_ZTVN10__cxxabiv123__fundamental_type_infoE, {{.}} @_ZTSu10__SVBool_t