This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/
-
clang/
-
Basic/
-
CodeGenOptions.def
-
Driver/
-
Options.td
-
lib/
-
CodeGen/
-
CGExpr.cpp
-
Frontend/
-
CompilerInvocation.cpp
-
test/CodeGen/
-
CodeGen/
4/4
aapcs-bitfield.c

Differential D67399

[ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields
ClosedPublic

Authored by dnsampaio on Sep 10 2019, 7:44 AM.

Download Raw Diff

Details

Reviewers

ostannard
jfb
eli.friedman
lebedev.ri

Commits

rG9d869180c4ad: [ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields

Summary

Following the AAPCS, every store to a volatile bit-field requires to generate one load of that field, even if all the bits are going to be replaced.
This patch allows the user to opt-in in following such rule, whenever the a.

AAPCS Release 2019Q1.1 (https://static.docs.arm.com/ihi0042/g/aapcs32.pdf)
section 8.1 Data Types, page 35, paragraph: Volatile bit-fields – preserving number and width of container accesses

When a volatile bit-field is written, and its container does not overlap with any non-bit-field member, its
container must be read exactly once and written exactly once using the access width appropriate to the
type of the container. The two accesses are not atomic.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dnsampaio created this revision.Sep 10 2019, 7:44 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 10 2019, 7:44 AM

Herald added subscribers: cfe-commits, jfb, kristof.beyls. · View Herald Transcript

Harbormaster completed remote builds in B37957: Diff 219542.Sep 10 2019, 7:46 AM

This patch could hack clang to generate an extra load. However, my knowledge in the clang code base is not extensive. How could we ensure that the width of loads and stores are the size of the container, and that they don't overlap non-bitfields?

I'm not sure why i'm added as a reviewer here.

That being said i don't see why that load is needed there.
As i read it, the document only states that if load is needed,
then it must be a single load, likewise for store.
I.e. i'm not sure why it requires a load when it isn't needed.
Unlike atomics, volatiles don't enforce any ordering.

Herald added a subscriber: dexonsmith. · View Herald TranscriptSep 10 2019, 7:52 AM

@ostannard might prove me wrong, but according to the AACPS:

When a volatile bit-field is written, and its container does not overlap with any non-bit-field member, its
container must be read exactly once and written exactly once using the access width appropriate to the
type of the container. The two accesses are not atomic.

This rule does not define that the load is done if required. It states that it will be read once. It even gives the example that an increment will always perform two reads and one write, bitwidth agnostic. It writes just after:

Note: Note the volatile access rules apply even when the width and alignment of the bit-field imply that
the access could be achieved more efficiently using a narrower type. For a write operation the read must
always occur even if the entire contents of the container will be replaced.

The rationale is to provide a uniform behavior for volatile bitfields independent of their width (as far they do not overlap with non-bitfields).

In D67399#1664785, @dnsampaio wrote:
@ostannard might prove me wrong, but according to the AACPS:
When a volatile bit-field is written, and its container does not overlap with any non-bit-field member, its
container must be read exactly once and written exactly once using the access width appropriate to the
type of the container. The two accesses are not atomic.
This rule does not define that the load is done if required. It states that it will be read once. It even gives the example that an increment will always perform two reads and one write, bitwidth agnostic. It writes just after:

While it can be read as "just always read volatile bitfield before overwriting it", i'm not sure that makes any sense.
Why exactly would you want do to that load, if you don't use it's results? What side-effects does it cause?

What it should be saying is: "when you have a bitfield, and you want to update several of it's fields but not all,
first do one read to get the old contents, then merge old-new data, and then perform a single store replacing the entire contents at once".
That is the only sane behavior it can require.

Note: Note the volatile access rules apply even when the width and alignment of the bit-field imply that
the access could be achieved more efficiently using a narrower type. For a write operation the read must
always occur even if the entire contents of the container will be replaced.
The rationale is to provide a uniform behavior for volatile bitfields independent of their width (as far they do not overlap with non-bitfields).

lebedev.ri removed a subscriber: lebedev.ri.Sep 10 2019, 8:30 AM

jfb added a subscriber: rjmccall.Sep 10 2019, 8:59 AM

jfb added inline comments.Sep 10 2019, 9:01 AM

clang/test/CodeGen/aapcs-bitfield.c
543	These are just extra loads? Why?
554	Why isn't this load sufficient?

Hi @jfb. In a example such as:

struct { int a : 1; int b : 16; } S;
extern int expensive_computaion(int v);
void foo(volatile S* s){
  s->b = expensive_computation(s->b);
}

There is no guarantee that s->a is not modified during a expensive computation, so it must be obtained just before writing the s->b value, as a and b share the same memory position. This is already done by llvm. Indeed, the exact output would be

define void @foo(%struct.S* %S) local_unnamed_addr #0 {
entry:
  %0 = bitcast %struct.S* %S to i32*
  %bf.load = load volatile i32, i32* %0, align 4
  %bf.shl = shl i32 %bf.load, 15
  %bf.ashr = ashr i32 %bf.shl, 16
  %call = tail call i32 @expensive_computation(i32 %bf.ashr) #2
  %bf.load1 = load volatile i32, i32* %0, align 4 ; <<<== Here it obtains the value to s->a to restore it.
  %bf.value = shl i32 %call, 1
  %bf.shl2 = and i32 %bf.value, 131070
  %bf.clear = and i32 %bf.load1, -131071
  %bf.set = or i32 %bf.clear, %bf.shl2
  store volatile i32 %bf.set, i32* %0, align 4
  ret void
}

These extra loads here are required to make uniform the number of times the volatile bitfield is read, independent if they share or not memory with other data.

We could have it under a flag, such as -faacps-volatilebitfield, disabled by default.

The other point not conformant to the AACPS is the width of the loads/stores used to obtain bitfields. They should be the width of the container, if that would not overlap with non-bitfields. Do you have any idea where that could be computed? I imagine that would be when computing the alignment of the elements of the structure, where I can check if the performing the entire container width load would overlap with other elements. Could you point me where that is done?

clang/test/CodeGen/aapcs-bitfield.c
543	Yes, these are just extra loads. As the AACPS describes, every write requires to perform a load as well, even if all bits of the volatile bitfield is going to be replaced.
554	Technically speaking, that is the load for reading the bitfield, not the load required when writing it.

dnsampaio marked 2 inline comments as done.Sep 11 2019, 1:56 AM

dnsampaio added a reviewer: eli.friedman.Sep 11 2019, 9:38 AM

The exact sequence of volatile accesses is observable behavior, and it's the ABI's right to define correct behavior for compliant implementations, so we do need to do this.

Diogo, IRGen breaks up bit-field storage units in CGRecordLowering::accumulateBitFields. I'm sure there's some reasonable way to add target-specific adjustments to that so that we break things up by storage unit.

In D67399#1668759, @rjmccall wrote:

The exact sequence of volatile accesses is observable behavior, and it's the ABI's right to define correct behavior for compliant implementations, so we do need to do this.

I'm not convinced; I think the AAPCS is overstepping its bounds here by trying to specify this. Whether you get a volatile load for a full-width bitfield access doesn't seem to have anything to do with ABI; implementations that differ on this detail will be able to produce code that links together and works together just fine. I think this is instead a question of what guarantees the implementation wants to give about how it treats volatile operations and how it lowers them to hardware. That should be specified, sure (and in fact C++ at least requires us to document what we do), but it's not ABI any more than (for example) the choice of whether a volatile access to a local variable must actually issue load / store instructions -- or which instructions must be produced -- is ABI. I have no particular opinion on whether we should make this change -- volatile bit-fields are pretty weird and broken regardless. But I don't think we should view it as being part of the ABI, and just as with the question of whether plain bit-fields are signed, we should tell the psABI folks that the question of what constitutes a volatile access and the semantics of such an access is not ABI and not up to them.

(Similar example: on Windows x86 targets, volatile accesses are atomic by default (controlled by compiler switch). clang-cl follows cl in this regard, but we don't consider it to be part of the ABI, and in the regular clang driver, we do not treat volatile accesses as atomic when targeting Windows x86.)

I have to say that I disagree. The ABI certainly doesn't get to control the language and define what constitutes a volatile access, but when the language has decided that a volatile access is actually performed, I think ABIs absolutely ought to define how they should be compiled. Volatile accesses are quite unusual in that they are used for a purpose — memory-mapped I/O — that is extremely dependent for correctness on the exact instructions used to implement it. IIRC there are architectures that for whatever reason provide two different load instructions where only one of those instructions actually performs memory-mapped I/O correctly. Certainly the exact width of load or store is critical for correctness. So while I have certainly seen ABI overreach before and pushed back on it, for this one case I think ABI involvement is totally appropriate, and in fact I wish more ABIs specified how they expect volatile accesses to be performed.

It is, of course, somewhat unfortunate that a corner-case use case like memory-mapped I/O has been privileged with such a core position in the language, but that's how it is.

I do think that AAPCS's specific request that the compiler emit spurious loads when storing to volatile bit-fields is a bad idea, and I think it would be fine to push back on that part, and perhaps also on the idea that compound assignments and increments should load twice.

Indeed our main concern is regarding the access widths of loads. As mentioned by @rjmccall, most volatile bitfields are used to perform memory mapped I/O, and some hardware only support them with a specific access width.
The spurious load I am more than glad to leave it disable behind a command flag, so it will only appear if the user requests it. See that volatile accesses might have side effects, and for example, an I/O read counter holding an odd number could define that the data is still being processed.

In D67399#1669038, @dnsampaio wrote:

Indeed our main concern is regarding the access widths of loads. As mentioned by @rjmccall, most volatile bitfields are used to perform memory mapped I/O, and some hardware only support them with a specific access width.
The spurious load I am more than glad to leave it disable behind a command flag, so it will only appear if the user requests it. See that volatile accesses might have side effects, and for example, an I/O read counter holding an odd number could define that the data is still being processed.

Are the cases being addressed in the PR actually relevant to real MMIO, or is this patch following the letter of AAPCS which doesn't actually matter?

In D67399#1669568, @jfb wrote:

In D67399#1669038, @dnsampaio wrote:

Indeed our main concern is regarding the access widths of loads. As mentioned by @rjmccall, most volatile bitfields are used to perform memory mapped I/O, and some hardware only support them with a specific access width.
The spurious load I am more than glad to leave it disable behind a command flag, so it will only appear if the user requests it. See that volatile accesses might have side effects, and for example, an I/O read counter holding an odd number could define that the data is still being processed.

Are the cases being addressed in the PR actually relevant to real MMIO, or is this patch following the letter of AAPCS which doesn't actually matter?

Again, I think AAPCS is well within its rights to say that certain volatile accesses should be performed with loads and stores of certain widths. If low-level programmers cannot use bit-fields today with memory-mapped I/O because they cannot trust compilers to produce reasonable accesses, that is a legitimate concern for ABI authors and a legitimate bug for compiler maintainers.

In D67399#1669920, @rjmccall wrote:

In D67399#1669568, @jfb wrote:

In D67399#1669038, @dnsampaio wrote:

Indeed our main concern is regarding the access widths of loads. As mentioned by @rjmccall, most volatile bitfields are used to perform memory mapped I/O, and some hardware only support them with a specific access width.
The spurious load I am more than glad to leave it disable behind a command flag, so it will only appear if the user requests it. See that volatile accesses might have side effects, and for example, an I/O read counter holding an odd number could define that the data is still being processed.

Are the cases being addressed in the PR actually relevant to real MMIO, or is this patch following the letter of AAPCS which doesn't actually matter?

Again, I think AAPCS is well within its rights to say that certain volatile accesses should be performed with loads and stores of certain widths. If low-level programmers cannot use bit-fields today with memory-mapped I/O because they cannot trust compilers to produce reasonable accesses, that is a legitimate concern for ABI authors and a legitimate bug for compiler maintainers.

I have no objection to the direction in this patch. I agree that it's important to have a specification that covers this, and while I still think that this has nothing to do with ABI as I believe the term is normally understood, treating it as platform-dependent and specifying it in the same place as the platform ABI might be the most reasonable option. Clearly the AAPCS is more than a Procedure Call Standard, which is fine.

dnsampaio mentioned this in D72932: [ARM] Follow AACPS standard for volatile bit-fields access width.Jan 17 2020, 9:06 AM

dnsampaio mentioned this in rG6a24339a4524: [ARM] Follow AACPS standard for volatile bit-fields access width.Jan 21 2020, 7:35 AM

dnsampaio added a parent revision: D72932: [ARM] Follow AACPS standard for volatile bit-fields access width.Jan 30 2020, 8:39 AM

dnsampaio retitled this revision from [ARM] Follow AACPS standard for volatile bitfields to [ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields.Jan 30 2020, 9:06 AM

dnsampaio edited the summary of this revision. (Show Details)

Added flag to allow user to opt-in

Harbormaster completed remote builds in B45358: Diff 241486.Jan 30 2020, 9:10 AM

I'm happy with this change since it's opt-in. Thanks!

This revision is now accepted and ready to land.Jan 30 2020, 9:39 AM

dnsampaio removed a parent revision: D72932: [ARM] Follow AACPS standard for volatile bit-fields access width.Feb 7 2020, 2:03 AM

Revordered some tests adding and definition of the "isAAPCS" function from patch D72932

Harbormaster completed remote builds in B45930: Diff 243102.Feb 7 2020, 2:10 AM

Closed by commit rG9d869180c4ad: [ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields (authored by Diogo Sampaio <diogo.sampaio@arm.com>). · Explain WhyFeb 7 2020, 2:13 AM

This revision was automatically updated to reflect the committed changes.

stuij mentioned this in rG514df1b2bb1e: [ARM] Follow AACPS standard for volatile bit-fields access width.Sep 8 2020, 9:50 AM

stuij mentioned this in rG208987844ffa: [ARM] Follow AACPS standard for volatile bit-fields access width.Oct 13 2020, 2:32 AM

stuij mentioned this in D96784: Pass the cmdline aapcs bitfield options to cc1.Feb 16 2021, 6:48 AM

stuij mentioned this in rG5f7715d8780a: Pass the cmdline aapcs bitfield options to cc1.Feb 18 2021, 7:40 AM

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

CodeGenOptions.def

3 lines

Driver/

Options.td

3 lines

lib/

CodeGen/

CGExpr.cpp

13 lines

Frontend/

CompilerInvocation.cpp

1 line

test/

CodeGen/

aapcs-bitfield.c

500 lines

Diff 243105

clang/include/clang/Basic/CodeGenOptions.def

	Show First 20 Lines • Show All 383 Lines • ▼ Show 20 Lines

	ENUM_CODEGENOPT(SignReturnAddress, SignReturnAddressScope, 2, SignReturnAddressScope::None)			ENUM_CODEGENOPT(SignReturnAddress, SignReturnAddressScope, 2, SignReturnAddressScope::None)
	ENUM_CODEGENOPT(SignReturnAddressKey, SignReturnAddressKeyValue, 1, SignReturnAddressKeyValue::AKey)			ENUM_CODEGENOPT(SignReturnAddressKey, SignReturnAddressKeyValue, 1, SignReturnAddressKeyValue::AKey)
	CODEGENOPT(BranchTargetEnforcement, 1, 0)			CODEGENOPT(BranchTargetEnforcement, 1, 0)

	/// Whether to emit unused static constants.			/// Whether to emit unused static constants.
	CODEGENOPT(KeepStaticConsts, 1, 0)			CODEGENOPT(KeepStaticConsts, 1, 0)

				/// Whether to not follow the AAPCS that enforce at least one read before storing to a volatile bitfield
				CODEGENOPT(ForceAAPCSBitfieldLoad, 1, 0)

	#undef CODEGENOPT			#undef CODEGENOPT
	#undef ENUM_CODEGENOPT			#undef ENUM_CODEGENOPT
	#undef VALUE_CODEGENOPT			#undef VALUE_CODEGENOPT

clang/include/clang/Driver/Options.td

Show First 20 Lines • Show All 2,320 Lines • ▼ Show 20 Lines	def mcrc : Flag<["-"], "mcrc">, Group<m_Group>,
HelpText<"Allow use of CRC instructions (ARM/Mips only)">;		HelpText<"Allow use of CRC instructions (ARM/Mips only)">;
def mnocrc : Flag<["-"], "mnocrc">, Group<m_arm_Features_Group>,		def mnocrc : Flag<["-"], "mnocrc">, Group<m_arm_Features_Group>,
HelpText<"Disallow use of CRC instructions (ARM only)">;		HelpText<"Disallow use of CRC instructions (ARM only)">;
def mno_neg_immediates: Flag<["-"], "mno-neg-immediates">, Group<m_arm_Features_Group>,		def mno_neg_immediates: Flag<["-"], "mno-neg-immediates">, Group<m_arm_Features_Group>,
HelpText<"Disallow converting instructions with negative immediates to their negation or inversion.">;		HelpText<"Disallow converting instructions with negative immediates to their negation or inversion.">;
def mcmse : Flag<["-"], "mcmse">, Group<m_arm_Features_Group>,		def mcmse : Flag<["-"], "mcmse">, Group<m_arm_Features_Group>,
Flags<[DriverOption,CC1Option]>,		Flags<[DriverOption,CC1Option]>,
HelpText<"Allow use of CMSE (Armv8-M Security Extensions)">;		HelpText<"Allow use of CMSE (Armv8-M Security Extensions)">;
		def ForceAAPCSBitfieldLoad : Flag<["-"], "fAAPCSBitfieldLoad">, Group<m_arm_Features_Group>,
		Flags<[DriverOption,CC1Option]>,
		HelpText<"Follows the AAPCS standard that all volatile bit-field write generates at least one load. (ARM only).">;

def mgeneral_regs_only : Flag<["-"], "mgeneral-regs-only">, Group<m_aarch64_Features_Group>,		def mgeneral_regs_only : Flag<["-"], "mgeneral-regs-only">, Group<m_aarch64_Features_Group>,
HelpText<"Generate code which only uses the general purpose registers (AArch64 only)">;		HelpText<"Generate code which only uses the general purpose registers (AArch64 only)">;
def mfix_cortex_a53_835769 : Flag<["-"], "mfix-cortex-a53-835769">,		def mfix_cortex_a53_835769 : Flag<["-"], "mfix-cortex-a53-835769">,
Group<m_aarch64_Features_Group>,		Group<m_aarch64_Features_Group>,
HelpText<"Workaround Cortex-A53 erratum 835769 (AArch64 only)">;		HelpText<"Workaround Cortex-A53 erratum 835769 (AArch64 only)">;
def mno_fix_cortex_a53_835769 : Flag<["-"], "mno-fix-cortex-a53-835769">,		def mno_fix_cortex_a53_835769 : Flag<["-"], "mno-fix-cortex-a53-835769">,
Group<m_aarch64_Features_Group>,		Group<m_aarch64_Features_Group>,
▲ Show 20 Lines • Show All 1,089 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGExpr.cpp

Show First 20 Lines • Show All 409 Lines • ▼ Show 20 Lines	case SD_Static:
return CGF.CGM.GetAddrOfGlobalTemporary(M, Inner);		return CGF.CGM.GetAddrOfGlobalTemporary(M, Inner);

case SD_Dynamic:		case SD_Dynamic:
llvm_unreachable("temporary can't have dynamic storage duration");		llvm_unreachable("temporary can't have dynamic storage duration");
}		}
llvm_unreachable("unknown storage duration");		llvm_unreachable("unknown storage duration");
}		}

		/// Helper method to check if the underlying ABI is AAPCS
		static bool isAAPCS(const TargetInfo &TargetInfo) {
		return TargetInfo.getABI().startswith("aapcs");
		}

LValue CodeGenFunction::		LValue CodeGenFunction::
EmitMaterializeTemporaryExpr(const MaterializeTemporaryExpr *M) {		EmitMaterializeTemporaryExpr(const MaterializeTemporaryExpr *M) {
const Expr *E = M->getSubExpr();		const Expr *E = M->getSubExpr();

assert((!M->getExtendingDecl() \|\| !isa<VarDecl>(M->getExtendingDecl()) \|\|		assert((!M->getExtendingDecl() \|\| !isa<VarDecl>(M->getExtendingDecl()) \|\|
!cast<VarDecl>(M->getExtendingDecl())->isARCPseudoStrong()) &&		!cast<VarDecl>(M->getExtendingDecl())->isARCPseudoStrong()) &&
"Reference should never be pseudo-strong!");		"Reference should never be pseudo-strong!");

▲ Show 20 Lines • Show All 1,637 Lines • ▼ Show 20 Lines	Val = Builder.CreateAnd(Val,
Info.Offset,		Info.Offset,
Info.Offset + Info.Size),		Info.Offset + Info.Size),
"bf.clear");		"bf.clear");

// Or together the unchanged values and the source value.		// Or together the unchanged values and the source value.
SrcVal = Builder.CreateOr(Val, SrcVal, "bf.set");		SrcVal = Builder.CreateOr(Val, SrcVal, "bf.set");
} else {		} else {
assert(Info.Offset == 0);		assert(Info.Offset == 0);
		// According to the AACPS:
		// When a volatile bit-field is written, and its container does not overlap
		// with any non-bit-field member, its container must be read exactly once and
		// written exactly once using the access width appropriate to the type of the
		// container. The two accesses are not atomic.
		if (Dst.isVolatileQualified() && isAAPCS(CGM.getTarget()) &&
		CGM.getCodeGenOpts().ForceAAPCSBitfieldLoad)
		Builder.CreateLoad(Ptr, true, "bf.load");
}		}

// Write the new value back out.		// Write the new value back out.
Builder.CreateStore(SrcVal, Ptr, Dst.isVolatileQualified());		Builder.CreateStore(SrcVal, Ptr, Dst.isVolatileQualified());

// Return the new value of the bit-field, if requested.		// Return the new value of the bit-field, if requested.
if (Result) {		if (Result) {
llvm::Value *ResultVal = MaskedVal;		llvm::Value *ResultVal = MaskedVal;
▲ Show 20 Lines • Show All 3,105 Lines • Show Last 20 Lines

clang/lib/Frontend/CompilerInvocation.cpp

Show First 20 Lines • Show All 1,433 Lines • ▼ Show 20 Lines	static bool ParseCodeGenArgs(CodeGenOptions &Opts, ArgList &Args, InputKind IK,

Opts.DefaultFunctionAttrs = Args.getAllArgValues(OPT_default_function_attr);		Opts.DefaultFunctionAttrs = Args.getAllArgValues(OPT_default_function_attr);

Opts.PassPlugins = Args.getAllArgValues(OPT_fpass_plugin_EQ);		Opts.PassPlugins = Args.getAllArgValues(OPT_fpass_plugin_EQ);

Opts.SymbolPartition =		Opts.SymbolPartition =
std::string(Args.getLastArgValue(OPT_fsymbol_partition_EQ));		std::string(Args.getLastArgValue(OPT_fsymbol_partition_EQ));

		Opts.ForceAAPCSBitfieldLoad = Args.hasArg(OPT_ForceAAPCSBitfieldLoad);
return Success;		return Success;
}		}

static void ParseDependencyOutputArgs(DependencyOutputOptions &Opts,		static void ParseDependencyOutputArgs(DependencyOutputOptions &Opts,
ArgList &Args) {		ArgList &Args) {
Opts.OutputFile = std::string(Args.getLastArgValue(OPT_dependency_file));		Opts.OutputFile = std::string(Args.getLastArgValue(OPT_dependency_file));
Opts.Targets = Args.getAllArgValues(OPT_MT);		Opts.Targets = Args.getAllArgValues(OPT_MT);
Opts.IncludeSystemHeaders = Args.hasArg(OPT_sys_header_deps);		Opts.IncludeSystemHeaders = Args.hasArg(OPT_sys_header_deps);
▲ Show 20 Lines • Show All 2,361 Lines • Show Last 20 Lines

clang/test/CodeGen/aapcs-bitfield.c

// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py		// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 \| FileCheck %s -check-prefix=LE		// RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 \| FileCheck %s -check-prefix=LE
// RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 \| FileCheck %s -check-prefix=BE		// RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 \| FileCheck %s -check-prefix=BE
		// RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad \| FileCheck %s -check-prefixes=LE,LENUMLOADS
		// RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad \| FileCheck %s -check-prefixes=BE,BENUMLOADS

struct st0 {		struct st0 {
short c : 7;		short c : 7;
};		};

// LE-LABEL: @st0_check_load(		// LE-LABEL: @st0_check_load(
// LE-NEXT: entry:		// LE-NEXT: entry:
// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST0:%.]], %struct.st0* [[M:%.*]], i32 0, i32 0		// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST0:%.]], %struct.st0* [[M:%.*]], i32 0, i32 0
▲ Show 20 Lines • Show All 127 Lines • ▼ Show 20 Lines
// BE-NEXT: [[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], 1		// BE-NEXT: [[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], 1
// BE-NEXT: [[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 2		// BE-NEXT: [[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 2
// BE-NEXT: store i8 [[BF_SET]], i8* [[C]], align 2		// BE-NEXT: store i8 [[BF_SET]], i8* [[C]], align 2
// BE-NEXT: ret void		// BE-NEXT: ret void
//		//
void st2_check_store(struct st2 *m) {		void st2_check_store(struct st2 *m) {
m->c = 1;		m->c = 1;
}		}
		// Volatile access is allowed to use 16 bits
struct st3 {		struct st3 {
volatile short c : 7;		volatile short c : 7;
};		};

// LE-LABEL: @st3_check_load(		// LE-LABEL: @st3_check_load(
// LE-NEXT: entry:		// LE-NEXT: entry:
// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST3:%.]], %struct.st3* [[M:%.*]], i32 0, i32 0		// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST3:%.]], %struct.st3* [[M:%.*]], i32 0, i32 0
// LE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 2		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 2
Show All 30 Lines
// BE-NEXT: [[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], 1		// BE-NEXT: [[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], 1
// BE-NEXT: [[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 2		// BE-NEXT: [[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 2
// BE-NEXT: store volatile i8 [[BF_SET]], i8* [[TMP0]], align 2		// BE-NEXT: store volatile i8 [[BF_SET]], i8* [[TMP0]], align 2
// BE-NEXT: ret void		// BE-NEXT: ret void
//		//
void st3_check_store(struct st3 *m) {		void st3_check_store(struct st3 *m) {
m->c = 1;		m->c = 1;
}		}
		// Volatile access to st4.c should use a char ld/st
struct st4 {		struct st4 {
int b : 9;		int b : 9;
volatile char c : 5;		volatile char c : 5;
};		};

// LE-LABEL: @st4_check_load(		// LE-LABEL: @st4_check_load(
// LE-NEXT: entry:		// LE-NEXT: entry:
// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST4:%.]], %struct.st4* [[M:%.*]], i32 0, i32 0		// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST4:%.]], %struct.st4* [[M:%.*]], i32 0, i32 0
▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	struct st6 {
int a : 12;		int a : 12;
char b;		char b;
int c : 5;		int c : 5;
};		};

// LE-LABEL: @st6_check_load(		// LE-LABEL: @st6_check_load(
// LE-NEXT: entry:		// LE-NEXT: entry:
// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST6:%.]], %struct.st6* [[M:%.*]], i32 0, i32 0		// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST6:%.]], %struct.st6* [[M:%.*]], i32 0, i32 0
// LE-NEXT: [[BF_LOAD:%.]] = load i16, i16 [[TMP0]], align 4		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[TMP0]], align 4
// LE-NEXT: [[BF_SHL:%.*]] = shl i16 [[BF_LOAD]], 4		// LE-NEXT: [[BF_SHL:%.*]] = shl i16 [[BF_LOAD]], 4
// LE-NEXT: [[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 4		// LE-NEXT: [[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 4
// LE-NEXT: [[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32		// LE-NEXT: [[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
// LE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 1		// LE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 1
// LE-NEXT: [[TMP1:%.]] = load i8, i8 [[B]], align 2, !tbaa !3		// LE-NEXT: [[TMP1:%.]] = load volatile i8, i8 [[B]], align 2
// LE-NEXT: [[CONV:%.*]] = sext i8 [[TMP1]] to i32		// LE-NEXT: [[CONV:%.*]] = sext i8 [[TMP1]] to i32
// LE-NEXT: [[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]		// LE-NEXT: [[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
// LE-NEXT: [[C:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 2		// LE-NEXT: [[C:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 2
// LE-NEXT: [[BF_LOAD1:%.]] = load i8, i8 [[C]], align 1		// LE-NEXT: [[BF_LOAD1:%.]] = load volatile i8, i8 [[C]], align 1
// LE-NEXT: [[BF_SHL2:%.*]] = shl i8 [[BF_LOAD1]], 3		// LE-NEXT: [[BF_SHL2:%.*]] = shl i8 [[BF_LOAD1]], 3
// LE-NEXT: [[BF_ASHR3:%.*]] = ashr exact i8 [[BF_SHL2]], 3		// LE-NEXT: [[BF_ASHR3:%.*]] = ashr exact i8 [[BF_SHL2]], 3
// LE-NEXT: [[BF_CAST4:%.*]] = sext i8 [[BF_ASHR3]] to i32		// LE-NEXT: [[BF_CAST4:%.*]] = sext i8 [[BF_ASHR3]] to i32
// LE-NEXT: [[ADD5:%.*]] = add nsw i32 [[ADD]], [[BF_CAST4]]		// LE-NEXT: [[ADD5:%.*]] = add nsw i32 [[ADD]], [[BF_CAST4]]
// LE-NEXT: ret i32 [[ADD5]]		// LE-NEXT: ret i32 [[ADD5]]
//		//
// BE-LABEL: @st6_check_load(		// BE-LABEL: @st6_check_load(
// BE-NEXT: entry:		// BE-NEXT: entry:
// BE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST6:%.]], %struct.st6* [[M:%.*]], i32 0, i32 0		// BE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST6:%.]], %struct.st6* [[M:%.*]], i32 0, i32 0
// BE-NEXT: [[BF_LOAD:%.]] = load i16, i16 [[TMP0]], align 4		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[TMP0]], align 4
// BE-NEXT: [[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 4		// BE-NEXT: [[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 4
// BE-NEXT: [[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32		// BE-NEXT: [[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
// BE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 1		// BE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 1
// BE-NEXT: [[TMP1:%.]] = load i8, i8 [[B]], align 2, !tbaa !3		// BE-NEXT: [[TMP1:%.]] = load volatile i8, i8 [[B]], align 2
// BE-NEXT: [[CONV:%.*]] = sext i8 [[TMP1]] to i32		// BE-NEXT: [[CONV:%.*]] = sext i8 [[TMP1]] to i32
// BE-NEXT: [[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]		// BE-NEXT: [[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
// BE-NEXT: [[C:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 2		// BE-NEXT: [[C:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 2
// BE-NEXT: [[BF_LOAD1:%.]] = load i8, i8 [[C]], align 1		// BE-NEXT: [[BF_LOAD1:%.]] = load volatile i8, i8 [[C]], align 1
// BE-NEXT: [[BF_ASHR2:%.*]] = ashr i8 [[BF_LOAD1]], 3		// BE-NEXT: [[BF_ASHR2:%.*]] = ashr i8 [[BF_LOAD1]], 3
// BE-NEXT: [[BF_CAST3:%.*]] = sext i8 [[BF_ASHR2]] to i32		// BE-NEXT: [[BF_CAST3:%.*]] = sext i8 [[BF_ASHR2]] to i32
// BE-NEXT: [[ADD4:%.*]] = add nsw i32 [[ADD]], [[BF_CAST3]]		// BE-NEXT: [[ADD4:%.*]] = add nsw i32 [[ADD]], [[BF_CAST3]]
// BE-NEXT: ret i32 [[ADD4]]		// BE-NEXT: ret i32 [[ADD4]]
//		//
int st6_check_load(struct st6 *m) {		int st6_check_load(volatile struct st6 *m) {
int x = m->a;		int x = m->a;
x += m->b;		x += m->b;
x += m->c;		x += m->c;
return x;		return x;
}		}

// LE-LABEL: @st6_check_store(		// LE-LABEL: @st6_check_store(
// LE-NEXT: entry:		// LE-NEXT: entry:
// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST6:%.]], %struct.st6* [[M:%.*]], i32 0, i32 0		// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST6:%.]], %struct.st6* [[M:%.*]], i32 0, i32 0
// LE-NEXT: [[BF_LOAD:%.]] = load i16, i16 [[TMP0]], align 4		// LE-NEXT: [[BF_LOAD:%.]] = load i16, i16 [[TMP0]], align 4
// LE-NEXT: [[BF_CLEAR:%.*]] = and i16 [[BF_LOAD]], -4096		// LE-NEXT: [[BF_CLEAR:%.*]] = and i16 [[BF_LOAD]], -4096
// LE-NEXT: [[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 1		// LE-NEXT: [[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 1
// LE-NEXT: store i16 [[BF_SET]], i16* [[TMP0]], align 4		// LE-NEXT: store i16 [[BF_SET]], i16* [[TMP0]], align 4
// LE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 1		// LE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 1
// LE-NEXT: store i8 2, i8* [[B]], align 2, !tbaa !3		// LE-NEXT: store i8 2, i8* [[B]], align 2
// LE-NEXT: [[C:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 2		// LE-NEXT: [[C:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 2
// LE-NEXT: [[BF_LOAD1:%.]] = load i8, i8 [[C]], align 1		// LE-NEXT: [[BF_LOAD1:%.]] = load i8, i8 [[C]], align 1
// LE-NEXT: [[BF_CLEAR2:%.*]] = and i8 [[BF_LOAD1]], -32		// LE-NEXT: [[BF_CLEAR2:%.*]] = and i8 [[BF_LOAD1]], -32
// LE-NEXT: [[BF_SET3:%.*]] = or i8 [[BF_CLEAR2]], 3		// LE-NEXT: [[BF_SET3:%.*]] = or i8 [[BF_CLEAR2]], 3
// LE-NEXT: store i8 [[BF_SET3]], i8* [[C]], align 1		// LE-NEXT: store i8 [[BF_SET3]], i8* [[C]], align 1
// LE-NEXT: ret void		// LE-NEXT: ret void
//		//
// BE-LABEL: @st6_check_store(		// BE-LABEL: @st6_check_store(
// BE-NEXT: entry:		// BE-NEXT: entry:
// BE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST6:%.]], %struct.st6* [[M:%.*]], i32 0, i32 0		// BE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST6:%.]], %struct.st6* [[M:%.*]], i32 0, i32 0
// BE-NEXT: [[BF_LOAD:%.]] = load i16, i16 [[TMP0]], align 4		// BE-NEXT: [[BF_LOAD:%.]] = load i16, i16 [[TMP0]], align 4
// BE-NEXT: [[BF_CLEAR:%.*]] = and i16 [[BF_LOAD]], 15		// BE-NEXT: [[BF_CLEAR:%.*]] = and i16 [[BF_LOAD]], 15
// BE-NEXT: [[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 16		// BE-NEXT: [[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 16
// BE-NEXT: store i16 [[BF_SET]], i16* [[TMP0]], align 4		// BE-NEXT: store i16 [[BF_SET]], i16* [[TMP0]], align 4
// BE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 1		// BE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 1
// BE-NEXT: store i8 2, i8* [[B]], align 2, !tbaa !3		// BE-NEXT: store i8 2, i8* [[B]], align 2
// BE-NEXT: [[C:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 2		// BE-NEXT: [[C:%.]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6 [[M]], i32 0, i32 2
// BE-NEXT: [[BF_LOAD1:%.]] = load i8, i8 [[C]], align 1		// BE-NEXT: [[BF_LOAD1:%.]] = load i8, i8 [[C]], align 1
// BE-NEXT: [[BF_CLEAR2:%.*]] = and i8 [[BF_LOAD1]], 7		// BE-NEXT: [[BF_CLEAR2:%.*]] = and i8 [[BF_LOAD1]], 7
// BE-NEXT: [[BF_SET3:%.*]] = or i8 [[BF_CLEAR2]], 24		// BE-NEXT: [[BF_SET3:%.*]] = or i8 [[BF_CLEAR2]], 24
// BE-NEXT: store i8 [[BF_SET3]], i8* [[C]], align 1		// BE-NEXT: store i8 [[BF_SET3]], i8* [[C]], align 1
// BE-NEXT: ret void		// BE-NEXT: ret void
//		//
void st6_check_store(struct st6 *m) {		void st6_check_store(struct st6 *m) {
m->a = 1;		m->a = 1;
m->b = 2;		m->b = 2;
m->c = 3;		m->c = 3;
}		}

// Nested structs and bitfields.		// Nested structs and bitfields.
struct st7a {		struct st7a {
char a;		char a;
int b : 5;		int b : 5;
};		};

struct st7b {		struct st7b {
char x;		char x;
struct st7a y;		volatile struct st7a y;
};		};

// LE-LABEL: @st7_check_load(		// LE-LABEL: @st7_check_load(
// LE-NEXT: entry:		// LE-NEXT: entry:
// LE-NEXT: [[X:%.]] = getelementptr inbounds [[STRUCT_ST7B:%.]], %struct.st7b* [[M:%.*]], i32 0, i32 0		// LE-NEXT: [[X:%.]] = getelementptr inbounds [[STRUCT_ST7B:%.]], %struct.st7b* [[M:%.*]], i32 0, i32 0
// LE-NEXT: [[TMP0:%.]] = load i8, i8 [[X]], align 4, !tbaa !8		// LE-NEXT: [[TMP0:%.]] = load i8, i8 [[X]], align 4
// LE-NEXT: [[CONV:%.*]] = sext i8 [[TMP0]] to i32		// LE-NEXT: [[CONV:%.*]] = sext i8 [[TMP0]] to i32
// LE-NEXT: [[A:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 0		// LE-NEXT: [[A:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 0
// LE-NEXT: [[TMP1:%.]] = load i8, i8 [[A]], align 4, !tbaa !11		// LE-NEXT: [[TMP1:%.]] = load volatile i8, i8 [[A]], align 4
// LE-NEXT: [[CONV1:%.*]] = sext i8 [[TMP1]] to i32		// LE-NEXT: [[CONV1:%.*]] = sext i8 [[TMP1]] to i32
// LE-NEXT: [[ADD:%.*]] = add nsw i32 [[CONV1]], [[CONV]]		// LE-NEXT: [[ADD:%.*]] = add nsw i32 [[CONV1]], [[CONV]]
// LE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 1		// LE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 1
// LE-NEXT: [[BF_LOAD:%.]] = load i8, i8 [[B]], align 1		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[B]], align 1
// LE-NEXT: [[BF_SHL:%.*]] = shl i8 [[BF_LOAD]], 3		// LE-NEXT: [[BF_SHL:%.*]] = shl i8 [[BF_LOAD]], 3
// LE-NEXT: [[BF_ASHR:%.*]] = ashr exact i8 [[BF_SHL]], 3		// LE-NEXT: [[BF_ASHR:%.*]] = ashr exact i8 [[BF_SHL]], 3
// LE-NEXT: [[BF_CAST:%.*]] = sext i8 [[BF_ASHR]] to i32		// LE-NEXT: [[BF_CAST:%.*]] = sext i8 [[BF_ASHR]] to i32
// LE-NEXT: [[ADD3:%.*]] = add nsw i32 [[ADD]], [[BF_CAST]]		// LE-NEXT: [[ADD3:%.*]] = add nsw i32 [[ADD]], [[BF_CAST]]
// LE-NEXT: ret i32 [[ADD3]]		// LE-NEXT: ret i32 [[ADD3]]
//		//
// BE-LABEL: @st7_check_load(		// BE-LABEL: @st7_check_load(
// BE-NEXT: entry:		// BE-NEXT: entry:
// BE-NEXT: [[X:%.]] = getelementptr inbounds [[STRUCT_ST7B:%.]], %struct.st7b* [[M:%.*]], i32 0, i32 0		// BE-NEXT: [[X:%.]] = getelementptr inbounds [[STRUCT_ST7B:%.]], %struct.st7b* [[M:%.*]], i32 0, i32 0
// BE-NEXT: [[TMP0:%.]] = load i8, i8 [[X]], align 4, !tbaa !8		// BE-NEXT: [[TMP0:%.]] = load i8, i8 [[X]], align 4
// BE-NEXT: [[CONV:%.*]] = sext i8 [[TMP0]] to i32		// BE-NEXT: [[CONV:%.*]] = sext i8 [[TMP0]] to i32
// BE-NEXT: [[A:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 0		// BE-NEXT: [[A:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 0
// BE-NEXT: [[TMP1:%.]] = load i8, i8 [[A]], align 4, !tbaa !11		// BE-NEXT: [[TMP1:%.]] = load volatile i8, i8 [[A]], align 4
// BE-NEXT: [[CONV1:%.*]] = sext i8 [[TMP1]] to i32		// BE-NEXT: [[CONV1:%.*]] = sext i8 [[TMP1]] to i32
// BE-NEXT: [[ADD:%.*]] = add nsw i32 [[CONV1]], [[CONV]]		// BE-NEXT: [[ADD:%.*]] = add nsw i32 [[CONV1]], [[CONV]]
// BE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 1		// BE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 1
// BE-NEXT: [[BF_LOAD:%.]] = load i8, i8 [[B]], align 1		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[B]], align 1
// BE-NEXT: [[BF_ASHR:%.*]] = ashr i8 [[BF_LOAD]], 3		// BE-NEXT: [[BF_ASHR:%.*]] = ashr i8 [[BF_LOAD]], 3
// BE-NEXT: [[BF_CAST:%.*]] = sext i8 [[BF_ASHR]] to i32		// BE-NEXT: [[BF_CAST:%.*]] = sext i8 [[BF_ASHR]] to i32
// BE-NEXT: [[ADD3:%.*]] = add nsw i32 [[ADD]], [[BF_CAST]]		// BE-NEXT: [[ADD3:%.*]] = add nsw i32 [[ADD]], [[BF_CAST]]
// BE-NEXT: ret i32 [[ADD3]]		// BE-NEXT: ret i32 [[ADD3]]
//		//
int st7_check_load(struct st7b *m) {		int st7_check_load(struct st7b *m) {
int r = m->x;		int r = m->x;
r += m->y.a;		r += m->y.a;
r += m->y.b;		r += m->y.b;
return r;		return r;
}		}

// LE-LABEL: @st7_check_store(		// LE-LABEL: @st7_check_store(
// LE-NEXT: entry:		// LE-NEXT: entry:
// LE-NEXT: [[X:%.]] = getelementptr inbounds [[STRUCT_ST7B:%.]], %struct.st7b* [[M:%.*]], i32 0, i32 0		// LE-NEXT: [[X:%.]] = getelementptr inbounds [[STRUCT_ST7B:%.]], %struct.st7b* [[M:%.*]], i32 0, i32 0
// LE-NEXT: store i8 1, i8* [[X]], align 4, !tbaa !8		// LE-NEXT: store i8 1, i8* [[X]], align 4
// LE-NEXT: [[A:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 0		// LE-NEXT: [[A:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 0
// LE-NEXT: store i8 2, i8* [[A]], align 4, !tbaa !11		// LE-NEXT: store volatile i8 2, i8* [[A]], align 4
// LE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 1		// LE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 1
// LE-NEXT: [[BF_LOAD:%.]] = load i8, i8 [[B]], align 1		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[B]], align 1
// LE-NEXT: [[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], -32		// LE-NEXT: [[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], -32
// LE-NEXT: [[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 3		// LE-NEXT: [[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 3
// LE-NEXT: store i8 [[BF_SET]], i8* [[B]], align 1		// LE-NEXT: store volatile i8 [[BF_SET]], i8* [[B]], align 1
// LE-NEXT: ret void		// LE-NEXT: ret void
//		//
// BE-LABEL: @st7_check_store(		// BE-LABEL: @st7_check_store(
// BE-NEXT: entry:		// BE-NEXT: entry:
// BE-NEXT: [[X:%.]] = getelementptr inbounds [[STRUCT_ST7B:%.]], %struct.st7b* [[M:%.*]], i32 0, i32 0		// BE-NEXT: [[X:%.]] = getelementptr inbounds [[STRUCT_ST7B:%.]], %struct.st7b* [[M:%.*]], i32 0, i32 0
// BE-NEXT: store i8 1, i8* [[X]], align 4, !tbaa !8		// BE-NEXT: store i8 1, i8* [[X]], align 4
// BE-NEXT: [[A:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 0		// BE-NEXT: [[A:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 0
// BE-NEXT: store i8 2, i8* [[A]], align 4, !tbaa !11		// BE-NEXT: store volatile i8 2, i8* [[A]], align 4
// BE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 1		// BE-NEXT: [[B:%.]] = getelementptr inbounds [[STRUCT_ST7B]], %struct.st7b [[M]], i32 0, i32 2, i32 1
// BE-NEXT: [[BF_LOAD:%.]] = load i8, i8 [[B]], align 1		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[B]], align 1
// BE-NEXT: [[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], 7		// BE-NEXT: [[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], 7
// BE-NEXT: [[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 24		// BE-NEXT: [[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 24
// BE-NEXT: store i8 [[BF_SET]], i8* [[B]], align 1		// BE-NEXT: store volatile i8 [[BF_SET]], i8* [[B]], align 1
// BE-NEXT: ret void		// BE-NEXT: ret void
//		//
void st7_check_store(struct st7b *m) {		void st7_check_store(struct st7b *m) {
m->x = 1;		m->x = 1;
m->y.a = 2;		m->y.a = 2;
m->y.b = 3;		m->y.b = 3;
}		}

Show All 38 Lines
//		//
int read_st9(volatile struct st9 *m) {		int read_st9(volatile struct st9 *m) {
return m->f;		return m->f;
}		}

// LE-LABEL: @store_st9(		// LE-LABEL: @store_st9(
// LE-NEXT: entry:		// LE-NEXT: entry:
// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0		// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0
		// LENUMLOADS-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 4
// LE-NEXT: store volatile i8 1, i8* [[TMP0]], align 4		// LE-NEXT: store volatile i8 1, i8* [[TMP0]], align 4
// LE-NEXT: ret void		// LE-NEXT: ret void
//		//
// BE-LABEL: @store_st9(		// BE-LABEL: @store_st9(
// BE-NEXT: entry:		// BE-NEXT: entry:
// BE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0		// BE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0
		// BENUMLOADS-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 4
		jfbUnsubmitted Done Reply Inline Actions These are just extra loads? Why? jfb: These are just extra loads? Why?
		dnsampaioAuthorUnsubmitted Done Reply Inline Actions Yes, these are just extra loads. As the AACPS describes, every write requires to perform a load as well, even if all bits of the volatile bitfield is going to be replaced. dnsampaio: Yes, these are just extra loads. As the AACPS describes, every write requires to perform a load…
// BE-NEXT: store volatile i8 1, i8* [[TMP0]], align 4		// BE-NEXT: store volatile i8 1, i8* [[TMP0]], align 4
// BE-NEXT: ret void		// BE-NEXT: ret void
//		//
void store_st9(volatile struct st9 *m) {		void store_st9(volatile struct st9 *m) {
m->f = 1;		m->f = 1;
}		}

// LE-LABEL: @increment_st9(		// LE-LABEL: @increment_st9(
// LE-NEXT: entry:		// LE-NEXT: entry:
// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0		// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0
// LE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 4		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 4
		jfbUnsubmitted Done Reply Inline Actions Why isn't this load sufficient? jfb: Why isn't this load sufficient?
		dnsampaioAuthorUnsubmitted Done Reply Inline Actions Technically speaking, that is the load for reading the bitfield, not the load required when writing it. dnsampaio: Technically speaking, that is the load for reading the bitfield, not the load required when…
// LE-NEXT: [[INC:%.*]] = add i8 [[BF_LOAD]], 1		// LE-NEXT: [[INC:%.*]] = add i8 [[BF_LOAD]], 1
		// LENUMLOADS-NEXT: [[BF_LOAD1:%.]] = load volatile i8, i8 [[TMP0]], align 4
// LE-NEXT: store volatile i8 [[INC]], i8* [[TMP0]], align 4		// LE-NEXT: store volatile i8 [[INC]], i8* [[TMP0]], align 4
// LE-NEXT: ret void		// LE-NEXT: ret void
//		//
// BE-LABEL: @increment_st9(		// BE-LABEL: @increment_st9(
// BE-NEXT: entry:		// BE-NEXT: entry:
// BE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0		// BE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0
// BE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 4		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 4
// BE-NEXT: [[INC:%.*]] = add i8 [[BF_LOAD]], 1		// BE-NEXT: [[INC:%.*]] = add i8 [[BF_LOAD]], 1
		// BENUMLOADS-NEXT: [[BF_LOAD1:%.]] = load volatile i8, i8 [[TMP0]], align 4
// BE-NEXT: store volatile i8 [[INC]], i8* [[TMP0]], align 4		// BE-NEXT: store volatile i8 [[INC]], i8* [[TMP0]], align 4
// BE-NEXT: ret void		// BE-NEXT: ret void
//		//
void increment_st9(volatile struct st9 *m) {		void increment_st9(volatile struct st9 *m) {
++m->f;		++m->f;
}		}

struct st10{		struct st10{
▲ Show 20 Lines • Show All 94 Lines • ▼ Show 20 Lines
//		//
int read_st11(volatile struct st11 *m) {		int read_st11(volatile struct st11 *m) {
return m->f;		return m->f;
}		}

// LE-LABEL: @store_st11(		// LE-LABEL: @store_st11(
// LE-NEXT: entry:		// LE-NEXT: entry:
// LE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1		// LE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1
		// LENUMLOADS-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[F]], align 1
// LE-NEXT: store volatile i16 1, i16* [[F]], align 1		// LE-NEXT: store volatile i16 1, i16* [[F]], align 1
// LE-NEXT: ret void		// LE-NEXT: ret void
//		//
// BE-LABEL: @store_st11(		// BE-LABEL: @store_st11(
// BE-NEXT: entry:		// BE-NEXT: entry:
// BE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1		// BE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1
		// BENUMLOADS-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[F]], align 1
// BE-NEXT: store volatile i16 1, i16* [[F]], align 1		// BE-NEXT: store volatile i16 1, i16* [[F]], align 1
// BE-NEXT: ret void		// BE-NEXT: ret void
//		//
void store_st11(volatile struct st11 *m) {		void store_st11(volatile struct st11 *m) {
m->f = 1;		m->f = 1;
}		}

// LE-LABEL: @increment_st11(		// LE-LABEL: @increment_st11(
// LE-NEXT: entry:		// LE-NEXT: entry:
// LE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1		// LE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1
// LE-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[F]], align 1		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[F]], align 1
// LE-NEXT: [[INC:%.*]] = add i16 [[BF_LOAD]], 1		// LE-NEXT: [[INC:%.*]] = add i16 [[BF_LOAD]], 1
		// LENUMLOADS-NEXT: [[BF_LOAD1:%.]] = load volatile i16, i16 [[F]], align 1
// LE-NEXT: store volatile i16 [[INC]], i16* [[F]], align 1		// LE-NEXT: store volatile i16 [[INC]], i16* [[F]], align 1
// LE-NEXT: ret void		// LE-NEXT: ret void
//		//
// BE-LABEL: @increment_st11(		// BE-LABEL: @increment_st11(
// BE-NEXT: entry:		// BE-NEXT: entry:
// BE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1		// BE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1
// BE-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[F]], align 1		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[F]], align 1
// BE-NEXT: [[INC:%.*]] = add i16 [[BF_LOAD]], 1		// BE-NEXT: [[INC:%.*]] = add i16 [[BF_LOAD]], 1
		// BENUMLOADS-NEXT: [[BF_LOAD1:%.]] = load volatile i16, i16 [[F]], align 1
// BE-NEXT: store volatile i16 [[INC]], i16* [[F]], align 1		// BE-NEXT: store volatile i16 [[INC]], i16* [[F]], align 1
// BE-NEXT: ret void		// BE-NEXT: ret void
//		//
void increment_st11(volatile struct st11 *m) {		void increment_st11(volatile struct st11 *m) {
++m->f;		++m->f;
}		}

// LE-LABEL: @increment_e_st11(		// LE-LABEL: @increment_e_st11(
// LE-NEXT: entry:		// LE-NEXT: entry:
// LE-NEXT: [[E:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 0		// LE-NEXT: [[E:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 0
// LE-NEXT: [[TMP0:%.]] = load volatile i8, i8 [[E]], align 4, !tbaa !12		// LE-NEXT: [[TMP0:%.]] = load volatile i8, i8 [[E]], align 4
// LE-NEXT: [[INC:%.*]] = add i8 [[TMP0]], 1		// LE-NEXT: [[INC:%.*]] = add i8 [[TMP0]], 1
// LE-NEXT: store volatile i8 [[INC]], i8* [[E]], align 4, !tbaa !12		// LE-NEXT: store volatile i8 [[INC]], i8* [[E]], align 4
// LE-NEXT: ret void		// LE-NEXT: ret void
//		//
// BE-LABEL: @increment_e_st11(		// BE-LABEL: @increment_e_st11(
// BE-NEXT: entry:		// BE-NEXT: entry:
// BE-NEXT: [[E:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 0		// BE-NEXT: [[E:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 0
// BE-NEXT: [[TMP0:%.]] = load volatile i8, i8 [[E]], align 4, !tbaa !12		// BE-NEXT: [[TMP0:%.]] = load volatile i8, i8 [[E]], align 4
// BE-NEXT: [[INC:%.*]] = add i8 [[TMP0]], 1		// BE-NEXT: [[INC:%.*]] = add i8 [[TMP0]], 1
// BE-NEXT: store volatile i8 [[INC]], i8* [[E]], align 4, !tbaa !12		// BE-NEXT: store volatile i8 [[INC]], i8* [[E]], align 4
// BE-NEXT: ret void		// BE-NEXT: ret void
//		//
void increment_e_st11(volatile struct st11 *m) {		void increment_e_st11(volatile struct st11 *m) {
++m->e;		++m->e;
}		}

struct st12{		struct st12{
int e : 8;		int e : 8;
▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines
// BE-NEXT: [[BF_CLEAR:%.*]] = and i32 [[BF_LOAD1]], 16777215		// BE-NEXT: [[BF_CLEAR:%.*]] = and i32 [[BF_LOAD1]], 16777215
// BE-NEXT: [[BF_SET:%.*]] = or i32 [[BF_CLEAR]], [[BF_SHL]]		// BE-NEXT: [[BF_SET:%.*]] = or i32 [[BF_CLEAR]], [[BF_SHL]]
// BE-NEXT: store volatile i32 [[BF_SET]], i32* [[TMP0]], align 4		// BE-NEXT: store volatile i32 [[BF_SET]], i32* [[TMP0]], align 4
// BE-NEXT: ret void		// BE-NEXT: ret void
//		//
void increment_e_st12(volatile struct st12 *m) {		void increment_e_st12(volatile struct st12 *m) {
++m->e;		++m->e;
}		}

		struct st13 {
		char a : 8;
		int b : 32;
		} __attribute__((packed));

		// LE-LABEL: @increment_b_st13(
		// LE-NEXT: entry:
		// LE-NEXT: [[TMP0:%.]] = bitcast %struct.st13 [[S:%.]] to i40
		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i40, i40 [[TMP0]], align 1
		// LE-NEXT: [[TMP1:%.*]] = lshr i40 [[BF_LOAD]], 8
		// LE-NEXT: [[BF_CAST:%.*]] = trunc i40 [[TMP1]] to i32
		// LE-NEXT: [[INC:%.*]] = add nsw i32 [[BF_CAST]], 1
		// LE-NEXT: [[TMP2:%.*]] = zext i32 [[INC]] to i40
		// LE-NEXT: [[BF_LOAD1:%.]] = load volatile i40, i40 [[TMP0]], align 1
		// LE-NEXT: [[BF_SHL:%.*]] = shl nuw i40 [[TMP2]], 8
		// LE-NEXT: [[BF_CLEAR:%.*]] = and i40 [[BF_LOAD1]], 255
		// LE-NEXT: [[BF_SET:%.*]] = or i40 [[BF_SHL]], [[BF_CLEAR]]
		// LE-NEXT: store volatile i40 [[BF_SET]], i40* [[TMP0]], align 1
		// LE-NEXT: ret void
		//
		// BE-LABEL: @increment_b_st13(
		// BE-NEXT: entry:
		// BE-NEXT: [[TMP0:%.]] = bitcast %struct.st13 [[S:%.]] to i40
		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i40, i40 [[TMP0]], align 1
		// BE-NEXT: [[BF_CAST:%.*]] = trunc i40 [[BF_LOAD]] to i32
		// BE-NEXT: [[INC:%.*]] = add nsw i32 [[BF_CAST]], 1
		// BE-NEXT: [[TMP1:%.*]] = zext i32 [[INC]] to i40
		// BE-NEXT: [[BF_LOAD1:%.]] = load volatile i40, i40 [[TMP0]], align 1
		// BE-NEXT: [[BF_CLEAR:%.*]] = and i40 [[BF_LOAD1]], -4294967296
		// BE-NEXT: [[BF_SET:%.*]] = or i40 [[BF_CLEAR]], [[TMP1]]
		// BE-NEXT: store volatile i40 [[BF_SET]], i40* [[TMP0]], align 1
		// BE-NEXT: ret void
		//
		void increment_b_st13(volatile struct st13 *s) {
		s->b++;
		}

		struct st14 {
		char a : 8;
		} __attribute__((packed));

		// LE-LABEL: @increment_a_st14(
		// LE-NEXT: entry:
		// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST14:%.]], %struct.st14* [[S:%.*]], i32 0, i32 0
		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 1
		// LE-NEXT: [[INC:%.*]] = add i8 [[BF_LOAD]], 1
		// LENUMLOADS-NEXT: [[BF_LOAD1:%.]] = load volatile i8, i8 [[TMP0]], align 1
		// LE-NEXT: store volatile i8 [[INC]], i8* [[TMP0]], align 1
		// LE-NEXT: ret void
		//
		// BE-LABEL: @increment_a_st14(
		// BE-NEXT: entry:
		// BE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST14:%.]], %struct.st14* [[S:%.*]], i32 0, i32 0
		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 1
		// BE-NEXT: [[INC:%.*]] = add i8 [[BF_LOAD]], 1
		// BENUMLOADS-NEXT: [[BF_LOAD1:%.]] = load volatile i8, i8 [[TMP0]], align 1
		// BE-NEXT: store volatile i8 [[INC]], i8* [[TMP0]], align 1
		// BE-NEXT: ret void
		//
		void increment_a_st14(volatile struct st14 *s) {
		s->a++;
		}

		struct st15 {
		short a : 8;
		} __attribute__((packed));

		// LE-LABEL: @increment_a_st15(
		// LE-NEXT: entry:
		// LE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST15:%.]], %struct.st15* [[S:%.*]], i32 0, i32 0
		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 1
		// LE-NEXT: [[INC:%.*]] = add i8 [[BF_LOAD]], 1
		// LENUMLOADS-NEXT: [[BF_LOAD1:%.]] = load volatile i8, i8 [[TMP0]], align 1
		// LE-NEXT: store volatile i8 [[INC]], i8* [[TMP0]], align 1
		// LE-NEXT: ret void
		//
		// BE-LABEL: @increment_a_st15(
		// BE-NEXT: entry:
		// BE-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_ST15:%.]], %struct.st15* [[S:%.*]], i32 0, i32 0
		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 1
		// BE-NEXT: [[INC:%.*]] = add i8 [[BF_LOAD]], 1
		// BENUMLOADS-NEXT: [[BF_LOAD1:%.]] = load volatile i8, i8 [[TMP0]], align 1
		// BE-NEXT: store volatile i8 [[INC]], i8* [[TMP0]], align 1
		// BE-NEXT: ret void
		//
		void increment_a_st15(volatile struct st15 *s) {
		s->a++;
		}

		struct st16 {
		int a : 32;
		int b : 16;
		int c : 32;
		int d : 16;
		};

		// LE-LABEL: @increment_a_st16(
		// LE-NEXT: entry:
		// LE-NEXT: [[TMP0:%.]] = bitcast %struct.st16 [[S:%.]] to i64
		// LE-NEXT: [[BF_LOAD:%.]] = load i64, i64 [[TMP0]], align 4
		// LE-NEXT: [[BF_CAST:%.*]] = trunc i64 [[BF_LOAD]] to i32
		// LE-NEXT: [[INC:%.*]] = add nsw i32 [[BF_CAST]], 1
		// LE-NEXT: [[TMP1:%.*]] = zext i32 [[INC]] to i64
		// LE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD]], -4294967296
		// LE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_CLEAR]], [[TMP1]]
		// LE-NEXT: store i64 [[BF_SET]], i64* [[TMP0]], align 4
		// LE-NEXT: ret void
		//
		// BE-LABEL: @increment_a_st16(
		// BE-NEXT: entry:
		// BE-NEXT: [[TMP0:%.]] = bitcast %struct.st16 [[S:%.]] to i64
		// BE-NEXT: [[BF_LOAD:%.]] = load i64, i64 [[TMP0]], align 4
		// BE-NEXT: [[TMP1:%.*]] = lshr i64 [[BF_LOAD]], 32
		// BE-NEXT: [[BF_CAST:%.*]] = trunc i64 [[TMP1]] to i32
		// BE-NEXT: [[INC:%.*]] = add nsw i32 [[BF_CAST]], 1
		// BE-NEXT: [[TMP2:%.*]] = zext i32 [[INC]] to i64
		// BE-NEXT: [[BF_SHL:%.*]] = shl nuw i64 [[TMP2]], 32
		// BE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD]], 4294967295
		// BE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_SHL]], [[BF_CLEAR]]
		// BE-NEXT: store i64 [[BF_SET]], i64* [[TMP0]], align 4
		// BE-NEXT: ret void
		//
		void increment_a_st16(struct st16 *s) {
		s->a++;
		}

		// LE-LABEL: @increment_b_st16(
		// LE-NEXT: entry:
		// LE-NEXT: [[TMP0:%.]] = bitcast %struct.st16 [[S:%.]] to i64
		// LE-NEXT: [[BF_LOAD:%.]] = load i64, i64 [[TMP0]], align 4
		// LE-NEXT: [[TMP1:%.*]] = lshr i64 [[BF_LOAD]], 32
		// LE-NEXT: [[TMP2:%.*]] = trunc i64 [[TMP1]] to i32
		// LE-NEXT: [[INC:%.*]] = add i32 [[TMP2]], 1
		// LE-NEXT: [[TMP3:%.*]] = and i32 [[INC]], 65535
		// LE-NEXT: [[BF_VALUE:%.*]] = zext i32 [[TMP3]] to i64
		// LE-NEXT: [[BF_SHL2:%.*]] = shl nuw nsw i64 [[BF_VALUE]], 32
		// LE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD]], -281470681743361
		// LE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_SHL2]], [[BF_CLEAR]]
		// LE-NEXT: store i64 [[BF_SET]], i64* [[TMP0]], align 4
		// LE-NEXT: ret void
		//
		// BE-LABEL: @increment_b_st16(
		// BE-NEXT: entry:
		// BE-NEXT: [[TMP0:%.]] = bitcast %struct.st16 [[S:%.]] to i64
		// BE-NEXT: [[BF_LOAD:%.]] = load i64, i64 [[TMP0]], align 4
		// BE-NEXT: [[TMP1:%.*]] = trunc i64 [[BF_LOAD]] to i32
		// BE-NEXT: [[INC4:%.*]] = add i32 [[TMP1]], 65536
		// BE-NEXT: [[TMP2:%.*]] = and i32 [[INC4]], -65536
		// BE-NEXT: [[BF_SHL2:%.*]] = zext i32 [[TMP2]] to i64
		// BE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD]], -4294901761
		// BE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_CLEAR]], [[BF_SHL2]]
		// BE-NEXT: store i64 [[BF_SET]], i64* [[TMP0]], align 4
		// BE-NEXT: ret void
		//
		void increment_b_st16(struct st16 *s) {
		s->b++;
		}

		// LE-LABEL: @increment_c_st16(
		// LE-NEXT: entry:
		// LE-NEXT: [[C:%.]] = getelementptr inbounds [[STRUCT_ST16:%.]], %struct.st16* [[S:%.*]], i32 0, i32 1
		// LE-NEXT: [[TMP0:%.]] = bitcast i48 [[C]] to i64*
		// LE-NEXT: [[BF_LOAD:%.]] = load i64, i64 [[TMP0]], align 4
		// LE-NEXT: [[BF_CAST:%.*]] = trunc i64 [[BF_LOAD]] to i32
		// LE-NEXT: [[INC:%.*]] = add nsw i32 [[BF_CAST]], 1
		// LE-NEXT: [[TMP1:%.*]] = zext i32 [[INC]] to i64
		// LE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD]], -4294967296
		// LE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_CLEAR]], [[TMP1]]
		// LE-NEXT: store i64 [[BF_SET]], i64* [[TMP0]], align 4
		// LE-NEXT: ret void
		//
		// BE-LABEL: @increment_c_st16(
		// BE-NEXT: entry:
		// BE-NEXT: [[C:%.]] = getelementptr inbounds [[STRUCT_ST16:%.]], %struct.st16* [[S:%.*]], i32 0, i32 1
		// BE-NEXT: [[TMP0:%.]] = bitcast i48 [[C]] to i64*
		// BE-NEXT: [[BF_LOAD:%.]] = load i64, i64 [[TMP0]], align 4
		// BE-NEXT: [[TMP1:%.*]] = lshr i64 [[BF_LOAD]], 32
		// BE-NEXT: [[BF_CAST:%.*]] = trunc i64 [[TMP1]] to i32
		// BE-NEXT: [[INC:%.*]] = add nsw i32 [[BF_CAST]], 1
		// BE-NEXT: [[TMP2:%.*]] = zext i32 [[INC]] to i64
		// BE-NEXT: [[BF_SHL:%.*]] = shl nuw i64 [[TMP2]], 32
		// BE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD]], 4294967295
		// BE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_SHL]], [[BF_CLEAR]]
		// BE-NEXT: store i64 [[BF_SET]], i64* [[TMP0]], align 4
		// BE-NEXT: ret void
		//
		void increment_c_st16(struct st16 *s) {
		s->c++;
		}

		// LE-LABEL: @increment_d_st16(
		// LE-NEXT: entry:
		// LE-NEXT: [[D:%.]] = getelementptr inbounds [[STRUCT_ST16:%.]], %struct.st16* [[S:%.*]], i32 0, i32 1
		// LE-NEXT: [[TMP0:%.]] = bitcast i48 [[D]] to i64*
		// LE-NEXT: [[BF_LOAD:%.]] = load i64, i64 [[TMP0]], align 4
		// LE-NEXT: [[TMP1:%.*]] = lshr i64 [[BF_LOAD]], 32
		// LE-NEXT: [[TMP2:%.*]] = trunc i64 [[TMP1]] to i32
		// LE-NEXT: [[INC:%.*]] = add i32 [[TMP2]], 1
		// LE-NEXT: [[TMP3:%.*]] = and i32 [[INC]], 65535
		// LE-NEXT: [[BF_VALUE:%.*]] = zext i32 [[TMP3]] to i64
		// LE-NEXT: [[BF_SHL2:%.*]] = shl nuw nsw i64 [[BF_VALUE]], 32
		// LE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD]], -281470681743361
		// LE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_SHL2]], [[BF_CLEAR]]
		// LE-NEXT: store i64 [[BF_SET]], i64* [[TMP0]], align 4
		// LE-NEXT: ret void
		//
		// BE-LABEL: @increment_d_st16(
		// BE-NEXT: entry:
		// BE-NEXT: [[D:%.]] = getelementptr inbounds [[STRUCT_ST16:%.]], %struct.st16* [[S:%.*]], i32 0, i32 1
		// BE-NEXT: [[TMP0:%.]] = bitcast i48 [[D]] to i64*
		// BE-NEXT: [[BF_LOAD:%.]] = load i64, i64 [[TMP0]], align 4
		// BE-NEXT: [[TMP1:%.*]] = trunc i64 [[BF_LOAD]] to i32
		// BE-NEXT: [[INC4:%.*]] = add i32 [[TMP1]], 65536
		// BE-NEXT: [[TMP2:%.*]] = and i32 [[INC4]], -65536
		// BE-NEXT: [[BF_SHL2:%.*]] = zext i32 [[TMP2]] to i64
		// BE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD]], -4294901761
		// BE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_CLEAR]], [[BF_SHL2]]
		// BE-NEXT: store i64 [[BF_SET]], i64* [[TMP0]], align 4
		// BE-NEXT: ret void
		//
		void increment_d_st16(struct st16 *s) {
		s->d++;
		}

		// LE-LABEL: @increment_v_a_st16(
		// LE-NEXT: entry:
		// LE-NEXT: [[TMP0:%.]] = bitcast %struct.st16 [[S:%.]] to i64
		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// LE-NEXT: [[BF_CAST:%.*]] = trunc i64 [[BF_LOAD]] to i32
		// LE-NEXT: [[INC:%.*]] = add nsw i32 [[BF_CAST]], 1
		// LE-NEXT: [[TMP1:%.*]] = zext i32 [[INC]] to i64
		// LE-NEXT: [[BF_LOAD1:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// LE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD1]], -4294967296
		// LE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_CLEAR]], [[TMP1]]
		// LE-NEXT: store volatile i64 [[BF_SET]], i64* [[TMP0]], align 4
		// LE-NEXT: ret void
		//
		// BE-LABEL: @increment_v_a_st16(
		// BE-NEXT: entry:
		// BE-NEXT: [[TMP0:%.]] = bitcast %struct.st16 [[S:%.]] to i64
		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// BE-NEXT: [[TMP1:%.*]] = lshr i64 [[BF_LOAD]], 32
		// BE-NEXT: [[BF_CAST:%.*]] = trunc i64 [[TMP1]] to i32
		// BE-NEXT: [[INC:%.*]] = add nsw i32 [[BF_CAST]], 1
		// BE-NEXT: [[TMP2:%.*]] = zext i32 [[INC]] to i64
		// BE-NEXT: [[BF_LOAD1:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// BE-NEXT: [[BF_SHL:%.*]] = shl nuw i64 [[TMP2]], 32
		// BE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD1]], 4294967295
		// BE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_SHL]], [[BF_CLEAR]]
		// BE-NEXT: store volatile i64 [[BF_SET]], i64* [[TMP0]], align 4
		// BE-NEXT: ret void
		//
		void increment_v_a_st16(volatile struct st16 *s) {
		s->a++;
		}

		// LE-LABEL: @increment_v_b_st16(
		// LE-NEXT: entry:
		// LE-NEXT: [[TMP0:%.]] = bitcast %struct.st16 [[S:%.]] to i64
		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// LE-NEXT: [[TMP1:%.*]] = lshr i64 [[BF_LOAD]], 32
		// LE-NEXT: [[TMP2:%.*]] = trunc i64 [[TMP1]] to i32
		// LE-NEXT: [[INC:%.*]] = add i32 [[TMP2]], 1
		// LE-NEXT: [[BF_LOAD1:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// LE-NEXT: [[TMP3:%.*]] = and i32 [[INC]], 65535
		// LE-NEXT: [[BF_VALUE:%.*]] = zext i32 [[TMP3]] to i64
		// LE-NEXT: [[BF_SHL2:%.*]] = shl nuw nsw i64 [[BF_VALUE]], 32
		// LE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD1]], -281470681743361
		// LE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_SHL2]], [[BF_CLEAR]]
		// LE-NEXT: store volatile i64 [[BF_SET]], i64* [[TMP0]], align 4
		// LE-NEXT: ret void
		//
		// BE-LABEL: @increment_v_b_st16(
		// BE-NEXT: entry:
		// BE-NEXT: [[TMP0:%.]] = bitcast %struct.st16 [[S:%.]] to i64
		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// BE-NEXT: [[BF_LOAD1:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// BE-NEXT: [[TMP1:%.*]] = trunc i64 [[BF_LOAD]] to i32
		// BE-NEXT: [[INC4:%.*]] = add i32 [[TMP1]], 65536
		// BE-NEXT: [[TMP2:%.*]] = and i32 [[INC4]], -65536
		// BE-NEXT: [[BF_SHL2:%.*]] = zext i32 [[TMP2]] to i64
		// BE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD1]], -4294901761
		// BE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_CLEAR]], [[BF_SHL2]]
		// BE-NEXT: store volatile i64 [[BF_SET]], i64* [[TMP0]], align 4
		// BE-NEXT: ret void
		//
		void increment_v_b_st16(volatile struct st16 *s) {
		s->b++;
		}

		// LE-LABEL: @increment_v_c_st16(
		// LE-NEXT: entry:
		// LE-NEXT: [[C:%.]] = getelementptr inbounds [[STRUCT_ST16:%.]], %struct.st16* [[S:%.*]], i32 0, i32 1
		// LE-NEXT: [[TMP0:%.]] = bitcast i48 [[C]] to i64*
		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// LE-NEXT: [[BF_CAST:%.*]] = trunc i64 [[BF_LOAD]] to i32
		// LE-NEXT: [[INC:%.*]] = add nsw i32 [[BF_CAST]], 1
		// LE-NEXT: [[TMP1:%.*]] = zext i32 [[INC]] to i64
		// LE-NEXT: [[BF_LOAD1:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// LE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD1]], -4294967296
		// LE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_CLEAR]], [[TMP1]]
		// LE-NEXT: store volatile i64 [[BF_SET]], i64* [[TMP0]], align 4
		// LE-NEXT: ret void
		//
		// BE-LABEL: @increment_v_c_st16(
		// BE-NEXT: entry:
		// BE-NEXT: [[C:%.]] = getelementptr inbounds [[STRUCT_ST16:%.]], %struct.st16* [[S:%.*]], i32 0, i32 1
		// BE-NEXT: [[TMP0:%.]] = bitcast i48 [[C]] to i64*
		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// BE-NEXT: [[TMP1:%.*]] = lshr i64 [[BF_LOAD]], 32
		// BE-NEXT: [[BF_CAST:%.*]] = trunc i64 [[TMP1]] to i32
		// BE-NEXT: [[INC:%.*]] = add nsw i32 [[BF_CAST]], 1
		// BE-NEXT: [[TMP2:%.*]] = zext i32 [[INC]] to i64
		// BE-NEXT: [[BF_LOAD1:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// BE-NEXT: [[BF_SHL:%.*]] = shl nuw i64 [[TMP2]], 32
		// BE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD1]], 4294967295
		// BE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_SHL]], [[BF_CLEAR]]
		// BE-NEXT: store volatile i64 [[BF_SET]], i64* [[TMP0]], align 4
		// BE-NEXT: ret void
		//
		void increment_v_c_st16(volatile struct st16 *s) {
		s->c++;
		}

		// LE-LABEL: @increment_v_d_st16(
		// LE-NEXT: entry:
		// LE-NEXT: [[D:%.]] = getelementptr inbounds [[STRUCT_ST16:%.]], %struct.st16* [[S:%.*]], i32 0, i32 1
		// LE-NEXT: [[TMP0:%.]] = bitcast i48 [[D]] to i64*
		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// LE-NEXT: [[TMP1:%.*]] = lshr i64 [[BF_LOAD]], 32
		// LE-NEXT: [[TMP2:%.*]] = trunc i64 [[TMP1]] to i32
		// LE-NEXT: [[INC:%.*]] = add i32 [[TMP2]], 1
		// LE-NEXT: [[BF_LOAD1:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// LE-NEXT: [[TMP3:%.*]] = and i32 [[INC]], 65535
		// LE-NEXT: [[BF_VALUE:%.*]] = zext i32 [[TMP3]] to i64
		// LE-NEXT: [[BF_SHL2:%.*]] = shl nuw nsw i64 [[BF_VALUE]], 32
		// LE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD1]], -281470681743361
		// LE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_SHL2]], [[BF_CLEAR]]
		// LE-NEXT: store volatile i64 [[BF_SET]], i64* [[TMP0]], align 4
		// LE-NEXT: ret void
		//
		// BE-LABEL: @increment_v_d_st16(
		// BE-NEXT: entry:
		// BE-NEXT: [[D:%.]] = getelementptr inbounds [[STRUCT_ST16:%.]], %struct.st16* [[S:%.*]], i32 0, i32 1
		// BE-NEXT: [[TMP0:%.]] = bitcast i48 [[D]] to i64*
		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// BE-NEXT: [[BF_LOAD1:%.]] = load volatile i64, i64 [[TMP0]], align 4
		// BE-NEXT: [[TMP1:%.*]] = trunc i64 [[BF_LOAD]] to i32
		// BE-NEXT: [[INC4:%.*]] = add i32 [[TMP1]], 65536
		// BE-NEXT: [[TMP2:%.*]] = and i32 [[INC4]], -65536
		// BE-NEXT: [[BF_SHL2:%.*]] = zext i32 [[TMP2]] to i64
		// BE-NEXT: [[BF_CLEAR:%.*]] = and i64 [[BF_LOAD1]], -4294901761
		// BE-NEXT: [[BF_SET:%.*]] = or i64 [[BF_CLEAR]], [[BF_SHL2]]
		// BE-NEXT: store volatile i64 [[BF_SET]], i64* [[TMP0]], align 4
		// BE-NEXT: ret void
		//
		void increment_v_d_st16(volatile struct st16 *s) {
		s->d++;
		}
		// st17 has alignment = 1, the AAPCS defines nothing for the
		// accessing of b, but accessing c should use char
		struct st17 {
		int b : 32;
		char c : 8;
		} __attribute__((packed));

		// LE-LABEL: @increment_v_b_st17(
		// LE-NEXT: entry:
		// LE-NEXT: [[TMP0:%.]] = bitcast %struct.st17 [[S:%.]] to i40
		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i40, i40 [[TMP0]], align 1
		// LE-NEXT: [[BF_CAST:%.*]] = trunc i40 [[BF_LOAD]] to i32
		// LE-NEXT: [[INC:%.*]] = add nsw i32 [[BF_CAST]], 1
		// LE-NEXT: [[TMP1:%.*]] = zext i32 [[INC]] to i40
		// LE-NEXT: [[BF_LOAD1:%.]] = load volatile i40, i40 [[TMP0]], align 1
		// LE-NEXT: [[BF_CLEAR:%.*]] = and i40 [[BF_LOAD1]], -4294967296
		// LE-NEXT: [[BF_SET:%.*]] = or i40 [[BF_CLEAR]], [[TMP1]]
		// LE-NEXT: store volatile i40 [[BF_SET]], i40* [[TMP0]], align 1
		// LE-NEXT: ret void
		//
		// BE-LABEL: @increment_v_b_st17(
		// BE-NEXT: entry:
		// BE-NEXT: [[TMP0:%.]] = bitcast %struct.st17 [[S:%.]] to i40
		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i40, i40 [[TMP0]], align 1
		// BE-NEXT: [[TMP1:%.*]] = lshr i40 [[BF_LOAD]], 8
		// BE-NEXT: [[BF_CAST:%.*]] = trunc i40 [[TMP1]] to i32
		// BE-NEXT: [[INC:%.*]] = add nsw i32 [[BF_CAST]], 1
		// BE-NEXT: [[TMP2:%.*]] = zext i32 [[INC]] to i40
		// BE-NEXT: [[BF_LOAD1:%.]] = load volatile i40, i40 [[TMP0]], align 1
		// BE-NEXT: [[BF_SHL:%.*]] = shl nuw i40 [[TMP2]], 8
		// BE-NEXT: [[BF_CLEAR:%.*]] = and i40 [[BF_LOAD1]], 255
		// BE-NEXT: [[BF_SET:%.*]] = or i40 [[BF_SHL]], [[BF_CLEAR]]
		// BE-NEXT: store volatile i40 [[BF_SET]], i40* [[TMP0]], align 1
		// BE-NEXT: ret void
		//
		void increment_v_b_st17(volatile struct st17 *s) {
		s->b++;
		}

		// LE-LABEL: @increment_v_c_st17(
		// LE-NEXT: entry:
		// LE-NEXT: [[TMP0:%.]] = bitcast %struct.st17 [[S:%.]] to i40
		// LE-NEXT: [[BF_LOAD:%.]] = load volatile i40, i40 [[TMP0]], align 1
		// LE-NEXT: [[TMP1:%.*]] = lshr i40 [[BF_LOAD]], 32
		// LE-NEXT: [[BF_CAST:%.*]] = trunc i40 [[TMP1]] to i8
		// LE-NEXT: [[INC:%.*]] = add i8 [[BF_CAST]], 1
		// LE-NEXT: [[TMP2:%.*]] = zext i8 [[INC]] to i40
		// LE-NEXT: [[BF_LOAD1:%.]] = load volatile i40, i40 [[TMP0]], align 1
		// LE-NEXT: [[BF_SHL:%.*]] = shl nuw i40 [[TMP2]], 32
		// LE-NEXT: [[BF_CLEAR:%.*]] = and i40 [[BF_LOAD1]], 4294967295
		// LE-NEXT: [[BF_SET:%.*]] = or i40 [[BF_SHL]], [[BF_CLEAR]]
		// LE-NEXT: store volatile i40 [[BF_SET]], i40* [[TMP0]], align 1
		// LE-NEXT: ret void
		//
		// BE-LABEL: @increment_v_c_st17(
		// BE-NEXT: entry:
		// BE-NEXT: [[TMP0:%.]] = bitcast %struct.st17 [[S:%.]] to i40
		// BE-NEXT: [[BF_LOAD:%.]] = load volatile i40, i40 [[TMP0]], align 1
		// BE-NEXT: [[BF_CAST:%.*]] = trunc i40 [[BF_LOAD]] to i8
		// BE-NEXT: [[INC:%.*]] = add i8 [[BF_CAST]], 1
		// BE-NEXT: [[TMP1:%.*]] = zext i8 [[INC]] to i40
		// BE-NEXT: [[BF_LOAD1:%.]] = load volatile i40, i40 [[TMP0]], align 1
		// BE-NEXT: [[BF_CLEAR:%.*]] = and i40 [[BF_LOAD1]], -256
		// BE-NEXT: [[BF_SET:%.*]] = or i40 [[BF_CLEAR]], [[TMP1]]
		// BE-NEXT: store volatile i40 [[BF_SET]], i40* [[TMP0]], align 1
		// BE-NEXT: ret void
		//
		void increment_v_c_st17(volatile struct st17 *s) {
		s->c++;
		}