This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
CGExpr.cpp
-
test/CodeGen/
-
CodeGen/
4/4
aapcs-bitfield.c

Differential D67399

[ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields
ClosedPublic

Authored by dnsampaio on Sep 10 2019, 7:44 AM.

Download Raw Diff

Details

Reviewers

ostannard
jfb
eli.friedman
lebedev.ri

Commits

rG9d869180c4ad: [ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields

Summary

Following the AAPCS, every store to a volatile bit-field requires to generate one load of that field, even if all the bits are going to be replaced.
This patch allows the user to opt-in in following such rule, whenever the a.

AAPCS Release 2019Q1.1 (https://static.docs.arm.com/ihi0042/g/aapcs32.pdf)
section 8.1 Data Types, page 35, paragraph: Volatile bit-fields – preserving number and width of container accesses

When a volatile bit-field is written, and its container does not overlap with any non-bit-field member, its
container must be read exactly once and written exactly once using the access width appropriate to the
type of the container. The two accesses are not atomic.

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 37957
Build 37956: arc lint + arc unit

Event Timeline

dnsampaio created this revision.Sep 10 2019, 7:44 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 10 2019, 7:44 AM

Herald added subscribers: cfe-commits, jfb, kristof.beyls. · View Herald Transcript

Harbormaster completed remote builds in B37957: Diff 219542.Sep 10 2019, 7:46 AM

This patch could hack clang to generate an extra load. However, my knowledge in the clang code base is not extensive. How could we ensure that the width of loads and stores are the size of the container, and that they don't overlap non-bitfields?

I'm not sure why i'm added as a reviewer here.

That being said i don't see why that load is needed there.
As i read it, the document only states that if load is needed,
then it must be a single load, likewise for store.
I.e. i'm not sure why it requires a load when it isn't needed.
Unlike atomics, volatiles don't enforce any ordering.

Herald added a subscriber: dexonsmith. · View Herald TranscriptSep 10 2019, 7:52 AM

@ostannard might prove me wrong, but according to the AACPS:

When a volatile bit-field is written, and its container does not overlap with any non-bit-field member, its
container must be read exactly once and written exactly once using the access width appropriate to the
type of the container. The two accesses are not atomic.

This rule does not define that the load is done if required. It states that it will be read once. It even gives the example that an increment will always perform two reads and one write, bitwidth agnostic. It writes just after:

Note: Note the volatile access rules apply even when the width and alignment of the bit-field imply that
the access could be achieved more efficiently using a narrower type. For a write operation the read must
always occur even if the entire contents of the container will be replaced.

The rationale is to provide a uniform behavior for volatile bitfields independent of their width (as far they do not overlap with non-bitfields).

In D67399#1664785, @dnsampaio wrote:
@ostannard might prove me wrong, but according to the AACPS:
When a volatile bit-field is written, and its container does not overlap with any non-bit-field member, its
container must be read exactly once and written exactly once using the access width appropriate to the
type of the container. The two accesses are not atomic.
This rule does not define that the load is done if required. It states that it will be read once. It even gives the example that an increment will always perform two reads and one write, bitwidth agnostic. It writes just after:

While it can be read as "just always read volatile bitfield before overwriting it", i'm not sure that makes any sense.
Why exactly would you want do to that load, if you don't use it's results? What side-effects does it cause?

What it should be saying is: "when you have a bitfield, and you want to update several of it's fields but not all,
first do one read to get the old contents, then merge old-new data, and then perform a single store replacing the entire contents at once".
That is the only sane behavior it can require.

Note: Note the volatile access rules apply even when the width and alignment of the bit-field imply that
the access could be achieved more efficiently using a narrower type. For a write operation the read must
always occur even if the entire contents of the container will be replaced.
The rationale is to provide a uniform behavior for volatile bitfields independent of their width (as far they do not overlap with non-bitfields).

lebedev.ri removed a subscriber: lebedev.ri.Sep 10 2019, 8:30 AM

jfb added a subscriber: rjmccall.Sep 10 2019, 8:59 AM

jfb added inline comments.Sep 10 2019, 9:01 AM

clang/test/CodeGen/aapcs-bitfield.c
541	These are just extra loads? Why?
552	Why isn't this load sufficient?

Hi @jfb. In a example such as:

struct { int a : 1; int b : 16; } S;
extern int expensive_computaion(int v);
void foo(volatile S* s){
  s->b = expensive_computation(s->b);
}

There is no guarantee that s->a is not modified during a expensive computation, so it must be obtained just before writing the s->b value, as a and b share the same memory position. This is already done by llvm. Indeed, the exact output would be

define void @foo(%struct.S* %S) local_unnamed_addr #0 {
entry:
  %0 = bitcast %struct.S* %S to i32*
  %bf.load = load volatile i32, i32* %0, align 4
  %bf.shl = shl i32 %bf.load, 15
  %bf.ashr = ashr i32 %bf.shl, 16
  %call = tail call i32 @expensive_computation(i32 %bf.ashr) #2
  %bf.load1 = load volatile i32, i32* %0, align 4 ; <<<== Here it obtains the value to s->a to restore it.
  %bf.value = shl i32 %call, 1
  %bf.shl2 = and i32 %bf.value, 131070
  %bf.clear = and i32 %bf.load1, -131071
  %bf.set = or i32 %bf.clear, %bf.shl2
  store volatile i32 %bf.set, i32* %0, align 4
  ret void
}

These extra loads here are required to make uniform the number of times the volatile bitfield is read, independent if they share or not memory with other data.

We could have it under a flag, such as -faacps-volatilebitfield, disabled by default.

The other point not conformant to the AACPS is the width of the loads/stores used to obtain bitfields. They should be the width of the container, if that would not overlap with non-bitfields. Do you have any idea where that could be computed? I imagine that would be when computing the alignment of the elements of the structure, where I can check if the performing the entire container width load would overlap with other elements. Could you point me where that is done?

clang/test/CodeGen/aapcs-bitfield.c
541	Yes, these are just extra loads. As the AACPS describes, every write requires to perform a load as well, even if all bits of the volatile bitfield is going to be replaced.
552	Technically speaking, that is the load for reading the bitfield, not the load required when writing it.

dnsampaio marked 2 inline comments as done.Sep 11 2019, 1:56 AM

dnsampaio added a reviewer: eli.friedman.Sep 11 2019, 9:38 AM

The exact sequence of volatile accesses is observable behavior, and it's the ABI's right to define correct behavior for compliant implementations, so we do need to do this.

Diogo, IRGen breaks up bit-field storage units in CGRecordLowering::accumulateBitFields. I'm sure there's some reasonable way to add target-specific adjustments to that so that we break things up by storage unit.

In D67399#1668759, @rjmccall wrote:

The exact sequence of volatile accesses is observable behavior, and it's the ABI's right to define correct behavior for compliant implementations, so we do need to do this.

I'm not convinced; I think the AAPCS is overstepping its bounds here by trying to specify this. Whether you get a volatile load for a full-width bitfield access doesn't seem to have anything to do with ABI; implementations that differ on this detail will be able to produce code that links together and works together just fine. I think this is instead a question of what guarantees the implementation wants to give about how it treats volatile operations and how it lowers them to hardware. That should be specified, sure (and in fact C++ at least requires us to document what we do), but it's not ABI any more than (for example) the choice of whether a volatile access to a local variable must actually issue load / store instructions -- or which instructions must be produced -- is ABI. I have no particular opinion on whether we should make this change -- volatile bit-fields are pretty weird and broken regardless. But I don't think we should view it as being part of the ABI, and just as with the question of whether plain bit-fields are signed, we should tell the psABI folks that the question of what constitutes a volatile access and the semantics of such an access is not ABI and not up to them.

(Similar example: on Windows x86 targets, volatile accesses are atomic by default (controlled by compiler switch). clang-cl follows cl in this regard, but we don't consider it to be part of the ABI, and in the regular clang driver, we do not treat volatile accesses as atomic when targeting Windows x86.)

I have to say that I disagree. The ABI certainly doesn't get to control the language and define what constitutes a volatile access, but when the language has decided that a volatile access is actually performed, I think ABIs absolutely ought to define how they should be compiled. Volatile accesses are quite unusual in that they are used for a purpose — memory-mapped I/O — that is extremely dependent for correctness on the exact instructions used to implement it. IIRC there are architectures that for whatever reason provide two different load instructions where only one of those instructions actually performs memory-mapped I/O correctly. Certainly the exact width of load or store is critical for correctness. So while I have certainly seen ABI overreach before and pushed back on it, for this one case I think ABI involvement is totally appropriate, and in fact I wish more ABIs specified how they expect volatile accesses to be performed.

It is, of course, somewhat unfortunate that a corner-case use case like memory-mapped I/O has been privileged with such a core position in the language, but that's how it is.

I do think that AAPCS's specific request that the compiler emit spurious loads when storing to volatile bit-fields is a bad idea, and I think it would be fine to push back on that part, and perhaps also on the idea that compound assignments and increments should load twice.

Indeed our main concern is regarding the access widths of loads. As mentioned by @rjmccall, most volatile bitfields are used to perform memory mapped I/O, and some hardware only support them with a specific access width.
The spurious load I am more than glad to leave it disable behind a command flag, so it will only appear if the user requests it. See that volatile accesses might have side effects, and for example, an I/O read counter holding an odd number could define that the data is still being processed.

In D67399#1669038, @dnsampaio wrote:

Indeed our main concern is regarding the access widths of loads. As mentioned by @rjmccall, most volatile bitfields are used to perform memory mapped I/O, and some hardware only support them with a specific access width.
The spurious load I am more than glad to leave it disable behind a command flag, so it will only appear if the user requests it. See that volatile accesses might have side effects, and for example, an I/O read counter holding an odd number could define that the data is still being processed.

Are the cases being addressed in the PR actually relevant to real MMIO, or is this patch following the letter of AAPCS which doesn't actually matter?

In D67399#1669568, @jfb wrote:

In D67399#1669038, @dnsampaio wrote:

Indeed our main concern is regarding the access widths of loads. As mentioned by @rjmccall, most volatile bitfields are used to perform memory mapped I/O, and some hardware only support them with a specific access width.
The spurious load I am more than glad to leave it disable behind a command flag, so it will only appear if the user requests it. See that volatile accesses might have side effects, and for example, an I/O read counter holding an odd number could define that the data is still being processed.

Are the cases being addressed in the PR actually relevant to real MMIO, or is this patch following the letter of AAPCS which doesn't actually matter?

Again, I think AAPCS is well within its rights to say that certain volatile accesses should be performed with loads and stores of certain widths. If low-level programmers cannot use bit-fields today with memory-mapped I/O because they cannot trust compilers to produce reasonable accesses, that is a legitimate concern for ABI authors and a legitimate bug for compiler maintainers.

In D67399#1669920, @rjmccall wrote:

In D67399#1669568, @jfb wrote:

In D67399#1669038, @dnsampaio wrote:

Indeed our main concern is regarding the access widths of loads. As mentioned by @rjmccall, most volatile bitfields are used to perform memory mapped I/O, and some hardware only support them with a specific access width.
The spurious load I am more than glad to leave it disable behind a command flag, so it will only appear if the user requests it. See that volatile accesses might have side effects, and for example, an I/O read counter holding an odd number could define that the data is still being processed.

Are the cases being addressed in the PR actually relevant to real MMIO, or is this patch following the letter of AAPCS which doesn't actually matter?

Again, I think AAPCS is well within its rights to say that certain volatile accesses should be performed with loads and stores of certain widths. If low-level programmers cannot use bit-fields today with memory-mapped I/O because they cannot trust compilers to produce reasonable accesses, that is a legitimate concern for ABI authors and a legitimate bug for compiler maintainers.

I have no objection to the direction in this patch. I agree that it's important to have a specification that covers this, and while I still think that this has nothing to do with ABI as I believe the term is normally understood, treating it as platform-dependent and specifying it in the same place as the platform ABI might be the most reasonable option. Clearly the AAPCS is more than a Procedure Call Standard, which is fine.

dnsampaio mentioned this in D72932: [ARM] Follow AACPS standard for volatile bit-fields access width.Jan 17 2020, 9:06 AM

dnsampaio mentioned this in rG6a24339a4524: [ARM] Follow AACPS standard for volatile bit-fields access width.Jan 21 2020, 7:35 AM

dnsampaio added a parent revision: D72932: [ARM] Follow AACPS standard for volatile bit-fields access width.Jan 30 2020, 8:39 AM

dnsampaio retitled this revision from [ARM] Follow AACPS standard for volatile bitfields to [ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields.Jan 30 2020, 9:06 AM

dnsampaio edited the summary of this revision. (Show Details)

Added flag to allow user to opt-in

Harbormaster completed remote builds in B45358: Diff 241486.Jan 30 2020, 9:10 AM

I'm happy with this change since it's opt-in. Thanks!

This revision is now accepted and ready to land.Jan 30 2020, 9:39 AM

dnsampaio removed a parent revision: D72932: [ARM] Follow AACPS standard for volatile bit-fields access width.Feb 7 2020, 2:03 AM

Revordered some tests adding and definition of the "isAAPCS" function from patch D72932

Harbormaster completed remote builds in B45930: Diff 243102.Feb 7 2020, 2:10 AM

Closed by commit rG9d869180c4ad: [ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields (authored by Diogo Sampaio <diogo.sampaio@arm.com>). · Explain WhyFeb 7 2020, 2:13 AM

This revision was automatically updated to reflect the committed changes.

stuij mentioned this in rG514df1b2bb1e: [ARM] Follow AACPS standard for volatile bit-fields access width.Sep 8 2020, 9:50 AM

stuij mentioned this in rG208987844ffa: [ARM] Follow AACPS standard for volatile bit-fields access width.Oct 13 2020, 2:32 AM

stuij mentioned this in D96784: Pass the cmdline aapcs bitfield options to cc1.Feb 16 2021, 6:48 AM

stuij mentioned this in rG5f7715d8780a: Pass the cmdline aapcs bitfield options to cc1.Feb 18 2021, 7:40 AM

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGExpr.cpp

7 lines

test/

CodeGen/

aapcs-bitfield.c

8 lines

Diff 219542

clang/lib/CodeGen/CGExpr.cpp

Show First 20 Lines • Show All 2,055 Lines • ▼ Show 20 Lines	Val = Builder.CreateAnd(Val,
Info.Offset,		Info.Offset,
Info.Offset + Info.Size),		Info.Offset + Info.Size),
"bf.clear");		"bf.clear");

// Or together the unchanged values and the source value.		// Or together the unchanged values and the source value.
SrcVal = Builder.CreateOr(Val, SrcVal, "bf.set");		SrcVal = Builder.CreateOr(Val, SrcVal, "bf.set");
} else {		} else {
assert(Info.Offset == 0);		assert(Info.Offset == 0);
		// Acording to the AACPS:
		// When a volatile bit-field is written, and its container does not overlap
		// with any non-bit-field member, its container must be read exactly once and
		// written exactly once using the access width appropriate to the type of the
		// container. The two accesses are not atomic.
		if ( (CGM.getTarget().getABI() == "aapcs") && Dst.isVolatileQualified() )
		Builder.CreateLoad(Ptr, Dst.isVolatileQualified(), "bf.load");
}		}

// Write the new value back out.		// Write the new value back out.
Builder.CreateStore(SrcVal, Ptr, Dst.isVolatileQualified());		Builder.CreateStore(SrcVal, Ptr, Dst.isVolatileQualified());

// Return the new value of the bit-field, if requested.		// Return the new value of the bit-field, if requested.
if (Result) {		if (Result) {
llvm::Value *ResultVal = MaskedVal;		llvm::Value *ResultVal = MaskedVal;
▲ Show 20 Lines • Show All 3,006 Lines • Show Last 20 Lines

clang/test/CodeGen/aapcs-bitfield.c

	Show First 20 Lines • Show All 525 Lines • ▼ Show 20 Lines
	//			//
	int read_st9(volatile struct st9 *m) {			int read_st9(volatile struct st9 *m) {
	return m->f;			return m->f;
	}			}

	// LE-LABEL: @store_st9(			// LE-LABEL: @store_st9(
	// LE-NEXT: entry:			// LE-NEXT: entry:
	// LE-NEXT: [[TMP0:%.]] = getelementptr inbounds [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0			// LE-NEXT: [[TMP0:%.]] = getelementptr inbounds [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0
				// LE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 4
	// LE-NEXT: store volatile i8 1, i8* [[TMP0]], align 4			// LE-NEXT: store volatile i8 1, i8* [[TMP0]], align 4
	// LE-NEXT: ret void			// LE-NEXT: ret void
	//			//
	// BE-LABEL: @store_st9(			// BE-LABEL: @store_st9(
	// BE-NEXT: entry:			// BE-NEXT: entry:
	// BE-NEXT: [[TMP0:%.]] = getelementptr inbounds [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0			// BE-NEXT: [[TMP0:%.]] = getelementptr inbounds [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0
				// BE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 4
				jfbUnsubmitted Done Reply Inline Actions These are just extra loads? Why? jfb: These are just extra loads? Why?
				dnsampaioAuthorUnsubmitted Done Reply Inline Actions Yes, these are just extra loads. As the AACPS describes, every write requires to perform a load as well, even if all bits of the volatile bitfield is going to be replaced. dnsampaio: Yes, these are just extra loads. As the AACPS describes, every write requires to perform a load…
	// BE-NEXT: store volatile i8 1, i8* [[TMP0]], align 4			// BE-NEXT: store volatile i8 1, i8* [[TMP0]], align 4
	// BE-NEXT: ret void			// BE-NEXT: ret void
	//			//
	void store_st9(volatile struct st9 *m) {			void store_st9(volatile struct st9 *m) {
	m->f = 1;			m->f = 1;
	}			}

	// LE-LABEL: @increment_st9(			// LE-LABEL: @increment_st9(
	// LE-NEXT: entry:			// LE-NEXT: entry:
	// LE-NEXT: [[TMP0:%.]] = getelementptr inbounds [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0			// LE-NEXT: [[TMP0:%.]] = getelementptr inbounds [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0
	// LE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 4			// LE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 4
				jfbUnsubmitted Done Reply Inline Actions Why isn't this load sufficient? jfb: Why isn't this load sufficient?
				dnsampaioAuthorUnsubmitted Done Reply Inline Actions Technically speaking, that is the load for reading the bitfield, not the load required when writing it. dnsampaio: Technically speaking, that is the load for reading the bitfield, not the load required when…
	// LE-NEXT: [[INC:%.*]] = add i8 [[BF_LOAD]], 1			// LE-NEXT: [[INC:%.*]] = add i8 [[BF_LOAD]], 1
				// LE-NEXT: [[BF_LOAD1:%.]] = load volatile i8, i8 [[TMP0]], align 4
	// LE-NEXT: store volatile i8 [[INC]], i8* [[TMP0]], align 4			// LE-NEXT: store volatile i8 [[INC]], i8* [[TMP0]], align 4
	// LE-NEXT: ret void			// LE-NEXT: ret void
	//			//
	// BE-LABEL: @increment_st9(			// BE-LABEL: @increment_st9(
	// BE-NEXT: entry:			// BE-NEXT: entry:
	// BE-NEXT: [[TMP0:%.]] = getelementptr inbounds [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0			// BE-NEXT: [[TMP0:%.]] = getelementptr inbounds [[STRUCT_ST9:%.]], %struct.st9* [[M:%.*]], i32 0, i32 0
	// BE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 4			// BE-NEXT: [[BF_LOAD:%.]] = load volatile i8, i8 [[TMP0]], align 4
	// BE-NEXT: [[INC:%.*]] = add i8 [[BF_LOAD]], 1			// BE-NEXT: [[INC:%.*]] = add i8 [[BF_LOAD]], 1
				// BE-NEXT: [[BF_LOAD1:%.]] = load volatile i8, i8 [[TMP0]], align 4
	// BE-NEXT: store volatile i8 [[INC]], i8* [[TMP0]], align 4			// BE-NEXT: store volatile i8 [[INC]], i8* [[TMP0]], align 4
	// BE-NEXT: ret void			// BE-NEXT: ret void
	//			//
	void increment_st9(volatile struct st9 *m) {			void increment_st9(volatile struct st9 *m) {
	++m->f;			++m->f;
	}			}

	struct st10{			struct st10{
	▲ Show 20 Lines • Show All 94 Lines • ▼ Show 20 Lines
	//			//
	int read_st11(volatile struct st11 *m) {			int read_st11(volatile struct st11 *m) {
	return m->f;			return m->f;
	}			}

	// LE-LABEL: @store_st11(			// LE-LABEL: @store_st11(
	// LE-NEXT: entry:			// LE-NEXT: entry:
	// LE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1			// LE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1
				// LE-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[F]], align 1
	// LE-NEXT: store volatile i16 1, i16* [[F]], align 1			// LE-NEXT: store volatile i16 1, i16* [[F]], align 1
	// LE-NEXT: ret void			// LE-NEXT: ret void
	//			//
	// BE-LABEL: @store_st11(			// BE-LABEL: @store_st11(
	// BE-NEXT: entry:			// BE-NEXT: entry:
	// BE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1			// BE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1
				// BE-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[F]], align 1
	// BE-NEXT: store volatile i16 1, i16* [[F]], align 1			// BE-NEXT: store volatile i16 1, i16* [[F]], align 1
	// BE-NEXT: ret void			// BE-NEXT: ret void
	//			//
	void store_st11(volatile struct st11 *m) {			void store_st11(volatile struct st11 *m) {
	m->f = 1;			m->f = 1;
	}			}

	// LE-LABEL: @increment_st11(			// LE-LABEL: @increment_st11(
	// LE-NEXT: entry:			// LE-NEXT: entry:
	// LE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1			// LE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1
	// LE-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[F]], align 1			// LE-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[F]], align 1
	// LE-NEXT: [[INC:%.*]] = add i16 [[BF_LOAD]], 1			// LE-NEXT: [[INC:%.*]] = add i16 [[BF_LOAD]], 1
				// LE-NEXT: [[BF_LOAD1:%.]] = load volatile i16, i16 [[F]], align 1
	// LE-NEXT: store volatile i16 [[INC]], i16* [[F]], align 1			// LE-NEXT: store volatile i16 [[INC]], i16* [[F]], align 1
	// LE-NEXT: ret void			// LE-NEXT: ret void
	//			//
	// BE-LABEL: @increment_st11(			// BE-LABEL: @increment_st11(
	// BE-NEXT: entry:			// BE-NEXT: entry:
	// BE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1			// BE-NEXT: [[F:%.]] = getelementptr inbounds [[STRUCT_ST11:%.]], %struct.st11* [[M:%.*]], i32 0, i32 1
	// BE-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[F]], align 1			// BE-NEXT: [[BF_LOAD:%.]] = load volatile i16, i16 [[F]], align 1
	// BE-NEXT: [[INC:%.*]] = add i16 [[BF_LOAD]], 1			// BE-NEXT: [[INC:%.*]] = add i16 [[BF_LOAD]], 1
				// BE-NEXT: [[BF_LOAD1:%.]] = load volatile i16, i16 [[F]], align 1
	// BE-NEXT: store volatile i16 [[INC]], i16* [[F]], align 1			// BE-NEXT: store volatile i16 [[INC]], i16* [[F]], align 1
	// BE-NEXT: ret void			// BE-NEXT: ret void
	//			//
	void increment_st11(volatile struct st11 *m) {			void increment_st11(volatile struct st11 *m) {
	++m->f;			++m->f;
	}			}

	// LE-LABEL: @increment_e_st11(			// LE-LABEL: @increment_e_st11(
	▲ Show 20 Lines • Show All 121 Lines • Show Last 20 Lines