This is an archive of the discontinued LLVM Phabricator instance.

Make clang AAPCS compliant w.r.t volatile bitfield accesses
AbandonedPublic

Authored by rmaprath on Jan 26 2016, 9:01 AM.

Details

Summary

Lets consider the following plain struct:

struct S1 {
  char a;
  short b : 8;
};

According to the AAPCS, the bitfield 'b' _can_ be accessed by loading its
"container" (a short in this case) and then using bit-shifting operations to
extract the actual bitfield value. However, for plain (non-volatile) bitfields,
the AAPCS does not mandate this, so compilers are free to use whatever type of
access they think is best. armcc tends to use wider accesses even when narrower
accesses are possible (like in the above example, b can be byte-loaded), whereas
clang and gcc tend to issue narrow loads/stores where available.

But things get tricky when volatile bitfields are involved:

struct S2 {
  char a;
  volatile short b : 8;
};

Now the AAPCS mandates that 'b' must be accessed through loads/stores of widths
as appropriate to the container type (Section 7.1.7.5). To complicate matters
further, AAPCS allows overlapping of bitfields with regular fields
(Section 7.1.7.4). What this means is that the storage units of 'a' and 'b'
are overlapping, where 'a' occupies the first byte of the struct and 'b' lies
on the first half-word of the struct. Loading 'b' will load 'a' as well since
'b' is loaded as a short.

Currently, clang will byte-load 'b' and is therefore not AAPCS compliant. The
purpose of this patch is to make clang respect these access widths of volatile
bitfields.

One way to fix this problem is to teach clang the concept of overlapping storage
units, this would require an overhaul of clang's structure layout code, as the
idea of distinct storage units is deeply embedded in clang's structure layout
routines. In this patch, I take the view that this confusion is only a matter of
abstraction; whether the above struct's storage is viewed as two consecutive
bytes (this is how clang lays out the struct) or as a single half-word (this is
how armcc lays out the struct) is irrelevant, the actual data of the fields will
always be at the same offset from the base of the struct. This difference of
abstraction only matters when we generate code to access the relevant fields, we
can therefore intercept those loads/stores in clang and adjust the accesses so
that they are AAPCS compliant.

Diff Detail

Event Timeline

rmaprath updated this revision to Diff 46003.Jan 26 2016, 9:01 AM
rmaprath retitled this revision from to Make clang AAPCS compliant w.r.t volatile bitfield accesses.
rmaprath updated this object.
rmaprath added a subscriber: cfe-commits.
rjmccall edited edge metadata.Jan 26 2016, 11:36 AM

Well, that's certainly an interesting ABI rule.

A few high-level notes:

  1. AAPCS requires the bit-field to be loaded on a store, even if the store fills the entire container; that doesn't seem to be implemented in your patch.
  1. Especially because of #1, let's not do this unless the l-value is actually volatile. The ABI rule here is arguably actively wrong for non-volatile cases, e.g. struct { volatile char c; short s : 8; }.
  1. Instead of using string comparisons all over the place, please make this a flag on the CG TargetInfo or something.
lib/CodeGen/CGExpr.cpp
1761

This alignment computation is wrong; you need to be basing this on the alignment of the base. It would be easier to do that in the formation of the LValue in the first place in EmitLValueForField, and then you won't need to modify as many of these uses.

rmaprath added inline comments.Jan 27 2016, 9:49 AM
lib/CodeGen/CGExpr.cpp
1761

I'm wondering if it would be possible to do the all the adjustments (not just setting the bitfield LValue alignment) within EmitLValueForField(). It would certainly simplify the code a lot (e.g. No need to dissect the load/store GEP as I currently do in AdjustAAPCSBitfieldAccess(), I can intercept the original GEP when it is being constructed).

One problem I have is, as part of the adjustments, I have to override the CGBitFieldInfo of the bitfield LValue. But LValue holds a pointer to the CGBitFieldInfo and I'd have to allocate a new CGBitFieldInfo instance and pass it into LValue::MakeBitfield() method. Not sure if that is a good idea (who'd free it?). If you have any suggestions, please let me know.

Adjusting alignments in EmitLValueForField() and then again adjusting other bitfield access parameters in the load/store methods would be a bit messy, I imagine.

Thanks for all the feedback!

rmaprath updated this revision to Diff 46264.Jan 28 2016, 6:29 AM
rmaprath edited edge metadata.

Addressing review comments by @rjmccall:

Moved all the AAPCS specific tweaks to EmitLValueForField(), this simplified the patch a lot (now there is no mucking about a de-constructed GEP at load/store points). In order to do this, LValue definition had to be updated so that it gets initialized with is own copy of CGBitFieldInfo.

A few high-level notes:
AAPCS requires the bit-field to be loaded on a store, even if the store fills the entire container; that doesn't seem to be implemented in your patch.

Fixed.

Especially because of #1, let's not do this unless the l-value is actually volatile. The ABI rule here is arguably actively wrong for non-volatile cases, e.g. struct { volatile char c; short s : 8; }.

The AAPCS also say (Section 7.1.7.5):

"If the container of a non-volatile bit-field overlaps a volatile bit-field then it is undefined whether access to the non volatile field will cause the volatile field to be accessed."

So it looks like doing this for normal fields is still within the bounds of AAPCS. In fact, armcc loads the volatile 'c' in your example when 's' is accessed. But I agree that limiting this to volatile cases is a good idea; we are still AAPCS compliant and there's less things to break.

Instead of using string comparisons all over the place, please make this a flag on the CG TargetInfo or something.

Factored this out into a small utility function.

rmaprath marked 2 inline comments as done.Jan 28 2016, 6:30 AM
rsmith added a subscriber: rsmith.Feb 1 2016, 5:00 PM
rsmith added inline comments.
test/CodeGen/aapcs-bitfield.c
313–318

This violates the C and C++ object models by creating a data race on m->y.a that was not present in the source code. A store to a bit-field cannot write to bytes that are not part of the same sequence of bit-field members. If this ABI really requires that (and supports multi-threaded systems), it is not a correct ABI for C11 nor C++11. (This leaves open the question of which standard we should follow...)

rmaprath added inline comments.Feb 2 2016, 1:07 AM
test/CodeGen/aapcs-bitfield.c
313–318

Hi Richard,

Thank you for this, I didn't know about this restriction in the C11/C++11 standards. The AAPCS is indeed at odds with the standards in this case, for a simpler example, consider:

struct foo {
  char a;
  volatile short b : 8;
};

void foo(struct foo *p) {
  p->b = 0xFF;
}

This store will cause 'a' to be written as well according to the AAPCS. The conflicting sections of the standards are:

  • AAPCS: 7.1.7.4 Combining bit-field and non-bit-field members (+ 7.1.7.5 - volatile bitfield access)
  • C++11: 1.7 The C++ memory model
  • C11: 3.14 memory location

I will take this up with the AAPCS authors.

jmolloy resigned from this revision.Feb 15 2016, 6:57 AM
jmolloy removed a reviewer: jmolloy.

I'm not the right person to review this.

rmaprath abandoned this revision.May 4 2016, 6:49 AM