Page MenuHomePhabricator

[ARM] Follow AACPS standard for volatile bit-fields access width

Authored by stuij on Jan 17 2020, 9:06 AM.



This patch resumes the work of D16586.
According to the AAPCS, volatile bit-fields should
be accessed using containers of the widht of their
declarative type. In such case:

struct S1 {
  short a : 1;

should be accessed using load and stores of the width
(sizeof(short)), where now the compiler does only load
the minimum required width (char in this case).
However, as discussed in D16586,
that could overwrite non-volatile bit-fields, which
conflicted with C and C++ object models by creating
data race conditions that are not part of the bit-field,

struct S2 {
  short a;
  int  b : 16;

Accessing S2.b would also access S2.a.

The AAPCS Release 2020Q2
section 8.1 Data Types, page 36, "Volatile bit-fields -
preserving number and width of container accesses" has been
updated to avoid conflict with the C++ Memory Model.
Now it reads in the note:

This ABI does not place any restrictions on the access widths of bit-fields where the container
overlaps with a non-bit-field member or where the container overlaps with any zero length bit-field
placed between two other bit-fields. This is because the C/C++ memory model defines these as being
separate memory locations, which can be accessed by two threads simultaneously. For this reason,
compilers must be permitted to use a narrower memory access width (including splitting the access into
multiple instructions) to avoid writing to a different memory location. For example, in
struct S { int a:24; char b; }; a write to a must not also write to the location occupied by b, this requires at least two
memory accesses in all current Arm architectures. In the same way, in struct S { int a:24; int:0; int b:8; };,
writes to a or b must not overwrite each other.

I've updated the patch D16586 to follow such behavior by verifying that we
only change volatile bit-field access when:

  • it won't overlap with any other non-bit-field member
  • we only access memory inside the bounds of the record
  • avoid overlapping zero-length bit-fields.

Regarding the number of memory accesses, that should be preserved, that will
be implemented by D67399.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
This comment was removed by dnsampaio.
dnsampaio updated this revision to Diff 242823.Feb 6 2020, 12:56 AM
  • Removed test
  • Added clear at the end of run as well, to clear waste
  • Moved clearing to a more sensible position
Herald added a project: Restricted Project. · View Herald TranscriptFeb 6 2020, 12:56 AM
dnsampaio planned changes to this revision.Feb 6 2020, 1:20 AM

Updated wrong patch here.

dnsampaio updated this revision to Diff 243197.Feb 7 2020, 9:34 AM

Added opt-out flag

I think I've spotted a bug in the ABI spec, which you've faithfully implemented here. I don't know of any other compiler which has implemented this ABI change yet, so it's probably worth seeing if we can get the spec fixed.

The intention of the ABI is to avoid conflicting with the C/C++11 spec, which requires access to one "memory location" to not write to any adjacent memory locations. However, the wording of the ABI does not take into account zero-sized bitfields, which are defined in C/C++ to start a new memory location. For example:

struct foo {
  int a : 8;
  char  : 0;
  int b : 8;

Here, C/C++ says that a and b are separate memory locations, so it must be possible to read and write them from two threads. However, the ABI does not have the special case for zero-sized bitfields, so requires access to both fields to use 32-bit loads and stores, which overlap.

I've raised a ticket (internal to Arm) to consider changing the ABI to match C/C++ in this case.


Why is this a negative option, when the one above is positive?


Command-line options are in kebab-case, so this should be something like fno-aapcs-bitfield-width. This also applies to the fAAPCSBitfieldLoad option above, assuming it's not too late to change that.

Please also add a positive version of this option (i.e. faapcs-bitfield-width).





It doesn't look like this is ever called with different volatile/non-volatile values, so I don't think we need the extra parameters.


Space in "itscontainer".


Space in "itscontainer".


Space in "thetype".


_register_ alignment?


Comments end in a full stop (multiple times in this function).


Unnecessary formatting change.

dnsampaio updated this revision to Diff 243855.Feb 11 2020, 6:55 AM
  • Add test that the volatile access does not cross a border defined by a zero lenght bit-field, defined by C11, avoiding race-conditions
dnsampaio updated this revision to Diff 243858.Feb 11 2020, 7:08 AM
dnsampaio marked 5 inline comments as done.
  • Fixing comments and removing some tests changes not required

Hi @ostannard, thanks for your review.
I updated the patch so it won't act when the computed volatile bit-field access will overlap a zero length bit-field, avoiding the conflict. We can update it accordingly to future versions of the AAPCS if required.

dnsampaio updated this revision to Diff 276034.Jul 7 2020, 6:17 AM

Now the AAPCS explicitly avoids conflicts with the C11, by not imposing any restriction
when the natural container will overlap a zero lenght bit-field:
Both 32 and 64 bit versions were updated on the same way.

dnsampaio edited the summary of this revision. (Show Details)Jul 17 2020, 2:25 AM
dnsampaio edited the summary of this revision. (Show Details)

You haven't addressed my earlier inline comments.

dnsampaio marked 6 inline comments as done.Jul 17 2020, 8:02 AM

Indeed not all of them. Fixed this time.


The enforcing of number of accesses would not be accepted if it was not an opt-in option. This one I expect it should be accepted with a single opt-out option.

dnsampaio updated this revision to Diff 278793.Jul 17 2020, 8:54 AM
dnsampaio marked an inline comment as done.

Fixed remaining of inline comments

Ping ... ping...

stuij added a subscriber: stuij.Aug 28 2020, 7:16 AM

@ostannard: pinging on behalf of @dnsampaio. The changes still apply cleanly.

ostannard added inline comments.Sep 7 2020, 6:07 AM

My problem is with the name of the option (adding an extra negative just makes things more confusing), not with the default value. This could just be called AAPCSBitfieldWidth, (unless you think the Force is adding something), and default to true father than false.


This still needs a positive version.


These doc comments are copied from the ones above, they need changing.


Returning void is confusing (yes I know it was already there), this should be a separate return; statement.


Same here.



stuij commandeered this revision.Sep 8 2020, 7:52 AM
stuij added a reviewer: dnsampaio.

Commandeering as I've made some changes to the patch.

stuij updated this revision to Diff 290485.Sep 8 2020, 7:53 AM

addressed review comment

stuij marked 6 inline comments as done.Sep 8 2020, 7:55 AM
stuij added inline comments.




This revision is now accepted and ready to land.Sep 8 2020, 8:26 AM
This revision was landed with ongoing or failed builds.Sep 8 2020, 9:50 AM
This revision was automatically updated to reflect the committed changes.
stuij marked 2 inline comments as done.
stuij reopened this revision.Oct 7 2020, 4:46 AM

Reopening as this commit made clang/test/CodeGen/volatile.c fail on Arm/AArch64 buildbot hosts.

This revision is now accepted and ready to land.Oct 7 2020, 4:46 AM
stuij updated this revision to Diff 296644.Oct 7 2020, 4:47 AM

After committing this patch, clang/test/CodeGen/volatile.c failed on Arm/AArch64 buildbot hosts. The reason for this is that %itanium_abi_triple, a run line Lit target triple substitution at the top of the file, is filled in with the host arch triple. For example: aarch64-unknown-linux-gnu. I've amended the tests to take into account the changes to code generation in this patch.

ostannard added inline comments.Oct 8 2020, 6:13 AM

I think it would be better to change this test to use explicit triples, so that we're always testing both the ARM and non-ARM behaviour, regardless of the default triple.

stuij updated this revision to Diff 297584.Oct 12 2020, 7:40 AM

addressed review comment to hardwire non-MS target platforms

stuij updated this revision to Diff 297585.Oct 12 2020, 7:45 AM

removed clang/test/CodeGen/

This revision was landed with ongoing or failed builds.Oct 13 2020, 2:32 AM
This revision was automatically updated to reflect the committed changes.