This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Define subtarget feature "strict-align"
ClosedPublic

Authored by ahatanak on Jul 23 2015, 12:51 PM.

Details

Summary

This patch defines a new subtarget feature "strict-align" which is used to decide whether unaligned memory accesses are allowed or not.

I removed all the code that checks the target architecture and OS to see if unaligned memory accesses are allowed as that decision will be made in the front-end (I'll send a patch to cfe-commits that includes the clang-side changes shortly). I wasn't sure if this will work for NaCl, which I think has a clang-based frontend that produces a target independent bitcode and a code-gen tool and a translator tool that produces target-specific code according to the following link:

https://www.chromium.org/nativeclient/pnacl/developing-pnacl

Would it be a problem it this patch required changing the translator tool (pnacl-translate) to pass subtarget feature "+strict-align" based on the target architecture?

Diff Detail

Event Timeline

ahatanak updated this revision to Diff 30514.Jul 23 2015, 12:51 PM
ahatanak retitled this revision from to [ARM] Define subtarget feature "strict-align".
ahatanak updated this object.
ahatanak added reviewers: echristo, dexonsmith, jfb, dschuff.

The clang-side changes are in this patch: http://reviews.llvm.org/D11472

jfb edited edge metadata.Jul 23 2015, 1:01 PM

This is fine for PNaCl because we assume unaligned accesses are allowed in the targets we support.

Waiting on the clang-side change to sign off on this patch. Seems fine, but I want to make sure I understand what happens in clang :-)

jfb added a comment.Jul 23 2015, 1:05 PM

You need to update tests in ./test/CodeGen/ARM/.

OK, thanks. I should also mention that a lot of test cases fail with this patch, but I'll update all of them once I know this patch is headed in the right direction.

jfb added a comment.Jul 23 2015, 1:11 PM

It looks like at least aarch64 has its own alignment flags too. Could you have it use these general flags instead? Do other targets have these flags as well?

dschuff edited edge metadata.Jul 23 2015, 1:12 PM

I don't have a strong opinion on whether this should be a subtarget feature vs a backend option (although a subtarget feature does seem right to me).
But otherwise this looks OK from the NaCl perspective. The clang patch looks like it has the right behavior wrt nacl-clang (i.e. the non-PNaCl clang toolchain), and it shouldn't be a problem for us to fix our out-of-tree pnacl-translate tool.

echristo edited edge metadata.Jul 23 2015, 1:17 PM

Subtarget feature is definitely correct. As far as other targets that implement a similar feature keeping the same name is appreciated, I haven't implemented "generic" subtarget feature support.

-eric

As long as we use the same feature name +strict-align for arm and aarch64, I believe we can make it a "generic" subtarget feature later if we decide we want to do so.

Defining a function attribute for strict-align is another option, but I think that requires more work than using a subtarget feature.

Agreed. This follows pretty much what I did with soft-float here.

I think you need some test case support here?

-eric

Yes. I need test cases to make sure clang driver passes subtarget feature "+strict-align" base on the target architecture and OS as the backend no longer does that check.

ahatanak updated this revision to Diff 30552.Jul 23 2015, 6:43 PM
ahatanak edited edge metadata.

Updated test cases.

jfb added inline comments.Jul 24 2015, 9:12 AM
test/CodeGen/ARM/2011-10-26-memset-inline.ll
2

Doesn't this test's triple put it in the hasV6Ops() && isTargetMachO() category, which supports unaligned memory accesses?

test/CodeGen/ARM/build-attributes.ll
61–62

This v6m needs strict-align?

61–62

Same?

129

The current code didn't affect v8, so this seems wrong. I think it should fix v8 to use the same flags, though.

136

Same.

ahatanak added inline comments.Jul 24 2015, 12:24 PM
test/CodeGen/ARM/2011-10-26-memset-inline.ll
2

I'm not sure why it is this way, but it looks like thumbv7 triple doesn't result in HasV7Ops being set to true (see function ARM_MC::ParseARMTriple).

test/CodeGen/ARM/build-attributes.ll
61–62

Don't we need strict-align since no v6M cores support unaligned memory access?

129

For the first test, I removed -arm-no-strict-align because v8 does allow unaligned access, and for the second test, I replaced -arm-strict-align with -mattr=+strict-align.

jfb added inline comments.Jul 24 2015, 1:16 PM
test/CodeGen/ARM/2011-10-26-memset-inline.ll
2

So this was a bug in the old code?

Does you clang change "fix" this bug?

test/CodeGen/ARM/build-attributes.ll
61–62

IIUC you should add -mattr=+strict-align here. After your fix clang will do it automatically, but llc won't so you need the flag.

129

The llc flag -arm-no-strict-align and -arm-strict-align don't do anything for v8.
With your current code, -mattr=+strict-align also doesn't do anything for v8.

v8 does have -aarch64-strict-align and -aarch64-no-strict-align, which I suggested you change to be consistent with the ARM change.

This line went from being misguided to being differently misguided. I'd rather fix v8 to act the same as ARM :-)

ahatanak added inline comments.Jul 24 2015, 2:43 PM
test/CodeGen/ARM/2011-10-26-memset-inline.ll
2

The clang patch does "fix" the bug: if the triple is thumbv7 and darwin, clang doesn't pass "+strict-align".

I didn't realize this was a bug, but it appears that the changes made to ARM_MC::ParseARMTriple in r213367 weren't intentional based on what I see in this thread:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140630/224415.html

test/CodeGen/ARM/build-attributes.ll
61–62

Ah, I was looking at a completely different place. We need -mattr=+strict-align here. The test was passing because it wasn't checking the value of Tag_CPU_unaligned_access.

129

I'm not sure I understood your comment, but I'm seeing a difference in the output when I use -arm-strict-align. Tag_CPU_unaligned_access is 1 when I use -arm-strict-align, but when I use -arm-no-strict-align or use no command line options for alignment, it is 0.

.eabi_attribute 34, 1 @ Tag_CPU_unaligned_access

jfb added inline comments.
test/CodeGen/ARM/2011-10-26-memset-inline.ll
2

Ah OK, it would be good to get @t.p.northover or @rengolin to confirm.

test/CodeGen/ARM/build-attributes.ll
61–62

So you'll update the patch?

129

Ah! armv8.1a is aarch32, not aarch64. So yes, this test is affected by your change.

ahatanak updated this revision to Diff 30620.Jul 24 2015, 4:09 PM

Made the following changes to address review comments:

  • Added "-mattr=+strict-align" to the command line if the target is v6m.
  • Added a line to check ".eabi_attribute 34, 0" if target cpu is cortex-m0.
  • Removed run lines that have become redundant.
jfb accepted this revision.Jul 24 2015, 5:14 PM
jfb edited edge metadata.

lgtm, though it would be good to have other folks chime in too.

This revision is now accepted and ready to land.Jul 24 2015, 5:14 PM
ahatanak added inline comments.Jul 27 2015, 9:32 AM
test/CodeGen/ARM/2011-10-26-memset-inline.ll
2

It turns out TargetRegistry::lookupTarget rewrites "thumb7" in the triple to "arm" when architecture name is specified using -march=arm. It isn't a bug in r213367 that is causing this behavior.

I plan to commit the clang and llvm patches today if there are no further comments and follow up with similar changes to aarch64-strict-align.

I'm wondering whether we should start looking into implementing support for generic subtarget features. Currently, there are only a few subtarget features that are not specific to one target (soft-float and strict-align), but we'll probably find a few more as we convert more front-end options to subtarget features that are written to the IR.

echristo accepted this revision.Jul 28 2015, 12:47 PM
echristo edited edge metadata.

The patch LGTM, the general comment... let's see where we are and go from there. I can dig it up, but generalized subtarget features usually end up being quite a bit more of a pain than you'd expect and it's probably easier to just implement the boiler plate in each back end.

-eric

OK, that works for me.

Eugene.Zelenko added a subscriber: Eugene.Zelenko.

Committed in rL243493.