This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Fix codegen of unaligned volatile load/store of i64
ClosedPublic

Authored by momo5502 on Jun 13 2023, 2:08 AM.

Details

Summary

Volatile loads/stores of i64 are lowered to LDRD/STRD on ARMv5TE.
However, these instructions require the addresses to be aligned.
Unaligned loads/stores therefore should be ignored by this handling.

Diff Detail

Event Timeline

momo5502 created this revision.Jun 13 2023, 2:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 13 2023, 2:08 AM
momo5502 requested review of this revision.Jun 13 2023, 2:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 13 2023, 2:08 AM

Hello

Should the condition be bases on Subtarget->allowsUnalignedMem too? I'm not sure exactly in which architectures LDRD required an alignment of 8 and when it required an alignment of 4. I believe in most cases it is 4, with some early architectures requiring 8 if Subtarget->allowsUnalignedMem was false.

efriedma added inline comments.Jun 13 2023, 10:41 AM
llvm/lib/Target/ARM/ARMISelLowering.cpp
10085

Using getABITypeAlign() like this is confusing; if you mean "4", just write 4.

I did have Align(4) initially, but saw that it was done differently here: https://github.com/llvm/llvm-project/blob/2a1716dec57e8b3dd668df17ecbedfc77a4112e5/llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp#L2294
I simply copied the expression. If that is not applicable here, I can just use 4 instead or respect allowsUnalignedMem.

I have no idea why the ARMLoadStoreOptimizer code was written that way; maybe fix that too, for the sake of clarity.

momo5502 updated this revision to Diff 531670.Jun 15 2023, 3:04 AM

I have no idea why the ARMLoadStoreOptimizer code was written that way; maybe fix that too, for the sake of clarity.

I adjusted it here to Align(4).

I also tried to adjust the ARMLoadStoreOptimizer and it turns out that getABITypeAlign of i64 is 8 and not 4,. That seems wrong in case of aligning loads. Therefore your comment is valid and it should be fixed there too. However, that breaks quite a bunch of tests and I feel this exceeds the scope of this change. I recommend creating an issue for that.

@dmgreen, concerning allowsUnalignedMem, I'm not sure how to proceed with that. Only applying the alignment check form this change when strict-alignment is enabled causes the instructions in this tests to be lowered to 16 one-byte loads and stores. However, 2 four-byte loads seem perfectly fine to me, at least on the system where I encountered this issue. To me, not taking allowsUnalignedMem into consideration seems fine. What do you think?

re: allowsUnalignedMem:

Certain memory (e.g. memmap'ed registers, uncached memory) have restrictions that go beyond the normal rules for memory. allowsUnalignedMem() is supposed to model that: all pointer accesses have to be naturally aligned to avoid faults, even if the CPU supports unaligned access to cached memory.

I'm not 100% sure what the interaction is between that and ldrd/strd off the top of my head. But the idea would be to narrow the check to Align(Subtarget->hasV6Ops() && Subtarget->allowsUnalignedMem() ? 4 : 8), so we don't emit a ldrd such that "addr % 8 == 4".

I haven't checked any hardware, but from reading the reference manuals it looked like the alignment requirement would be Subtarget->hasV6Ops() || Subtarget->allowsUnalignedMem() ? 4 : 8. As in for newer archs 4 is always fine. For v5te the ldrd requires an alignment of 4 with U=1 and 8 with U=0. I haven't tested that though.

Spent a bit of time digging in the armv7 reference manual. Apparently on armv7, it should always be fine to use ldrd for word alignment. On armv5, as you've noted, it's "UNPREDICTABLE". For armv6, you can switch between armv5 semantics and armv7 semantics. (sections D12.3.1 and D15.3.1.) On v6 targets, if allowsUnalignedMem() is true, we can safely assume we're using v7 semantics; if we're in strict alignment mode, I don't think we want to make any assumptions. (We could theoretically make it a flag specified by the user, but that seems like overkill.)

Note that we currently assume the frontend will set the "+strict-align" target attribute if we're compiling for v4 or v5, so compiling for -mtriple=armv5e-arm-none-eabi without also adding -mattr=+strict-align will produce weird results: we assume unaligned accesses work, but they clearly won't. This is probably a bug; the backend should do something more reasonable by default on v4/v5 targets.

Given that, I guess what we want is actually something like Align(Subtarget->hasV7Ops() || Subtarget->allowsUnalignedMem() ? 4 : 8). Maybe wrap that up in a helper and stick it in ARMSubtarget.h, so we can use it for ARMLoadStoreOptimizer in a followup.

momo5502 updated this revision to Diff 532567.Jun 19 2023, 2:00 AM

Thank you for investigating it. I have added a helper called getDualLoadStoreAlignment. Feel free to suggest a better name if you have one.

I'd like to see better test coverage for align-4 case: specifically, a test that checks we produce ldrd for targets where it's legal, but a pair of ldr where it isn't.

momo5502 updated this revision to Diff 532801.Jun 20 2023, 12:11 AM

I added an align 4 test for armv7 and armv6 with and without strict alignment.
I omitted armv5 here, as, due to the missing strict align, the test generates an ldrd which is probably not something one wants to explicitly test for.
Question is, should I add strict align to armv4/v5 test cases here and adjust the test accordingly?

momo5502 updated this revision to Diff 532805.Jun 20 2023, 12:17 AM

Added an armv7 strict case to make sure it is identical to non-strict for align 4

armv4/armv5 should have strict align, yes.

(Also, you might want to autogenerate the CHECK lines using update_llc_test_checks.py; it's a little more verbose, but a lot easier to update.)

momo5502 updated this revision to Diff 533179.Jun 21 2023, 1:45 AM

Thank you for the hint. Using update_llc_test_checks.py makes writing the tests much easier.

efriedma accepted this revision.Jun 21 2023, 10:19 AM

LGTM

(I'm guessing you don't have commit access? Please let me know the author name/email you want on the commit.)

This revision is now accepted and ready to land.Jun 21 2023, 10:19 AM

LGTM

(I'm guessing you don't have commit access? Please let me know the author name/email you want on the commit.)

Thank you. You're right, I don't have commit access. In my previous commits I used this: Maurice Heumann <MauriceHeumann@gmail.com>

as soon as this is commited,I will try to adjust the ARMLoadStoreOptimizer accordingly. @eli.friedman you have commit permissions, right? would you mind committing this change for me with the name and email provided above?

I'll get it merged in the next couple days; sorry about the delay.

This revision was landed with ongoing or failed builds.Jun 26 2023, 10:46 AM
This revision was automatically updated to reflect the committed changes.