Volatile loads/stores of i64 are lowered to LDRD/STRD on ARMv5TE.
However, these instructions require the addresses to be aligned.
Unaligned loads/stores therefore should be ignored by this handling.
Details
Diff Detail
Event Timeline
Hello
Should the condition be bases on Subtarget->allowsUnalignedMem too? I'm not sure exactly in which architectures LDRD required an alignment of 8 and when it required an alignment of 4. I believe in most cases it is 4, with some early architectures requiring 8 if Subtarget->allowsUnalignedMem was false.
llvm/lib/Target/ARM/ARMISelLowering.cpp | ||
---|---|---|
10085 | Using getABITypeAlign() like this is confusing; if you mean "4", just write 4. |
I did have Align(4) initially, but saw that it was done differently here: https://github.com/llvm/llvm-project/blob/2a1716dec57e8b3dd668df17ecbedfc77a4112e5/llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp#L2294
I simply copied the expression. If that is not applicable here, I can just use 4 instead or respect allowsUnalignedMem.
I have no idea why the ARMLoadStoreOptimizer code was written that way; maybe fix that too, for the sake of clarity.
I adjusted it here to Align(4).
I also tried to adjust the ARMLoadStoreOptimizer and it turns out that getABITypeAlign of i64 is 8 and not 4,. That seems wrong in case of aligning loads. Therefore your comment is valid and it should be fixed there too. However, that breaks quite a bunch of tests and I feel this exceeds the scope of this change. I recommend creating an issue for that.
@dmgreen, concerning allowsUnalignedMem, I'm not sure how to proceed with that. Only applying the alignment check form this change when strict-alignment is enabled causes the instructions in this tests to be lowered to 16 one-byte loads and stores. However, 2 four-byte loads seem perfectly fine to me, at least on the system where I encountered this issue. To me, not taking allowsUnalignedMem into consideration seems fine. What do you think?
re: allowsUnalignedMem:
Certain memory (e.g. memmap'ed registers, uncached memory) have restrictions that go beyond the normal rules for memory. allowsUnalignedMem() is supposed to model that: all pointer accesses have to be naturally aligned to avoid faults, even if the CPU supports unaligned access to cached memory.
I'm not 100% sure what the interaction is between that and ldrd/strd off the top of my head. But the idea would be to narrow the check to Align(Subtarget->hasV6Ops() && Subtarget->allowsUnalignedMem() ? 4 : 8), so we don't emit a ldrd such that "addr % 8 == 4".
I haven't checked any hardware, but from reading the reference manuals it looked like the alignment requirement would be Subtarget->hasV6Ops() || Subtarget->allowsUnalignedMem() ? 4 : 8. As in for newer archs 4 is always fine. For v5te the ldrd requires an alignment of 4 with U=1 and 8 with U=0. I haven't tested that though.
Spent a bit of time digging in the armv7 reference manual. Apparently on armv7, it should always be fine to use ldrd for word alignment. On armv5, as you've noted, it's "UNPREDICTABLE". For armv6, you can switch between armv5 semantics and armv7 semantics. (sections D12.3.1 and D15.3.1.) On v6 targets, if allowsUnalignedMem() is true, we can safely assume we're using v7 semantics; if we're in strict alignment mode, I don't think we want to make any assumptions. (We could theoretically make it a flag specified by the user, but that seems like overkill.)
Note that we currently assume the frontend will set the "+strict-align" target attribute if we're compiling for v4 or v5, so compiling for -mtriple=armv5e-arm-none-eabi without also adding -mattr=+strict-align will produce weird results: we assume unaligned accesses work, but they clearly won't. This is probably a bug; the backend should do something more reasonable by default on v4/v5 targets.
Given that, I guess what we want is actually something like Align(Subtarget->hasV7Ops() || Subtarget->allowsUnalignedMem() ? 4 : 8). Maybe wrap that up in a helper and stick it in ARMSubtarget.h, so we can use it for ARMLoadStoreOptimizer in a followup.
Thank you for investigating it. I have added a helper called getDualLoadStoreAlignment. Feel free to suggest a better name if you have one.
I'd like to see better test coverage for align-4 case: specifically, a test that checks we produce ldrd for targets where it's legal, but a pair of ldr where it isn't.
I added an align 4 test for armv7 and armv6 with and without strict alignment.
I omitted armv5 here, as, due to the missing strict align, the test generates an ldrd which is probably not something one wants to explicitly test for.
Question is, should I add strict align to armv4/v5 test cases here and adjust the test accordingly?
armv4/armv5 should have strict align, yes.
(Also, you might want to autogenerate the CHECK lines using update_llc_test_checks.py; it's a little more verbose, but a lot easier to update.)
Thank you for the hint. Using update_llc_test_checks.py makes writing the tests much easier.
LGTM
(I'm guessing you don't have commit access? Please let me know the author name/email you want on the commit.)
Thank you. You're right, I don't have commit access. In my previous commits I used this: Maurice Heumann <MauriceHeumann@gmail.com>
as soon as this is commited,I will try to adjust the ARMLoadStoreOptimizer accordingly. @eli.friedman you have commit permissions, right? would you mind committing this change for me with the name and email provided above?
Using getABITypeAlign() like this is confusing; if you mean "4", just write 4.