VLDRH needs to have an alignment of at least 2, including the widening/narrowing versions. This tightens up the ISel patterns for it and alters allowsMisalignedMemoryAccesses so that unaligned accesses are expanded through the stack. It also fixed some incorrect shift amounts, which seemed to be passing a multiple not a shift.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
llvm/test/CodeGen/Thumb2/mve-ldst-offset.ll | ||
---|---|---|
756 ↗ | (On Diff #212793) | I am so confused by this, can you explain it for me please? |
llvm/test/CodeGen/Thumb2/mve-ldst-offset.ll | ||
---|---|---|
756 ↗ | (On Diff #212793) | (Drive-by comment since this crossed my inbox) I think what's going on here is: VLDRH.S32 means: load 8 bytes of memory, regard them as 4 16-bit halfwords (H), and sign-extend each one into a 32-bit lane (S32) of the output vector register. But it requires alignment of at least 2 on the memory it's loading from. So in order to apply it to 8 bytes starting at an odd address, the generated code is copying the 8 source bytes to an aligned 8-byte stack slot, and then pointing the VLDRH.S32 at that instead. I assume this run of llc is in a mode where it assumes unaligned access support on the ordinary LDR instruction has been enabled in the hardware configuration. (If I remember, that's the default – to generate code compatible with a CPU that has that turned _off_ you have to say -mno-unaligned-access in clang, or whatever llc's equivalent option is.) |
llvm/test/CodeGen/Thumb2/mve-ldst-offset.ll | ||
---|---|---|
756 ↗ | (On Diff #212793) | Bah, thanks! For some reason I wasn't thinking about the need to widen, all the loads really threw me. |
Thanks!
llvm/test/CodeGen/Thumb2/mve-ldst-offset.ll | ||
---|---|---|
756 ↗ | (On Diff #212793) | Yep, this is the default fallback of "align it via the stack and load it again". Its obviously not very efficient, but I don't believe it will often come up (it's only for unaligned 16bit loads). If it does we may be able to do something better, perhaps by splitting out the extend. |