Add subtarget feature check to avoid using ds_read/write_b96/128 with too low
alignment if a bug is present on that specific hardware.
Add this "feature" to GFX 10.1.1 as it is also affected.
Add global-isel test.
Details
Diff Detail
Event Timeline
Note that tests with flat instructions are not copied to GlobalISel/lds-misaligned-bug.ll. While we can put similar check in allowsMisalignedMemoryAccessesImpl for flat address space as well it will cause SDag to produce less optimal code. For some reason it will break down a load 16, align 8 into four flat_load_dword instead of two flat_load_dwordx2 instructions (but not similar stores). This patch should fix problems mentioned in D84403 while I look into this.
Can you add a comment to hasLDSMisalignedBug with what specifically is broken? Is b64 broken too?
Is the updated description enough or do you prefer an explicit list of all instructions for hasLDSMisalignedBug? Something like:
// Hardware requires natural alignment for the following: // ds_read/write_b64/96/128 // flat_load/store_dwordx2/3/4
b64 is affected be we currently use ds_read2_b32/write2_b32 instructions. We could potentially relax restrictions for b64 the same way it was done for b96/128 to make it more consistent.
Looks good to me if Matt has no further comments. I'm not sure whether gfx10.1.1 has the bug but it's certainly safe to assume it does, unless/until we know otherwise.