Alignment requirements for ds_read/write_b96/b128 for gfx9 and onward are
now the same as for other GCN subtargets. This way we can avoid any
unintentional use of these instructions on systems that do not support dword
alignment and instead require natural alignment.
This also makes 'SH_MEM_CONFIG.alignment_mode == STRICT' the default.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
From the description, I don't understand what this is trying to fix
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
1442–1443 | I think the logic for 2 byte alignment not being fast also applies here, but that's a separate change | |
1471–1472 | This looks wrong to me, 4 byte alignment is still usable? | |
llvm/test/CodeGen/AMDGPU/ds_write2.ll | ||
2 | What is -dword-access-mode? |
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
1471–1472 | It was when we allowed dword alignment. But now we want strict to be default (because of windows). So now it's always 8 (because of read2/write2), unless it's +unaligned-access-mode. | |
llvm/test/CodeGen/AMDGPU/ds_write2.ll | ||
2 | Sorry, that should have been removed. That option was supposed to represent "alignment_mode = dword" (which was the default before) but we decided against that, at least in this patch. |
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir | ||
---|---|---|
8 | This test should probably have checks with unaligned access enabled |
- Added run line with -mattr=+unaligned-access-mode to test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
I think the logic for 2 byte alignment not being fast also applies here, but that's a separate change