Details
- Reviewers
arsenm foad - Commits
- rG28eb9ed3bb5b: [AMDGPU] Fine tune LDS misaligned access speed
Diff Detail
Event Timeline
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
1598 | What do the numbers mean? |
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
1598 | More or less 'it operates with a speed comparable to N-bit wide load'. With the full alignment ds128 is slower than ds96 for example. If underaligned it is comparable to a speed of a single dword access, which would then mean 32 < 128 and it is faster to issue a wide load regardless. 1 is simply 'slow, don't do it'. I.e. comparing an aligned load to a wider load which will not be aligned anymore the latter is slower. But essentially it is just a rank, these are not additive. |
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
1598 | This needs to be commented |
What do the numbers mean?