LDS is allocated to 64-dword alignment. In order to prove a memory address being 8 byte aligned it is sufficient to check that the offset if multiple of 8.
This allows generating ds_read/write_b64 instead of ds_read/write2_b32.
FarhanaAleen on Mar 2 2018, 2:25 PM.Authored by
I think it is simpler than that. If a local symbol must be 64 dword aligned, it should be declared as a such and not 4 byte aligned as we have.
Although I am not really sure this is true it is always 64 dword aligned. Consider:
local int x;
Do you mean this allocation would take 128 dwords? I highly doubt.
I suppose only the first symbol is 64 dword aligned, and everything after is just naturally aligned wrt element type size. So a logic to leverage actual allocation alignment can be useful only after all LDS is allocated and allocation is flattened into a single LDS memory array.