RA can insert something like a sub1_sub2 COPY of a wide VGPR
tuple which results in the unaligned acces with v_pk_mov_b32
after the copy is expanded. This is regression after D97316.
Details
Details
Diff Detail
Diff Detail
Event Timeline
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
914–916 | I.e. I do not see an easy way to check an RC is aligned. |
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
914–916 | Probably should add a utility function to check this. The verifier has a similarly confusing check here: if (!RC || ((IsVGPR && !RC->hasSuperClassEq(RI.getVGPRClassForBitWidth( RI.getRegSizeInBits(*RC)))) || |
It would be easier to follow if you directly referred to the aligned class. getVGPR64Class will always return that anyway on the targets with the alignment requirement