Previously, constant index insertelements would be turned into SI_INDIRECT_DST,
which is bound to prevent some optimization opportunities. Worse, it mislead
the heuristic that decides whether immediates should be lowered to S_MOV_B32
or V_MOV_B32 in a way that resulted in unnecessary v_readfirstlanes.
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
Comment Actions
LGTM. Just to be sure the constant indexed insert element still always emits an INSERT_SUBREG? I think I wasn't getting this before which is why the check was there
Comment Actions
This breaks test/CodeGen/AMDGPU/llvm.SI.gather4.ll for me:
/home/daenzer/src/llvm-git/llvm/test/CodeGen/AMDGPU/llvm.SI.gather4.ll:472:9: error: expected string not found in input ;CHECK: v_readfirstlane_b32 s[[LO:[0-9]+]], v{{[0-9]+}} ^ <stdin>:1767:19: note: scanning from here gather4_sgpr_bug: ; @gather4_sgpr_bug ^ <stdin>:1782:2: note: possible intended match here v_add_f32_e32 v0, v0, v1 ^
Comment Actions
Yeah, that's an interaction with the recent D22210 (I wrote the patches in the other order). I'll fix it before committing.