This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Unify MOVRELSOffset and MOVRELDOffset
ClosedPublic

Authored by nhaehnle on Jul 11 2016, 6:07 AM.

Details

Summary

Previously, constant index insertelements would be turned into SI_INDIRECT_DST,
which is bound to prevent some optimization opportunities. Worse, it mislead
the heuristic that decides whether immediates should be lowered to S_MOV_B32
or V_MOV_B32 in a way that resulted in unnecessary v_readfirstlanes.

Diff Detail

Repository
rL LLVM

Event Timeline

nhaehnle updated this revision to Diff 63498.Jul 11 2016, 6:07 AM
nhaehnle retitled this revision from to AMDGPU: Unify MOVRELSOffset and MOVRELDOffset.
nhaehnle updated this object.
nhaehnle added reviewers: arsenm, tstellarAMD.
nhaehnle added a subscriber: llvm-commits.
arsenm accepted this revision.Jul 11 2016, 9:12 AM
arsenm edited edge metadata.

LGTM. Just to be sure the constant indexed insert element still always emits an INSERT_SUBREG? I think I wasn't getting this before which is why the check was there

This revision is now accepted and ready to land.Jul 11 2016, 9:12 AM

This breaks test/CodeGen/AMDGPU/llvm.SI.gather4.ll for me:

/home/daenzer/src/llvm-git/llvm/test/CodeGen/AMDGPU/llvm.SI.gather4.ll:472:9: error: expected string not found in input
;CHECK: v_readfirstlane_b32 s[[LO:[0-9]+]], v{{[0-9]+}}
        ^
<stdin>:1767:19: note: scanning from here
gather4_sgpr_bug: ; @gather4_sgpr_bug
                  ^
<stdin>:1782:2: note: possible intended match here
 v_add_f32_e32 v0, v0, v1
 ^

Yeah, that's an interaction with the recent D22210 (I wrote the patches in the other order). I'll fix it before committing.

This revision was automatically updated to reflect the committed changes.