This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/GlobalISel: Manually select llvm.amdgcn.writelane
ClosedPublic

Authored by arsenm on Jul 23 2020, 10:34 AM.

Details

Summary

Fixup the special case constant bus handling pre-gfx10.

Diff Detail

Event Timeline

arsenm created this revision.Jul 23 2020, 10:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 23 2020, 10:34 AM
foad added a comment.Aug 11 2020, 8:09 AM

Pre-gfx10 writelane v0, m0, s0 is legal, isn't it? Does your implementation allow for that? You don't seem to have any tests for that case.

Pre-gfx10 writelane v0, m0, s0 is legal, isn't it? Does your implementation allow for that? You don't seem to have any tests for that case.

This is covered by @test_writelane_m0_s_v, but it looks like it ends up swapping the registers:

; GFX7-NEXT:    s_mov_b32 s0, m0
; GFX7-NEXT:    s_mov_b32 m0, s2
; GFX7-NEXT:    v_writelane_b32 v0, s0, m0

Pre-gfx10 writelane v0, m0, s0 is legal, isn't it? Does your implementation allow for that? You don't seem to have any tests for that case.

This is covered by @test_writelane_m0_s_v, but it looks like it ends up swapping the registers:

; GFX7-NEXT:    s_mov_b32 s0, m0
; GFX7-NEXT:    s_mov_b32 m0, s2
; GFX7-NEXT:    v_writelane_b32 v0, s0, m0

I'm not sure there's a principled way to fold this during selection. I think handling this correctly would require us to finally stop treating m0 as a reserved register

foad accepted this revision.Aug 11 2020, 8:48 AM

Pre-gfx10 writelane v0, m0, s0 is legal, isn't it? Does your implementation allow for that? You don't seem to have any tests for that case.

This is covered by @test_writelane_m0_s_v, but it looks like it ends up swapping the registers:

; GFX7-NEXT:    s_mov_b32 s0, m0
; GFX7-NEXT:    s_mov_b32 m0, s2
; GFX7-NEXT:    v_writelane_b32 v0, s0, m0

I guess it's probably not worth worrying about this case? It all looks functionally correct anyway.

This revision is now accepted and ready to land.Aug 11 2020, 8:48 AM