This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Set sizes of spill pseudos
ClosedPublic

Authored by arsenm on Aug 10 2016, 12:30 PM.

Details

Diff Detail

Event Timeline

arsenm updated this revision to Diff 67576.Aug 10 2016, 12:30 PM
arsenm retitled this revision from to AMDGPU: Set sizes of spill pseudos.
arsenm updated this object.
arsenm added a subscriber: llvm-commits.
nhaehnle added inline comments.
lib/Target/AMDGPU/SIInstructions.td
1962–1963

I just took a look, and for some reason the reloads tend to look like

buffer_load_dword v3, off, s[72:75], s70 offset:1444 ; 16-byte Folded Reload
                                ; encoding: [0xa4,0x05,0x30,0xe0,0x00,0x03,0x12,0x46]
s_waitcnt vmcnt(0)              ; encoding: [0x70,0x0f,0x8c,0xbf]
buffer_load_dword v4, off, s[72:75], s70 offset:1448 ; 16-byte Folded Reload
                                ; encoding: [0xa8,0x05,0x30,0xe0,0x00,0x04,0x12,0x46]
s_waitcnt vmcnt(0)              ; encoding: [0x70,0x0f,0x8c,0xbf]

etc., so you actually get 12 bytes per dword. Not sure if that's a problem, especially since those waits are really wrong anyway (perhaps the wait insertion gets confused by the register/subregister relationship?).

arsenm added inline comments.Aug 11 2016, 12:04 PM
lib/Target/AMDGPU/SIInstructions.td
1962–1963

I'm not really sure what to do about waitcnts. It doesn't really matter for correctness, since the branch relax pass right now runs after these should be eliminated (these may be inserted during relaxation but isn't a concern yet)

nhaehnle accepted this revision.Sep 1 2016, 9:10 AM
nhaehnle added a reviewer: nhaehnle.

Fair enough. LGTM.

This revision is now accepted and ready to land.Sep 1 2016, 9:10 AM
arsenm closed this revision.Sep 3 2016, 10:34 AM

r280595