Page MenuHomePhabricator

[AMDGPU] Add FixupVectorISel pass, currently Supports SREGs in GLOBAL LD/ST

Authored by ronlieb on Oct 3 2018, 1:57 PM.


Group Reviewers
Restricted Project

Add a pass to fixup various vector ISel issues.
Currently we handle converting GLOBAL_{LOAD|STORE}_*
and GLOBAL_Atomic_* instructions into their _SADDR variants.
This involves feeding the sreg into the saddr field of the new instruction.

Diff Detail

Event Timeline

ronlieb created this revision.Oct 3 2018, 1:57 PM
arsenm added inline comments.Oct 4 2018, 2:27 AM

Put these on the following line and indent


I'd rather not have this option. We have an already excessively long list of band-aids to avoid updating tests


I think you could break this with a shader using an inreg/sgpr argument


This is assuming too much about the REG_SEQUENCE, so at least needs an assert. It could be a wider reg_sequence


You don't need these. We'll never emit a physical register operand like this. I might worry about seeing a subreg index on the use operand though


de Morgan this


Comment not needed


Range loop


Move the triple to the command line and drop this


This testcase is way too big. Most of the tests should be only a handful of instructions. There should be ones stressing the immediate limits. I also don't see ones for the atomics that the comments say work

ronlieb marked 10 inline comments as done.Oct 5 2018, 1:32 PM

Matt, thanks for review, I have addressed most of the issues although the inreg/sgpr one may require some more investigation on my end.
see my responses to each of your review comments.
A revision to this patch will be uploaded in the next hour or so...


option removed, and tests altered to not reference the now non existent flag.


I tried to construct a test that exercises this path, its in the next revision of this patch in a test named global-saddr-misc.ll


assert added


changed to check for subreg.


reduced test case to 47 lines total.

ronlieb updated this revision to Diff 168524.Oct 5 2018, 1:43 PM
ronlieb marked 5 inline comments as done.

Revised with various changes to reflect review comments.

I am planning to add another lit test for coverage of all the Global load/store opcodes.


assert seems like overkill, so I am going to change this to
if (DefInst->getNumOperands() != 5)

10 ↗(On Diff #168524)

Reduced version of this test in the works, next patch rev.

ronlieb updated this revision to Diff 168576.Oct 6 2018, 1:28 PM

added a load/store globals test case (.mir)
updated global-saddr-misc.ll per offline discussion.
changed assert to simpler if (check) continue.

I took a look and (1) seems OK, though Matt should opine as well (2) minor nit: you could run instnamer on global-saddr-misc.ll to tidy up the %var<numbering>

Thanks Mark , the instnamer suggestion is a good idea. Completed locally, waiting on further review comments from Matt.

ran instnamer on global-saddr-misc.ll

ronlieb updated this revision to Diff 168781.Oct 9 2018, 5:22 AM

updated global-saddr-misc.ll with instnamer changes

arsenm added inline comments.Oct 9 2018, 7:26 AM

Braces for both half


This is still too big. You only need not much more than load, a store, and a GEP in each function

7–11 ↗(On Diff #168781)

You don't need the complicated vector stuff (this is also broken since addrspace(1) is 64-bit). The pointer argument needs to be the direct argument here. It won't be since now there will end up being a zext to the pointer size, so this isn't testing what it's supposed to

arsenm added inline comments.Oct 9 2018, 7:33 AM
7–11 ↗(On Diff #168781)

I also realized the case I was worried about here probably can't happen

ronlieb added inline comments.Oct 9 2018, 7:45 AM

I thought the coding standard was no braces for single statement 'if' and 'else' , is it not ?
I personally like braces, however staying with convention seems best.
I could also refactor it a little bit and avoid the whole topic...

NewGlob = BuildMI(MBB, I, MI.getDebugLoc(), TII->get(NewOpcd));
if (HasVdst)
  NewGlob->addOperand(MF, MI.getOperand(0));
ronlieb added inline comments.Oct 9 2018, 8:09 AM
7–11 ↗(On Diff #168781)

seems like I should remove this test now. Originally I added it based on a concern you expressed.
Matt, do you have a preference regarding this test?

ronlieb added inline comments.Oct 9 2018, 9:13 AM

is this comment about only needing "load store gep per function? intended for a different test? This test, conv2d-saddr.ll only has one function. perhaps you were thinking of global-saddr-atomics.ll ?

ronlieb updated this revision to Diff 168823.Oct 9 2018, 10:18 AM

Cleaned up an if-else statement
Slightly reduced global-saddr-misc.ll

arsenm added inline comments.Oct 9 2018, 7:13 PM

It was spread over multiple lines but I don't se that in the diff now



7–11 ↗(On Diff #168781)

It would still be good to have to make sure this actually works, but you need to completely replace the test body to be simpler and avoid these issues. Also I wouldn't split these tests up arbitrarily like this and merge them into one

arsenm added inline comments.Oct 9 2018, 7:14 PM

This is probably just noise for general debugging

ronlieb added inline comments.Oct 9 2018, 7:50 PM

It is indeed for general debugging.
Is there a specific request implied by your comment?
would you prefer this be removed?


This is at the moment a single statement loop, I would expect it to become compound at some point at which braces would be inserted.
See example: line 163

7–11 ↗(On Diff #168781)

I think part of your your suggestion is to reduce the current number of new tests down to 1 or 2.

I would propose changing these
test/CodeGen/AMDGPU/conv2d-saddr.ll (47 lines)
test/CodeGen/AMDGPU/global-load_stores.mir (86 lines)
test/CodeGen/AMDGPU/global-saddr-atomics.ll (359 lines)
test/CodeGen/AMDGPU/global-saddr-misc.ll (13 lines)
test/CodeGen/AMDGPU/global-saddr-offsets.ll (60 lines)

  1. fold these test/CodeGen/AMDGPU/global-load_stores.mir (86 lines) test/CodeGen/AMDGPU/global-saddr-atomics.ll (359 lines)

into a single .mir coverage test
test/CodeGen/AMDGPU/global-load-stores-atomics.mir (86 lines)

  1. fold these test/CodeGen/AMDGPU/conv2d-saddr.ll (47 lines) test/CodeGen/AMDGPU/global-saddr-misc.ll (13 lines) test/CodeGen/AMDGPU/global-saddr-offsets.ll (60 lines)

into a single test

ronlieb updated this revision to Diff 169041.Oct 10 2018, 10:28 AM

Consolidated 5 tests into 2.
A few minor cleanups on some spacing and indentation.

ronlieb updated this revision to Diff 169141.Oct 10 2018, 6:19 PM

Cleaned up global-saddr.ll test, reducing number of functions, and use addrspace(1) i64

You probably need to extend the pass to support subregs, not necessarily in the same patch.
But you definitely need a mir test with subregs used with either operands.


The name of the pass is too opaque and gives no hint what pass actually doing. I think it needs to have something about "flat" in the name.


Can be subreg.


Can be subreg.

arsenm added inline comments.Oct 22 2018, 12:43 PM

It shouldn't. This should be a general pass that isn't specifically for this one thing. There may be other SelectionDAG workarounds we want to put in here

rampitec added inline comments.Oct 22 2018, 1:18 PM


ronlieb updated this revision to Diff 173850.Nov 13 2018, 7:18 AM

Rebased with latest llvm, pushed up to trigger some AMD internal Jenkins testing.
Will work on the subregs suggestions next.

ronlieb updated this revision to Diff 173972.Nov 13 2018, 6:13 PM

Added additional comments and some code related to subregs.

ronlieb marked 9 inline comments as done.Nov 13 2018, 6:18 PM
arsenm added inline comments.Nov 14 2018, 9:18 AM

All of these places listing SReg_64* should probably just avoid it since we have so many variants. Something like
AMDGPU::getRegBitWidth() == 64 && TRI->isSGPRClass(), although maybe we should have an isSGPR64 or something

ronlieb updated this revision to Diff 174085.Nov 14 2018, 12:45 PM

Per review comments, now using getRegBitWidth(),
which exposed a missing case in getRegBitWidth() for AMDGPU::SReg_64_XEXECRegClassID

ronlieb marked an inline comment as done.Nov 14 2018, 12:47 PM
This revision is now accepted and ready to land.Nov 15 2018, 10:21 AM
arsenm accepted this revision.Nov 15 2018, 11:07 AM


ronlieb closed this revision.Nov 16 2018, 2:08 PM

rL347008: [AMDGPU] Add FixupVectorISel pass, currently Supports SREGs in GLOBAL LD/ST