Page MenuHomePhabricator

AMDGPU add support for spilling to a user sgpr pointed buffers
ClosedPublic

Authored by airlied on Oct 9 2016, 11:53 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

airlied updated this revision to Diff 74108.Oct 9 2016, 11:53 PM
airlied retitled this revision from to AMDGPU add support for spilling to a user data SREG address..
airlied updated this object.
airlied set the repository for this revision to rL LLVM.
arsenm edited edge metadata.Oct 10 2016, 12:41 AM

I think ABI changes should be encoded as part of the triple (a radv environment type?) rather than a subtarget feature. This is supposed to be just the pointer and not the full scratch resource descriptor?

Alternatively we could just switch mesa to always doing this and remove the relocations

I think ABI changes should be encoded as part of the triple (a radv environment type?) rather than a subtarget feature. This is supposed to be just the pointer and not the full scratch resource descriptor?

Hmm I'll try and investigate adding a radv env triple tomorrow.

Just a pointer, I don't have space for 4 user gprs here. An alternative plan of action is to make the first descriptor on the first pointer in the user sgprs to point to the spill descriptor, I'm not sure if that causes problems though.

Alternatively we could just switch mesa to always doing this and remove the relocations

I'm not sure the GL driver has enough spare user sgprs to allow this to happen.

nhaehnle edited edge metadata.Oct 10 2016, 1:08 AM
nhaehnle added a subscriber: llvm-commits.

As mentioned on mesa-dev, I like the general approach, but instead of a machine feature to enable this I'd use a function attribute which designates one of the function arguments as the source for the spill pointer, e.g. "amdgpu-spill-ptr=4" to take the pointer from the fifth function argument (which would typically be SGPR 8/9).

I'm not sure the GL driver has enough spare user sgprs to allow this to happen.

True, GL will actually need to do some indirect loading at least for compute shaders. Also, we should really set the size part of the descriptor correctly anyway, so there's not much benefit to loading a full descriptor instead of just a 64-bit pointer.

airlied updated this revision to Diff 74195.Oct 10 2016, 10:35 PM
airlied edited edge metadata.

Updated diff to spill to a 64-bit buffer, pointed to by 64-bits in the buffer pointed to by SGPR0/1.

arsenm added inline comments.Oct 13 2016, 6:17 AM
lib/Target/AMDGPU/SIFrameLowering.cpp
245–270 ↗(On Diff #74195)

Formatting for this all looks wrong

247 ↗(On Diff #74195)

This is confusingly named because it's not S_MOV_B64

248 ↗(On Diff #74195)

You can just use the literal AMDGPU::SGPR0_SGPR1 if you're just using a known constant register. However, I would expect this to be getting the register from the MachineFunctionInfo rather than hardcoding it again inside FrameLowering

253 ↗(On Diff #74195)

This shouldn't be undef, the read value does matter

256 ↗(On Diff #74195)

Because this is using physical registers here, the sub register index isn't used (also sub0_sub1 of s[0:1] doesn't make sense)

256 ↗(On Diff #74195)

Because you are creating a load you should add a MachineMemOperand or else it will unduly constrain the post-RA scheduler. There are a few other dummy places that create read-only memory operands for similar purposes.

Also because it's a load I would hope we could do this much earlier than frame lowering. I'm thinking about how to properly initialize m0 in the HSA ABI which also requires a load so I will probably end up taking care of that eventually

airlied updated this revision to Diff 81697.Dec 15 2016, 5:53 PM
airlied retitled this revision from AMDGPU add support for spilling to a user data SREG address. to AMDGPU add support for spilling to a user sgpr pointed buffers.
airlied updated this object.
airlied edited edge metadata.
airlied removed rL LLVM as the repository for this revision.

This is missing the part touching the argument lowering. When the input sgpr0/1 is going to be used it needs to be marked in the initial argument lowering in LowerFormalArguments, or else the other argument usage might end up thinking it can use these

lib/Target/AMDGPU/SIFrameLowering.cpp
316 ↗(On Diff #81697)

I think this needs a better name, that definitely doesn't involve the word spill. We currently mix the terms scratch and private memory. How about hasPrivateMemoryPointerInput()?

326 ↗(On Diff #81697)

hasIndirectPrivaetMemoryPointerInput? I assume you are putting other things in this buffer besides just this one pointer, so maybe the name should be whatever you want to call that buffer.

333–335 ↗(On Diff #81697)

These go over 80 lines

338–340 ↗(On Diff #81697)

Usually we put a comment with the name of the operand after each one for the more complicated instructions, e.g. // offset

lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
132–135 ↗(On Diff #81697)

I don't really like using the attributes this way naming specific registers. This needs to always be available, so I don't see why you need to explicitly enable this particularly in the indirect case.

mareko added a subscriber: mareko.Dec 28 2016, 3:07 PM
mareko added inline comments.
lib/Target/AMDGPU/SIFrameLowering.cpp
326 ↗(On Diff #81697)

There are no other things in the scratch buffer. Only LLVM uses it. It doesn't matter what it's called from the Mesa's point of view.

lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
132–135 ↗(On Diff #81697)

I don't understand the comment. Of course it's always available. The driver just chooses one of the methods: 1) scratch relocation; 2) the pointer is in SGPR01; 3) the pointer is at the address pointed to by SGPR01

airlied updated this revision to Diff 82643.Dec 28 2016, 7:18 PM
airlied edited edge metadata.

This rewrites the patch, using a new triple OsType (needs renaming to something better), and adds an intrinsic to get access to the private buffer.

arsenm added inline comments.Dec 28 2016, 9:00 PM
include/llvm/IR/IntrinsicsAMDGPU.td
103–105 ↗(On Diff #82643)

The name shouldn't include private because this is the pointer to the buffer with more than just the private resource descriptor.

lib/Target/AMDGPU/AMDGPUSubtarget.h
156 ↗(On Diff #82643)

Did you try adding the version number to the end and checking that instead of adding a new OS?

airlied updated this revision to Diff 85525.Jan 23 2017, 8:42 PM

updates after taking to Tom on irc, still might cause a flag day.

arsenm accepted this revision.Jan 23 2017, 9:05 PM

LGTM except for some pedantry

lib/Target/AMDGPU/AMDGPUSubtarget.cpp
300 ↗(On Diff #85525)

Looks like it's over 80 columns

lib/Target/AMDGPU/AMDGPUSubtarget.h
318 ↗(On Diff #85525)

C++ style comment, capitalized

327 ↗(On Diff #85525)

Extra line

lib/Target/AMDGPU/SIRegisterInfo.cpp
1114–1117 ↗(On Diff #85525)

No return after else

This revision is now accepted and ready to land.Jan 23 2017, 9:05 PM
airlied updated this revision to Diff 85529.Jan 23 2017, 9:20 PM

fix pedantry
fix two bugs found in testing.

airlied updated this revision to Diff 85531.Jan 23 2017, 9:48 PM

fixed a regression in the when to emit spill setup path, should have just added isMesaGfx option not removed the other one.

This revision was automatically updated to reflect the committed changes.