This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Document amdgpu_cs_chain[_preserve] CCs. NFC
ClosedPublic

Authored by rovka on Jun 2 2023, 6:45 AM.

Details

Summary

Co-authored-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

Diff Detail

Event Timeline

rovka created this revision.Jun 2 2023, 6:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2023, 6:45 AM
rovka requested review of this revision.Jun 2 2023, 6:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2023, 6:45 AM
Herald added a subscriber: wdng. · View Herald Transcript

Thanks, this looks pretty reasonable to me.

llvm/docs/AMDGPUUsage.rst
1127

This shouldn't be a property of the calling convention. We should just set the alignment attribute in the frontend to ensure this.

foad added a subscriber: foad.Jun 6 2023, 1:33 AM
foad added inline comments.
llvm/docs/AMDGPUUsage.rst
1111–1112

I feel like this is missing some context. What is "an llvm.amdgcn.cs.chain sequence in the function epilog"???

1113–1118

I'm not sure any of this belongs in a calling convention description. The dealloc_vgprs thing applies to all kernels regardless of calling convention, but probably doesn't need to be documented to the end user anyway.

1120–1121

So they're not launched by hardware? That feels like the biggest difference from amdgpu_cs, and should probably be mentioned first.

nhaehnle added a subscriber: t-tye.
nhaehnle added inline comments.
llvm/docs/AMDGPUUsage.rst
1111–1112

It's a new intrinsic that belongs to the same set of changes, so while the context is missing from this patch, I think it's fair to let it be added implicitly by the change that adds the intrinsic.

1113–1118

It is relevant if anybody wanted to try writing compatible code via some non-LLVM mechanism. @t-tye may have opinions on this as well.

arsenm added a subscriber: arsenm.Jun 6 2023, 5:33 AM
arsenm added inline comments.
llvm/docs/AMDGPUUsage.rst
1127

Why? Entry points require 256 byte align but regular code is fine with 4

nhaehnle added inline comments.Jun 6 2023, 11:28 AM
llvm/docs/AMDGPUUsage.rst
1127

Entry points require 256 for HW reasons. The 64 bytes we discussed is a SW choice that is orthogonal to the base functionality of amdgpu_cs_chain (making LSBs of function pointers available to stuff metadata in -- which we also could have considered doing with amdgpu_gfx functions).

rovka updated this revision to Diff 529522.Jun 8 2023, 12:55 AM

Address some of the previous comments and alphabetise.

rovka marked an inline comment as done.Jun 8 2023, 1:02 AM
rovka added inline comments.
llvm/docs/AMDGPUUsage.rst
1113–1118

Ok, I'll wait for more feedback on this.

1120–1121

Good point, fixed.

1127

I agree this might not be the most pertinent place to put it, but it feels like relevant information, like you said above. I don't feel very strongly about this, we could just omit it (or mention it elsewhere if you think there's a better place for it).

nhaehnle added inline comments.Jun 12 2023, 1:24 AM
llvm/docs/AMDGPUUsage.rst
1127

Yeah, but the alignment is relevant information in the context of the specific use of this feature in LLPC. I talked with @ruiling and he's going to use Function::setAlignment to explicitly set the desired alignment. So I'd prefer to remove it here. Or perhaps rephrase it as an "FYI, there's this use of the feature which is going to set explicit alignments on the functions". But at least to me that just seems a little weird.

nhaehnle added inline comments.Jun 12 2023, 1:26 AM
llvm/docs/AMDGPUUsage.rst
1127

And I think one big reason why it feels a little weird is that there could be tons of little "FYIs" that we could add about this feature. Why is this one special? And adding all of them would just blow up this document...

foad added inline comments.Jun 12 2023, 1:39 AM
llvm/docs/AMDGPUUsage.rst
1113–1118

It is relevant if anybody wanted to try writing compatible code via some non-LLVM mechanism.

Then I don't understand what the part about waitcnts, starting from "Waits for regular memory counters are not inserted", is trying to tell me. (And I speak as someone who is quite familiar with the dealloc vgprs issue!) It is written in the style "the compiler does X". Could you rephrase it more like "the required state at a function call boundary is Y"?

rovka updated this revision to Diff 530858.Jun 13 2023, 4:12 AM
rovka marked an inline comment as done.

Remove the bits about function alignment and MSG_DEALLOC_VGPRs.

nhaehnle accepted this revision.Jun 19 2023, 9:13 AM
This revision is now accepted and ready to land.Jun 19 2023, 9:13 AM
This revision was automatically updated to reflect the committed changes.