This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Expose CLI controls for IGroup ordering
AbandonedPublic

Authored by jrbyrnes on Jun 16 2022, 11:24 AM.

Details

Summary

This patch implements the necessary framework to allow users to specify custom ordering in the IGroupLP mutation. Specifically, it implements a parser and extracts some common IGroup data to a table, to force agreement between the producer and consumer of the IGroupLP ordering. In the current iteration, users can only specify an IGroup once (including via subgroup -- e.g. can't have VMEMWrite and VMEM groups). The user is expected to provide a simple comma separated string of IGroups, with optional sizes occuring after a specified IGroup.

Diff Detail

Event Timeline

jrbyrnes created this revision.Jun 16 2022, 11:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2022, 11:24 AM
jrbyrnes requested review of this revision.Jun 16 2022, 11:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2022, 11:24 AM
jrbyrnes updated this revision to Diff 437635.Jun 16 2022, 11:37 AM

Fix naming of the parser.

kerbowa added inline comments.Jun 16 2022, 12:18 PM
llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
52

Can you comment that this is a map from IGroupClass and just use an enum class?

100

Can this just reuse/replace SchedBarrierMasks?

103

These enum declarations don't follow style conventions https://llvm.org/docs/CodingStandards.html

161

As above, should use a bitmask enum so the mappings are more explicit.

210

So this parser is just validating the input? Is there some way to avoid parsing twice?

254

Can these be removed now since the size is embedded into amdgpu-igrouplp-order?

460

Remove commented code.

463

Can we record the order in the enums somehow instead of re-parsing the strings?

jrbyrnes added inline comments.Jun 16 2022, 12:46 PM
llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
210

I can rethink the design a bit. Would you prefer the CLI parser to create the PipelineOrderGroups (e.g. Pipeline used in adding edges)? I can probably implement this here using enum and array of constructors. Thanks for comments.

254

The idea was to default to the prior iteration's style if no amdgpu-igrouplp-order was specified, but I see how these options may be confusing. I'll remove them.

kerbowa added inline comments.Jun 16 2022, 4:24 PM
llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
210

Seems like we could use a SchedGroup factory that just needs the parser to tell what the isMemberFn and size are. Creating and initializing two classes that do essentially the same thing seems not ideal, but I could be misunderstanding IGroupTableEntry.

254

I mostly just want to avoid having so many cl options. An alternative is to make them "really hidden".

jrbyrnes updated this revision to Diff 442638.Jul 6 2022, 11:03 AM
jrbyrnes marked 7 inline comments as done.

Addressed review comments.

Refactor implementation to change the product of parsing. The parser now generates a {SchedGroupMask, Optional<unsigned>} pair to represent the IGroup and MaxSize, which is directly used to construct SchedGroups in the mutation.

Mainly enabled by removal of canAddMIFn in favor of a common canAddMI for the SchedGroup class. Due to this, it is no longer necessary to map command line strings to canAddMIs. Additionally, handleGroup harcodes a map between Groups -> SubGroups (note: handleGroup will be removed in next iteration, as multiple groups of same type will be allowed). These two things allowed for removal of IGroupTable.

jrbyrnes updated this revision to Diff 442641.Jul 6 2022, 11:09 AM

Remove unnecessary debug code.

arsenm added inline comments.Aug 17 2023, 3:42 PM
llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
315–334

Use StringSwitch

447–456

Who is intended to use this? Users aren't supposed to be exposed to cl::opts like this

jrbyrnes abandoned this revision.Aug 17 2023, 4:11 PM

We have since taken the UI for this framework in a different direction. Most of what can be achieved through this UI can be achieved with already existing controls / interface (e.g. SGB). I don't see any need to pursue this further, will reopen if there are opposite opinions.