This patch implements the necessary framework to allow users to specify custom ordering in the IGroupLP mutation. Specifically, it implements a parser and extracts some common IGroup data to a table, to force agreement between the producer and consumer of the IGroupLP ordering. In the current iteration, users can only specify an IGroup once (including via subgroup -- e.g. can't have VMEMWrite and VMEM groups). The user is expected to provide a simple comma separated string of IGroups, with optional sizes occuring after a specified IGroup.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp | ||
---|---|---|
40–41 | Can these be removed now since the size is embedded into amdgpu-igrouplp-order? | |
52 | Can you comment that this is a map from IGroupClass and just use an enum class? | |
100 | Can this just reuse/replace SchedBarrierMasks? | |
103 | These enum declarations don't follow style conventions https://llvm.org/docs/CodingStandards.html | |
161 | As above, should use a bitmask enum so the mappings are more explicit. | |
210 | So this parser is just validating the input? Is there some way to avoid parsing twice? | |
451 | Remove commented code. | |
454 | Can we record the order in the enums somehow instead of re-parsing the strings? |
llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp | ||
---|---|---|
40–41 | The idea was to default to the prior iteration's style if no amdgpu-igrouplp-order was specified, but I see how these options may be confusing. I'll remove them. | |
210 | I can rethink the design a bit. Would you prefer the CLI parser to create the PipelineOrderGroups (e.g. Pipeline used in adding edges)? I can probably implement this here using enum and array of constructors. Thanks for comments. |
llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp | ||
---|---|---|
40–41 | I mostly just want to avoid having so many cl options. An alternative is to make them "really hidden". | |
210 | Seems like we could use a SchedGroup factory that just needs the parser to tell what the isMemberFn and size are. Creating and initializing two classes that do essentially the same thing seems not ideal, but I could be misunderstanding IGroupTableEntry. |
Addressed review comments.
Refactor implementation to change the product of parsing. The parser now generates a {SchedGroupMask, Optional<unsigned>} pair to represent the IGroup and MaxSize, which is directly used to construct SchedGroups in the mutation.
Mainly enabled by removal of canAddMIFn in favor of a common canAddMI for the SchedGroup class. Due to this, it is no longer necessary to map command line strings to canAddMIs. Additionally, handleGroup harcodes a map between Groups -> SubGroups (note: handleGroup will be removed in next iteration, as multiple groups of same type will be allowed). These two things allowed for removal of IGroupTable.
We have since taken the UI for this framework in a different direction. Most of what can be achieved through this UI can be achieved with already existing controls / interface (e.g. SGB). I don't see any need to pursue this further, will reopen if there are opposite opinions.
Can these be removed now since the size is embedded into amdgpu-igrouplp-order?