This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][IGLP]: Add rules to SchedGroups
ClosedPublic

Authored by jrbyrnes on Mar 23 2023, 5:41 PM.

Details

Summary

As desired mutations are trending towards groups of "patterns" of instructions, and more complex mutations are desired, it makes sense to allow finer grain specification of SchedGroups. Previously, SchedGroups could only be defined in terms of the classes of Instructions allowed in the SchedGroup. This introduces the "rule" framework which allows for specifying SchedGroups based on how they relate to other SchedGroups.

Rules allow for greater expression of SchedGroups, while simultaneously reducing the number of SchedGroups an SU can map to, making things easier for PipelineSolver.

Diff Detail

Event Timeline

jrbyrnes created this revision.Mar 23 2023, 5:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 23 2023, 5:41 PM
jrbyrnes requested review of this revision.Mar 23 2023, 5:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 23 2023, 5:41 PM
jrbyrnes updated this revision to Diff 507929.Mar 23 2023, 5:43 PM

Remove accidental files

I like the general approach. It seems like things could get unwieldy with larger SchedGroups. You would need to have lots of checks vs Collection.size() which could be somewhat hard to work with.

Maybe allowing rules to look into other already built SchedGroups could be helpful? That way you could avoid some of the Collection.size() stuff.

Have you tried to implement Po Yen's proposal, or do you have a general idea about how this patch may try to implement it?

I like the general approach. It seems like things could get unwieldy with larger SchedGroups. You would need to have lots of checks vs Collection.size() which could be somewhat hard to work with.

Maybe allowing rules to look into other already built SchedGroups could be helpful? That way you could avoid some of the Collection.size() stuff.

Have you tried to implement Po Yen's proposal, or do you have a general idea about how this patch may try to implement it?

Thanks for the thoughts Austin.

First -- regarding Po Yen's proposal.
I have a general idea of how this patch may to implement it. Based on his presentation, it seems like we may want the following sched groups:

A. (1) ds_write, (1) buffer_load, (1), mfma

  • WAR dep on ds_write and buffer_load
  • no dep on buffer_load and mfma
  • sched group is interlinked (add edge between ds_w, buffer_load and buffer_load, mfma) (already partially linked)

B. (3) ds_read, (2) mfma

  • RAW deps on the ds_reads and mfmas
  • sched group is interlinked (already linked)

The A,B sched groups would be in different sync_pipelines (each sync_pipeline is just a repeat of the corresponding schedgroup # DAG MFMA / # SG MFMA times). We may want to include some MFMA sched groups to preserve their initial ordering (the more rules we have on SchedGroups, the fewer legal permutation for PipelineSolver).

This is just a general idea towards how it could work. For the actual optimization, I would want to look at the actual assembly produced and encode that into sched groups. Moreover, I would want to look into regressions.

Second -- regarding the collection.size() issue:

Yes, the collections.size() calls will probably become hard to manage. However, I think without grouping together related (by a rule) schedgroups, there will be an opposite problem of finding the corresponding schedgroup that is being referenced by a rule. Both these issues can likely be resolved with composite SchedGroups, or having the collection be an aggregate of structs, each with their own size / mask / rule(s). This way we can group together related collections (each easily sized), and allow cross inspection via rules.

jrbyrnes updated this revision to Diff 517939.Apr 28 2023, 8:33 AM

Enable splitting of SchedGroups -- enable rules to find relevant other SchedGroups.

jrbyrnes retitled this revision from [AMDGPU][IGLP] WIP/Demo: Add rules to SchedGroups to [AMDGPU][IGLP]: Add rules to SchedGroups.Apr 28 2023, 8:35 AM
jrbyrnes edited the summary of this revision. (Show Details)

Looks good thanks. I think adding some more structure around rules i.e. having things like rule types for common patterns could be a good addition in the future. Having these open-ended functions could lead to a lot of repeat code and disorganization if the number of strategies gets higher. Having base rules like isSuccessorWithProperty for example.

llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
926

Can you add an addrule function in SchedGroup so that we don't need to do this extra copying?

jrbyrnes updated this revision to Diff 528504.Jun 5 2023, 10:40 AM
jrbyrnes marked an inline comment as done.

Hi @kerbowa, thanks for your comment. I agree that there needs to be thought into code organization (in particular offering bases for patterns) -- https://reviews.llvm.org/D149773 shows how quickly the code can become messy.

jrbyrnes updated this revision to Diff 528507.Jun 5 2023, 10:44 AM

Add comment for new addRule method.

kerbowa accepted this revision.Jun 6 2023, 4:31 PM

LGTM after removing the DemoRules variable.

llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
926

DemoRules is unused now.

This revision is now accepted and ready to land.Jun 6 2023, 4:31 PM
jrbyrnes updated this revision to Diff 529123.Jun 6 2023, 7:05 PM
jrbyrnes marked an inline comment as done.

Remove unused variable

This revision was landed with ongoing or failed builds.Jun 6 2023, 7:20 PM
This revision was automatically updated to reflect the committed changes.