Creates a new scheduling strategy that attempts to maximize ILP for a single
wave.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/CodeGen/AMDGPU/schedule-regpressure-limit.ll | ||
---|---|---|
2–3 | Why has this changed? |
llvm/test/CodeGen/AMDGPU/schedule-regpressure-limit.ll | ||
---|---|---|
2–3 | I renamed the iterative scheduler cl flags to have this "iterative" prefix. Mostly for clarity and to avoid confusion with this scheduling strategy that is being added in this patch. |
Is scheduling for maximum ILP the same thing as scheduling for minimum latency?
Does this patch have anything in common with lib/Target/AMDGPU/GCNILPSched.cpp? (Is that even maintained?)
llvm/test/CodeGen/AMDGPU/schedule-regpressure-limit.ll | ||
---|---|---|
2–3 | Oh I see. I somehow missed that change in AMDGPUTargetMachine.cpp. |
Yes, it's the same.
Does this patch have anything in common with lib/Target/AMDGPU/GCNILPSched.cpp? (Is that even maintained?)
That is part of the iterative scheduler. I don't know if it is maintained. I did see it was crashing on some lit tests if I changed waves-per-eu.
It was probably made obsolete by the recent changes to the default scheduler. @vpykhtin do you think we still have there something useful?
llvm/lib/Target/AMDGPU/GCNSchedStrategy.h | ||
---|---|---|
32 | AFAIR PreRARematerialize shall be a last stage, it was leaving some variables in an inconsistent state. Before the last refactoring there was even static_assert about that. I see that you are building stages pipeline within SchedStages, but probably it makes sense to reorder the enum and redefine operator++ to walk SchedStages instead of static casting integers now. |
llvm/lib/Target/AMDGPU/GCNSchedStrategy.h | ||
---|---|---|
32 | The idea is that different SchedStrategies may have different stages or permutations of stages. I already defined operator++ in a previous patch. I'm not sure what casting of integers you are referring to. In this patch, each SchedStrategy has the order of its stages defined in the SchedStages vector. I should make the max occupancy strategy assert that PreRARemat is the last stage in that vector and make all the checks relative to that vector. |
llvm/lib/Target/AMDGPU/GCNSchedStrategy.h | ||
---|---|---|
32 | I mean the cast at line 42. It might be clearer to use a next stage from SchedStages vector rather than just incrementing GCNSchedStageID numerically. Ten years from now nobody will remember why PreRARematerialize shall be a last one, and code can easily avoid using that operator alltogether, like it does using a range based loop for SchedStages in the runSchedStages. Then imagine you would like to add it to the ILP pipeline too, it will just break operator++ logic. |
AFAIR PreRARematerialize shall be a last stage, it was leaving some variables in an inconsistent state. Before the last refactoring there was even static_assert about that.
I see that you are building stages pipeline within SchedStages, but probably it makes sense to reorder the enum and redefine operator++ to walk SchedStages instead of static casting integers now.