We want to have more load/store clustering but we also want
to maintain low register pressure which are oposit targets.
Allow scheduler to reschedule regions without mutations
applied if we hit a register limit.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This tries scheduling with all mutations disabled including macro fusion right?
Instead of adding another scheduling pass could you try always disabling mutations for the first pass, and have them enabled for the second pass with a fallback if we drop occupancy?
Yes. Macrofusion shall have the same impact on the pressure as clustering. It is a clustering in a sense.
Instead of adding another scheduling pass could you try always disabling mutations for the first pass, and have them enabled for the second pass with a fallback if we drop occupancy?
I though about it but have chosen not to. It will effectively double scheduling time. Note I am only rescheduling when we run out of registers. If I do it another way I will have to always attempt rescheduling. Currently the impact on the compile time should be really minimal.
LGTM.
llvm/lib/Target/AMDGPU/GCNSchedStrategy.h | ||
---|---|---|
97 | Maybe a BitVector? Don't have a strong opinion though. |
Maybe a BitVector? Don't have a strong opinion though.