Make MIMG NSA minimum addresses threshold an attribute that can
be set on a function or configured via command line.
This enables frontend tuning which allows increased NSA usage
where beneficial.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Seems reasonable.
I would also be in favour of changing the default to 2. That would tend to introduce fewer "mov"s to shuffle data around, at the expense of using a larger encoding on average for mimg instructions. But I think that is a good trade-off to make for the sake of performance.
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp | ||
---|---|---|
959 | Use max() | |
963 | Use max() |
Thank you for the quick review!
I did test the idea of using default 2, but I am not convinced yet that the benefit is universal enough.
Out of ~20000 pipelines, ~2000 had higher VGPR usage with threshold 2 and ~1000 had lower VGPR usage.
When VGPR usage decreases, it does so more beneficial ways though.
Spilling decreased in 2 pipelines, and only increased in 1.
Out of ~20000 pipelines, ~2000 had higher VGPR usage with threshold 2 and ~1000 had lower VGPR usage.
That's weird. I can't see why enabling NSA would consistently cause higher vgpr usage.
Out of ~20000 pipelines, ~2000 had higher VGPR usage with threshold 2 and ~1000 had lower VGPR usage.
That's weird. I can't see why enabling NSA would consistently cause higher vgpr usage.
FYI I looked at one case where this happens and it was caused by GCNNSAReassign making strange (well, different) decisions. So now I need to try to understand what that pass does.
Yes, I think we probably need to revisit the logic in GCNNSAReassign.
I seem to remember it lack some support for subregisters which might be part of the issue.
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp | ||
---|---|---|
59 | True, it would be nice to have this number only once. | |
959 | Pretty sure it is required, because std:max does not know how to handle cl::opt. |
On the "don't repeat yourself" principle you could either remove the cl::init here...