This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Make enable-flat-scratch a subtarget feature
ClosedPublic

Authored by sebastian-ne on Feb 10 2022, 3:46 AM.

Details

Summary

Use a subtarget feature instead of a command line argument to reduce global
state.
We want to enable flat scratch for graphics in some cases and this
doesn't work well with command line options.

Diff Detail

Event Timeline

sebastian-ne created this revision.Feb 10 2022, 3:46 AM
sebastian-ne requested review of this revision.Feb 10 2022, 3:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2022, 3:46 AM
foad added a comment.Feb 10 2022, 4:01 AM

Just curious: why is it not enabled by default on subtargets that support it? Then you would only need the attribute if you wanted to turn it off for some reason.

Just curious: why is it not enabled by default on subtargets that support it? Then you would only need the attribute if you wanted to turn it off for some reason.

If we do so, I think it only makes sense for gfx10.3+ (there are a couple hardware bugs before that). It would need benchmarking on the compute side.
For graphics there are almost no differences when using flat scratch, but there is one improvement in a case where we spill a lot.

Title is inaccurate, this is a subtarget feature, not an attribute

sebastian-ne retitled this revision from [AMDGPU] Make enable-flat-scratch an attribute to [AMDGPU] Make enable-flat-scratch a subtarget feature.Feb 10 2022, 6:41 AM
sebastian-ne edited the summary of this revision. (Show Details)
rampitec accepted this revision.Feb 10 2022, 8:55 AM

Just curious: why is it not enabled by default on subtargets that support it? Then you would only need the attribute if you wanted to turn it off for some reason.

If we do so, I think it only makes sense for gfx10.3+ (there are a couple hardware bugs before that). It would need benchmarking on the compute side.
For graphics there are almost no differences when using flat scratch, but there is one improvement in a case where we spill a lot.

Right, support is incomplete without _ST addressing mode, so before gfx1030 it can fail under a high pressure.

LGTM

This revision is now accepted and ready to land.Feb 10 2022, 8:55 AM
This revision was landed with ongoing or failed builds.Feb 11 2022, 9:23 AM
This revision was automatically updated to reflect the committed changes.