gfx90a allows the number of ACC registers (AGPRs) to be set
independently to the VGPR registers. For both HSA and PAL metadata, we
now include an "agpr_count" key to report the number of AGPRs set for
supported devices (gfx90a, gfx908, as determined by hasMAIInsts()).
This is collected from SIProgramInfo.NumAccVGPR for both HSA and PAL.
The AsmParser also now recognizes ".kernel.agpr_count" for supported
devices.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
50 ms | x64 debian > LLVM.Bindings/Go::go.test |
Event Timeline
Removing const from MCSymbol variable and creating fall through case for getTotalNumVgprs()
llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp | ||
---|---|---|
1158 | The const does seem to be unnecessary. Should I remove from lines 1134 and 1144 as well, or leave that for another patch? |
llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp | ||
---|---|---|
1168 | Really should avoid repeating this then. Usually things that need to be shared between codegen and MC go in AMDGPUBaseInfo |
llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp | ||
---|---|---|
1168 | Ok perfect, I was looking for a good way to do this. So would you recommend lifting the implementation of getTotalNumVGPRs from AMDGPUResourceUsageAnalysis.cpp into Utils/AMDGPUBaseInfo.cpp, possibly as a separate patch? Moreover, there’s actually already a function defined as “getTotalNumVGPRs” in Utils/AMDGPUBaseInfo.cpp, but its implementation is different than what we’re looking for in the AsmParser. Here are the two overlapping functions. AMDGPUResourceUsageAnalysis.cpp int32_t AMDGPUResourceUsageAnalysis::SIFunctionResourceInfo::getTotalNumVGPRs( const GCNSubtarget &ST, int32_t ArgNumAGPR, int32_t ArgNumVGPR) const { if (ST.hasGFX90AInsts() && ArgNumAGPR) return alignTo(ArgNumVGPR, 4) + ArgNumAGPR; return std::max(ArgNumVGPR, ArgNumAGPR); } Utils/AMDGPUBaseInfo.cpp unsigned getTotalNumVGPRs(const MCSubtargetInfo *STI) { if (STI->getFeatureBits().test(FeatureGFX90AInsts)) return 512; if (!isGFX10Plus(*STI)) return 256; return STI->getFeatureBits().test(FeatureWavefrontSize32) ? 1024 : 512; } Are the overlapping names here intentional? Would it make sense to rename one of these in a separate patch, for example the AMDGPUResourceUsageAnalysis one to “getTotalNumUsedVGPRs” or something similar? |
clang-format: please reformat the code