This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Limit promote alloca to vector with VGPR budget
ClosedPublic

Authored by rampitec on Jul 1 2020, 12:11 PM.

Details

Summary

Allow only up to 1/4 of available VGPRs for the vectorization
of any given alloca.

Diff Detail

Event Timeline

rampitec created this revision.Jul 1 2020, 12:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2020, 12:11 PM
arsenm added inline comments.Jul 1 2020, 12:30 PM
llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
444

This seems like a huge default. I thought we previously limited this to 16 VGPRs. There should also probably be a cl::opt for this too

rampitec marked an inline comment as done.Jul 1 2020, 1:02 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
444

We did not limit it to 16 VGPRs but to 16 elements. This limit is still here. The problem this limit solves is different, it is when you only have limited number of registers not to run out of them. Like wg size 1024 leaves us with 64 VGPRs and then the limit would be 16.

rampitec updated this revision to Diff 274902.Jul 1 2020, 1:17 PM
rampitec marked an inline comment as done.

Added cl::opt.

arsenm accepted this revision.Jul 1 2020, 3:26 PM
This revision is now accepted and ready to land.Jul 1 2020, 3:26 PM
This revision was automatically updated to reflect the committed changes.