This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Fix computation for getOccupancyWithLocalMemSize
ClosedPublic

Authored by arsenm on Mar 2 2020, 11:35 AM.

Details

Summary

The computation here didn't really make sense to me, and reported
wildy different results depending on the flat work group size
attribute.

I think this should really report a range derived from the possible
work group size bounds, and only allow an occupancy that is a multiple
of the group size.

Diff Detail

Event Timeline

arsenm created this revision.Mar 2 2020, 11:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2020, 11:35 AM
rampitec added inline comments.Mar 2 2020, 12:00 PM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
361

Yes, we cannot split a group. I think it is better to uncomment this.

arsenm marked an inline comment as done.Mar 2 2020, 12:28 PM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
361

The problem is this assumes the very worst case. This would hit 0 with the default / maximum group size, and need clamping to 1. This breaks every test. I was thinking I would try to apply this after changing this to report a range of occupancies

This revision is now accepted and ready to land.Mar 2 2020, 12:36 PM