This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: More bits of frame index are known to be zero
ClosedPublic

Authored by arsenm on Feb 22 2016, 6:15 PM.

Details

Reviewers
tstellarAMD
Summary

The maximum private allocation for the whole GPU is 4G,
so the maximum possible index for a single workitem is the
maximum size divided by the smallest granularity for a dispatch.

This increases the number of known zero high bits, which
enables more offset folding. The maximum private size per
workitem with this is 128M but may be smaller still.

Diff Detail

Event Timeline

arsenm updated this revision to Diff 48763.Feb 22 2016, 6:15 PM
arsenm retitled this revision from to AMDGPU: More bits of frame index are known to be zero.
arsenm updated this object.
arsenm added a reviewer: tstellarAMD.
arsenm added a subscriber: llvm-commits.
tstellarAMD accepted this revision.Feb 24 2016, 7:05 PM
tstellarAMD edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Feb 24 2016, 7:05 PM
arsenm closed this revision.Feb 27 2016, 12:31 PM

r262153