This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Merge BUFFER_LOAD_DWORD_OFFEN into x2, x4
ClosedPublic

Authored by mareko on Oct 16 2017, 6:04 AM.

Details

Summary

-9.9% code size decrease in affected shaders.

Totals (changed stats only):
SGPRS: 2151462 -> 2170646 (0.89 %)
VGPRS: 1634612 -> 1640288 (0.35 %)
Spilled SGPRs: 8942 -> 8940 (-0.02 %)
Code Size: 52940672 -> 51727288 (-2.29 %) bytes
Max Waves: 373066 -> 371718 (-0.36 %)

Totals from affected shaders:
SGPRS: 283520 -> 302704 (6.77 %)
VGPRS: 227632 -> 233308 (2.49 %)
Spilled SGPRs: 3966 -> 3964 (-0.05 %)
Code Size: 12203080 -> 10989696 (-9.94 %) bytes
Max Waves: 44070 -> 42722 (-3.06 %)

Diff Detail

Repository
rL LLVM

Event Timeline

mareko created this revision.Oct 16 2017, 6:04 AM
This revision is now accepted and ready to land.Oct 23 2017, 8:43 AM
arsenm edited edge metadata.Oct 23 2017, 9:42 AM

We should probably be doing this in IR. Supporting merging in this pass is still useful, since it's likely we would only ever want to form the X8/X16 versions when we have better register pressure info

This revision was automatically updated to reflect the committed changes.