This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: SimplifyDemandedElts for image intrinsics
ClosedPublic

Authored by arsenm on Apr 13 2017, 11:16 AM.

Details

Reviewers
arsenm
mareko
Summary

Causes some VGPR usage improvements in shaderdb, but
introduces some SGPR spilling regressions due to random
scheduling changes later.

https://ghostbin.com/paste/ghrtj

Diff Detail

Event Timeline

arsenm created this revision.Apr 13 2017, 11:16 AM
mareko added inline comments.Apr 13 2017, 12:46 PM
lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
1734

Gather4 opcodes always return 4 VGPRs and DMASK has a different meaning. Specifically, Gather4 reads 4 texels from memory and DMASK selects which color component is returned for the texels (i.e. 4x red channel, or 4x green channel, etc.) So DMASK shouldn't be changed by the compiler for gather4 opcodes.

arsenm updated this revision to Diff 95308.Apr 14 2017, 9:48 AM

Don't handle gather4 (is getlod OK?)

mareko edited edge metadata.Apr 14 2017, 12:55 PM

GETLOD should be OK.

LGTM.

arsenm accepted this revision.Apr 17 2017, 8:26 AM

r300453

This revision is now accepted and ready to land.Apr 17 2017, 8:26 AM
arsenm closed this revision.Apr 17 2017, 8:26 AM