This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Track physreg uses in SILoadStoreOptimizer
ClosedPublic

Authored by nhaehnle on Jan 29 2018, 8:30 AM.

Details

Summary

This handles def-after-use of physregs, and allows us to merge loads and
stores even across some physreg defs (typically M0 defs).

Change-Id: I076484b2bda27c2cf46013c845a0380c5b89b67b

Diff Detail

Event Timeline

nhaehnle created this revision.Jan 29 2018, 8:30 AM

I think you need a test with lds combining which does not merge on VI and does merge on GFX9 due to m0 defs.

mareko added inline comments.Jan 29 2018, 2:29 PM
test/CodeGen/AMDGPU/smrd.ll
248–252

Merging of these copcodes is disabled on GFX9 because of the cache line straddling bug.

A mir test where an SCC use and def need to be moved would be good

nhaehnle updated this revision to Diff 135247.Feb 21 2018, 6:49 AM
nhaehnle marked an inline comment as done.
  • fixed comment in test
  • add test to distinguish between gfx9 and pre-gfx9 wrt LDS M0 use
  • add MIR test for moving SCC def/use
mareko added inline comments.Feb 21 2018, 9:46 AM
test/CodeGen/AMDGPU/smrd.ll
248–252

Note that merging buffer loads is fully enabled on GFX9 now. See git commit ea06ecf3436ea455a7f304095ebf7f4f4ec989f3 .

This revision is now accepted and ready to land.Feb 21 2018, 11:33 AM
This revision was automatically updated to reflect the committed changes.