This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Extend the SI Load/Store optimizer
ClosedPublic

Authored by piotr on Jul 18 2019, 3:58 AM.

Details

Summary

Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.

When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.

Diff Detail

Event Timeline

piotr created this revision.Jul 18 2019, 3:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2019, 3:58 AM

I still think we should be handling these on the IR level

lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
459–460 ↗(On Diff #210524)

Can this actually, legitimately fail? I would expect this to be an assert

498 ↗(On Diff #210524)

C++ style comment

1114–1115 ↗(On Diff #210524)

Weird to put the assert string first

1120–1125 ↗(On Diff #210524)

static const

test/CodeGen/AMDGPU/merge-image-load.mir
4 ↗(On Diff #210524)

Can you include the merged MMO in the check lines

piotr updated this revision to Diff 210829.Jul 19 2019, 7:05 AM

V2: Addressed review comments.

I still think we should be handling these on the IR level

Matt, where do you think would be the right place to do this at the IR level? Presumably you have an existing pass in mind?

I still think we should be handling these on the IR level

Matt, where do you think would be the right place to do this at the IR level? Presumably you have an existing pass in mind?

Either a new pass, or teach the LoadStoreVectorizer about intrinsics. Things are generally easier in the IR, and I don't think this needs any information only available after selection

piotr added a comment.Jul 19 2019, 8:14 AM

I think a generic pass would not be suited for our image instructions, due to the dmask special treatment. I had decided to extend the si-load-store-opt pass, because similar transformations were already handled there.

piotr updated this revision to Diff 218359.Sep 2 2019, 6:54 AM

Rebased and added missing 'const' in SILoadStoreOptimizer::dmasksCanBeCombined.

piotr updated this revision to Diff 222992.EditedOct 3 2019, 5:29 AM

A recent patch in SILoadOptimizer (D65496) conflicted with this patch. I have rebased it to the latest master, but for clarity I will split the review into two parts: a separate review for the NFC refactoring (D68384) and the current review only for merging MIMG instruction. I will update the current review once D68384 has been merged.

piotr updated this revision to Diff 223187.Oct 4 2019, 4:51 AM

Rebased after D68384 has been merged.

nhaehnle added inline comments.Oct 8 2019, 6:20 AM
lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
322–325 ↗(On Diff #223187)

This should probably check mayStore instead of mayLoad: we want to exclude both stores and atomics.

You could also move the check for TFE and LWE to here.

1051 ↗(On Diff #223187)

Please use an if-statement.

piotr updated this revision to Diff 224585.Oct 11 2019, 7:00 AM

Rebased and addressed review comments.

piotr marked 2 inline comments as done.Oct 11 2019, 7:07 AM
piotr added inline comments.
lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
322–325 ↗(On Diff #223187)

Good point about atomics, I added the condition to bail out on mayStore()). I am keeping !mayLoad() to avoid merging IMAGE_GET_RESINFO.

For TFE/LWE I would like to keep the checks where they are, because I dislike extending getInstClass() with Instruction argument, and it would be necessary to query the actual value of TFE/LWE.

nhaehnle added inline comments.Oct 14 2019, 9:37 AM
lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
322–325 ↗(On Diff #223187)

Can IMAGE_GET_RESINFO not be merged? I would kind of expect it can...

Fair point about TFE/LWE.

piotr marked an inline comment as done.Oct 14 2019, 11:03 PM
piotr added inline comments.
lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
322–325 ↗(On Diff #223187)

Yes, ideally we would like to combine them too, but currently it would not work, because the pass makes assumptions that the instruction to be merged is either a load or store (assertion thrown otherwise) . I will add a TODO in the code to make it clear that IMAGE_GET_RESINFO is not handled.

piotr updated this revision to Diff 224956.Oct 14 2019, 11:14 PM

Added TODO.

This revision is now accepted and ready to land.Oct 15 2019, 2:11 AM
This revision was automatically updated to reflect the committed changes.