Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Paths
| Differential D64911
[AMDGPU] Extend the SI Load/Store optimizer ClosedPublic Authored by piotr on Jul 18 2019, 3:58 AM.
Details Summary Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle When the instructions of the same subclass differ only in dmask, merge
Diff Detail
Event TimelineComment Actions I still think we should be handling these on the IR level
Comment Actions
Matt, where do you think would be the right place to do this at the IR level? Presumably you have an existing pass in mind? Comment Actions
Either a new pass, or teach the LoadStoreVectorizer about intrinsics. Things are generally easier in the IR, and I don't think this needs any information only available after selection Comment Actions I think a generic pass would not be suited for our image instructions, due to the dmask special treatment. I had decided to extend the si-load-store-opt pass, because similar transformations were already handled there. Comment Actions A recent patch in SILoadOptimizer (D65496) conflicted with this patch. I have rebased it to the latest master, but for clarity I will split the review into two parts: a separate review for the NFC refactoring (D68384) and the current review only for merging MIMG instruction. I will update the current review once D68384 has been merged. piotr added inline comments.
piotr added inline comments.
This revision is now accepted and ready to land.Oct 15 2019, 2:11 AM Closed by commit rG02baaca742f7: [AMDGPU] Extend the SI Load/Store optimizer (authored by piotr). · Explain WhyOct 16 2019, 3:19 AM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 222992 lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
test/CodeGen/AMDGPU/merge-image-load.mir
test/CodeGen/AMDGPU/merge-image-sample.mir
|
This should probably check mayStore instead of mayLoad: we want to exclude both stores and atomics.
You could also move the check for TFE and LWE to here.