Page MenuHomePhabricator

[AMDGPU] Don't combine memory intrs to v3i16

Authored by Flakebi on Jul 21 2020, 2:14 AM.



v3i16 and v3f16 currently cannot be legalized and lowered so they should
not be emitted by inst combining.

Moved the check down to still allow extracting 1 or 2 elements via the dmask.

Fixes image intrinsics being combined to return v3x16.

Diff Detail

Event Timeline

Flakebi created this revision.Jul 21 2020, 2:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 21 2020, 2:14 AM
piotr added a comment.Jul 21 2020, 3:51 AM

Looks good to me as a stop-gap solution.

Is there any real obstacle to handling these? Even the DAG has v3i16/v3f16 types now

I’m also trying to get it working properly (currently for SDag). I think I got the legalization/widening part working but I’m still trying to figure out how to select the right instruction patterns.

The next two weeks I’m on vacation, so it will still take a while. I think Marek wants a slightly quicker fix, probably something in mesa hit this.

arsenm accepted this revision.Jul 21 2020, 10:01 AM

LGTM but I think we should stop hacking around these

This revision is now accepted and ready to land.Jul 21 2020, 10:01 AM
This revision was automatically updated to reflect the committed changes.
mareko added a subscriber: mareko.Jul 29 2020, 2:57 AM

Can you please cherry-pick this to the LLVM 11 branch?

Can you please cherry-pick this to the LLVM 11 branch?