This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Resolve issues when picking between ds_read/write and ds_read2/write2
ClosedPublic

Authored by mbrkusanin on Dec 7 2020, 7:50 AM.

Details

Summary

Both ds_read_b128 and ds_read2_b64 are valid for 128bit 16-byte aligned
loads but the one that will be selected is determined either by the order in
tablegen or by the AddedComplexity attribute. Currently ds_read_b128 has
priority.

While ds_read2_b64 has lower alignment requirements, we cannot always
restrict ds_read_b128 to 16-byte alignment because of unaligned-access-mode
option. This was causing ds_read_b128 to be selected for 8-byte aligned
loads regardles of chosen access mode.

To resolve this we use two patterns for selecting ds_read_b128. One
requires alignment of 16-byte and the other requires
unaligned-access-mode option.

Same goes for ds_write2_b64 and ds_write_b128.

Diff Detail

Event Timeline

mbrkusanin created this revision.Dec 7 2020, 7:50 AM
mbrkusanin requested review of this revision.Dec 7 2020, 7:50 AM
foad added inline comments.Dec 7 2020, 8:00 AM
llvm/lib/Target/AMDGPU/DSInstructions.td
683

Can't we relax this predicate to "isGFX7Plus" instead of duplicating the patterns?

mbrkusanin updated this revision to Diff 310111.Dec 8 2020, 1:55 AM
mbrkusanin marked an inline comment as done.
mbrkusanin added inline comments.
llvm/lib/Target/AMDGPU/DSInstructions.td
683

Right. That way we do not increase number of patterns.

foad accepted this revision.Dec 10 2020, 3:34 AM

Looks good, thanks!

This revision is now accepted and ready to land.Dec 10 2020, 3:34 AM