This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][LSV] Restrict forming extra large vectors
Needs ReviewPublic

Authored by piotr on May 30 2023, 4:58 AM.

Details

Reviewers
None
Group Reviewers
Restricted Project
Summary

Restrict the bitwidth of the largest vector type used in ls vectorizer
to 128 for buffer and constant addr spaces.

This avoids a potential sgpr pressure increase in shaders where multiple
resources are used. There is no enough context in LSV to determine if forming
large vectors is beneficial for perf, and currently there is no late phase in
the compiler that would split vectors if register pressure were too high (it
could be argued that one should be added).

The extra large loads/store could still be formed late in the backend in
si-load-store-optimizer which has also some logic to avoid unbounded register
pressure increases, with the exception of s_load_dwordx16 which is not formed
there. s_load_dwordx16 is a tricky instruction to get right anyway, because
it can cause massive register pressure and fragmentation.

Diff Detail

Event Timeline

piotr created this revision.May 30 2023, 4:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 30 2023, 4:58 AM
piotr requested review of this revision.May 30 2023, 4:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 30 2023, 4:58 AM
piotr added a reviewer: Restricted Project.May 30 2023, 5:00 AM

What we should do is teach rematerialization to split scalar loads

piotr added a comment.May 30 2023, 6:25 AM

What we should do is teach rematerialization to split scalar loads

Yes, that would work as well. Just to note, currently scalar loads are not rematerializable at all.